First, upload the files from the dnn-mode-connectivity directory to the session storage of this notebook. You may need to upload files inside subdirectories separatly (e.g. the ```models``` directory). Then run these commands to initialize the environment:

In [9]:
!mkdir -p data checkpoints checkpoints2 checkpoints_curve curve_eval

At any point, you can reset the state of the notebook by running these commands:

In [2]:
!rm -rf data/* checkpoints/* checkpoints2/* checkpoints_curve/* curve_eval/*

To run the training procedure (finding endpoints of the curve):

In [1]:
%run train.py --dir=checkpoints --dataset=CIFAR10 --data_path=data --model=PreResNet110 --epochs=200 --lr=0.1 --wd=1e-4 --transform=ResNet

Files already downloaded and verified
Using train (45000) + validation (5000)
Files already downloaded and verified
----  ---------  ---------  ---------  ---------  ---------  ---------
  ep         lr    tr_loss     tr_acc     te_nll     te_acc       time
----  ---------  ---------  ---------  ---------  ---------  ---------
   1     0.1000     1.6046    40.5089     1.2962    53.0000   152.5687
   2     0.1000     1.1559    58.4578     1.1634    58.7400   144.2256
   3     0.1000     1.0076    64.1556     0.9334    67.3000   144.8487
   4     0.1000     0.9005    68.6622     1.0374    66.2000   144.7706
   5     0.1000     0.8112    71.8044     0.7993    73.7400   144.4986
   6     0.1000     0.7568    73.8733     0.9110    69.1600   144.8276
   7     0.1000     0.7214    75.1356     0.6975    76.0200   144.5329
   8     0.1000     0.6841    76.4911     0.6971    76.1600   144.7541
   9     0.1000     0.6557    77.5378     0.6192    77.9000   144.7023
  10     0.1000     0.6406    77

In [8]:
%run train.py --dir=checkpoints2 --dataset=CIFAR10 --data_path=data --model=PreResNet110 --epochs=200 --lr=0.1 --wd=1e-4 --transform=ResNet

Files already downloaded and verified
Using train (45000) + validation (5000)
Files already downloaded and verified
----  ---------  ---------  ---------  ---------  ---------  ---------
  ep         lr    tr_loss     tr_acc     te_nll     te_acc       time
----  ---------  ---------  ---------  ---------  ---------  ---------
   1     0.1000     1.6202    39.8000     1.2659    54.4600   150.0946
   2     0.1000     1.1729    57.6933     1.5773    52.3400   148.6091
   3     0.1000     1.0248    63.4467     0.9805    65.9600   146.5757
   4     0.1000     0.9264    67.3622     1.0052    67.3400   147.9227
   5     0.1000     0.8520    70.0444     0.8997    69.7600   150.2749
   6     0.1000     0.7837    72.7511     0.7616    74.7600   148.0893
   7     0.1000     0.7365    74.5044     0.7240    75.3800   144.9041
   8     0.1000     0.7013    75.6489     0.6947    76.3200   145.8027
   9     0.1000     0.6726    76.7667     0.7085    76.6400   151.4990
  10     0.1000     0.6566    77

Now that we have two idenpenently trained endpoints we can try to find the curve.

In [6]:
%run train.py --dir=checkpoints_curve --dataset=CIFAR10 --transform=ResNet --data_path=data --model=PreResNet110 --curve=Bezier --num_bends=3 --init_start=checkpoints/checkpoint-200.pt --init_end=checkpoints2/checkpoint-200.pt --fix_start --fix_end --epochs=200 --lr=0.03 --wd=1e-4

Files already downloaded and verified
Using train (45000) + validation (5000)
Files already downloaded and verified
Loading checkpoints/checkpoint-200.pt as point #0
Loading checkpoints2/checkpoint-200.pt as point #2
Linear initialization.
----  ---------  ---------  ---------  --------  --------  ---------
  ep         lr    tr_loss     tr_acc  te_nll    te_acc         time
----  ---------  ---------  ---------  --------  --------  ---------
   1     0.0300     0.4070    90.8644                       285.1157
   2     0.0300     0.3205    94.0644                       276.7805
   3     0.0300     0.2913    94.8956                       278.8230
   4     0.0300     0.2868    94.9244                       280.3426
   5     0.0300     0.2742    95.5422                       277.6911
   6     0.0300     0.2763    95.3511                       277.9759
   7     0.0300     0.2760    95.5067                       277.5783
   8     0.0300     0.2782    95.3511                       277.8616
 

In [12]:
%run eval_curve.py --dir=curve_eval --dataset=CIFAR10 --data_path=data --transform=ResNet --model=PreResNet110 --wd=1e-4 --curve=Bezier --num_bends=3 --ckpt=checkpoints_curve/checkpoint-200.pt --num_points=30

Files already downloaded and verified
Using train (45000) + validation (5000)
Files already downloaded and verified
----------  ------------  -----------  -----------------  ----------  ----------------
         t    Train loss    Train nll    Train error (%)    Test nll    Test error (%)
----------  ------------  -----------  -----------------  ----------  ----------------
    0.0000        0.1911       0.0032             0.0311      0.2014            5.1600
    0.0345        0.1754       0.0027             0.0156      0.2002            5.0600
    0.0690        0.1629       0.0025             0.0178      0.1994            5.0200
    0.1034        0.1531       0.0026             0.0222      0.2002            4.9000
    0.1379        0.1455       0.0027             0.0156      0.2011            4.9000
    0.1724        0.1395       0.0027             0.0111      0.2030            4.8600
    0.2069        0.1350       0.0026             0.0267      0.2052            5.1000
    0.2414    