# Investigate Bad Filters

In our previous notebook, we observed that there were four filters that didn't seem to respond to very much. Let's see what happens when we just drop the filters altogether. Do we still get good results? Or are these filters actually useful and the network doesn't work without them?

Since our network was trained on ImageNet, we'll test it out against the CIFAR-10.

In [69]:
from fastai.vision import *

## Basic Training

First let's sanity check that we can train on CIFAR.

For pretrained models, fastai will freeze the body of the network and only train the head.

In [70]:
path = untar_data(URLs.CIFAR)

In [100]:
def trainLearner():
    data = ImageDataBunch.from_folder(path, valid='test', bs=128)
    learner = cnn_learner(data, models.resnet50, metrics=accuracy, pretrained=True)
    learner.fit_one_cycle(10, max_lr=1e-2)
    acc = learner.recorder.metrics[-1]
    return acc

In [101]:
acc1 = trainLearner()
acc2 = trainLearner()
acc3 = trainLearner()

print("Avg accuracy:", np.mean((acc1, acc2, acc3)))

epoch,train_loss,valid_loss,accuracy,time
0,1.244443,1.088225,0.6463,00:16
1,0.927991,0.885325,0.7059,00:16
2,0.892676,0.846013,0.7079,00:16
3,0.736301,0.833539,0.7182,00:17
4,0.701119,0.698902,0.7601,00:17
5,0.644356,0.635338,0.7774,00:17
6,0.547222,0.578339,0.7989,00:17
7,0.469942,0.541901,0.8118,00:17
8,0.406963,0.547133,0.8169,00:17
9,0.345667,0.550512,0.8131,00:17


epoch,train_loss,valid_loss,accuracy,time
0,1.228941,1.105009,0.6455,00:16
1,0.967721,0.965166,0.6806,00:16
2,0.8773,1.211801,0.6872,00:17
3,0.788042,0.860623,0.7071,00:17
4,0.722246,0.689881,0.7605,00:16
5,0.614706,0.627281,0.7867,00:17
6,0.542406,0.596784,0.7968,00:17
7,0.466417,0.550768,0.8098,00:17
8,0.397846,0.545762,0.8165,00:17
9,0.36579,0.545271,0.8177,00:17


epoch,train_loss,valid_loss,accuracy,time
0,1.236661,1.014586,0.6469,00:16
1,0.906971,0.845509,0.7117,00:17
2,0.914414,0.834185,0.7155,00:17
3,0.793425,0.737999,0.7479,00:16
4,0.703719,0.727799,0.7537,00:16
5,0.630391,0.697964,0.7605,00:17
6,0.537599,0.582982,0.7979,00:17
7,0.453328,0.557602,0.8128,00:16
8,0.403366,0.55181,0.8164,00:17
9,0.360014,0.558081,0.8185,00:16


Avg accuracy: 0.81643337


## Training Without "Bad" Filters

Next we'll try the same exercise, but we'll completely omit the "bad" filters. If they're truly doing nothing, then we should expect to get comparably performance without them.

In [104]:
def trainLearnerButDropFiltersFromFirstConv(filter_indexes):
    data = ImageDataBunch.from_folder(path, valid='test', bs=128)
    learner = cnn_learner(data, models.resnet50, metrics=accuracy, pretrained=True)
    
    # drop any filters for conv1 that are passed in as `filter_indexes`
    filters = learner.model[0][0].weight
    for idx in filter_indexes:
        filters[idx].mul_(0)
        
    learner.fit_one_cycle(10, max_lr=1e-2)
    acc = learner.recorder.metrics[-1]
    return acc

In [105]:
acc1 = trainLearnerButDropFiltersFromFirstConv([13, 42, 52, 58])
acc2 = trainLearnerButDropFiltersFromFirstConv([13, 42, 52, 58])
acc3 = trainLearnerButDropFiltersFromFirstConv([13, 42, 52, 58])

print("Avg accuracy:", np.mean((acc1, acc2, acc3)))

epoch,train_loss,valid_loss,accuracy,time
0,1.247845,1.080709,0.6424,00:15
1,1.050161,1.262804,0.5936,00:15
2,0.97237,0.905316,0.6822,00:16
3,0.760521,0.799287,0.7281,00:16
4,0.693087,0.742944,0.7658,00:16
5,0.62796,0.647821,0.7718,00:16
6,0.563726,0.584559,0.795,00:16
7,0.474862,0.553025,0.8067,00:15
8,0.400794,0.550689,0.8132,00:16
9,0.36015,0.559005,0.8121,00:16


epoch,train_loss,valid_loss,accuracy,time
0,1.234343,0.99944,0.6618,00:16
1,1.034868,1.219926,0.668,00:16
2,0.813259,0.976878,0.7108,00:15
3,0.749725,0.68571,0.7616,00:16
4,0.6601,0.686995,0.7636,00:18
5,0.611054,0.607105,0.792,00:16
6,0.536444,0.790822,0.8,00:16
7,0.456535,0.540729,0.8142,00:16
8,0.390104,0.542062,0.8174,00:17
9,0.349075,0.548315,0.817,00:17


epoch,train_loss,valid_loss,accuracy,time
0,1.276838,1.220675,0.6445,00:17
1,1.007382,0.906003,0.6852,00:16
2,0.906013,0.984407,0.665,00:16
3,0.758477,0.688748,0.7585,00:16
4,0.682145,0.675771,0.7636,00:16
5,0.594602,0.620319,0.7843,00:16
6,0.535475,0.565233,0.8015,00:15
7,0.457108,0.536073,0.8146,00:16
8,0.383365,0.532336,0.8215,00:16
9,0.335319,0.539524,0.8223,00:16


Avg accuracy: 0.81713337


So accuracy is roughly unchanged. (I think it's likely just chance that it increased, in my earlier results it seemed consistently lower...)

## Training without random filters

So accuracy is basically unchanged. Is this because the filters we dropped are useless or is it just a property of neural networks that they're pretty good at recovering after brain surgery?

Let's try the same experiment, but we'll drop random filters that are not in our list of "bad" filters.

In [115]:
def get_filters_to_drop():
    bad_filter_idx = [13, 42, 52, 58]

    filters_to_drop = []

    while len(filters_to_drop) < 4:
        rand_idx = np.random.randint(0, 64)
        if rand_idx not in bad_filter_idx and rand_idx not in filters_to_drop:
            filters_to_drop.append(rand_idx)
            
    return filters_to_drop    

In [116]:
filters_to_drop = get_filters_to_drop()
print("Dropping", filters_to_drop)
acc1 = trainLearnerButDropFiltersFromFirstConv(filters_to_drop)
filters_to_drop = get_filters_to_drop()
print("Dropping", filters_to_drop)
acc2 = trainLearnerButDropFiltersFromFirstConv(filters_to_drop)
filters_to_drop = get_filters_to_drop()
print("Dropping", filters_to_drop)
acc3 = trainLearnerButDropFiltersFromFirstConv(filters_to_drop)

print("Avg accuracy:", np.mean((acc1, acc2, acc3)))

Dropping [32, 1, 5, 17]


epoch,train_loss,valid_loss,accuracy,time
0,1.28604,1.217069,0.6203,00:16
1,1.005321,0.9957,0.6789,00:17
2,0.982823,1.169003,0.6994,00:17
3,0.747405,0.811925,0.7187,00:16
4,0.695665,0.728269,0.7491,00:17
5,0.620434,0.628402,0.7815,00:17
6,0.532214,0.598709,0.7947,00:17
7,0.458927,0.560244,0.8093,00:16
8,0.395093,0.54983,0.8153,00:17
9,0.357761,0.556244,0.8157,00:17


Dropping [33, 34, 24, 57]


epoch,train_loss,valid_loss,accuracy,time
0,1.254056,1.066417,0.6413,00:17
1,0.99704,1.005947,0.6688,00:17
2,1.059929,0.968771,0.6635,00:16
3,0.799531,0.831358,0.7198,00:17
4,0.693111,0.722479,0.7462,00:17
5,0.627966,0.635978,0.7812,00:17
6,0.548624,0.58408,0.8021,00:17
7,0.482482,0.553478,0.8087,00:17
8,0.39156,0.554136,0.8154,00:17
9,0.35861,0.562674,0.8163,00:17


Dropping [55, 47, 27, 48]


epoch,train_loss,valid_loss,accuracy,time
0,1.244725,1.107273,0.6406,00:17
1,1.074177,1.498895,0.6524,00:17
2,0.916429,0.863429,0.7119,00:17
3,0.765995,0.748045,0.7423,00:16
4,0.686289,0.773112,0.7308,00:16
5,0.633192,0.660814,0.7691,00:17
6,0.535396,0.583198,0.7942,00:16
7,0.469384,0.558696,0.808,00:16
8,0.407914,0.559097,0.8093,00:16
9,0.359167,0.566663,0.8082,00:16


Avg accuracy: 0.8134


So we see a small decrease. Not enough for me to say anything conclusively. 

To be honest, after running this experiment, we haven't really learned that much. But we've at least crossed off this line of investigation for now.

A potential followup would be to run this same experiment on a different (maybe more challenging) dataset.