# THINGS (CNN) Feature Extraction

The THINGS dataset contains `1,854` classes. Each class has `12` sample images. The MEG participants saw the entire dataset, while the fMRI participants only saw a subset of `720` classes. Here, features for the entire dataset are extracted, associated with filenames so that at later stages, neural network features for specific CNN layers can be extracted. We use the `thingsvision` library to extract features and save them. The activations will be stored in the folder specified below (in `save_folder` / `layer_name`).

Prior to this step, the THINGS image dataset was renamed to be a folder containing training images. This file structure was copied and replicated in the same directory and subsequently denoted as the validation image folder (no images present, just folder structure). Then, one image from each of the `1,854` classes was randomly selected and moved over to the corresponding volder in the validation images. This resulted in a training and validation folder, which contained a class label subfolder (consistent with PyTorch's `ImageFolder` specification). The training folder contained 11 images per class and the validation folder contained 1 image per class. In the code below, each of these folders are referred to by `images_folder` (via specification of `data_split` to be either `train` or `val`).

In [1]:
import thingsvision
import torch

from pathlib import Path
from thingsvision import get_extractor
from thingsvision.utils.storing import save_features
from thingsvision.utils.data import ImageDataset, DataLoader

  from .autonotebook import tqdm as notebook_tqdm


In [6]:
model_name = 'cornet-s' # 'resnet18'
data_split = 'val' # 'train
source = 'custom' # 'torchvision'
device = 'cuda' if torch.cuda.is_available() else 'cpu'
batch_size = 32

images_folder = Path(f'/Users/alxmrphi/Documents/Data/THINGS/images/imgs_{data_split}')
save_folder = Path('/Users/alxmrphi/Documents/Data/THINGS/activations') / model_name

In [7]:
extractor = get_extractor(
  model_name=model_name,
  source=source,
  device=device,
  pretrained=True
)

Downloading: "https://s3.amazonaws.com/cornet-models/cornet_s-1d3f7974.pth" to /Users/alxmrphi/.cache/torch/hub/checkpoints/cornet_s-1d3f7974.pth
100%|██████████| 408M/408M [00:23<00:00, 18.5MB/s] 


Using device:  cpu


In [8]:
extractor.show_model()

Sequential(
  (V1): Sequential(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (nonlin1): ReLU(inplace=True)
    (pool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (norm2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (nonlin2): ReLU(inplace=True)
    (output): Identity()
  )
  (V2): CORblock_S(
    (conv_input): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (skip): Conv2d(128, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
    (norm_skip): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (conv1): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (nonlin1): ReLU(inplace=True)
    (conv2): Conv2d(512, 512, k

In [11]:
layer_names_resnet18 = [ 'conv1', 'maxpool',
                        'layer1.0.conv1', 'layer1.0.conv2', 'layer1.1.conv1', 'layer1.1.conv2',
                        'layer2.0.conv1', 'layer2.0.conv2', 'layer2.1.conv1', 'layer2.1.conv2',
                        'layer3.0.conv1', 'layer3.0.conv2', 'layer3.1.conv1', 'layer3.1.conv2',
                        'layer4.0.conv1', 'layer4.0.conv2', 'layer4.1.conv1', 'layer4.1.conv2',
                        'avgpool', 'fc']

layer_names_cornet_s = ['V1.conv1', 'V1.conv2', 'V2.conv1', 'V2.conv2', 'V2.conv3',
                        'V4.conv1', 'V4.conv2', 'V4.conv3', 'IT.conv1', 'IT.conv2', 'IT.conv3',
                        'decoder.avgpool', 'decoder.linear']

In [12]:
dataset = ImageDataset(
  root=images_folder,
  out_path=save_folder,
  backend=extractor.get_backend(),
  transforms=extractor.get_transformations()
)

batches = DataLoader(
  dataset=dataset,
  batch_size=batch_size, 
  backend=extractor.get_backend()
)


...Creating dataset.


In [13]:
layer_names = layer_names_cornet_s
#layer_names = layer_names_resnet18

for layer_name in layer_names:
    print(f"Processing: {layer_name}")

    features = extractor.extract_features(
    batches=batches,
    module_name=layer_name,
    flatten_acts=True)

    save_path = save_folder / layer_name
    save_features(features, out_path=save_path, file_format='npy')

Processing: V1.conv1


Batch: 100%|██████████| 58/58 [10:06<00:00, 10.45s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V1.conv2


Batch: 100%|██████████| 58/58 [10:07<00:00, 10.48s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V2.conv1


Batch: 100%|██████████| 58/58 [09:55<00:00, 10.26s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V2.conv2


Batch: 100%|██████████| 58/58 [09:57<00:00, 10.31s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V2.conv3


Batch: 100%|██████████| 58/58 [09:54<00:00, 10.24s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V4.conv1


Batch: 100%|██████████| 58/58 [09:51<00:00, 10.20s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V4.conv2


Batch: 100%|██████████| 58/58 [36:00<00:00, 37.26s/it]   


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: V4.conv3


Batch: 100%|██████████| 58/58 [41:24<00:00, 42.83s/it] 


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: IT.conv1


Batch: 100%|██████████| 58/58 [5:09:57<00:00, 320.65s/it]    


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: IT.conv2


Batch: 100%|██████████| 58/58 [09:57<00:00, 10.30s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: IT.conv3


Batch: 100%|██████████| 58/58 [09:51<00:00, 10.19s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: decoder.avgpool


Batch: 100%|██████████| 58/58 [09:50<00:00, 10.19s/it]


...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.

Processing: decoder.linear


Batch: 100%|██████████| 58/58 [09:21<00:00,  9.69s/it]

...Features successfully extracted for all 1854 images in the database.

Output directory did not exist. Creating directories to save features...

...Features successfully saved to disk.




