# Feature Extraction

In [1]:
!git clone -b Stable https://github.com/cvai-roig-lab/Net2Brain.git

Cloning into 'Net2Brain'...
remote: Enumerating objects: 545, done.[K
remote: Counting objects: 100% (203/203), done.[K
remote: Compressing objects: 100% (127/127), done.[K
remote: Total 545 (delta 115), reused 150 (delta 74), pack-reused 342[K
Receiving objects: 100% (545/545), 101.04 MiB | 24.19 MiB/s, done.
Resolving deltas: 100% (202/202), done.
Checking out files: 100% (327/327), done.


In [2]:
!pip install Net2Brain/.

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing ./Net2Brain
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m
Collecting cornet@ git+https://github.com/dicarlolab/CORnet
  Cloning https://github.com/dicarlolab/CORnet to /tmp/pip-install-1wd_vbuw/cornet_8b9302bf96834d0fb1053a057969ef85
  Running command git clone -q https://github.com/dicarlolab/CORnet /tmp/pip-install-1wd_vbuw/cornet_8b9302bf96834d0fb1053a057969ef85
Collecting clip@ git+https://github.com/openai/CLIP.git
  Cloning https://github.com/openai/CLIP.git to /tmp/pip-install-1wd_vbuw/clip_76a4c14e90464fce8ca14089813a

__Net2Brain__ allows you to use one of over 600 Deep Neural Networks (DNNs) for your experiments comparing human brain activity with the activations of artificial neural networks with. These networks are obtained from what we call in the toolbox as different _netsets_, which are libraries that provide different pretrained models. 

__Net2Brain__ provides access to the following _netsets_:
- []()
- [Timm](https://github.com/rwightman/pytorch-image-models#models) models

You can print the available models from every netset using the function `print_all_models()`. 

In [3]:
from net2brain.feature_extraction import print_all_models

print_all_models()

Vissl models are not installed
Detectron2 is not installed.


NetSet: standard
Models: ['AlexNet', 'ResNet18', 'ResNet34', 'ResNet50', 'ResNet101', 'ResNet152', 'Squeezenet1_0', 'Squeezenet1_1', 'VGG11', 'VGG11_bn', 'VGG13', 'VGG13_bn', 'VGG16', 'VGG16_bn', 'VGG19', 'VGG19_bn', 'Densenet121', 'Densenet161', 'Densenet169', 'Densenet201', 'GoogleNet', 'ShuffleNetV2x05', 'ShuffleNetV2x10', 'mobilenet_v2', 'mobilenet_v3_large', 'mobilenet_v3_small', 'resnext50_32x4d', 'resnext101_32x8d', 'wide_resnet101_2', 'wide_resnet50_2', 'mnasnet05', 'mnasnet10', 'efficientnet_b0', 'efficientnet_b1', 'efficientnet_b2', 'efficientnet_b3', 'efficientnet_b4', 'efficientnet_b5', 'efficientnet_b6', 'efficientnet_b7', 'regnet_y_400mf', 'regnet_y_800mf', 'regnet_y_1_6gf', 'regnet_y_3_2gf', 'regnet_y_8gf', 'regnet_y_16gf', 'regnet_y_32gf', 'regnet_x_400mf', 'regnet_x_800mf', 'regnet_x_1_6gf', 'regnet_x_3_2gf', 'regnet_x_8gf', 'regnet_x_16gf', 'regnet_x_32gf']


NetSet: timm
Models: ['adv_inception_v3', 'cait_

You can also inspect the models available from a particular _netset_ using the function `print_netset_models()`:

In [4]:
from net2brain.feature_extraction import print_netset_models

print_netset_models('pyvideo')

['slow_r50', 'slowfast_r101', 'slowfast_r50', 'x3d_m', 'x3d_s', 'x3d_xs']

Or you can find a model by its name using the function `find_model_like()`:

In [None]:
from net2brain.feature_extraction import find_model_like

find_model_like('resnet50')

standard: ResNet50
standard: wide_resnet50_2
timm: cspresnet50
timm: ecaresnet50d
timm: ecaresnet50d_pruned
timm: ecaresnet50t
timm: gluon_resnet50_v1b
timm: gluon_resnet50_v1c
timm: gluon_resnet50_v1d
timm: gluon_resnet50_v1s
timm: legacy_seresnet50
timm: nf_resnet50
timm: resnet50
timm: resnet50d
timm: seresnet50
timm: ssl_resnet50
timm: swsl_resnet50
timm: tv_resnet50
timm: wide_resnet50_2
pytorch: deeplabv3_resnet50
pytorch: fcn_resnet50


## Using `FeatureExtractor` with a pretrained DNN

To extract the activations of a pretrained model of one of the netsets, you will first need to initialize the `FeatureExtractor` class, and provide the name of the model, and the name of the _netset_. You can also determine which device to use to compute the extraction, in case you want to run it on GPUs.

In [6]:
from net2brain.feature_extraction import FeatureExtractor

fx = FeatureExtractor(model='ResNet50', netset='standard', device='cpu')

__Net2Brain__ chooses by default from which layers of the model to extract the features from. You can inspect which layers are selected by default y calling the `layers_to_extract` attribute:

In [7]:
fx.layers_to_extract

['layer1', 'layer2', 'layer3', 'layer4']

However, you can also select which layers to extract. For example, if you would only want the activations from a specific layer, for example layer 4, you can define this is the `FeatureExtractor` arguments:

In [None]:
fx = FeatureExtractor(
    model='ResNet50', netset='standard', 
    layers_to_extract=['layer4'], 
    device='cpu'
  )

If you are not sure about the names of the layers that you could extract from a given layer beyond the default ones, you can always use the `get_all_layers()` method to have a print out of the possibilities:

In [9]:
fx.get_all_layers()

['',
 'conv1',
 'bn1',
 'relu',
 'maxpool',
 'layer1',
 'layer1.0',
 'layer1.0.conv1',
 'layer1.0.bn1',
 'layer1.0.conv2',
 'layer1.0.bn2',
 'layer1.0.conv3',
 'layer1.0.bn3',
 'layer1.0.relu',
 'layer1.0.downsample',
 'layer1.0.downsample.0',
 'layer1.0.downsample.1',
 'layer1.1',
 'layer1.1.conv1',
 'layer1.1.bn1',
 'layer1.1.conv2',
 'layer1.1.bn2',
 'layer1.1.conv3',
 'layer1.1.bn3',
 'layer1.1.relu',
 'layer1.2',
 'layer1.2.conv1',
 'layer1.2.bn1',
 'layer1.2.conv2',
 'layer1.2.bn2',
 'layer1.2.conv3',
 'layer1.2.bn3',
 'layer1.2.relu',
 'layer2',
 'layer2.0',
 'layer2.0.conv1',
 'layer2.0.bn1',
 'layer2.0.conv2',
 'layer2.0.bn2',
 'layer2.0.conv3',
 'layer2.0.bn3',
 'layer2.0.relu',
 'layer2.0.downsample',
 'layer2.0.downsample.0',
 'layer2.0.downsample.1',
 'layer2.1',
 'layer2.1.conv1',
 'layer2.1.bn1',
 'layer2.1.conv2',
 'layer2.1.bn2',
 'layer2.1.conv3',
 'layer2.1.bn3',
 'layer2.1.relu',
 'layer2.2',
 'layer2.2.conv1',
 'layer2.2.bn1',
 'layer2.2.conv2',
 'layer2.2.bn2',
 '

To initialize the extraction, you have to call the method `extract()`. Using this method, you can specify how you want the activations to be stored using the `save_format` argument. Options are `pt` or `npz`, in which the activations of each image are stored separately in a tensor or array format, respectively, or `dataset`, in which the activations are stored in the format of a `Dataset` class of the [RSA toolbox](https://rsatoolbox.readthedocs.io/en/stable/).

You can also specify which folder to use to store the activations using the `save_path` argument. By default this argument is `None` in which case the activations will be stored in a folder named `features` at the root of the project.

In [None]:
images_path = '/Users/m_vilas/projects/Net2Brain/input_data/stimuli_data/78images'
save_path = '/Users/m_vilas/test'

fx = FeatureExtractor(model='ResNet50', netset='standard', device='cpu')
fts_datasets = fx.extract(
    dataset_path=images_path, save_format='dataset', save_path=save_path
)
print(fts_datasets)

100%|███████████████████████████████████████████| 78/78 [00:15<00:00,  4.98it/s]


{'layer1': rsatoolbox.data.Dataset(
measurements = 
[[0.00651395 0.00523897 0.00502557 ... 0.03112683 0.0168194  0.        ]
 [0.2371985  0.2579319  0.3460365  ... 1.2546022  0.24196889 0.        ]
 [0.23620495 0.2895102  0.24857639 ... 0.8353056  1.0888933  0.        ]
 ...
 [0.32463378 0.26182312 0.2564817  ... 0.04701129 0.13045819 0.        ]
 [0.00445512 0.0047385  0.00366665 ... 0.         0.         0.3265368 ]
 [0.21103615 0.15448973 0.00474278 ... 0.28030264 0.         0.61205375]]
descriptors = 
{'dnn': 'ResNet50', 'layer': 'layer1'}
obs_descriptors = 
{'images': array(['image_01', 'image_02', 'image_03', 'image_04', 'image_05',
       'image_06', 'image_07', 'image_08', 'image_09', 'image_10',
       'image_11', 'image_12', 'image_13', 'image_14', 'image_15',
       'image_16', 'image_17', 'image_18', 'image_19', 'image_20',
       'image_21', 'image_22', 'image_23', 'image_24', 'image_25',
       'image_26', 'image_27', 'image_28', 'image_29', 'image_30',
       'image_31',

In [None]:
from pathlib import Path
from rsatoolbox.data.dataset import load_dataset

filename = Path(save_path) / f'ResNet50_layer1.hdf5'
layer1_dataset = load_dataset(filename, file_type='hdf5')
layer1_dataset

rsatoolbox.data.Dataset(
measurements = 
[[0.00651395 0.00523897 0.00502557 ... 0.03112683 0.0168194  0.        ]
 [0.2371985  0.2579319  0.3460365  ... 1.2546022  0.24196889 0.        ]
 [0.23620495 0.2895102  0.24857639 ... 0.8353056  1.0888933  0.        ]
 ...
 [0.32463378 0.26182312 0.2564817  ... 0.04701129 0.13045819 0.        ]
 [0.00445512 0.0047385  0.00366665 ... 0.         0.         0.3265368 ]
 [0.21103615 0.15448973 0.00474278 ... 0.28030264 0.         0.61205375]]
descriptors = 
{'dnn': 'ResNet50', 'layer': 'layer1'}
obs_descriptors = 
{'images': array(['image_01', 'image_02', 'image_03', 'image_04', 'image_05',
       'image_06', 'image_07', 'image_08', 'image_09', 'image_10',
       'image_11', 'image_12', 'image_13', 'image_14', 'image_15',
       'image_16', 'image_17', 'image_18', 'image_19', 'image_20',
       'image_21', 'image_22', 'image_23', 'image_24', 'image_25',
       'image_26', 'image_27', 'image_28', 'image_29', 'image_30',
       'image_31', 'image_32'

## Using `FeatureExtractor` with your own DNN

In [None]:
from torchvision import models
from torchvision import transforms as T
from torchvision.models import AlexNet_Weights

# Define model and transforms
model = models.alexnet(weights=AlexNet_Weights.DEFAULT)
transforms = T.Compose([
    T.Resize((224, 224)),  # transform images if needed
    T.ToTensor(),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
  ])

# Define extractor
fx = FeatureExtractor(model, transforms=transforms, device='cpu')


In [None]:
# Print model nodes to see which one we could extract
all_layers = extractor.get_all_layers()
print(all_layers)

# Only take nodes with "features." in the name
layers_to_extract = [x for x in all_layers if "features." in x]
print(layers_to_extract)

## Next steps!