<a href="https://colab.research.google.com/github/lookingglasslab/VisualFeatureSearch/blob/widen-support/notebooks/Interactive_Visual_Feature_Search_Caching_COCO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interactive Visual Feature Search
## CoCo Caching

This notebook demonstrates how to build a feature cache from scratch for Visual
Feature Search. We focus on creating a cache for a pretrained ResNet50 model when evaluated on the CoCo validation set; see our Basic Demo in our [Github repo](https://github.com/lookingglasslab/VisualFeatureSearch) for how this cached data can be used for efficient similarity search.

Note: if you are creating your own cache, we highly recommend doing so on a dedicated VM or local machine rather than on Google Colab. The resulting feature data can be quite large (e.g. 1-10 GB depending on the model and dataset size), and it's difficult to export files of this size out of Colab.

## Setup

In [None]:
!wget http://images.cocodataset.org/zips/val2014.zip
!wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip

!unzip annotations_trainval2014.zip
!unzip val2014.zip

In [None]:
# TODO: change to regular pip upon release
!pip install -i https://test.pypi.org/simple/ --no-deps visualfeaturesearch==0.0.11
!pip install zarr

In [5]:
import numpy as np
import torch
import torchvision
from torchvision import transforms

import zarr
import visualfeaturesearch as vfs

if torch.cuda.is_available():
  device = torch.device('cuda:0')
else:
  raise Exception('No GPU available')

## Dataset and Model Code

The caching function is meant to be very easy to call once you have a trained model and a dataset for searching across. The next few cells focus on setting up the CoCo dataset and ResNet model.

In [6]:
coco_ds = torchvision.datasets.CocoDetection(root='val2014/',
                                             annFile='annotations/instances_val2014.json',
                                             transform=vfs.data.net_transform,
                                             target_transform=lambda x : 0)
# the target_transform ensures that all target data are a fixed size (otherwise, the PyTorch dataloaders would raise an exception)

coco_dl = torch.utils.data.DataLoader(coco_ds, batch_size=256)

loading annotations into memory...
Done (t=9.89s)
creating index...
index created!


In [7]:
model = torchvision.models.resnet50(pretrained=True)
model = model.cuda().eval()

model_conv5 = vfs.util.FeatureHook(model, model.layer4[2].conv2)

Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:00<00:00, 147MB/s]


## Caching

It's as simple as one function call! This function saves the computed features to a file using the Zarr python library.

In [8]:
vfs.caching.precompute(coco_dl,
                       model_conv5,
                       cache_path='/content/ResNet_COCO_08172023_f16',
                       array_name='conv5',
                       device=device,
                       dtype=np.float16)
# we use float16 to make the resulting feature data have a much smaller memory
# footprint (but with very similar search results)

Progress: 256 / 40504
Progress: 512 / 40504
Progress: 768 / 40504
Progress: 1024 / 40504
Progress: 1280 / 40504
Progress: 1536 / 40504
Progress: 1792 / 40504
Progress: 2048 / 40504
Progress: 2304 / 40504
Progress: 2560 / 40504
Progress: 2816 / 40504
Progress: 3072 / 40504
Progress: 3328 / 40504
Progress: 3584 / 40504
Progress: 3840 / 40504
Progress: 4096 / 40504
Progress: 4352 / 40504
Progress: 4608 / 40504
Progress: 4864 / 40504
Progress: 5120 / 40504
Progress: 5376 / 40504
Progress: 5632 / 40504
Progress: 5888 / 40504
Progress: 6144 / 40504
Progress: 6400 / 40504
Progress: 6656 / 40504
Progress: 6912 / 40504
Progress: 7168 / 40504
Progress: 7424 / 40504
Progress: 7680 / 40504
Progress: 7936 / 40504
Progress: 8192 / 40504
Progress: 8448 / 40504
Progress: 8704 / 40504
Progress: 8960 / 40504
Progress: 9216 / 40504
Progress: 9472 / 40504
Progress: 9728 / 40504
Progress: 9984 / 40504
Progress: 10240 / 40504
Progress: 10496 / 40504
Progress: 10752 / 40504
Progress: 11008 / 40504
Progress: 

Below we open the cached data and print its metadata. We can see that the shape is what we'd expect (40,504 image features of size $512 \times 7 \times 7$), with a total size of 1.9 GB.

In [12]:
cache_store = zarr.DirectoryStore('ResNet_COCO_08172023')
cache_root = zarr.group(store=cache_store, overwrite=False)
cache_root['conv5'].info

0,1
Name,/conv5
Type,zarr.core.Array
Data type,float16
Shape,"(40504, 512, 7, 7)"
Chunk shape,"(500, 512, 7, 7)"
Order,C
Read-only,False
Compressor,"Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)"
Store type,zarr.storage.DirectoryStore
No. bytes,2032328704 (1.9G)
