<a href="https://colab.research.google.com/github/YannDubs/lossyless/blob/main/Hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using lossyless CLIP compressor

This notebook contains a minimal example for using the CLIP compressor pretrained on pytorch Hub in our paper [**Lossy Compression for Lossless Prediction**](https://arxiv.org/pdf/2106.10800.pdf). 

**Make sure that you use a GPU** (on COLAB: runtime -> change runtime type -> Hardware accelerator: GPU)

## Environment

In [1]:
!pip install torch torchvision tqdm numpy compressai scikit-learn git+https://github.com/openai/CLIP.git --quiet

[?25l[K     |█▍                              | 10kB 25.1MB/s eta 0:00:01[K     |██▉                             | 20kB 18.7MB/s eta 0:00:01[K     |████▎                           | 30kB 15.1MB/s eta 0:00:01[K     |█████▊                          | 40kB 13.8MB/s eta 0:00:01[K     |███████▏                        | 51kB 8.5MB/s eta 0:00:01[K     |████████▋                       | 61kB 7.9MB/s eta 0:00:01[K     |██████████                      | 71kB 8.9MB/s eta 0:00:01[K     |███████████▌                    | 81kB 9.4MB/s eta 0:00:01[K     |█████████████                   | 92kB 9.8MB/s eta 0:00:01[K     |██████████████▎                 | 102kB 8.2MB/s eta 0:00:01[K     |███████████████▊                | 112kB 8.2MB/s eta 0:00:01[K     |█████████████████▏              | 122kB 8.2MB/s eta 0:00:01[K     |██████████████████▋             | 133kB 8.2MB/s eta 0:00:01[K     |████████████████████            | 143kB 8.2MB/s eta 0:00:01[K     |█████████████████████▌ 

## Downloading the pretrained compressor

First we will download the compressor. The following command returns the compressor as well as the transform that should be applied to the images before compression. The transformation resizes+crops images to `(3,224,224)`, applies CLIP normalization, and converts to tensor.

In [3]:
# Load the desired compressor and transformation to apply to images (by default on GPU if available)
compressor, transform = torch.hub.load(
    "YannDubs/lossyless:main", "clip_compressor_b005"
)

Downloading: "https://github.com/YannDubs/lossyless/archive/main.zip" to /root/.cache/torch/hub/main.zip
Downloading: "https://github.com/YannDubs/lossyless/releases/download/v0.1-alpha/beta5e-02_factorized_rate.pt" to /root/.cache/torch/hub/checkpoints/beta5e-02_factorized_rate.pt
100%|███████████████████████████████████████| 354M/354M [00:10<00:00, 33.8MiB/s]
  "Argument interpolation should be of type InterpolationMode instead of int. "


You can also use stronger compressor or less strong compressor. Specifically, `b005` stands for $\beta=0.05$ and you can increase $\beta$ to increase compression power (this is actually $\frac{1}{\beta}$ in the paper :/ ). To see avaliable compressors use the following command:

In [4]:
# list available compressors. b01 compresses the most (b01 > b005 > b001)
torch.hub.list("YannDubs/lossyless:main")

Using cache found in /root/.cache/torch/hub/YannDubs_lossyless_main


['clip_compressor_b001', 'clip_compressor_b005', 'clip_compressor_b01']

## Compressing an entire dataset

Let's see how to compress and save a torchvision dataset to file. We will use STL10 as it is quick and easy to download.

Importantly we will use `transform` on each image 


In [7]:
from torchvision.datasets import STL10
DATA_DIR = "data/"

# Load some data to compress and apply transformation
stl10_train = STL10(DATA_DIR, download=True, split="train", transform=transform)
stl10_test = STL10(DATA_DIR, download=True, split="test", transform=transform)

Downloading http://ai.stanford.edu/~acoates/stl10/stl10_binary.tar.gz to data/stl10_binary.tar.gz


HBox(children=(FloatProgress(value=0.0, max=2640397119.0), HTML(value='')))


Extracting data/stl10_binary.tar.gz to data/
Files already downloaded and verified


Let us now compress the entire dataset and save it to file. We provide a helper function for that `compress_dataset` (see docstring for more information). This requires a GPU.

In [9]:
# Rate: 1506.50 bits/img | Encoding: 347.82 img/sec
compressor.compress_dataset(
    stl10_train,
    f"{DATA_DIR}/stl10_train_Z.bin",
    label_file=f"{DATA_DIR}/stl10_train_Y.npy",
)
compressor.compress_dataset(
    stl10_test,
    f"{DATA_DIR}/stl10_test_Z.bin",
    label_file=f"{DATA_DIR}/stl10_test_Y.npy",
)

  cpuset_checked))
100%|██████████| 40/40 [00:18<00:00,  2.18it/s]
  0%|          | 0/63 [00:00<?, ?it/s]

Rate: 1506.62 bits/img | Encoding: 271.71 img/sec 


100%|██████████| 63/63 [00:26<00:00,  2.37it/s]

Rate: 1507.56 bits/img | Encoding: 301.09 img/sec 





The dataset is now saved to file.

In [11]:
!du -sh data/stl10_train_Z.bin

920K	data/stl10_train_Z.bin


Let us now load and decompress the dataset from file. The decompressed data is loaded as numpy array. This does not use a GPU by default. 

In [12]:
# Decoding: 1062.38 img/sec
Z_train, Y_train = compressor.decompress_dataset(
    f"{DATA_DIR}/stl10_train_Z.bin", label_file=f"{DATA_DIR}/stl10_train_Y.npy"
)
Z_test, Y_test = compressor.decompress_dataset(
    f"{DATA_DIR}/stl10_test_Z.bin", label_file=f"{DATA_DIR}/stl10_test_Y.npy"
)

100%|██████████| 5000/5000 [00:04<00:00, 1090.42it/s]
  0%|          | 0/8000 [00:00<?, ?it/s]

Decoding: 1086.62 img/sec 


100%|██████████| 8000/8000 [00:07<00:00, 1104.77it/s]

Decoding: 1101.41 img/sec 





Now that we have the decompressed data, let's test how well we can classify from it.

In [15]:
from sklearn.svm import LinearSVC
import time

# Accuracy: 98.65% | Training time: 0.5 sec
clf = LinearSVC(C=7e-3)
start = time.time()
clf.fit(Z_train, Y_train)
delta_time = time.time() - start
acc = clf.score(Z_test, Y_test)
print(
    f"Downstream STL10 accuracy: {acc*100:.2f}%.  \t Training time: {delta_time:.1f} "
)

Downstream STL10 accuracy: 98.64%.  	 Training time: 0.6 


## Representing a batch of image

In case you have a batch of images and you only want to represent them (skip the compression / decompression steps). Then you can do the following.

In [33]:
from torch.utils.data import DataLoader

# 1. Get a batch of images from STL10 (note that correct transform already applied)
for X, _ in DataLoader(stl10_train, batch_size=128):
  break
print("X shape:", X.shape)

# 2. Transfer batch to CUDA and half precision
X = X.to("cuda").half()

# 3. Represent the data (equivalent of compression + decompressing but quicker)
Z = compressor(X)
print("Z shape:", Z.shape)

X shape: torch.Size([128, 3, 224, 224])
Z shape: torch.Size([128, 512])
