# Model loading
We start by loading the CNN used to extract the convolutional features.  
We use the `densenet169` pretrained on ImageNet available in the `torchvision` library. Moreover, since we do not need the linear section at the end we select only the convolutional part with `.features`.  

We also set the device for any future computations: you can either pick `cpu` if you want to carry out all computations on a CPU, or `cuda:x` if you are operating a machine with one or more GPUs. In the last case the GPU used will be the number `x`, usually starting from 0. 

Finally, we need to know the size of the convolutional features in output to the network, so we pass a dummy input to retrieve this information.  

In [1]:
import torch
from torchvision.models import densenet169

device = "cuda:0"

model = densenet169(pretrained=True).features
model.to(device)

input_shape=(1, 3, 224, 224)

dummy_input = torch.ones(input_shape, requires_grad=False).to(device)
output_size = model(dummy_input).shape[1:].numel()

print("model output size = ", output_size)

model output size =  81536


# Dataloading

Next we load the dataset images. We use the `STL10` dataset available in the `torchvision` library in this notebook.  
We start by setting the following parameters:
- `root`: path to data directory. If no dataset is present, it will be downloaded here;
-  transforms to apply to each sample. We stick with a simple normalization and resize the images to (224, 224)
- `batch_size`: batch size for the number of samples to be processed at the same time;
- `num_workers`: Number of CPU cores to use by the dataloaders

In [2]:
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader

root = "/data/mldata/STL10/"
normalize = transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
transform = transforms.Compose([transforms.Resize(224),
                                transforms.ToTensor(),
                                normalize
                                ])
batch_size = 32
num_workers = 12

    
train_data = datasets.STL10(root=root, split='train', transform=transform, download=True)
test_data = datasets.STL10(root=root, split='test', transform=transform, download=True)

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=num_workers,drop_last=False)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True, num_workers=num_workers,drop_last=False)

Files already downloaded and verified
Files already downloaded and verified


# Extracting the convolutional features

We need to extract the convolutional features of the train and test set and encode them. We do so with the `get_conv_features` defined below:

In [3]:
from time import time
import numpy as np

def get_conv_features(loader, model, out_shape, device='cpu'):
    """
    Computes the convolutional features of the images in the loader and encodes them.

    Parameters
    ----------

    loader: torch Dataloader,
        contains the images to extract the training features from.
    model: torchvision.models,
        architecture to use to get the convolutional features.
    out_shape: int,
        output size of the last layer.
    device: string,
        device to use for the computation. Choose between 'cpu' and 'gpu:x', where
        x is the GPU number. Defaults to 'cpu'.

    Returns
    -------

    conv_features: numpy array,
        array containing the convolutional features. format is (# of samples * # of features).
        They are already moved to CPU.
    labels: list of int,
        labels associated to each image.
    conv_time: float,
        time required to compute the convolutional features. It includes the data loading.
    encode_time: float,
        encoding time, done on the GPU.
    """

    n_images = len(loader.dataset)

    batch_size = loader.batch_size

    conv_features = torch.FloatTensor(n_images, out_shape).to(device)
    labels = np.empty(n_images, dtype='uint8')
    model.eval()
    
    torch.cuda.synchronize()
    t0 = time()
    
    with torch.no_grad():
        
        for i, (images, targets) in enumerate(loader):
            images = images.to(device)
            outputs = model(images)

            conv_features[i * batch_size: (i + 1) * batch_size, :] = outputs.data.view(images.size(0), -1)
            labels[i * batch_size: (i + 1) * batch_size] = targets.numpy()

        torch.cuda.synchronize()
        conv_time = time() - t0

        torch.cuda.synchronize()
        start = time()
        conv_features = (conv_features > 0)
        torch.cuda.synchronize()
        encode_time = time() - start

        conv_features = conv_features.cpu()

    return conv_features, labels, conv_time, encode_time

In [4]:
train_conv_f, train_labels, train_conv_t, train_encode_t = get_conv_features(train_loader, model, output_size,
                                                                             device=device)
print("train conv features time = {0:3.2f} s\tencoding = {1:1.5f} s".format(train_conv_t, train_encode_t))

test_conv_f, test_labels, test_conv_t, test_encode_t= get_conv_features(test_loader, model, output_size,
                                                                        device=device)
print("test conv features time  = {0:3.2f} s\tencoding = {1:1.5f} s".format(test_conv_t, test_encode_t))

train conv features time = 11.76 s	encoding = 0.00561 s
test conv features time  = 18.48 s	encoding = 0.00708 s


# Random projection
### with GPU

We start by generating the random matrix of size will be `(out_shape , n_components)`, where `n_components` is the number of random projections. We split the matrix in `10` blocks, since we will have to move this matrix in GPU memory along with the matrix of convolutional features, and there might not be enough space for both.

In [5]:
import cupy as cp
import numpy as np
from time import time

def generate_RM(n_components, n_features, n_ram=10, normalize=True):
    """
    Generates the splits for the random matrix, ready to be moved to GPU.

    Parameters
    ----------

    n_ram: int,
        number of splits for the random matrix.
    n_components: int,
        number of random projections.
    n_features: int,
        number of convolutional features of the input matrix.
    normalize: boolean,
        if True, normalizes the matrix by dividing each entry by np.sqrt(n_features). defaults to True
    Returns
    -------

    R: list of numpy array,
        blocks of the random projection matrix.
    generation_time: float,
        time to generate the matrix.

    """

    matrix_shape = (n_features, n_components // n_ram)
    R = []
    since = time()

    for i in range(n_ram):
        print('Generating random matrix # ', i + 1)
        # allocate the right amount of memory
        R_tmp = np.zeros(shape=matrix_shape, dtype='float32')
        # fill that amount of memory and no more
        R_tmp[:] = np.random.randn(*matrix_shape)
        if normalize is True:
            R_tmp /= np.sqrt(n_components)
        R.append(R_tmp)

    generation_time = time() - since

    return R, generation_time

Once we generate the matrix, we can perform the projection. We multiply each block of the random matrix with the convolutional feature matrix and we concatenate the partial outputs to obtain the final result. 



In [6]:
def compute_dot_split(x, random_matrix):
    """
    Computes the random projection dot product on GPU.

    Parameters
    ----------
    x = cupy array,
        contains the data to project.
    random_matrix = numpy array,
        random projection matrix.

    Returns
    -------

    output: cupy array,
        contains the random projected matrix.
    """
    
    rm = cp.asarray(random_matrix)
    output = cp.asnumpy(cp.abs(cp.dot(x, rm))**2)
    return output

def get_rand_features_GPU(R, X):
    """
    Computes the random projection on GPU.

    Parameters
    ----------

    R: numpy array,
        random projection matrix.
    X: numpy array,
        matrix to project.

    Returns
    -------
    togpu_time: float,
        time to move the features to GPU.
    proj_time: float,
        projection time.
        
    """
    random_features = []

    # Export the features to GPU
    X = cp.asarray(X)

    # Do the RP

    t0 = time()
    for matrix in R:
        random_features.append(compute_dot_split(X, matrix))
    
    # Turn the features back to numpy arrays.
    random_features = cp.asnumpy(random_features)
    random_features = np.concatenate(random_features, axis=1)
    proj_time = time() - t0

    return random_features, proj_time

The only thing that is left is to pick the number of random features we want to use by setting the `n_components` variable and generate the matrix `R`

In [7]:
n_components = 120000

R, generation_time = generate_RM(n_components, output_size)
print("Generation time = {0:3.2f} s".format(generation_time))

Generating random matrix #  1
Generating random matrix #  2
Generating random matrix #  3
Generating random matrix #  4
Generating random matrix #  5
Generating random matrix #  6
Generating random matrix #  7
Generating random matrix #  8
Generating random matrix #  9
Generating random matrix #  10
Generation time = 456.23 s


And then we compute the random features by calling `get_rand_features_GPU`:

In [9]:
train_random_features, train_proj_t = get_rand_features_GPU(R, train_conv_f)
print("Train projection time = {0:3.2f} s".format(train_proj_t))

test_random_features, test_proj_t = get_rand_features_GPU(R, test_conv_f)
print("Test projection time = {0:3.2f} s".format(test_proj_t))

Train projection time = 22.53 s
Test projection time = 30.42 s


### With OPU

To perform the random projection with the OPU we use the `lightonml` library, which provides a simple python API to perform random projections with LightOn’s OPU.

We import the `OPUMap` object from the `lightonml` library and we create an instance of the class while passing the number of random projections (`n_components`) as argument.
A simple call to `opu.transform(X)` performs the random projection of the input matrix `X`, containing the convolutional features of the train/test set. We store the output matrix in the random_features variable.



In [5]:
from lightonml.projections.sklearn import OPUMap

def get_random_features(X, n_components):
    """
    Computes the random projection with LightOn's OPU and converts the features to float32.

    Parameters
    ----------
    
    X: torch tensor or numpy array,
        matrix of convolutional features
    n_components: int,
        number of random features
        
    Returns
    -------
    
    random_features: numpy array or torch tensor,
        matrix of random features;
    proj_time: float,
        projection time.
    decode_time: float,
        decoding time.
    
    """
    
    
    opu = OPUMap(n_components=n_components)
    
    since = time()
    random_features = opu.transform(X)
    proj_time = time() - since

    since = time()
    random_features = random_features.type(torch.FloatTensor)
    decode_time = time() - since
    
    return random_features, proj_time, decode_time

In [6]:
n_components = 120000

train_random_features, train_proj_t, decode_train_t = get_random_features(train_conv_f, n_components)
print("Train projection time = {0:3.2f} s\tTrain decode time = {1:3.2f} s".format(train_proj_t, decode_train_t))

test_random_features, test_proj_t, decode_test_t = get_random_features(test_conv_f, n_components)
print("Test projection time = {0:3.2f} s\tTest decode time = {1:3.2f} s".format(test_proj_t, decode_test_t))

Train projection time = 11.29 s	Train decode time = 0.16 s
Test projection time = 8.48 s	Test decode time = 0.23 s


# Fitting the linear classifier

Now we can simply fit a linear classifier on the training features and evaluate its performance on the test set. We selected the RidgeClassifier available in `scikit-learn` because of its fast implementation compared to other classifiers like logistic regression.


For the regularization coefficient `alpha`, usually values close to 1e6 tend to yield pretty good results with the LightOn’s OPU. If a GPU is used, this value needs to be lowered to 1e3.


In [10]:
from sklearn.linear_model import RidgeClassifier


clf = RidgeClassifier(alpha=1e3)
since = time()
clf.fit(train_random_features, train_labels)
fit_time = time() - since

train_accuracy = clf.score(train_random_features, train_labels) * 100
test_accuracy = clf.score(test_random_features, test_labels) * 100

print('Train acc = {0:3.2f}\tTest acc = {1:2.2f}\tFit time = {2:3.2f} s'
      .format(train_accuracy, test_accuracy, fit_time))

Train acc = 100.00	Test acc = 96.90	Fit time = 11.37 s
