<a href="https://colab.research.google.com/github/kiran74-ds/zero_shot_learning/blob/main/Embarrassingly_Simple_ZSL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Embarrassingly simple approach to ZSL**

The framework described with this approach is that of 2 parts which is built upon creating relationships between features, attributes and classes with the help of a linear model.

The first part helps in defining the relationship between features and attributes with the help of weights in that layer.

The second part deals with modelling the relationship between attributes and classes where the prescribed attribute signatures is fixed.

<img src="https://iq.opengenus.org/content/images/2020/01/Screenshot-from-2020-01-27-00-56-44.png">

### Download the datset

In [None]:
!wget http://datasets.d2.mpi-inf.mpg.de/xian/xlsa17.zip

In [None]:
!unzip xlsa17.zip

### Import the necessary libraries

In [163]:
import numpy as np
import os
import scipy.io
from sklearn.metrics import classification_report,confusion_matrix, accuracy_score

### Laod the DataSet

Choose the one of the datasets from the options(APY, AWA1, AWA2, SUN, CUB).
I have selected AWA2(Animal with attributes data set) to implement model.

+ res101 contains Image Features extracted using resnet model.
+ att_splits contains splitting criteria for trian and test data along with class names information

In [390]:
dataset = 'AWA2'
res101 = scipy.io.loadmat('xlsa17/data/'+dataset+'/res101.mat')
att_splits = scipy.io.loadmat('xlsa17/data/'+dataset+'/att_splits.mat')

In [376]:
res101.keys()

dict_keys(['__header__', '__version__', '__globals__', 'image_files', 'features', 'labels'])

In [377]:
att_splits.keys()

dict_keys(['__header__', '__version__', '__globals__', 'allclasses_names', 'att', 'original_att', 'test_seen_loc', 'test_unseen_loc', 'train_loc', 'trainval_loc', 'val_loc'])

### Prepare the Dataset for Training 

In [378]:
# extract the labels from att_split for all the data set points
labels = res101['labels']
labels_train = labels[np.squeeze(att_splits['train_loc']-1)]
labels_val = labels[np.squeeze(att_splits['val_loc']-1)]
labels_trainval = labels[np.squeeze(att_splits['trainval_loc']-1)]
labels_test = labels[np.squeeze(att_splits['test_unseen_loc']-1)]

In [379]:
# unique labels for all the data set points
train_labels_seen = np.unique(labels_train)
val_labels_unseen = np.unique(labels_val)
trainval_labels_seen = np.unique(labels_trainval)
test_labels_unseen = np.unique(labels_test)

In [380]:
# Indexing labels for all the data set points

i = 0
for labels in train_labels_seen:
    labels_train[labels_train == labels] = i    
    i = i+1
j = 0
for labels in val_labels_unseen:
    labels_val[labels_val == labels] = j
    j = j+1
k = 0
for labels in trainval_labels_seen:
    labels_trainval[labels_trainval == labels] = k
    k = k+1
l = 0
for labels in test_labels_unseen:
    labels_test[labels_test == labels] = l
    l = l+1

test_label_classes = []
for i in test_labels_unseen:
  test_label_classes.append(att_splits['allclasses_names'][i-1][0][0])

In [381]:
X_features = res101['features']
train_vec = X_features[:,np.squeeze(att_splits['train_loc']-1)]
val_vec = X_features[:,np.squeeze(att_splits['val_loc']-1)]
trainval_vec = X_features[:,np.squeeze(att_splits['trainval_loc']-1)]
test_vec = X_features[:,np.squeeze(att_splits['test_unseen_loc']-1)]


print("Features for train:", train_vec.shape)
print("Features for val:", val_vec.shape)
print("Features for trainval:", trainval_vec.shape)
print("Features for test:", test_vec.shape)

Features for train: (2048, 4906)
Features for val: (2048, 1026)
Features for trainval: (2048, 5932)
Features for test: (2048, 7924)


In [382]:
#Signature matrix
signature = att_splits['att']
train_sig = signature[:,(train_labels_seen)-1]
val_sig = signature[:,(val_labels_unseen)-1]
trainval_sig = signature[:,(trainval_labels_seen)-1]
test_sig = signature[:,(test_labels_unseen)-1]

print("Signature for train:", train_sig.shape)
print("Signature for val:", val_sig.shape)
print("Signature for trainval:", trainval_sig.shape)
print("Signature for test:", test_sig.shape)


Signature for train: (64, 15)
Signature for val: (64, 5)
Signature for trainval: (64, 20)
Signature for test: (64, 12)


In [383]:
#params for train and val set
m_train = labels_train.shape[0]
n_val = labels_val.shape[0]
z_train = len(train_labels_seen)
z1_val = len(val_labels_unseen)

#params for trainval and test set
m_trainval = labels_trainval.shape[0]
n_test = labels_test.shape[0]
z_trainval = len(trainval_labels_seen)
z1_test = len(test_labels_unseen)


#ground truth for train and val set
gt_train = 0*np.ones((m_train, z_train))
gt_train[np.arange(m_train), np.squeeze(labels_train)] = 1

#grountruth for trainval and test set
gt_trainval = 0*np.ones((m_trainval, z_trainval))
gt_trainval[np.arange(m_trainval), np.squeeze(labels_trainval)] = 1

### **We will implement the model using this simple one line solution**

**V = inverse(XX' + γI) XYS' inverse(SS' + λI)**

In [384]:
#train set
d_train = train_vec.shape[0]
a_train = train_sig.shape[0]

#Weights
V = np.zeros((d_train,a_train))

In [385]:
#trainval set
d_trainval = trainval_vec.shape[0]
a_trainval = trainval_sig.shape[0]
W = np.zeros((d_trainval,a_trainval))

#Note: These hyper-parameters were found using the code snippet available below
gamma = 3
lamda = -1

In [386]:
part_left = np.linalg.pinv(np.matmul(trainval_vec, trainval_vec.transpose()) + (10**gamma)*np.eye(d_trainval))
part_middle= np.matmul(np.matmul(trainval_vec,gt_trainval),trainval_sig.transpose())
part_right = np.linalg.pinv(np.matmul(trainval_sig, trainval_sig.transpose()) + (10**lamda)*np.eye(a_trainval))

W = np.matmul(np.matmul(part_left,part_middle),part_right)


Making Predictions

In [387]:
outputs = np.matmul(np.matmul(test_vec.transpose(),W),test_sig)
preds = np.array([np.argmax(output) for output in outputs])

### Model Evaluation

In [388]:
cm = confusion_matrix(labels_test, preds)
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
avg = sum(cm.diagonal())/len(test_labels_unseen)

print("The top 1% accuracy is:", avg*100)

The top 1% accuracy is: 38.47890467016037


In [389]:
print(classification_report(labels_test, preds, target_names=test_label_classes))

              precision    recall  f1-score   support

         cow       0.14      0.40      0.21       197
       horse       0.53      0.33      0.41       306
   motorbike       0.20      0.93      0.33       297
      person       0.78      0.15      0.25      5071
 pottedplant       0.06      0.04      0.05       436
       sheep       0.30      0.27      0.28       234
       train       0.27      0.69      0.39       176
   tvmonitor       0.11      0.49      0.18       299
      donkey       0.31      0.24      0.28       139
        goat       0.24      0.14      0.18       163
      jetski       0.19      0.89      0.31       399
      statue       0.02      0.04      0.03       207

    accuracy                           0.25      7924
   macro avg       0.26      0.38      0.24      7924
weighted avg       0.57      0.25      0.24      7924



**Hyperparamter Tuning**

In [186]:
accu = 0.10
gamma1 = 4
lamda1 = 1
for gamma in range(-2, 4):
    for lamda in range(-2,4):
        #One line solution
        part_left = np.linalg.pinv(np.matmul(train_vec, train_vec.transpose()) + (10**gamma)*np.eye(d_train))
        part_middle = np.matmul(np.matmul(train_vec,gt_train),train_sig.transpose())
        part_right = np.linalg.pinv(np.matmul(train_sig, train_sig.transpose()) + (10**lamda)*np.eye(a_train))

        V = np.matmul(np.matmul(part_left,part_middle),part_right)

        #predictions
        outputs = np.matmul(np.matmul(val_vec.transpose(),V),val_sig)
        preds = np.array([np.argmax(output) for output in outputs])

        cm = confusion_matrix(labels_val, preds)
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        avg = sum(cm.diagonal())/len(val_labels_unseen)
        print("Average Accuracy {}, gamma:{}, lambda:{}".format(avg, gamma, lamda))

        if avg > accu:
            accu = avg
            gamma1 = gamma
            lamda1 = lamda
print("-------------------------")         
print("Best Hyperparameters gamma: {} , lambda: {}".format(gamma1, lamda1))

Average Accuracy 0.530569855230006, gamma:2, lambda:2
Average Accuracy 0.5170873122068254, gamma:2, lambda:3
Average Accuracy 0.5489361826794676, gamma:3, lambda:2
Average Accuracy 0.5327763165316234, gamma:3, lambda:3
-------------------------
Best Hyperparameters gamma: 3 , lambda: 2


### Top 1 Accuracy results on different datsaets

| DataSet | Top-1 Accuracy | Hyperparameters |
| --- | --- | --- |
| CUB | 51.3 | Gamma=2, Lambda=0 |
| AWA1 | 56.19 | Gamma=3, Lambda=0 |
| AWA2 | 54.5 | Gamma=3, Lambda=0 |
| SUN | 52.3 | Gamma=2, Lambda=2 |
| APY | 38.5 | Gamma=3, Lambda=-1|