# Extracting coarse-grained classifiers from large CNN models.

The following example shows how to extract coarse-grained classifiers from a fine-grained classifier pretrained on ImageNet.

In the presented strategy, we use all the available hyponyms (in ImageNet) of each desired coarse-grained class. We find all the hyponyms (sub-categories) using WordNet.

Import of all the necessary libraries:

In [1]:
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
import numpy as np

from nltk.corpus import wordnet as wn
import nltk
from tensorflow.keras.applications.imagenet_utils import decode_predictions
from keras.utils import to_categorical

Below, we include the implementation of the function that allows to find all the hyponyms (sub-classes) of a given WordNet node. We use the nltk WordNet implementation.

In [2]:
def find_hyponyms(hypernym_name):
    """
    Function can be used to find all the hyponyms of a given WordNet node, 
    which is a hypernym of some classes.
    Parameters
    ----------
    hypernym_name - name of a WordNet node for which we aim to find all available 
    hyponyms in ImageNet
    
    Returns
    -------
    ids_array - a NumPy array with ImageNet ids of desired fine-grained classes (hyponyms)
    """
    hyponyms = wn.synsets(hypernym_name)[0]
    hyponyms = set([i for i in hyponyms.closure(lambda s:s.hyponyms())])
    offsets = []
    imagenet_classes = decode_predictions(to_categorical(np.expand_dims(np.array(range(1000)), axis=-1), num_classes=1000), top=1)

    for c in imagenet_classes:
        offsets.append(int(c[0][0].split('n')[1]))
    
    ids = []
    for idx, o in enumerate(offsets):
        isadoggo = wn.synset_from_pos_and_offset('n', int(o))
        if isadoggo in hyponyms:
            ids.append(idx)
    ids_array = np.array(ids)
    return ids_array

Read an example model from Keras (MobileNetV2)

In [3]:
model = MobileNetV2(
    include_top=True,
    weights="imagenet",
    input_tensor=None,
    input_shape=None,
    pooling='avg',
    classes=1000,
    classifier_activation="softmax"
)

Read a dataset (we use image_dataset_from_directory from tf.keras.preprocessing to create a generator). In the example below, we use our own very small dataset, but to gather the results for the purpose of our paper, we used Kaggle Dogs vs. Cats dataset (see https://www.kaggle.com/c/dogs-vs-cats) - it has the same format as our tiny example.

In [4]:
# set a correct image size for a network (MobileNetV2)
image_size = (224, 224)
batch_size = 32

test_ds = tf.keras.preprocessing.image_dataset_from_directory(
    "PetImages",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
    label_mode='categorical',
    shuffle=False
)
normalized_ds = test_ds.map(lambda x, y: (preprocess_input(x), y))

# here, we read the labels for testing (correct only when shuffle is False)
labels = np.concatenate([y for x, y in normalized_ds], axis=0) 

Found 20 files belonging to 2 classes.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089


The function provided below can be used to created coarse-grained classifiers:

In [5]:
def create_coarse_grained_model(model, hypernyms):
    """ Function takes a pre-trained fine-grained CNN model as an argument and returns
    a coarse-grained model built using the base pre-trained model. The new coarse-grained classes
    are placed in the same order as the corresponding hypernyms (coarse-grained classes)
    in hypernyms list.
    
    Parameters
    ----------
    model - keras pre-trained model with fined-grained classes
    hypernyms - list of hypernyms - coarse-grained class
    
    Returns
    -------
    new_dense_coarse - resulting keras model with coarse-grained classes
    """
    import numpy as np

    coarse_grained_weights = []
    coarse_grained_bias =[]

    for hyp in hypernyms:
        hyponyms = find_hyponyms(hypernym_name=hyp)
        weights_for_init = np.mean(model.layers[-1].get_weights()[0][:, hyponyms], axis=1)
        bias_for_init = np.mean(model.layers[-1].get_weights()[1][hyponyms], axis=0)

        coarse_grained_weights.append(weights_for_init)
        coarse_grained_bias.append(bias_for_init)

    new_weights = np.moveaxis(np.array(coarse_grained_weights), 0, -1)
    new_biases = np.array(coarse_grained_bias)   

    new_number_of_classes = len(hypernyms)
    new_dense_coarse = tf.keras.layers.Dense(new_number_of_classes, activation='softmax')(model.layers[-2].output)
    new_dense_coarse = tf.keras.models.Model(inputs=model.input, outputs=new_dense_coarse)
    new_dense_coarse.layers[-1].set_weights([new_weights, new_biases])
    
    return new_dense_coarse

Setting the parameters values:
* hypernyms - a list of names for a coarse-grained classifier. They have to be WordNet nodes and the model to be modified has to include some of their hyponyms (e.g. for WordNet node dog, we have numerous ImageNet classes).

In [6]:
hypernyms = ['cat', 'dog'] #coarse-grained classes in the alphabetical order (similar to folder names)
coarse_grained_model = create_coarse_grained_model(model=model, hypernyms=hypernyms)

We can now test our model:

In [7]:
from sklearn.metrics import accuracy_score, confusion_matrix
predictions = coarse_grained_model.predict(normalized_ds)
true_y =  np.argmax(labels, axis=1)
predicted_y = np.argmax(predictions, axis=1)
print('Accuracy')
print(accuracy_score(true_y, predicted_y))
print('Confusion matrix')
print(confusion_matrix(true_y, predicted_y, normalize='true'))

Accuracy
0.7
Confusion matrix
[[0.9 0.1]
 [0.5 0.5]]
