## Multi Label Image Classification

**The aim of this model is to predict the genre of a movie using just its poster image, as we know a movie can belong to more than one genre, right? It doesn’t just have to belong to one category, like action or comedy. The movie can be a combination of two or more genres. Hence, we use multi-label image classification to predict the genre of a movie.**

In [None]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tqdm import tqdm
%matplotlib inline

In [None]:
# reading the csv file
data = pd.read_csv('multi_label_ds/train.csv')
train = data.copy()

In [None]:
train.shape

In [None]:
# printing first five rows of the file
train.head() 

In [None]:
train.dtypes

In [None]:
train.describe(include = 'all')

In [None]:
train.isnull().sum()

In [None]:
# print columns in 
train.columns

**For building any image classification model, the first step is to Load and Preprocess the data. So, let’s read in all the training images.**

In [None]:
train_image = []
for i in tqdm(range(train.shape[0])):
    img = image.load_img('multi_label_ds/images/'+train['Id'][i]+'.jpg',target_size=(400,400,3))
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)
X = np.array(train_image)

In [None]:
# view shape of the array
X.shape

There are 7254 poster images and all the images have been converted to a shape of (400, 300, 3). Let’s plot and visualize one of the images:

In [None]:
plt.imshow(X[2])

This is the poster for the movie ‘Trading Places’. Let’s also print the genre of this movie:

In [None]:
train['Genre'][2]

This movie belongs to multiple genres 'Comedy'. As our model require the true label's for all these images, We know there are a total of 25 possible genres. For each image, we will have 25 targets, i.e., whether the movie belongs to that genre or not. So, all these 25 targets will have a value of either 0 or 1.

Now, we will remove the **'Id'** and **'genre'** columns from the train file and convert the remaining columns to an array which will be the target for our images.

In [None]:
y = np.array(train.drop(['Id', 'Genre'],axis=1))
y.shape

The shape of the output array is (7254, 25) as we expected. Now, let’s create a validation set which will help us check the performance of our model on unseen data. We will randomly separate 20% of the images as our validation set.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=30, test_size=0.2)

The next step is to define the architecture of our model. The output layer will have 25 neurons (equal to the number of genres) and we’ll use sigmoid as the activation function.

In [None]:
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(400,400,3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(25, activation='sigmoid'))

In [None]:
#Let’s print our model summary
model.summary()

Now, compile the model. we will use 'binary_crossentropy' as the loss function and 'ADAM' as the optimizer (again, you can use other optimizers as well like 'Adamax', 'Adagrad' etc)

In [None]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Now, we are at the most interesting part – **'training the model'**. We will train the model for 10 epochs and also pass the validation data which we created earlier in order to validate the model’s performance:

In [None]:
model.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test), batch_size=64)

Before making predictions, we need to preprocess these images using the same steps we saw earlier.

In [None]:
img = image.load_img('GOT.jpg',target_size=(400,400,3))
img = image.img_to_array(img)
img = img/255

Now, we will predict the genre for these posters using our trained model. The model will tell us the probability for each genre and we will take the top 3 predictions from that.

In [None]:
classes = np.array(train.columns[2:])
proba = model.predict(img.reshape(1,400,400,3))
top_3 = np.argsort(proba[0])[:-4:-1]
for i in range(3):
    print("{}".format(classes[top_3[i]])+" ({:.3})".format(proba[0][top_3[i]]))
plt.imshow(img)

Good! Our model suggests Drama, Thriller and Action genres for Game of Thrones.