# Transfer Learning MNIST

* Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
* Freeze convolutional layers and fine-tune dense layers for the classification of digits [5..9].

## 1. Import necessary libraries for the model

In [1]:
import numpy as np
import pandas as pd
import keras
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

Using TensorFlow backend.


## 2. Import MNIST data and create 2 datasets with one dataset having digits from 0 to 4 and other from 5 to 9 

In [0]:
(trainX, trainY),(testX, testY) = tf.keras.datasets.mnist.load_data()

In [0]:
trainX = trainX.astype('float32')
testX = testX.astype('float32')

In [0]:
trainX_1 = trainX[trainY<5]
trainX_2 = trainX[trainY>=5]

trainY_1 = trainY[trainY<5]
trainY_2 = trainY[trainY>=5]

testX_1 = testX[testY<5]
testX_2 = testX[testY>=5]

testY_1 = testY[testY<5]
testY_2 = testY[testY>=5]

In [5]:
trainX_1.shape

(30596, 28, 28)

## 3. Print x_train, y_train, x_test and y_test for both the datasets

In [6]:
print(trainX_1.shape)
print(trainX_2.shape)
print(trainY_1.shape)
print(trainY_2.shape)

(30596, 28, 28)
(29404, 28, 28)
(30596,)
(29404,)


In [7]:
print(testX_1.shape)
print(testX_2.shape)
print(testY_1.shape)
print(testY_2.shape)

(5139, 28, 28)
(4861, 28, 28)
(5139,)
(4861,)


## ** 4. Let us take only the dataset (x_train, y_train, x_test, y_test) for Integers 0 to 4 in MNIST **
## Reshape x_train and x_test to a 4 Dimensional array (channel = 1) to pass it into a Conv2D layer

In [0]:
trainX_1 = np.reshape(trainX_1,(30596,28,28,1))
testX_1 = np.reshape(testX_1,(5139,28,28,1))

trainX_2 = np.reshape(trainX_2,(29404,28,28,1))
testX_2 = np.reshape(testX_2,(4861,28,28,1))

In [9]:
trainX_1.shape

(30596, 28, 28, 1)

## 5. Normalize x_train and x_test by dividing it by 255

In [0]:
trainX_1 = trainX_1/255
testX_1 = testX_1/255

trainX_2 = trainX_2/255
testX_2 = testX_2/255

In [11]:
np.unique(trainY_1)

array([0, 1, 2, 3, 4], dtype=uint8)

## 6. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [0]:
trainY_1 = tf.keras.utils.to_categorical(trainY_1, num_classes=5)
testY_1 = tf.keras.utils.to_categorical(testY_1, num_classes=5)

In [0]:
trainY_2 = trainY_2 - 5
testY_2 = testY_2 - 5

In [0]:
trainY_2 = tf.keras.utils.to_categorical(trainY_2, num_classes=5)
testY_2 = tf.keras.utils.to_categorical(testY_2, num_classes=5)

## 7. Build a sequential model with 2 Convolutional layers with 32 kernels of size (3,3) followed by a Max pooling layer of size (2,2) followed by a drop out layer to be trained for classification of digits 0-4  

In [0]:
tf.keras.backend.clear_session()
#Initialize model
model = tf.keras.models.Sequential()

In [16]:
#Add first convolutional layer
model.add(tf.keras.layers.Conv2D(32, #Number of filters 
                                 kernel_size=(3,3), #Size of the filter
                                 activation='relu',input_shape=(28,28,1)))
#Add second convolutional layer
model.add(tf.keras.layers.Conv2D(32, #Number of filters 
                                 kernel_size=(3,3), #Size of the filter
                                 activation='relu'))
#Add MaxPooling layer
model.add(tf.keras.layers.MaxPool2D(pool_size=(2,2)))

#Add dropout layer
model.add(tf.keras.layers.Dropout(0.25))

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


## 8. Post that flatten the data and add 2 Dense layers with 128 neurons and neurons = output classes with activation = 'relu' and 'softmax' respectively. Add dropout layer inbetween if necessary  

In [0]:
#Flatten the output
model.add(tf.keras.layers.Flatten())

#Dense layer
model.add(tf.keras.layers.Dense(128, activation='relu'))

#Add another dropout layer
#model.add(tf.keras.layers.Dropout(0.25))

#Output layer
model.add(tf.keras.layers.Dense(5, activation='softmax'))

In [0]:
model.compile(optimizer='adam', 
              loss='categorical_crossentropy', metrics=['accuracy'])

In [19]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        9248      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 32)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 32)        0         
_________________________________________________________________
flatten (Flatten)            (None, 4608)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               589952    
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
Total para

## 9. Print the training and test accuracy

In [20]:
#Train the model
model.fit(trainX_1,trainY_1,          
          validation_data=(testX_1,testY_1),
          epochs=10,
          batch_size=32)

Train on 30596 samples, validate on 5139 samples
Instructions for updating:
Use tf.cast instead.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fccf268edd0>

In [21]:
# Final Train Loss & Accuracy
model.evaluate(trainX_1,trainY_1)



[0.001138369440492378, 0.99964046]

In [22]:
# Final Validation Loss & Accuracy
model.evaluate(testX_1,testY_1)



[0.007715748299476817, 0.9978595]

## 10. Make only the dense layers to be trainable and convolutional layers to be non-trainable

In [23]:
#Freezing layers in the model which don't have 'dense' in their name
for layer in model.layers:
  if('dense' not in layer.name): #prefix detection to freeze layers which does not have dense
    #Freezing a layer
    layer.trainable = False
    
#Module to print colourful statements
from termcolor import colored

#Check which layers have been frozen 
for layer in model.layers:
  print (colored(layer.name, 'blue'))
  print (colored(layer.trainable, 'red'))

[34mconv2d[0m
[31mFalse[0m
[34mconv2d_1[0m
[31mFalse[0m
[34mmax_pooling2d[0m
[31mFalse[0m
[34mdropout[0m
[31mFalse[0m
[34mflatten[0m
[31mFalse[0m
[34mdense[0m
[31mTrue[0m
[34mdense_1[0m
[31mTrue[0m


## 11. Use the model trained on 0 to 4 digit classification and train it on the dataset which has digits 5 to 9  (Using Transfer learning keeping only the dense layers to be trainable)

In [24]:
#Train the model
model.fit(trainX_2,trainY_2,          
          validation_data=(testX_2,testY_2),
          epochs=10,
          batch_size=32)

Train on 29404 samples, validate on 4861 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fcce662ab50>

## 12. Print the accuracy for classification of digits 5 to 9

In [25]:
# Final Train Loss & Accuracy
model.evaluate(trainX_2,trainY_2)



[0.0026608804317516293, 0.99908173]

In [26]:
# Final Validation Loss & Accuracy
model.evaluate(testX_2,testY_2)



[0.03902185546897462, 0.9915655]

## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 13. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

### 14. Preprocess the text and add the preprocessed text in a column with name `text` in the dataframe.

In [0]:
def preprocess(text):
    try:
        return text.decode('ascii')
    except Exception as e:
        return ""

In [0]:
data['text'] = [preprocess(text) for text in data.tweet_text]

### 15. Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

### 16. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

### 17. Find number of different words in vocabulary

#### Tip: To see all available functions for an Object use dir

### 18. Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

### 19. Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'Label'

Hint: use map on that column and give labels

### 20. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

## 21. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

## 22. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [0]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(x_train)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(x_test)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, y_train)
    y_pred_class = nb.predict(x_test_dtm)
    print('Accuracy: ', metrics.accuracy_score(y_test, y_pred_class))

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score