# Transfer Learning MNIST

* Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
* Freeze convolutional layers and fine-tune dense layers for the classification of digits [5..9].

## 1. Import necessary libraries for the model

In [115]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from keras import applications
from keras.models import Sequential, Model 
from keras import backend as k 
from keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.layers import Conv2D, MaxPooling2D, Activation, Flatten, Dense, Dropout
from keras.callbacks import EarlyStopping, ModelCheckpoint

## 2. Import MNIST data and create 2 datasets with one dataset having digits from 0 to 4 and other from 5 to 9 

In [116]:
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# create two datasets one with digits from 0 to 4 and one with 5 to 9
x_train_lt5 = x_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]
x_test_lt5 = x_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]

x_train_gte5 = x_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5]
x_test_gte5 = x_test[y_test >= 5]
y_test_gte5 = y_test[y_test >= 5]

## 3. Print x_train, y_train, x_test and y_test for both the datasets

In [117]:
print("Dataset Samples: \n")
print("X Train < 5: ", x_train_lt5[x_train_lt5 > 0])
print("X Test < 5: ", x_test_lt5[x_test_lt5 > 0])
print("Y Train < 5: ", y_train_lt5[0])
print("Y Test < 5: ", y_test_lt5[0])
print("X Train >= 5: ", x_train_gte5[x_train_gte5 > 0])
print("X Test >=5 : ", x_test_gte5[x_test_gte5 > 0])
print("Y Train >= 5: ", y_train_gte5[0])
print("Y Test >= 5: ", y_test_gte5[0])

Dataset Samples: 

X Train < 5:  [ 51 159 253 ... 168 108  15]
X Test < 5:  [116 125 171 ... 255 230  38]
Y Train < 5:  0
Y Test < 5:  2
X Train >= 5:  [  3  18  18 ... 193 197 134]
X Test >=5 :  [ 84 185 159 ... 132 110   4]
Y Train >= 5:  5
Y Test >= 5:  7


## ** 4. Let us take only the dataset (x_train, y_train, x_test, y_test) for Integers 0 to 4 in MNIST **
## Reshape x_train and x_test to a 4 Dimensional array (channel = 1) to pass it into a Conv2D layer

In [118]:
x_train_lt5_4d = np.expand_dims(x_train_lt5, axis = 3)
x_test_lt5_4d = np.expand_dims(x_test_lt5, axis = 3)

In [119]:
print("New Shape X Train: ", x_train_lt5_4d.shape);
print("New Shape X Test: ", x_test_lt5_4d.shape)

New Shape X Train:  (30596, 28, 28, 1)
New Shape X Test:  (5139, 28, 28, 1)


## 5. Normalize x_train and x_test by dividing it by 255

In [120]:
x_train_lt5_4d = x_train_lt5_4d/255;
x_test_lt5_4d = x_test_lt5_4d/255

## 6. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [121]:
y_train_enc = pd.get_dummies(y_train_lt5)
y_test_enc = pd.get_dummies(y_test_lt5)

In [122]:
y_train_lt5 = y_train_enc
y_test_lt5 = y_test_enc

## 7. Build a sequential model with 2 Convolutional layers with 32 kernels of size (3,3) followed by a Max pooling layer of size (2,2) followed by a drop out layer to be trained for classification of digits 0-4  

In [126]:
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
filters = 32
# size of pooling area for max pooling
pool_size = 2
# convolution kernel size
kernel_size = 3
# number of classes
num_classes = 5

conv_layers = [
    Conv2D(filters, kernel_size,
           padding='valid',
           input_shape=(28, 28, 1)),
    Activation('relu'),
    Conv2D(filters, kernel_size),
    Activation('relu'),
    MaxPooling2D(pool_size = pool_size),
    Dropout(0.25),
    Flatten(),
]

## 8. Post that flatten the data and add 2 Dense layers with 128 neurons and neurons = output classes with activation = 'relu' and 'softmax' respectively. Add dropout layer inbetween if necessary  

In [129]:
#Referenced from a similar GIT Project
output_layers = [
    Dense(128),
    Activation('relu'),
    Dropout(0.5),
    Dense(num_classes),
    Activation('softmax')
]

# create complete model
model = Sequential(conv_layers + output_layers)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


In [130]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_5 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
activation_5 (Activation)    (None, 26, 26, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 24, 24, 32)        9248      
_________________________________________________________________
activation_6 (Activation)    (None, 24, 24, 32)        0         
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 12, 12, 32)        0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 4608)              0         
__________

In [None]:
model.fit(x_train_lt5, y_train_lt5,
          batch_size = 512,
          epochs = 10,
          verbose = 1,
          validation_data=(x_test_lt5, y_test_lt5))

In [None]:
model_score = model.evaluate(x_test_lt5, y_test_lt5)

## 9. Print the training and test accuracy

In [0]:
print('Test score:', score[0])
print('Test accuracy:', score[1])

## 10. Make only the dense layers to be trainable and convolutional layers to be non-trainable

## 11. Use the model trained on 0 to 4 digit classification and train it on the dataset which has digits 5 to 9  (Using Transfer learning keeping only the dense layers to be trainable)

## 12. Print the accuracy for classification of digits 5 to 9

## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 13. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

### 14. Preprocess the text and add the preprocessed text in a column with name `text` in the dataframe.

In [0]:
def preprocess(text):
    try:
        return text.decode('ascii')
    except Exception as e:
        return ""

In [0]:
data['text'] = [preprocess(text) for text in data.tweet_text]

### 15. Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

### 16. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

### 17. Find number of different words in vocabulary

#### Tip: To see all available functions for an Object use dir

### 18. Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

### 19. Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'Label'

Hint: use map on that column and give labels

### 20. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

## 21. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

## 22. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [0]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(x_train)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(x_test)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, y_train)
    y_pred_class = nb.predict(x_test_dtm)
    print('Accuracy: ', metrics.accuracy_score(y_test, y_pred_class))

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

### Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score