# Transfer Learning CIFAR10

* Train a simple convnet on the CIFAR dataset the first 5 output classes [0..4].
* Freeze convolutional layers and fine-tune dense layers for the last 5 ouput classes [5..9].


### 1. Import CIFAR10 data and create 2 datasets with one dataset having classes from 0 to 4 and other having classes from 5 to 9 

In [0]:
#Importing important modules
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np

In [5]:
(trainX, trainY),(testX, testY) = cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [52]:
trainX.shape

(50000, 32, 32, 3)

In [61]:
testX.shape[0]

10000

In [0]:
X1_train = []
Y1_train = []
X2_train = []
Y2_train = []
X1_test = [] 
Y1_test = []
X2_test = []
Y2_test = []

In [0]:
for ix in range(50000):
    if trainY[ix] < 5:
        # put data in set 1
        X1_train.append(trainX[ix])
        Y1_train.append(trainY[ix])
    else:
        # put data in set 2
        X2_train.append(trainX[ix])
        Y2_train.append(trainY[ix])


In [0]:
for ix in range(testX.shape[0]):
    if testY[ix] < 5:
        # put data in set 1
        X1_test.append(testX[ix])
        Y1_test.append(testY[ix])
    else:
        # put data in set 2
        X2_test.append(testX[ix])
        Y2_test.append(testY[ix])

In [0]:
X1_train = np.asarray(X1_train).reshape((-1, 32, 32, 3))
X1_test = np.asarray(X1_test).reshape((-1, 32, 32, 3))
X2_train = np.asarray(X2_train).reshape((-1, 32, 32, 3))
X2_test = np.asarray(X2_test).reshape((-1, 32, 32, 3))

In [78]:
print(X1_train.shape)
print(X1_test.shape)
print(X2_train.shape)
print(X2_test.shape)

(25000, 32, 32, 3)
(5000, 32, 32, 3)
(25000, 32, 32, 3)
(5000, 32, 32, 3)


In [0]:
X1_train = X1_train.astype('float32')/255
X1_test = X1_test.astype('float32')/255
X2_train = X2_train.astype('float32')/255
X2_test = X2_test.astype('float32')/255


### 2. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [0]:
import keras

In [0]:
Y1_train = keras.utils.to_categorical(Y1_train, 5)
Y1_test = keras.utils.to_categorical(Y1_test, 5)

In [0]:
y_train_gt5 = trainY[trainY >= 5] - 5
y_test_gt5 = testY[testY >= 5] - 5

In [0]:
Y2_train = keras.utils.to_categorical(y_train_gt5, 5)
Y2_test = keras.utils.to_categorical(y_test_gt5, 5)

### 3. Build a sequential neural network model which can classify the classes 0 to 4 of CIFAR10 dataset with at least 80% accuracy on test data

In [0]:
#Initialize the model
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(32,32,3),name='conv_1'))
model.add(Conv2D(32, (3, 3), activation='relu',name='conv_2'))

#Add a MaxPooling Layer of size 2X2 
model.add(MaxPooling2D(pool_size=(2, 2),name='max_1'))

#Apply Dropout with 0.25 probability 
model.add(Dropout(0.25,name='drop_1'))
model.add(Conv2D(64, (3, 3), activation='relu',name='conv_3'))
model.add(Conv2D(64, (3, 3),activation='relu',name='conv_4'))
model.add(MaxPooling2D(pool_size=(2, 2),name='max_2'))
model.add(Dropout(0.25,name='drop_2'))

#Flatten the layer
model.add(Flatten())

#Add Fully Connected Layer with 128 units and activation function as 'ReLU'
model.add(Dense(512, activation='relu',name='dense_1'))
model.add(Dropout(0.5,name='drop_3'))
#Add Fully Connected Layer with 10 units and activation function as 'softmax'
model.add(Dense(5, activation='softmax',name='dense_2'))

In [97]:
model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_1 (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
conv_2 (Conv2D)              (None, 28, 28, 32)        9248      
_________________________________________________________________
max_1 (MaxPooling2D)         (None, 14, 14, 32)        0         
_________________________________________________________________
drop_1 (Dropout)             (None, 14, 14, 32)        0         
_________________________________________________________________
conv_3 (Conv2D)              (None, 12, 12, 64)        18496     
_________________________________________________________________
conv_4 (Conv2D)              (None, 10, 10, 64)        36928     
_________________________________________________________________
max_2 (MaxPooling2D)         (None, 5, 5, 64)         

In [0]:
from keras.optimizers import Adam
from keras.losses import categorical_crossentropy

#To use adam optimizer for learning weights with learning rate = 0.001
optimizer = Adam(lr=0.001)
#Set the loss function and optimizer for the model training
model.compile(loss=categorical_crossentropy,
              optimizer=optimizer,
              metrics=['accuracy'])

In [103]:
model.fit(X1_train, Y1_train,
          batch_size=128,
          epochs=10,
          verbose=1,
          validation_data=(X1_test, Y1_test))

Train on 25000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f00b69b9198>

### 4. In the model which was built above (for classification of classes 0-4 in CIFAR10), make only the dense layers to be trainable and conv layers to be non-trainable

In [0]:
model2 = Sequential()

model2.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(32,32,3),name='conv_1'))
model2.add(Conv2D(32, (3, 3), activation='relu',name='conv_2'))

#Add a MaxPooling Layer of size 2X2 
model2.add(MaxPooling2D(pool_size=(2, 2),name='max_1'))

#Apply Dropout with 0.25 probability 
model2.add(Dropout(0.25,name='drop_1'))
model2.add(Conv2D(64, (3, 3), activation='relu',name='conv_3'))
model2.add(Conv2D(64, (3, 3),activation='relu',name='conv_4'))
model2.add(MaxPooling2D(pool_size=(2, 2),name='max_2'))
model2.add(Dropout(0.25,name='drop_2'))

#Flatten the layer
model2.add(Flatten())

#Add Fully Connected Layer with 128 units and activation function as 'ReLU'
model2.add(Dense(512, activation='relu',name='dense_1'))
model2.add(Dropout(0.5,name='drop_3'))
#Add Fully Connected Layer with 10 units and activation function as 'softmax'
model2.add(Dense(5, activation='softmax',name='dense_2'))

In [0]:
for layer in model2.layers:
  if('dense' not in layer.name): #prefix detection to freeze layers which does not have dense
    #Freezing a layer
    layer.trainable = False

In [106]:
#Module to print colourful statements
from termcolor import colored

#Check which layers have been frozen 
for layer in model2.layers:
  print (colored(layer.name, 'blue'))
  print (colored(layer.trainable, 'red'))

[34mconv_1[0m
[31mFalse[0m
[34mconv_2[0m
[31mFalse[0m
[34mmax_1[0m
[31mFalse[0m
[34mdrop_1[0m
[31mFalse[0m
[34mconv_3[0m
[31mFalse[0m
[34mconv_4[0m
[31mFalse[0m
[34mmax_2[0m
[31mFalse[0m
[34mdrop_2[0m
[31mFalse[0m
[34mflatten_5[0m
[31mFalse[0m
[34mdense_1[0m
[31mTrue[0m
[34mdrop_3[0m
[31mFalse[0m
[34mdense_2[0m
[31mTrue[0m


### 5. Utilize the the model trained on CIFAR 10 (classes 0 to 4) to classify the classes 5 to 9 of CIFAR 10  (Use Transfer Learning) <br>
Achieve an accuracy of more than 85% on test data

In [0]:
model2.set_weights(model.get_weights())

In [0]:
model2.compile(loss=categorical_crossentropy,
              optimizer=optimizer,
              metrics=['accuracy'])

In [109]:
#Training on the dataset
model2.fit(X2_train, Y2_train,
          batch_size=128,
          epochs=15,
          verbose=1,
          validation_data=(X2_test, Y2_test))

Train on 25000 samples, validate on 5000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.callbacks.History at 0x7f00b660b780>

## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 6. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

In [0]:
import pandas as pd
data = pd.read_csv('./tweets.csv', encoding = "ISO-8859-1").dropna()

In [111]:
data.shape

(3291, 3)

In [112]:
data.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


### Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

In [0]:
data = data[(data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Positive emotion') | (data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Negative emotion')]

In [114]:
data.shape

(3191, 3)

### 7. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

In [0]:
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

In [0]:
X = data.tweet_text
y=data.is_there_an_emotion_directed_at_a_brand_or_product
# split the new DataFrame into training and testing sets [Default test size = 25%]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

In [0]:
vect = CountVectorizer()
X_train_dtm = vect.fit_transform(X_train)
X_test_dtm = vect.transform(X_test)

### 8. Find number of different words in vocabulary

In [126]:
vect.vocabulary_

{'tried': 4435,
 'installing': 2239,
 'mention': 2721,
 'on': 2999,
 'my': 2858,
 'iphone': 2291,
 'but': 669,
 'it': 2311,
 'crashes': 1027,
 'every': 1480,
 'time': 4347,
 'open': 3008,
 'sxsw': 4149,
 'ipad2': 2285,
 'rocks': 3606,
 'apple': 315,
 'pop': 3244,
 'up': 4546,
 'store': 4044,
 'link': 2520,
 'what': 4722,
 'your': 4863,
 'take': 4191,
 'ipad': 2283,
 'really': 3452,
 'want': 4671,
 'checkins': 791,
 'aron': 344,
 'pilhofer': 3182,
 'from': 1736,
 'the': 4279,
 'new': 2900,
 'york': 4860,
 'times': 4351,
 'just': 2363,
 'endorsed': 1418,
 'html': 2119,
 'over': 3054,
 'at': 367,
 'newsapps': 2904,
 'and': 268,
 'asked': 360,
 'us': 4562,
 'not': 2937,
 'to': 4366,
 'tweet': 4475,
 'he': 2007,
 'actually': 162,
 'said': 3640,
 'lt': 2595,
 'guess': 1929,
 'who': 4729,
 'won': 4787,
 'an': 265,
 'unsix': 4538,
 'tweetup': 4484,
 'thanks': 4276,
 'amp': 262,
 'happydance': 1983,
 'pedicab': 3137,
 'charger': 776,
 'would': 4816,
 'be': 482,
 'epic': 1448,
 'win': 4752,
 'da

#### Tip: To see all available functions for an Object use dir

In [128]:
dir(vect)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_char_ngrams',
 '_char_wb_ngrams',
 '_check_stop_words_consistency',
 '_check_vocabulary',
 '_count_vocab',
 '_get_param_names',
 '_get_tags',
 '_limit_features',
 '_more_tags',
 '_sort_features',
 '_stop_words_id',
 '_validate_custom_analyzer',
 '_validate_params',
 '_validate_vocabulary',
 '_white_spaces',
 '_word_ngrams',
 'analyzer',
 'binary',
 'build_analyzer',
 'build_preprocessor',
 'build_tokenizer',
 'decode',
 'decode_error',
 'dtype',
 'encoding',
 'fit',
 'fit_transform',
 'fixed_vocabulary_',
 'get_feature_names',
 'get_params',
 'get_stop_words',
 'input',
 'inverse_transf

### Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

In [129]:
pd.value_counts(data['is_there_an_emotion_directed_at_a_brand_or_product'])

Positive emotion    2672
Negative emotion     519
Name: is_there_an_emotion_directed_at_a_brand_or_product, dtype: int64

###  Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'label'

Hint: use map on that column and give labels

In [0]:
data['label'] = data.is_there_an_emotion_directed_at_a_brand_or_product.map({'Positive emotion':1, 'Negative emotion':0})

### 9. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

In [0]:
X = data.tweet_text
y=data.label
# split the new DataFrame into training and testing sets [Default test size = 25%]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

## 10. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

## 11. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [0]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(X_train)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(X_test)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, y_train)
    y_pred_class = nb.predict(x_test_dtm)
    print('muLTINOMIAL Test Accuracy: ', metrics.accuracy_score(y_test, y_pred_class))
    logreg=LogisticRegression()
    logreg.fit(x_train_dtm, y_train)
    y_pred_class = logreg.predict(x_test_dtm)
    print('Logistic Test Accuracy: ', metrics.accuracy_score(y_test, y_pred_class))

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

In [136]:
# include 1-grams and 2-grams
vect = CountVectorizer(ngram_range=(1, 2))
tokenize_test(vect)

Features:  24855
muLTINOMIAL Test Accuracy:  0.8558897243107769
Logistic Test Accuracy:  0.8659147869674185


### 12. Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

In [138]:
# include 1-grams and 2-grams
vect = CountVectorizer(stop_words='english')
tokenize_test(vect)

Features:  4681
muLTINOMIAL Test Accuracy:  0.8533834586466166
Logistic Test Accuracy:  0.8671679197994987


### 13. Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

In [139]:
vect = CountVectorizer(stop_words='english',max_features=300)
tokenize_test(vect)

Features:  300
muLTINOMIAL Test Accuracy:  0.8107769423558897
Logistic Test Accuracy:  0.8333333333333334


### 14. Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

In [141]:
vect = CountVectorizer(stop_words='english',ngram_range=(1, 2),max_features=15000)
tokenize_test(vect)

Features:  15000
muLTINOMIAL Test Accuracy:  0.8558897243107769
Logistic Test Accuracy:  0.8659147869674185


### 15. Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score

In [142]:
vect = CountVectorizer(stop_words='english',ngram_range=(1, 2),min_df=2)
tokenize_test(vect)

Features:  5451
muLTINOMIAL Test Accuracy:  0.8659147869674185
Logistic Test Accuracy:  0.8671679197994987
