# Transfer Learning CIFAR10

* Train a simple convnet on the CIFAR dataset the first 5 output classes [0..4].
* Freeze convolutional layers and fine-tune dense layers for the last 5 ouput classes [5..9].


### 1. Import CIFAR10 data and create 2 datasets with one dataset having classes from 0 to 4 and other having classes from 5 to 9 

In [0]:
# Import CIFAR10

from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D

# Read the MNIST Dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [49]:
# Shapes of X sets
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
print (y_train.shape,y_test.shape)

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples
(50000, 1) (10000, 1)


In [31]:
import numpy as np
y_test

array([[3],
       [8],
       [8],
       ...,
       [5],
       [1],
       [7]])

In [0]:
Y_test = np.array([item[0] for item in y_test])
Y_train = np.array([item[0] for item in y_train])



In [0]:
# Create two datasets 
# Datasets with digits below 5
x_train_lt5 = x_train[Y_train < 5]
y_train_lt5 = Y_train[Y_train < 5]
x_test_lt5 = x_test[Y_test < 5]
y_test_lt5 = Y_test[Y_test < 5]

# Datasets with digits 5 and above
x_train_gt5 = x_train[Y_train >= 5]
y_train_gt5 = Y_train[Y_train >= 5] - 5  # make classes start at 0 for
x_test_gt5 = x_test[Y_test >= 5]         # np_utils.to_categorical
y_test_gt5 = Y_test[Y_test >= 5] - 5

In [44]:
y_test_lt5

array([3, 0, 1, ..., 3, 3, 1])

### 2. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [0]:
# One-hot encoding for Y classes
y_train_lt5 = keras.utils.to_categorical(y_train_lt5, 5)
y_test_lt5 = keras.utils.to_categorical(y_test_lt5, 5)

In [0]:
# Datatype changes
x_train_lt5 = x_train_lt5.astype('float32')
x_test_lt5 = x_test_lt5.astype('float32')

# Normailze X sets
x_train_lt5 /= 255
x_test_lt5 /= 255

### 3. Build a sequential neural network model which can classify the classes 0 to 4 of CIFAR10 dataset with at least 80% accuracy on test data

In [70]:
from keras.layers import Dense, Dropout, Activation, Flatten
# Model initialisation
model = Sequential()

# FIrst layer
model.add(Conv2D(32, (3, 3), padding='same'))
model.add(Activation('relu'))

# Second layer
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

# Third layer
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))

# Fourth layer
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))

# Fully connected layer
model.add(Dense(5))
model.add(Activation('softmax'))

from keras.optimizers import SGD,Adam
from keras.losses import categorical_crossentropy

#To use adam optimizer for learning weights with learning rate = 0.001
optimizer = Adam()

#Set the loss function and optimizer for the model training
model.compile(loss=categorical_crossentropy, optimizer=optimizer, metrics=['accuracy'])

# Train and Test Accuracy

#Training on the dataset
model.fit(x_train_lt5, y_train_lt5,  batch_size=128,  epochs=10, validation_data=(x_test_lt5, y_test_lt5))

#Accuracy of Train set
score_lt5_Train = model.evaluate(x_train_lt5, y_train_lt5)
print ("Accuracy of Train set", score_lt5_Train)

#Accuracy of Test set
score_lt5_Test = model.evaluate(x_test_lt5, y_test_lt5)
print ("Accuracy of Test set", score_lt5_Test)

Train on 25000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Accuracy of Train set [0.30700460375785826, 0.89376]
Accuracy of Test set [0.4800745431900024, 0.8292]


### 4. In the model which was built above (for classification of classes 0-4 in CIFAR10), make only the dense layers to be trainable and conv layers to be non-trainable

In [71]:
# Freeze all layers except Dense for Transfer Learning
for layers in model.layers:
    print(layers.name)
    if('dense' not in layers.name):
        layers.trainable = False
        print(layers.name + 'is not trainable\n')
    if('dense' in layers.name):
        print(layers.name + ' is trainable\n')

conv2d_43
conv2d_43is not trainable

activation_54
activation_54is not trainable

conv2d_44
conv2d_44is not trainable

activation_55
activation_55is not trainable

max_pooling2d_22
max_pooling2d_22is not trainable

dropout_25
dropout_25is not trainable

conv2d_45
conv2d_45is not trainable

activation_56
activation_56is not trainable

conv2d_46
conv2d_46is not trainable

activation_57
activation_57is not trainable

max_pooling2d_23
max_pooling2d_23is not trainable

dropout_26
dropout_26is not trainable

flatten_13
flatten_13is not trainable

dense_25
dense_25 is trainable

activation_58
activation_58is not trainable

dropout_27
dropout_27is not trainable

dense_26
dense_26 is trainable

activation_59
activation_59is not trainable



### 5. Utilize the the model trained on CIFAR 10 (classes 0 to 4) to classify the classes 5 to 9 of CIFAR 10  (Use Transfer Learning) <br>
Achieve an accuracy of more than 85% on test data

In [0]:
# Datatype changes
x_train_gt5 = x_train_gt5.astype('float32')
x_test_gt5 = x_test_gt5.astype('float32')

# Normailze X sets
x_train_gt5 /= 255
x_test_gt5 /= 255

# One-hot encoding for Y classes
y_train_gt5 = keras.utils.to_categorical(y_train_gt5, 5)
y_test_gt5 = keras.utils.to_categorical(y_test_gt5, 5)

In [73]:
# Train and Test Accuracy

#Training on the dataset
model.fit(x_train_gt5, y_train_gt5,  batch_size=128,  epochs=10, validation_data=(x_test_gt5, y_test_gt5))

#Accuracy of Train set
score_gt5_Train = model.evaluate(x_train_gt5, y_train_gt5)
print ("Accuracy of Train set", score_gt5_Train)

#Accuracy of Test set
score_gt5_Test = model.evaluate(x_test_gt5, y_test_gt5)
print ("Accuracy of Train set", score_gt5_Test)

  'Discrepancy between trainable weights and collected trainable'


Train on 25000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Accuracy of Train set [0.1887032545876503, 0.93412]
Accuracy of Train set [0.29767795960903165, 0.8984]


## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 6. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

In [0]:
# Read the dataset and drop NAs
import pandas as pd
data = pd.read_csv('tweets.csv', encoding = "ISO-8859-1").dropna()

In [77]:
data.shape

(3291, 3)

In [78]:
data.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


### Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

In [0]:
# Remove rows containing neither Positive or Negative Emotion
data = data[(data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Positive emotion') | (data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Negative emotion')]

In [80]:
data.shape

(3191, 3)

### 7. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

In [81]:
# import and instantiate CountVectorizer (with the default parameters)
from sklearn.feature_extraction.text import CountVectorizer
vect = CountVectorizer(ngram_range=(1, 1))
simple_train= data['tweet_text']
vect.fit(simple_train)

CountVectorizer(analyzer='word', binary=False, decode_error='strict',
                dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
                lowercase=True, max_df=1.0, max_features=None, min_df=1,
                ngram_range=(1, 1), preprocessor=None, stop_words=None,
                strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
                tokenizer=None, vocabulary=None)

In [82]:
# transform training data into a 'document-term matrix'
simple_train_dtm = vect.transform(simple_train)
simple_train_dtm

<3191x5648 sparse matrix of type '<class 'numpy.int64'>'
	with 53275 stored elements in Compressed Sparse Row format>

In [83]:
# convert sparse matrix to a dense matrix
simple_train_dtm.toarray()

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

In [84]:
# examine the vocabulary and document-term matrix together
pd.DataFrame(simple_train_dtm.toarray(), columns=vect.get_feature_names())

Unnamed: 0,000,02,03,08,10,100,100s,100tc,101,106,10am,10k,10mins,10pm,10x,11,11ntc,11th,12,12b,12th,13,130,14,1406,1413,1415,15,150,1500,150m,157,15am,15k,16162,16gb,16mins,17,188,1986,...,zite,zms,zombies,zomg,zone,zoom,zzzs,¼¼,á¾_î¾ð,äá,å_,åç,åçwhat,çü,èï,ðü,öý,ù_¾,û_,ûª,ûªll,ûªm,ûªs,ûªt,ûï,ûï35,ûïbuttons,ûïfoursquare,ûïline,ûïmore,ûïmute,ûïspecials,ûïthe,ûïview,ûò,ûòand,ûó,ûójust,ûólewis,ûóthe
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3186,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3187,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3188,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3189,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### 8. Find number of different words in vocabulary

In [92]:
# examine the fitted vocabulary
a=vect.get_feature_names()
b=pd.value_counts(a)
b
# b gives the total distinct words in vocabulary, which is 5648

excellent     1
ness          1
latism        1
flaw          1
company       1
             ..
8p            1
shiner        1
incredibly    1
stay          1
formula       1
Length: 5648, dtype: int64

#### Tip: To see all available functions for an Object use dir

In [94]:
# Available functions of the Count Vectorizer object
dir(vect)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_char_ngrams',
 '_char_wb_ngrams',
 '_check_stop_words_consistency',
 '_check_vocabulary',
 '_count_vocab',
 '_get_param_names',
 '_get_tags',
 '_limit_features',
 '_more_tags',
 '_sort_features',
 '_stop_words_id',
 '_validate_custom_analyzer',
 '_validate_params',
 '_validate_vocabulary',
 '_white_spaces',
 '_word_ngrams',
 'analyzer',
 'binary',
 'build_analyzer',
 'build_preprocessor',
 'build_tokenizer',
 'decode',
 'decode_error',
 'dtype',
 'encoding',
 'fit',
 'fit_transform',
 'fixed_vocabulary_',
 'get_feature_names',
 'get_params',
 'get_stop_words',
 'input',
 'inverse_transf

### Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

In [95]:
# Count of Postive and Negative emotion
pd.value_counts(data['is_there_an_emotion_directed_at_a_brand_or_product'])

Positive emotion    2672
Negative emotion     519
Name: is_there_an_emotion_directed_at_a_brand_or_product, dtype: int64

###  Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'label'

Hint: use map on that column and give labels

In [0]:
# Label change for Positive and Negative emotion value
data['label'] = data.is_there_an_emotion_directed_at_a_brand_or_product.map({'Positive emotion':1, 'Negative emotion':0})

### 9. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

In [99]:
# Defining X and Y

X= data['tweet_text']
Y= data['label']

# split X and Y into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, random_state=2)
print(X_train.shape)
print(X_test.shape)
print(Y_train.shape)
print(Y_test.shape)

(2393,)
(798,)
(2393,)
(798,)


## 10. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

In [0]:
# import and instantiate a Multinomial Naive Bayes model, Logistic Regression model
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LogisticRegression
from sklearn import metrics

## 11. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [0]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(X_train)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(X_test)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, Y_train)
    y_pred_nb = nb.predict(x_test_dtm)
    lr = LogisticRegression()
    lr.fit(x_train_dtm, Y_train)
    y_pred_lr = lr.predict(x_test_dtm)
    print('Accuracy using Naive Bayes: ', metrics.accuracy_score(Y_test, y_pred_nb));
    print('Accuracy using Logistic Regression: ', metrics.accuracy_score(Y_test, y_pred_lr));

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

In [104]:
# include 1-grams and 2-grams
vect = CountVectorizer(ngram_range=(1, 2))
tokenize_test(vect)

Features:  24650
Accuracy using Naive Bayes:  0.8759398496240601
Accuracy using Logistic Regression:  0.8771929824561403




### 12. Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

In [105]:
# include stop words
vect = CountVectorizer(stop_words='english')
tokenize_test(vect)


Features:  4661
Accuracy using Naive Bayes:  0.8659147869674185
Accuracy using Logistic Regression:  0.8721804511278195




### 13. Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

In [106]:
# include stop words
vect = CountVectorizer(stop_words='english', max_features=300)
tokenize_test(vect)


Features:  300
Accuracy using Naive Bayes:  0.8233082706766918
Accuracy using Logistic Regression:  0.8508771929824561




### 14. Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

In [107]:
# include stop words
vect = CountVectorizer(ngram_range=(1, 2), max_features=15000)
tokenize_test(vect)


Features:  15000
Accuracy using Naive Bayes:  0.8771929824561403
Accuracy using Logistic Regression:  0.8721804511278195




### 15. Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score

In [108]:
# include stop words
vect = CountVectorizer(ngram_range=(1, 2), min_df=2)
tokenize_test(vect)


Features:  7925
Accuracy using Naive Bayes:  0.8796992481203008
Accuracy using Logistic Regression:  0.8721804511278195


