# Transfer Learning CIFAR10

* Train a simple convnet on the CIFAR dataset the first 5 output classes [0..4].
* Freeze convolutional layers and fine-tune dense layers for the last 5 ouput classes [5..9].


### 1. Import CIFAR10 data and create 2 datasets with one dataset having classes from 0 to 4 and other having classes from 5 to 9 

In [0]:
import tensorflow as tf
tf.reset_default_graph()
tf.set_random_seed(42)

In [0]:
(Xtrain,Ytrain),(Xtest,Ytest) = tf.keras.datasets.cifar10.load_data()

In [0]:
Ytrain = Ytrain.flatten()
Ytest = Ytest.flatten()
#Flatten the Y dataset as its not in one dimensional shape

In [0]:
Xtrain_lt5 = Xtrain[Ytrain < 5]
Ytrain_lt5 = Ytrain[Ytrain < 5]
Xtest_lt5 = Xtest[Ytest < 5]
Ytest_lt5 = Ytest[Ytest < 5]

Xtrain_gt5 = Xtrain[Ytrain >= 5]
Ytrain_gt5 = Ytrain[Ytrain >= 5] - 5  # make classes start at 0 for
Xtest_gt5 = Xtest[Ytest >= 5]         # np_utils.to_categorical
Ytest_gt5 = Ytest[Ytest >= 5] - 5

### 2. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [0]:
Ytrain_lt5 = tf.keras.utils.to_categorical(Ytrain_lt5,num_classes=5)
Ytest_lt5 = tf.keras.utils.to_categorical(Ytest_lt5,num_classes=5)
Ytrain_gt5 = tf.keras.utils.to_categorical(Ytrain_gt5,num_classes=5)
Ytest_gt5 = tf.keras.utils.to_categorical(Ytest_gt5,num_classes=5)

### 3. Build a sequential neural network model which can classify the classes 0 to 4 of CIFAR10 dataset with at least 80% accuracy on test data

In [0]:
# Normalize & change Dtype
Xtrain_lt5 = Xtrain_lt5.astype('float32') / 255
Xtest_lt5 = Xtest_lt5.astype('float32') / 255

In [0]:
Xtrain_lt5 = Xtrain_lt5.reshape(Xtrain_lt5.shape[0],32,32,3)
Xtest_lt5 = Xtest_lt5.reshape(Xtest_lt5.shape[0],32,32,3)

In [0]:
import keras
from keras.layers import Dense, Activation,Dropout, Flatten, Reshape
from keras.layers import Convolution2D, MaxPooling2D
from keras.models import Sequential
from keras.optimizers import Adam
from keras.layers.normalization import BatchNormalization

In [0]:
model = Sequential()

In [294]:
# 1st Conv Layer
model.add(Convolution2D(64, 3, 3, input_shape=(32, 32, 3)))
model.add(Activation('relu'))

# 2nd Conv Layer   
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))

# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2)))

# Dropout Layer
model.add(Dropout(0.25))

# 3rd Conv Layer   
model.add(Convolution2D(128, 3, 3))
model.add(Activation('relu'))

# Max Pooling
model.add(MaxPooling2D(pool_size=(2,2)))

# Fully Connected Layer
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(BatchNormalization())
  
# Prediction Layer
model.add(Dense(5))
model.add(Activation('softmax'))
    
# Loss and Optimizer
adam = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
# Train the model
model.fit(Xtrain_lt5, Ytrain_lt5, batch_size=32, nb_epoch=10, 
              validation_data=(Xtest_lt5, Ytest_lt5))

  """Entry point for launching an IPython kernel.
  """
  from ipykernel import kernelapp as app


Train on 25000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fadfc8422b0>

In [295]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 30, 30, 64)        1792      
_________________________________________________________________
activation_1 (Activation)    (None, 30, 30, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 64)        36928     
_________________________________________________________________
activation_2 (Activation)    (None, 28, 28, 64)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 12, 12, 128)       73856     
__________

In [302]:
from keras.preprocessing.image import ImageDataGenerator

EPOCHS = 20
BS = 32

# construct the training image generator for data augmentation
aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15,
	width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
	horizontal_flip=True, fill_mode="nearest")
 
# train the network
model.fit_generator(aug.flow(Xtrain_lt5, Ytrain_lt5, batch_size=BS),
	validation_data=(Xtest_lt5, Ytest_lt5), steps_per_epoch=len(Xtrain_lt5) // BS,
	epochs=EPOCHS)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7fadfa12f4e0>

### 4. In the model which was built above (for classification of classes 0-4 in CIFAR10), make only the dense layers to be trainable and conv layers to be non-trainable

In [303]:
for layers in model.layers:
    print(layers.name)
    if('dense' not in layers.name):
        layers.trainable = False
        print(layers.name + 'is not trainable\n')
    if('dense' in layers.name):
        print(layers.name + ' is trainable\n')

conv2d_1
conv2d_1is not trainable

activation_1
activation_1is not trainable

conv2d_2
conv2d_2is not trainable

activation_2
activation_2is not trainable

max_pooling2d_1
max_pooling2d_1is not trainable

dropout_1
dropout_1is not trainable

conv2d_3
conv2d_3is not trainable

activation_3
activation_3is not trainable

max_pooling2d_2
max_pooling2d_2is not trainable

flatten_1
flatten_1is not trainable

dense_1
dense_1 is trainable

activation_4
activation_4is not trainable

batch_normalization_1
batch_normalization_1is not trainable

dense_2
dense_2 is trainable

activation_5
activation_5is not trainable



### 5. Utilize the the model trained on CIFAR 10 (classes 0 to 4) to classify the classes 5 to 9 of CIFAR 10  (Use Transfer Learning) <br>
Achieve an accuracy of more than 85% on test data

In [0]:
# Reshape
Xtrain_gt5 = Xtrain_gt5.reshape(Xtrain_gt5.shape[0],32,32,3)
Xtest_gt5 = Xtest_gt5.reshape(Xtest_gt5.shape[0],32,32,3)

In [0]:
# Normalize & change Dtype
Xtrain_gt5 = Xtrain_gt5.astype('float32') / 255
Xtest_gt5 = Xtest_gt5.astype('float32') / 255

In [389]:
#model.fit(Xtrain_gt5, Ytrain_gt5, batch_size=32, nb_epoch=10, 
#              validation_data=(Xtest_gt5, Ytest_gt5))

#Using the weights of previous model and augemting the data to get better accuracy

EPOCHS = 20
BS = 32

# construct the training image generator for data augmentation
aug1 = ImageDataGenerator(rotation_range=20, zoom_range=0.25,
	width_shift_range=0.1, height_shift_range=0.1, shear_range=0.15,
	horizontal_flip=True, fill_mode="nearest")
 
# train the network
model.fit_generator(aug1.flow(Xtrain_gt5, Ytrain_gt5, batch_size=BS),
	validation_data=(Xtest_gt5, Ytest_gt5), steps_per_epoch=len(Xtrain_gt5) // BS,
	epochs=EPOCHS)

Epoch 1/20
 13/781 [..............................] - ETA: 11s - loss: 0.3881 - acc: 0.8582

  'Discrepancy between trainable weights and collected trainable'


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7fadef453f98>

In [390]:
score = model.evaluate(Xtest_gt5, Ytest_gt5, batch_size=128, verbose=0)
print(score)

[0.3485453933954239, 0.8806]


In [0]:
# Data augmentation didnt help much. My accuracy improved only marginally by doing Augumentation.
# I would like to understand where i am going wrong and what tuning of hyper parameter would help#

## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 6. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

In [307]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
import pandas as pd
data = pd.read_csv('/content/drive/My Drive/Colab Notebooks/R8/tweets.csv', encoding = "ISO-8859-1").dropna()

In [310]:
data.shape

(3291, 3)

In [311]:
data.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


### Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

In [0]:
data = data[(data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Positive emotion') | (data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Negative emotion')]

In [313]:
data.shape

(3191, 3)

In total 100 records were dropped as those were neither positive nor negative response

### 7. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

Divide the data into two dataset with X as Tweet and Y as positive/Negative

In [0]:
X = data['tweet_text']

In [349]:
X.shape

(3191,)

In [0]:
Y = data['is_there_an_emotion_directed_at_a_brand_or_product']

In [0]:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

In [0]:
# Term Frequency
vect = CountVectorizer()
tf = vect.fit_transform(X)

### 8. Find number of different words in vocabulary

In [333]:
tf.shape

(3191, 5648)

#### Tip: To see all available functions for an Object use dir

In [335]:
dir(tf)

['__abs__',
 '__add__',
 '__array_priority__',
 '__bool__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__div__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__idiv__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__le__',
 '__len__',
 '__lt__',
 '__matmul__',
 '__module__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__pow__',
 '__radd__',
 '__rdiv__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmatmul__',
 '__rmul__',
 '__rsub__',
 '__rtruediv__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__weakref__',
 '_add_dense',
 '_add_sparse',
 '_arg_min_or_max',
 '_arg_min_or_max_axis',
 '_asindices',
 '_binopt',
 '_cs_matrix__get_has_canonical_format',
 '_cs_matrix__get_sorted',
 '_cs_matrix__set_has_canonical_format',
 '_cs_matrix__set_sorted

### Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

In [336]:
pd.value_counts(data['is_there_an_emotion_directed_at_a_brand_or_product'])

Positive emotion    2672
Negative emotion     519
Name: is_there_an_emotion_directed_at_a_brand_or_product, dtype: int64

###  Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'label'

Hint: use map on that column and give labels

In [0]:
data['label'] = data.is_there_an_emotion_directed_at_a_brand_or_product.map({'Positive emotion':1, 'Negative emotion':0})

### 9. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

In [0]:
x = data.tweet_text
y = data['label']

In [0]:
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest = train_test_split(x,y, random_state=1)

## 10. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

In [0]:
vect = CountVectorizer(ngram_range=(1, 1))
X_train_dtm = vect.fit_transform(xtrain)
X_test_dtm = vect.transform(xtest)

In [0]:
from sklearn.naive_bayes import MultinomialNB
nb = MultinomialNB()

In [364]:
nb.fit(X_train_dtm,ytrain)

MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)

In [0]:
y_pred = nb.predict(X_test_dtm)

In [367]:
from sklearn import metrics
metrics.accuracy_score(ytest, y_pred)

0.8471177944862155

In [0]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()

In [375]:
lr.fit(X_train_dtm,ytrain)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False)

In [0]:
y_pred_lr = lr.predict(X_test_dtm)

In [377]:
metrics.accuracy_score(ytest, y_pred_lr)

0.868421052631579

Logistic Regression is giving better accuracy. I believe applying ensemble technique can further improve the overall accuracy of this model.

## 11. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [0]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(xtrain)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(xtest)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, ytrain)
    y_pred_class = nb.predict(x_test_dtm)
    print('Accuracy: ', metrics.accuracy_score(ytest, y_pred_class))

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

In [381]:
# include 1-grams and 2-grams
vect = CountVectorizer(ngram_range=(1, 2))
tokenize_test(vect)

Features:  24855
Accuracy:  0.8558897243107769


### 12. Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

In [382]:
vect1 = CountVectorizer(stop_words='english')
tokenize_test(vect1)

Features:  4681
Accuracy:  0.8533834586466166


### 13. Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

In [383]:
vect2 = CountVectorizer(stop_words='english',max_features =300)
tokenize_test(vect2)

Features:  300
Accuracy:  0.8107769423558897


### 14. Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

In [384]:
vect3 = CountVectorizer(ngram_range=(1, 2),max_features =15000)
tokenize_test(vect3)

Features:  15000
Accuracy:  0.8533834586466166


### 15. Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score

In [385]:
vect4 = CountVectorizer(ngram_range=(1, 2),min_df = 2)
tokenize_test(vect4)

Features:  7764
Accuracy:  0.8583959899749374


##Bonus Code##
Below i am creating a function with LogisticRegression and using all the above vector function and pass it to the tokenize predictor to check accuracy


In [0]:
def tokenize_test_lr(vect):
    x_train_dtm = vect.fit_transform(xtrain)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(xtest)
    lr = LogisticRegression()
    lr.fit(x_train_dtm, ytrain)
    y_pred_lr = lr.predict(x_test_dtm)
    print('Accuracy: ', metrics.accuracy_score(ytest, y_pred_lr))

In [387]:
vect5 = CountVectorizer(ngram_range=(1, 2),min_df = 2,stop_words='english',max_features =15000)
tokenize_test_lr(vect5)

Features:  5451
Accuracy:  0.8671679197994987




In [388]:
tokenize_test(vect5)

Features:  5451
Accuracy:  0.8659147869674185


## Both Logistic Regression and Naive Bayes are giving almost similiar accuracy
