# Transfer Learning CIFAR10

* Train a simple convnet on the CIFAR dataset the first 5 output classes [0..4].
* Freeze convolutional layers and fine-tune dense layers for the last 5 ouput classes [5..9].


### 1. Import CIFAR10 data and create 2 datasets with one dataset having classes from 0 to 4 and other having classes from 5 to 9 

In [1]:
import keras
from keras.datasets import cifar10

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
(x_train,y_train),(x_test,y_test)=cifar10.load_data()

In [3]:
x_train.shape

(50000, 32, 32, 3)

In [4]:
y_train.shape

(50000, 1)

In [5]:
x_test.shape

(10000, 32, 32, 3)

In [6]:
import numpy as np

In [7]:
x_train_lt5 = x_train[y_train.reshape(y_train.shape[0])<5]
y_train_lt5 = y_train[y_train.reshape(y_train.shape[0])<5]
x_test_lt5 = x_test[y_test.reshape(y_test.shape[0])<5]
y_test_lt5 = y_test[y_test.reshape(y_test.shape[0])<5]

x_train_gt5 = x_train[y_train.reshape(y_train.shape[0]) >= 5]
y_train_gt5 = y_train[y_train.reshape(y_train.shape[0]) >= 5] - 5  
x_test_gt5 = x_test[y_test.reshape(y_test.shape[0]) >= 5]         
y_test_gt5 = y_test[y_test.reshape(y_test.shape[0]) >= 5] - 5

In [8]:
print(x_train_lt5.shape)
print(y_train_lt5.shape)
print(x_test_lt5.shape)
print(y_test_lt5.shape)

(25000, 32, 32, 3)
(25000, 1)
(5000, 32, 32, 3)
(5000, 1)


In [9]:
print(x_train_gt5.shape)
print(y_train_gt5.shape)
print(x_test_gt5.shape)
print(y_test_gt5.shape)

(25000, 32, 32, 3)
(25000, 1)
(5000, 32, 32, 3)
(5000, 1)


In [10]:
x_train_lt5_normalized=x_train_lt5.astype(float)/255

In [11]:
x_test_lt5_normalized=x_test_lt5.astype(float)/255

### 2. Use One-hot encoding to divide y_train and y_test into required no of output classes

In [12]:
y_train_lt5_ohe=keras.utils.to_categorical(y_train_lt5)

In [13]:
y_train_lt5_ohe.shape

(25000, 5)

In [14]:
y_test_lt5_ohe=keras.utils.to_categorical(y_test_lt5)

In [15]:
y_test_lt5_ohe.shape

(5000, 5)

### 3. Build a sequential neural network model which can classify the classes 0 to 4 of CIFAR10 dataset with at least 80% accuracy on test data

In [16]:
from keras.models import Sequential
from keras.layers import Dense,Convolution2D,MaxPooling2D,Dropout,Activation,Flatten

In [20]:
model=Sequential()

model.add(Convolution2D(32,(3,3),input_shape=(32,32,3)))
model.add(Activation('relu'))

model.add(Convolution2D(64,(3,3)))
model.add(Activation('relu'))

model.add(MaxPooling2D(2,2))
model.add(Dropout(0.25))

model.add(Convolution2D(128,(3,3)))
model.add(Activation('relu'))

model.add(Convolution2D(256,(3,3)))
model.add(Activation('relu'))

model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))

model.add(Dense(64))
model.add(Activation('relu'))
          
model.add(Dense(5))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

In [21]:
model.fit(x_train_lt5_normalized,y_train_lt5_ohe,validation_data=(x_test_lt5_normalized,y_test_lt5_ohe),epochs=5)

Train on 25000 samples, validate on 5000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x840630f5c0>

### 4. In the model which was built above (for classification of classes 0-4 in CIFAR10), make only the dense layers to be trainable and conv layers to be non-trainable

In [88]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_14 (Conv2D)           (None, 30, 30, 32)        896       
_________________________________________________________________
activation_13 (Activation)   (None, 30, 30, 32)        0         
_________________________________________________________________
conv2d_15 (Conv2D)           (None, 28, 28, 64)        18496     
_________________________________________________________________
activation_14 (Activation)   (None, 28, 28, 64)        0         
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 14, 14, 64)        0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 14, 14, 64)        0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 12544)             0         
__________

In [89]:
for layers in model.layers:
    print(layers.name)
    if('dense' not in layers.name):
        layers.trainable = False
        print(layers.name + 'is not trainable\n')
    if('dense' in layers.name):
        print(layers.name + ' is trainable\n')

conv2d_14
conv2d_14is not trainable

activation_13
activation_13is not trainable

conv2d_15
conv2d_15is not trainable

activation_14
activation_14is not trainable

max_pooling2d_4
max_pooling2d_4is not trainable

dropout_3
dropout_3is not trainable

flatten_3
flatten_3is not trainable

dense_5
dense_5 is trainable

activation_15
activation_15is not trainable

dense_6
dense_6 is trainable

activation_16
activation_16is not trainable

dense_7
dense_7 is trainable

activation_17
activation_17is not trainable



### 5. Utilize the the model trained on CIFAR 10 (classes 0 to 4) to classify the classes 5 to 9 of CIFAR 10  (Use Transfer Learning) <br>
Achieve an accuracy of more than 85% on test data

In [87]:
x_train_gt5_normalized=x_train_gt5.astype(float)/255

In [86]:
x_test_gt5_normalized=x_test_gt5.astype(float)/255

In [90]:
y_train_gt5=keras.utils.to_categorical(y_train_gt5)

In [91]:
y_test_gt5=keras.utils.to_categorical(y_test_gt5)

In [93]:
model.fit(x_train_gt5_normalized,y_train_gt5,validation_data=(x_test_gt5_normalized,y_test_gt5),nb_epoch=5)

  """Entry point for launching an IPython kernel.
  'Discrepancy between trainable weights and collected trainable'


Train on 25000 samples, validate on 5000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x8a849d63c8>

## Sentiment analysis <br> 

The objective of the second problem is to perform Sentiment analysis from the tweets data collected from the users targeted at various mobile devices.
Based on the tweet posted by a user (text), we will classify if the sentiment of the user targeted at a particular mobile device is positive or not.

### 6. Read the dataset (tweets.csv) and drop the NA's while reading the dataset

In [95]:
import pandas as pd
data = pd.read_csv('./tweets.csv', encoding = "ISO-8859-1").dropna()

In [96]:
data.shape

(3291, 3)

In [97]:
data.head()

Unnamed: 0,tweet_text,emotion_in_tweet_is_directed_at,is_there_an_emotion_directed_at_a_brand_or_product
0,.@wesley83 I have a 3G iPhone. After 3 hrs twe...,iPhone,Negative emotion
1,@jessedee Know about @fludapp ? Awesome iPad/i...,iPad or iPhone App,Positive emotion
2,@swonderlin Can not wait for #iPad 2 also. The...,iPad,Positive emotion
3,@sxsw I hope this year's festival isn't as cra...,iPad or iPhone App,Negative emotion
4,@sxtxstate great stuff on Fri #SXSW: Marissa M...,Google,Positive emotion


### Consider only rows having Positive emotion and Negative emotion and remove other rows from the dataframe.

In [98]:
data = data[(data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Positive emotion') | (data['is_there_an_emotion_directed_at_a_brand_or_product'] == 'Negative emotion')]

In [99]:
data.shape

(3191, 3)

### 7. Represent text as numerical data using `CountVectorizer` and get the document term frequency matrix

#### Use `vect` as the variable name for initialising CountVectorizer.

In [101]:
from sklearn.feature_extraction.text import CountVectorizer

In [102]:
vect=CountVectorizer()

In [103]:
vect.fit(data['tweet_text'])

CountVectorizer(analyzer='word', binary=False, decode_error='strict',
        dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
        lowercase=True, max_df=1.0, max_features=None, min_df=1,
        ngram_range=(1, 1), preprocessor=None, stop_words=None,
        strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
        tokenizer=None, vocabulary=None)

### 8. Find number of different words in vocabulary

In [104]:
len(vect.get_feature_names())

5648

#### Tip: To see all available functions for an Object use dir

In [106]:
dir(vect)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_char_ngrams',
 '_char_wb_ngrams',
 '_check_stop_words_consistency',
 '_check_vocabulary',
 '_count_vocab',
 '_get_param_names',
 '_limit_features',
 '_sort_features',
 '_stop_words_id',
 '_validate_params',
 '_validate_vocabulary',
 '_white_spaces',
 '_word_ngrams',
 'analyzer',
 'binary',
 'build_analyzer',
 'build_preprocessor',
 'build_tokenizer',
 'decode',
 'decode_error',
 'dtype',
 'encoding',
 'fit',
 'fit_transform',
 'fixed_vocabulary_',
 'get_feature_names',
 'get_params',
 'get_stop_words',
 'input',
 'inverse_transform',
 'lowercase',
 'max_df',
 'max_features',
 'min_df',


### Find out how many Positive and Negative emotions are there.

Hint: Use value_counts on that column

In [107]:
pd.value_counts(data['is_there_an_emotion_directed_at_a_brand_or_product'])

Positive emotion    2672
Negative emotion     519
Name: is_there_an_emotion_directed_at_a_brand_or_product, dtype: int64

###  Change the labels for Positive and Negative emotions as 1 and 0 respectively and store in a different column in the same dataframe named 'label'

Hint: use map on that column and give labels

In [108]:
data['label'] = data.is_there_an_emotion_directed_at_a_brand_or_product.map({'Positive emotion':1, 'Negative emotion':0})

### 9. Define the feature set (independent variable or X) to be `text` column and `labels` as target (or dependent variable)  and divide into train and test datasets

In [112]:
from sklearn.model_selection import train_test_split

In [123]:
X_train,X_test,Y_train,Y_test=train_test_split(data['tweet_text'],data['label'],test_size=0.2,random_state=1)

In [115]:
vect.fit(X_train)

CountVectorizer(analyzer='word', binary=False, decode_error='strict',
        dtype=<class 'numpy.int64'>, encoding='utf-8', input='content',
        lowercase=True, max_df=1.0, max_features=None, min_df=1,
        ngram_range=(1, 1), preprocessor=None, stop_words=None,
        strip_accents=None, token_pattern='(?u)\\b\\w\\w+\\b',
        tokenizer=None, vocabulary=None)

In [116]:
len(vect.get_feature_names())

5035

In [117]:
X=vect.transform(X_train)

In [124]:
Y_train.shape

(2552,)

In [125]:
test=vect.transform(X_test)

## 10. **Predicting the sentiment:**


### Use Naive Bayes and Logistic Regression and their accuracy scores for predicting the sentiment of the given text

In [119]:
from sklearn.naive_bayes import MultinomialNB
nb=MultinomialNB()

In [126]:
nb.fit(X,Y_train)

MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)

In [127]:
from sklearn.metrics import accuracy_score

In [130]:
y_pred_nb=nb.predict(test)

In [131]:
accuracy_score(Y_test,y_pred_nb)

0.8325508607198748

In [133]:
#Logistic Regression
from sklearn.linear_model import LogisticRegression
lr=LogisticRegression()

In [134]:
lr.fit(X,Y_train)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='warn',
          n_jobs=None, penalty='l2', random_state=None, solver='warn',
          tol=0.0001, verbose=0, warm_start=False)

In [135]:
y_pred_lr=lr.predict(test)

In [136]:
accuracy_score(Y_test,y_pred_lr)

0.8528951486697965

## 11. Create a function called `tokenize_predict` which can take count vectorizer object as input and prints the accuracy for x (text) and y (labels)

In [143]:
def tokenize_test(vect):
    x_train_dtm = vect.fit_transform(X_train)
    print('Features: ', x_train_dtm.shape[1])
    x_test_dtm = vect.transform(X_test)
    nb = MultinomialNB()
    nb.fit(x_train_dtm, Y_train)
    y_pred_class = nb.predict(x_test_dtm)
    print('Accuracy: ', accuracy_score(Y_test, y_pred_class))

### Create a count vectorizer function which includes n_grams = 1,2  and pass it to tokenize_predict function to print the accuracy score

In [144]:
# include 1-grams and 2-grams
vect = CountVectorizer(ngram_range=(1, 2))
tokenize_test(vect)

Features:  25815
Accuracy:  0.838810641627543


### 12. Create a count vectorizer function with stopwords = 'english'  and pass it to tokenize_predict function to print the accuracy score

In [151]:
vect = CountVectorizer(stop_words = 'english')
tokenize_test(vect)

Features:  4796
Accuracy:  0.838810641627543


### 13. Create a count vectorizer function with stopwords = 'english' and max_features =300  and pass it to tokenize_predict function to print the accuracy score

In [150]:
vect = CountVectorizer(stop_words = 'english',max_features=300)
tokenize_test(vect)

Features:  300
Accuracy:  0.7996870109546166


### 14. Create a count vectorizer function with n_grams = 1,2  and max_features = 15000  and pass it to tokenize_predict function to print the accuracy score

In [149]:
vect = CountVectorizer(ngram_range=(1, 2),max_features=15000)
tokenize_test(vect)

Features:  15000
Accuracy:  0.8341158059467919


### 15. Create a count vectorizer function with n_grams = 1,2  and include terms that appear at least 2 times (min_df = 2)  and pass it to tokenize_predict function to print the accuracy score

In [152]:
vect = CountVectorizer(ngram_range=(1, 2),min_df=2)
tokenize_test(vect)

Features:  8298
Accuracy:  0.8435054773082942
