# Replication Summary

## Part A - Baselines
---

### TF-IDF + Balanced SVM

__Command:__
`!python tfidf.py -m tfidf_svm --kernel rbf --class_weight balanced --max_ngram 3 --jobs -2 --seed 42 --tokenizer glove`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.823. (vs. 0.816)

Recall(avg): 0.826 (vs. 0.816)

F1-score(avg): 0.823 (vs. 0.816)

__Reflection:__ Looks good!


In [13]:
!python tfidf.py -m tfidf_svm --kernel rbf --class_weight balanced --max_ngram 3 --jobs -2 --seed 42 --tokenizer glove

Max-ngram-length: 3
Found 16907 texts. (samples)
Model Type: tfidf_svm
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[CV]  ................................................................
[CV]  ................................................................
[CV]  ................................................................
[CV] ................................................. , total= 2.9min
[CV]  ................................................................
[CV] ................................................. , total= 2.9min
[CV]  ................................................................
[CV] ................................................. , total= 3.0min
[CV]  ................................................................
[CV] ................................................. , total= 3.6min
[CV]  ................................................................
[CV] ................................................. , total= 3.6min
[

### TF-IDF + GBDT

__Command:__
`!python tfidf.py -m tfidf_gradient_boosting --max_ngram 3 --loss deviance --estimators 100 --jobs -2 --seed 42 --tokenizer glove `

__Result:__ 

replicated (vs. original)

Precision(avg): 0.826 (vs. 0.819)

Recall(avg): 0.817 (vs. 0.807)

F1-score(avg): 0.801 (vs. 0.813)

__Reflection:__
Close enough. Notice that the second try enables inverse-document-frequency reweighting (i.e., `--use-inverse-doc-freq`), which does not affect the results.

In [2]:
!python tfidf.py -m tfidf_gradient_boosting --max_ngram 3 --loss deviance --estimators 100 --jobs -2 --seed 42 --tokenizer glove 

Max-ngram-length: 3
Found 16907 texts. (samples)
Model Type: tfidf_gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[CV]  ................................................................
[CV]  ................................................................
[CV]  ................................................................
[CV] ................................................. , total=15.1min
[CV]  ................................................................
[CV] ................................................. , total=15.2min
[CV]  ................................................................
[CV] ................................................. , total=15.4min
[CV]  ................................................................
[CV] ................................................. , total=15.2min
[CV]  ................................................................
[CV] ................................................. , t

In [4]:
!python tfidf.py -m tfidf_gradient_boosting --max_ngram 3 --loss deviance --estimators 100 --jobs -2 --seed 42 --tokenizer glove --use-inverse-doc-freq

Max-ngram-length: 3
Found 16907 texts. (samples)
Model Type: tfidf_gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[CV]  ................................................................
[CV]  ................................................................
[CV]  ................................................................
[CV] ................................................. , total=15.3min
[CV]  ................................................................
[CV] ................................................. , total=15.8min
[CV]  ................................................................
[CV] ................................................. , total=16.1min
[CV]  ................................................................
[CV] ................................................. , total=15.1min
[CV]  ................................................................
[CV] ................................................. , t

### BoWV + Balanced SVM 

__Command:__
`!python BoWV.py -m svm --kernel rbf --class_weight balanced --jobs -2 -f GloVe/glove.twitter.27B.200d.txt -d 200 --seed 42 --tokenizer glove `

__Result:__ 

replicated (vs. original)

Precision(avg): 0.758 (vs. 0.791)

Recall(avg): 0.692 (vs. 0.788)

F1-score(avg): 0.705 (vs. 0.789)

__Reflection:__ SVMs are very sensitive to the choice of hyperparameters. We see that the default `rbf` kernel has a performance much worse than reported by the paper. However, since the same SVM performs well in the tf-idf model and the gradient boosting estimator also underperforms in the BoWV model, we have reason to believe something is wrong with the results.

In [12]:
!python BoWV.py -m svm --kernel rbf --class_weight balanced --jobs -2 -f GloVe/glove.twitter.27B.200d.txt -d 200 --seed 42 --tokenizer glove 

GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
GloVe model loaded successfully.
Tweets selected: 16905
Features and labels loaded from pickled files.
Model Type: svm
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  10 out of  10 | elapsed: 12.8min finished
Precision(avg): 0.758 (+/- 0.017)
Recall(avg): 0.692 (+/- 0.022)
F1-score(avg): 0.705 (+/- 0.021)


### BoWV + GBDT

__Command:__
`!python BoWV.py -m gradient_boosting --loss deviance --estimators 500 --jobs -2 -f GloVe/glove.twitter.27B.200d.txt -d 200 --seed 42 --tokenizer glove `

__Result:__ 

replicated (vs. original)

Precision(avg): 0.765 (vs. 0.800)

Recall(avg): 0.774 (vs. 0.802)

F1-score(avg): 0.759 (vs. 0.801)

__Reflection:__
Similar to the above.

In [1]:
!python BoWV.py -m gradient_boosting --loss deviance --estimators 500 --jobs -2 -f GloVe/glove.twitter.27B.200d.txt -d 200 --seed 42 --tokenizer glove 

GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
GloVe model loaded successfully.
Tweets selected: 16905
Features and labels loaded from pickled files.
Model Type: gradient_boosting
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  10 out of  10 | elapsed: 19.6min finished
Precision(avg): 0.765 (+/- 0.013)
Recall(avg): 0.774 (+/- 0.012)
F1-score(avg): 0.759 (+/- 0.014)


## Part B - DNNs only
---
### CNN + Random Embedding

__Command:__
`!python cnn.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --epochs 10 --batch-size 128 --folds 10 --initialize-weights random --learn-embeddings`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.816 (vs. 0.813)

Recall(avg): 0.816 (vs. 0.816)

F1-score(avg): 0.816 (vs. 0.814)

__Reflection:__
Looks good! Note that the authors do not mention for how many epochs they trained the networks. I trained for 10 epochs.

In [6]:
!python cnn.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --epochs 10 --batch-size 128 --folds 10 --initialize-weights random --learn-embeddings

Using TensorFlow backend.
GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
Allowing embedding learning: True
Tweets loaded from pickled file.
Tweets selected: 16905
Vocabs loaded from pickled files.
X and y loaded from pickled files.
max seq length is 28
3450 embedding missed
Model variation is CNN-rand
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (None, 28, 200)       3402800     embedding_input_1[0][0]          
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 28, 200)       0           embedding_1[0][0]                
____________________________________________________________________________________________________
model_1 (Model)                  (None, 300)           240300  

Epoch: 8/10.	Batch: 35.	Loss: 0.048335.	Accuracy: 0.983929
Epoch: 8/10.	Batch: 70.	Loss: 0.047847.	Accuracy: 0.984263
Epoch: 8/10.	Batch: 105.	Loss: 0.045784.	Accuracy: 0.984970
Epoch: 9/10.	Batch: 35.	Loss: 0.045184.	Accuracy: 0.986830
Epoch: 9/10.	Batch: 70.	Loss: 0.041895.	Accuracy: 0.987388
Epoch: 9/10.	Batch: 105.	Loss: 0.040375.	Accuracy: 0.987574

              precision    recall  f1-score   support

           0       0.85      0.88      0.86      1155
           1       0.72      0.72      0.72       205
           2       0.70      0.63      0.66       331

   micro avg       0.81      0.81      0.81      1691
   macro avg       0.76      0.74      0.75      1691
weighted avg       0.81      0.81      0.81      1691

Epoch: 0/10.	Batch: 35.	Loss: 0.873144.	Accuracy: 0.660268
Epoch: 0/10.	Batch: 70.	Loss: 0.758676.	Accuracy: 0.707924
Epoch: 0/10.	Batch: 105.	Loss: 0.685781.	Accuracy: 0.737798
Epoch: 1/10.	Batch: 35.	Loss: 0.335062.	Accuracy: 0.872991
Epoch: 1/10.	Batch: 70.	L

Epoch: 5/10.	Batch: 105.	Loss: 0.112313.	Accuracy: 0.960119
Epoch: 6/10.	Batch: 35.	Loss: 0.091431.	Accuracy: 0.968750
Epoch: 6/10.	Batch: 70.	Loss: 0.095103.	Accuracy: 0.966964
Epoch: 6/10.	Batch: 105.	Loss: 0.092931.	Accuracy: 0.968229
Epoch: 7/10.	Batch: 35.	Loss: 0.083757.	Accuracy: 0.970536
Epoch: 7/10.	Batch: 70.	Loss: 0.081412.	Accuracy: 0.971205
Epoch: 7/10.	Batch: 105.	Loss: 0.077920.	Accuracy: 0.972098
Epoch: 8/10.	Batch: 35.	Loss: 0.074322.	Accuracy: 0.977679
Epoch: 8/10.	Batch: 70.	Loss: 0.073357.	Accuracy: 0.977121
Epoch: 8/10.	Batch: 105.	Loss: 0.070471.	Accuracy: 0.977753
Epoch: 9/10.	Batch: 35.	Loss: 0.063309.	Accuracy: 0.979018
Epoch: 9/10.	Batch: 70.	Loss: 0.064530.	Accuracy: 0.979799
Epoch: 9/10.	Batch: 105.	Loss: 0.060675.	Accuracy: 0.981101

              precision    recall  f1-score   support

           0       0.87      0.88      0.87      1150
           1       0.71      0.77      0.74       209
           2       0.74      0.69      0.72       331

   micro 

### CNN + GloVe

__Command:__
`!python cnn.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --epochs 10 --batch-size 128 --folds 10 --initialize-weights glove --learn-embeddings`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.836 (vs. 0.839)

Recall(avg): 0.837 (vs. 0.840)

F1-score(avg): 0.835 (vs. 0.839)

__Reflection:__
Close enough. Note that the authors do not mention for how many epochs they trained the networks. I trained for 10 epochs.

In [7]:
!python cnn.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --epochs 10 --batch-size 128 --folds 10 --initialize-weights glove --learn-embeddings

Using TensorFlow backend.
GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
Allowing embedding learning: True
Tweets loaded from pickled file.
Tweets selected: 16905
Vocabs loaded from pickled files.
X and y loaded from pickled files.
max seq length is 28
3450 embedding missed
Model variation is CNN-rand
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (None, 28, 200)       3402800     embedding_input_1[0][0]          
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 28, 200)       0           embedding_1[0][0]                
____________________________________________________________________________________________________
model_1 (Model)                  (None, 300)           240300  

Epoch: 8/10.	Batch: 35.	Loss: 0.157955.	Accuracy: 0.942411
Epoch: 8/10.	Batch: 70.	Loss: 0.150249.	Accuracy: 0.945871
Epoch: 8/10.	Batch: 105.	Loss: 0.150660.	Accuracy: 0.944940
Epoch: 9/10.	Batch: 35.	Loss: 0.137948.	Accuracy: 0.948438
Epoch: 9/10.	Batch: 70.	Loss: 0.137960.	Accuracy: 0.949554
Epoch: 9/10.	Batch: 105.	Loss: 0.138997.	Accuracy: 0.948363

              precision    recall  f1-score   support

           0       0.86      0.90      0.88      1155
           1       0.77      0.77      0.77       205
           2       0.75      0.61      0.68       331

   micro avg       0.83      0.83      0.83      1691
   macro avg       0.79      0.76      0.77      1691
weighted avg       0.83      0.83      0.83      1691

Epoch: 0/10.	Batch: 35.	Loss: 1.044777.	Accuracy: 0.681696
Epoch: 0/10.	Batch: 70.	Loss: 0.878581.	Accuracy: 0.714844
Epoch: 0/10.	Batch: 105.	Loss: 0.789872.	Accuracy: 0.731845
Epoch: 1/10.	Batch: 35.	Loss: 0.463239.	Accuracy: 0.815625
Epoch: 1/10.	Batch: 70.	L

Epoch: 5/10.	Batch: 105.	Loss: 0.276481.	Accuracy: 0.892560
Epoch: 6/10.	Batch: 35.	Loss: 0.248141.	Accuracy: 0.906027
Epoch: 6/10.	Batch: 70.	Loss: 0.250129.	Accuracy: 0.903571
Epoch: 6/10.	Batch: 105.	Loss: 0.248791.	Accuracy: 0.902976
Epoch: 7/10.	Batch: 35.	Loss: 0.216819.	Accuracy: 0.914509
Epoch: 7/10.	Batch: 70.	Loss: 0.211454.	Accuracy: 0.917634
Epoch: 7/10.	Batch: 105.	Loss: 0.212014.	Accuracy: 0.916964
Epoch: 8/10.	Batch: 35.	Loss: 0.194787.	Accuracy: 0.924330
Epoch: 8/10.	Batch: 70.	Loss: 0.193829.	Accuracy: 0.928125
Epoch: 8/10.	Batch: 105.	Loss: 0.193040.	Accuracy: 0.928869
Epoch: 9/10.	Batch: 35.	Loss: 0.165726.	Accuracy: 0.937500
Epoch: 9/10.	Batch: 70.	Loss: 0.167136.	Accuracy: 0.935603
Epoch: 9/10.	Batch: 105.	Loss: 0.165930.	Accuracy: 0.935863

              precision    recall  f1-score   support

           0       0.87      0.90      0.88      1150
           1       0.73      0.82      0.77       209
           2       0.78      0.63      0.70       331

   micro 

### LSTM + Random Embedding

__Command:__
`!python lstm.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --initialize-weights random --learn-embeddings --epochs 10 --batch-size 128`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.802 (vs. 0.805)

Recall(avg): 0.805 (vs. 0.804)

F1-score(avg): 0.803 (vs. 0.804)

__Reflection:__
It checks out.

In [17]:
!python lstm.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --initialize-weights random --learn-embeddings --epochs 10 --batch-size 128

Using TensorFlow backend.
GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
Allowing embedding learning: True
Tweets loaded from pickled file.
Tweets selected: 16905
Vocabs loaded from pickled files.
X and y loaded from pickled files.
max seq length is 28
3450 embedding missed
Model variation is LSTM
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (None, 28, 200)       3402800     embedding_input_1[0][0]          
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 28, 200)       0           embedding_1[0][0]                
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 50)            50200       

Epoch: 9/10.	Batch: 70.	Loss: 0.050759.	Accuracy: 0.983036
Epoch: 9/10.	Batch: 105.	Loss: 0.046350.	Accuracy: 0.984226
              precision    recall  f1-score   support

           0       0.83      0.89      0.86      1155
           1       0.74      0.56      0.64       205
           2       0.69      0.61      0.65       331

   micro avg       0.80      0.80      0.80      1691
   macro avg       0.75      0.69      0.72      1691
weighted avg       0.79      0.80      0.79      1691

Epoch: 0/10.	Batch: 35.	Loss: 0.786997.	Accuracy: 0.669643
Epoch: 0/10.	Batch: 70.	Loss: 0.672668.	Accuracy: 0.725000
Epoch: 0/10.	Batch: 105.	Loss: 0.606243.	Accuracy: 0.753646
Epoch: 1/10.	Batch: 35.	Loss: 0.308503.	Accuracy: 0.880134
Epoch: 1/10.	Batch: 70.	Loss: 0.284786.	Accuracy: 0.892746
Epoch: 1/10.	Batch: 105.	Loss: 0.265689.	Accuracy: 0.899777
Epoch: 2/10.	Batch: 35.	Loss: 0.180018.	Accuracy: 0.938393
Epoch: 2/10.	Batch: 70.	Loss: 0.172317.	Accuracy: 0.939955
Epoch: 2/10.	Batch: 105.	L

Epoch: 7/10.	Batch: 35.	Loss: 0.073140.	Accuracy: 0.975000
Epoch: 7/10.	Batch: 70.	Loss: 0.072192.	Accuracy: 0.975000
Epoch: 7/10.	Batch: 105.	Loss: 0.069100.	Accuracy: 0.976042
Epoch: 8/10.	Batch: 35.	Loss: 0.064909.	Accuracy: 0.980580
Epoch: 8/10.	Batch: 70.	Loss: 0.063399.	Accuracy: 0.981250
Epoch: 8/10.	Batch: 105.	Loss: 0.058970.	Accuracy: 0.982143
Epoch: 9/10.	Batch: 35.	Loss: 0.050986.	Accuracy: 0.980134
Epoch: 9/10.	Batch: 70.	Loss: 0.051251.	Accuracy: 0.981585
Epoch: 9/10.	Batch: 105.	Loss: 0.049634.	Accuracy: 0.982068
              precision    recall  f1-score   support

           0       0.85      0.87      0.86      1150
           1       0.74      0.68      0.71       209
           2       0.66      0.65      0.66       331

   micro avg       0.80      0.80      0.80      1690
   macro avg       0.75      0.73      0.74      1690
weighted avg       0.80      0.80      0.80      1690

Epoch: 0/10.	Batch: 35.	Loss: 0.877204.	Accuracy: 0.653795
Epoch: 0/10.	Batch: 70.	Lo

### LSTM + GloVe

__Command:__
`!python lstm.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --initialize-weights glove --learn-embeddings --epochs 10 --batch-size 128`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.834 (vs. 0.807)

Recall(avg): 0.835 (vs. 0.809)

F1-score(avg): 0.836 (vs. 0.808)

__Reflection:__
The results are much better than reported on the pattern. Note that the improvement with respect to the random initiliazation is similar to the one observed for CNNs.

In [23]:
!python lstm.py -f GloVe/glove.twitter.27B.200d.txt -d 200 --tokenizer glove --loss categorical_crossentropy --optimizer adam --initialize-weights glove --learn-embeddings --epochs 10 --batch-size 128

Using TensorFlow backend.
GLOVE embedding: GloVe/glove.twitter.27B.200d.txt
Embedding Dimension: 200
Allowing embedding learning: True
Tweets loaded from pickled file.
Tweets selected: 16905
Vocabs loaded from pickled files.
X and y loaded from pickled files.
max seq length is 28
3450 embedding missed
Model variation is LSTM
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (None, 28, 200)       3402800     embedding_input_1[0][0]          
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 28, 200)       0           embedding_1[0][0]                
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 50)            50200       

Epoch: 9/10.	Batch: 70.	Loss: 0.138503.	Accuracy: 0.952121
Epoch: 9/10.	Batch: 105.	Loss: 0.133872.	Accuracy: 0.951637
              precision    recall  f1-score   support

           0       0.87      0.89      0.88      1155
           1       0.71      0.78      0.74       205
           2       0.75      0.66      0.70       331

   micro avg       0.83      0.83      0.83      1691
   macro avg       0.78      0.78      0.78      1691
weighted avg       0.83      0.83      0.83      1691

Epoch: 0/10.	Batch: 35.	Loss: 0.866980.	Accuracy: 0.646652
Epoch: 0/10.	Batch: 70.	Loss: 0.761629.	Accuracy: 0.688504
Epoch: 0/10.	Batch: 105.	Loss: 0.700121.	Accuracy: 0.715179
Epoch: 1/10.	Batch: 35.	Loss: 0.489990.	Accuracy: 0.803571
Epoch: 1/10.	Batch: 70.	Loss: 0.480168.	Accuracy: 0.809152
Epoch: 1/10.	Batch: 105.	Loss: 0.468053.	Accuracy: 0.813318
Epoch: 2/10.	Batch: 35.	Loss: 0.400154.	Accuracy: 0.843527
Epoch: 2/10.	Batch: 70.	Loss: 0.393794.	Accuracy: 0.848326
Epoch: 2/10.	Batch: 105.	L

Epoch: 7/10.	Batch: 35.	Loss: 0.190188.	Accuracy: 0.930580
Epoch: 7/10.	Batch: 70.	Loss: 0.191058.	Accuracy: 0.929464
Epoch: 7/10.	Batch: 105.	Loss: 0.185370.	Accuracy: 0.931399
Epoch: 8/10.	Batch: 35.	Loss: 0.167182.	Accuracy: 0.942187
Epoch: 8/10.	Batch: 70.	Loss: 0.165609.	Accuracy: 0.941295
Epoch: 8/10.	Batch: 105.	Loss: 0.161623.	Accuracy: 0.942113
Epoch: 9/10.	Batch: 35.	Loss: 0.148983.	Accuracy: 0.949330
Epoch: 9/10.	Batch: 70.	Loss: 0.145385.	Accuracy: 0.948772
Epoch: 9/10.	Batch: 105.	Loss: 0.142125.	Accuracy: 0.950298
              precision    recall  f1-score   support

           0       0.86      0.89      0.87      1150
           1       0.71      0.78      0.75       209
           2       0.76      0.63      0.69       331

   micro avg       0.82      0.82      0.82      1690
   macro avg       0.78      0.77      0.77      1690
weighted avg       0.82      0.82      0.82      1690

Epoch: 0/10.	Batch: 35.	Loss: 0.839351.	Accuracy: 0.659598
Epoch: 0/10.	Batch: 70.	Lo

## Part C - DNNs + GBDT Classifier
---
### CNN + GloVe + GBDT
__Command:__
`!python nn_classifier.py gradient_boosting cnn_glove.h5`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.869 (vs. 0.864)

Recall(avg): 0.869 (vs. 0.864)

F1-score(avg): 0.864 (vs. 0.864)

__Reflection:__
It checks out.

In [28]:
!python nn_classifier.py gradient_boosting cnn_glove.h5

Using TensorFlow backend.
Embedding Dimension: 200
2019-10-06 21:48:48.024133: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2
Tweets loaded from pickled file.
Tweets selected: 16905
Model Type: gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[Parallel(n_jobs=-2)]: Done  10 out of  10 | elapsed: 11.2min finished
Precision(avg): 0.869 (+/- 0.011)
Recall(avg): 0.869 (+/- 0.011)
F1-score(avg): 0.864 (+/- 0.012)


### CNN + Random Embedding + GBDT
__Command:__
`!python nn_classifier.py gradient_boosting cnn_random.h5`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.913 (vs. 0.864)

Recall(avg): 0.913 (vs. 0.864)

F1-score(avg): 0.912 (vs. 0.864)

__Reflection:__ The results I obtain are much higher than those reported by the paper. It is also strange that they report the same exact results for both the embedding initialization with GloVe and at random.

In [27]:
!python nn_classifier.py gradient_boosting cnn_random.h5

Using TensorFlow backend.
Embedding Dimension: 200
2019-10-06 21:38:10.343954: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2
Tweets loaded from pickled file.
Tweets selected: 16905
Model Type: gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[Parallel(n_jobs=-2)]: Done  10 out of  10 | elapsed: 10.4min finished
Precision(avg): 0.913 (+/- 0.013)
Recall(avg): 0.913 (+/- 0.013)
F1-score(avg): 0.912 (+/- 0.013)


### LSTM + GloVe + GBDT
__Command:__
`!python nn_classifier.py gradient_boosting lstm_glove.h5`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.865 (vs. 0.849)

Recall(avg): 0.864  (vs. 0.848)

F1-score(avg): 0.859 (vs. 0.848)

__Reflection:__ The results I obtained are slightly better.

In [30]:
!python nn_classifier.py gradient_boosting lstm_glove.h5

Using TensorFlow backend.
Embedding Dimension: 200
2019-10-06 22:11:47.580129: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2
Tweets loaded from pickled file.
Tweets selected: 16905
Model Type: gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[Parallel(n_jobs=-2)]: Done  10 out of  10 | elapsed: 11.3min finished
Precision(avg): 0.865 (+/- 0.013)
Recall(avg): 0.864 (+/- 0.012)
F1-score(avg): 0.859 (+/- 0.013)


### LSTM + Random Embedding + GBDT
__Command:__
`!python nn_classifier.py gradient_boosting lstm_random.h5`

__Result:__ 

replicated (vs. original)

Precision(avg): 0.934 (vs. 0.930)

Recall(avg): 0.934 (vs. 0.930)

F1-score(avg): 0.934 (vs. 0.930)

__Reflection:__ It checks out.

In [29]:
!python nn_classifier.py gradient_boosting lstm_random.h5

Using TensorFlow backend.
Embedding Dimension: 200
2019-10-06 22:00:13.692928: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2
Tweets loaded from pickled file.
Tweets selected: 16905
Model Type: gradient_boosting
[Parallel(n_jobs=-2)]: Using backend LokyBackend with 3 concurrent workers.
[Parallel(n_jobs=-2)]: Done  10 out of  10 | elapsed: 11.3min finished
Precision(avg): 0.934 (+/- 0.012)
Recall(avg): 0.934 (+/- 0.012)
F1-score(avg): 0.933 (+/- 0.012)
