## Experimental Setting 1: Tf-idf for unigrams + linear SVM 
### Task 2: Classification Claims vs Premises

**Importing necessary packages**

In [1]:
import pandas as pd
import numpy as np
import string
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV

In [22]:
filename = '../../../data/sentence_db_candidate.csv'
df = pd.read_csv(filename)

In [23]:
comps = ['Claim', 'Premise']
df = df.loc[(df['Component'].isin(comps))]

**Below we are turning labels "Claim" and "Premise" into machine-readable classes: 1 and 0 respectively**

In [24]:
classes = []

for s in df.Component:
    if s == 'Claim':
        classes.append(1.0)
    else:
        classes.append(0.0)

In [25]:
df['Annotation'] = classes

In [26]:
df.shape

(22280, 19)

**Simple preprocessing of sentences with lowercasing and punctuation removal**

In [27]:
def preproc(sentence):
    sentence = sentence.lower()
    sentence = ''.join([i for i in sentence if i not in string.punctuation])
    return sentence

In [28]:
df['Speech'] = df['Speech'].apply(preproc)

**Splitting the data into three sets. Our sets will be identical to those of authors**

In [29]:
df_train = df[df['Set'] == 'TRAIN']
df_val = df[df['Set'] == 'VALIDATION']
df_test = df[df['Set'] == 'TEST']

**Separating features set and target variable set**

In [30]:
X_train = df_train.Speech
y_train = df_train.Annotation

X_test = df_test.Speech
y_test = df_test.Annotation

X_val = df_val.Speech
y_val = df_val.Annotation

**Initializing tf-idf feature matrix. Fitting and transforming sentences on a train set and only transforming on a validation and test sets. We will be using the whole vocabulary, here it is 9.006 words**

In [31]:
tfidf = TfidfVectorizer()

train_tfidf =  tfidf.fit_transform(X_train)
val_tfidf = tfidf.transform(X_val)
test_tfidf = tfidf.transform(X_test)

In [32]:
test_tfidf

<6575x9006 sparse matrix of type '<class 'numpy.float64'>'
	with 96527 stored elements in Compressed Sparse Row format>

### Part 1: Replication

**First of all, repeating the authors' setting. Kernel is `linear`, penalty parameter `C=10`**

In [33]:
svm = SVC(kernel='linear', C=10, random_state=42)
svm.fit(train_tfidf, y_train)

SVC(C=10, kernel='linear', random_state=42)

In [34]:
y_pred_test_svm = svm.predict(test_tfidf)

**class 1 stands for CLAIM, class 0 stands for PREMISE**

In [35]:
target_names = ['class 0', 'class 1']
print(classification_report(y_test, y_pred_test_svm, target_names=target_names, digits=3))

              precision    recall  f1-score   support

     class 0      0.620     0.534     0.574      3214
     class 1      0.607     0.687     0.644      3361

    accuracy                          0.612      6575
   macro avg      0.613     0.610     0.609      6575
weighted avg      0.613     0.612     0.610      6575



               precision    recall  f1-score   support

     class 0      0.620     0.534     0.574      3214
     class 1      0.607     0.687     0.644      3361

      accuracy                        0.612      6575
     macro avg    0.613     0.610     0.609      6575
    weighted avg  0.613     0.612     0.610      6575


### Part 2: Hyperparameter tuning

**First, initializing the parameters grid. We are tuning parameters `C` and `gamma`**

In [39]:
param_grid = {'C': [0.1, 1, 5, 10], 'gamma': [1, 0.1, 0.01]}

In [40]:
grid = GridSearchCV(SVC(kernel='linear'), param_grid, refit=True, verbose=2)
grid.fit(val_tfidf, y_val)

Fitting 5 folds for each of 12 candidates, totalling 60 fits
[CV] C=0.1, gamma=1 ..................................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ................................... C=0.1, gamma=1, total=   4.2s
[CV] C=0.1, gamma=1 ..................................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    4.2s remaining:    0.0s


[CV] ................................... C=0.1, gamma=1, total=   4.3s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   4.5s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   4.1s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   3.9s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.3s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.3s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.4s
[CV] C=0.1, gamma=0.1 ................................................
[CV] .

[CV] ................................. C=10, gamma=0.01, total=   3.5s


[Parallel(n_jobs=1)]: Done  60 out of  60 | elapsed:  3.3min finished


GridSearchCV(estimator=SVC(kernel='linear'),
             param_grid={'C': [0.1, 1, 5, 10], 'gamma': [1, 0.1, 0.01]},
             verbose=2)

In [41]:
grid.best_params_

{'C': 1, 'gamma': 1}

`'C': 1, 'gamma': 1`

### Part 3: Training and testing with the best parameters

**As the tuning above shows, the best parameters are `C=1` and `gamma=0.1`. Now we shall train and test the model with this parameter setting**

In [42]:
svm_best = SVC(kernel='linear', C=1, gamma=1, random_state=42)
svm_best.fit(train_tfidf, y_train)

SVC(C=1, gamma=1, kernel='linear', random_state=42)

In [43]:
y_pred_test_svm_best = svm_best.predict(test_tfidf)

In [44]:
target_names = ['class 0', 'class 1']
print(classification_report(y_test, y_pred_test_svm_best, target_names=target_names, digits=3))

              precision    recall  f1-score   support

     class 0      0.655     0.571     0.610      3214
     class 1      0.634     0.712     0.671      3361

    accuracy                          0.643      6575
   macro avg      0.645     0.641     0.640      6575
weighted avg      0.644     0.643     0.641      6575



               precision    recall  f1-score   support

     class 0      0.655     0.571     0.610      3214
     class 1      0.634     0.712     0.671      3361

       accuracy                       0.643      6575
      macro avg   0.645     0.641     0.640      6575
    weighted avg  0.644     0.643     0.641      6575

**Conclusion: we have reached results comparable to authors after hyperparameter tuning. F1-score for class 0 (premises) is higher than the authors' value: 0.610 vs 0.599, while our f1-score for class 1 (claims) is slightly lower**