## Experimental Setting 1: Tf-idf for unigrams + linear SVM 
### Task 1: Classification Argument (contains either Claim or Premise) vs non-Argument 

**Importing necessary packages**

In [21]:
import pandas as pd
import numpy as np
import string
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV

In [2]:
filename = '../../../data/sentence_db_candidate.csv'
df = pd.read_csv(filename)

In [3]:
df.shape

(29621, 18)

**Simple preprocessing of sentences with lowercasing and punctuation removal**

In [4]:
def preproc(sentence):
    sentence = sentence.lower()
    sentence = ''.join([i for i in sentence if i not in string.punctuation])
    return sentence

In [5]:
df['Speech'] = df['Speech'].apply(preproc)

In [6]:
df.head()

Unnamed: 0,Text,Part,Document,Order,Sentence,Start,End,Annotator,Tag,Component,Speech,Speaker,SpeakerType,Set,Date,Year,Name,MainTag
0,"CHENEY: Gwen, I want to thank you, and I want ...",1,30_2004,0,0,2101,2221,,"{""O"": 27}",O,gwen i want to thank you and i want to thank ...,CHENEY,Candidate,TRAIN,05 Oct 2004,2004,Richard(Dick) B. Cheney,O
1,"It's a very important event, and they've done ...",1,30_2004,1,1,2221,2304,,"{""O"": 19}",O,its a very important event and theyve done a s...,CHENEY,Candidate,TRAIN,05 Oct 2004,2004,Richard(Dick) B. Cheney,O
2,It's important to look at all of our developme...,1,30_2004,2,2,2304,2418,,"{""O"": 23}",O,its important to look at all of our developmen...,CHENEY,Candidate,TRAIN,05 Oct 2004,2004,Richard(Dick) B. Cheney,O
3,"And, after 9/11, it became clear that we had t...",1,30_2004,3,3,2418,2744,,"{""O"": 16, ""Claim"": 50}",Claim,and after 911 it became clear that we had to d...,CHENEY,Candidate,TRAIN,05 Oct 2004,2004,Richard(Dick) B. Cheney,Claim
4,And we also then finally had to stand up democ...,1,30_2004,4,4,2744,2974,,"{""O"": 4, ""Claim"": 13, ""Premise"": 25}",Premise,and we also then finally had to stand up democ...,CHENEY,Candidate,TRAIN,05 Oct 2004,2004,Richard(Dick) B. Cheney,Mixed


**Below we are turning labels marking Claims, Premises and None into machine-readable classes: 1 for claims and premises and 0 for none**

In [7]:
valid = ['Claim', 'Premise', 'O']
df = df.loc[(df['Component'].isin(valid))]

In [9]:
classes = []

for s in df.Component:
    if s == 'O':
        classes.append(0.0)
    else:
        classes.append(1.0)

In [10]:
df['Annotation'] = classes

In [11]:
df.Annotation.value_counts()

1.0    22280
0.0     7252
Name: Annotation, dtype: int64

**Splitting the data into three sets. Our sets will be identical to those of authors**

In [12]:
df_train = df[df['Set'] == 'TRAIN']
df_val = df[df['Set'] == 'VALIDATION']
df_test = df[df['Set'] == 'TEST']

**Separating features set and target variable set**

In [13]:
X_train = df_train.Speech
y_train = df_train.Annotation

X_test = df_test.Speech
y_test = df_test.Annotation

X_val = df_val.Speech
y_val = df_val.Annotation

**Initializing tf-idf feature matrix. Fitting and transforming sentences on a train set and only transforming on a validation and test sets. We will be using the whole vocabulary, here it is 9.833 words**

In [15]:
tfidf = TfidfVectorizer()

train_tfidf =  tfidf.fit_transform(X_train)
val_tfidf = tfidf.transform(X_val)
test_tfidf = tfidf.transform(X_test)

In [16]:
train_tfidf

<14044x9833 sparse matrix of type '<class 'numpy.float64'>'
	with 195357 stored elements in Compressed Sparse Row format>

### Part 1: Replication

**First of all, repeating the authors' setting. Kernel is `linear`, penalty parameter `C=10`**

In [17]:
svm = SVC(kernel='linear', C=10, random_state=42)
svm.fit(train_tfidf, y_train)

SVC(C=10, kernel='linear', random_state=42)

In [18]:
y_pred_test_svm = svm.predict(test_tfidf)

**class 1 stands for ARGUMENT, class 0 stands for NONE**

In [20]:
#classification report on test set SVM
target_names = ['class 0', 'class 1']
print(classification_report(y_test, y_pred_test_svm, target_names=target_names, digits=3))

              precision    recall  f1-score   support

     class 0      0.467     0.462     0.465      1880
     class 1      0.847     0.849     0.848      6575

    accuracy                          0.763      8455
   macro avg      0.657     0.656     0.656      8455
weighted avg      0.762     0.763     0.763      8455



                precision    recall  f1-score   support

     class 0      0.467     0.462     0.465      1880
     class 1      0.847     0.849     0.848      6575

    accuracy                          0.763      8455
    macro avg     0.657     0.656     0.656      8455
    weighted avg  0.762     0.763     0.763      8455

### Part 2: Hyperparameter tuning

**First, initializing the parameters grid. We are tuning parameters `C` and `gamma`**

In [36]:
param_grid = {'C': [0.1, 1, 5, 10], 'gamma': [1, 0.1, 0.01]}

In [37]:
grid = GridSearchCV(SVC(kernel='linear'), param_grid, refit=True, verbose=2)
grid.fit(val_tfidf, y_val)

Fitting 5 folds for each of 12 candidates, totalling 60 fits
[CV] C=0.1, gamma=1 ..................................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ................................... C=0.1, gamma=1, total=   3.8s
[CV] C=0.1, gamma=1 ..................................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    3.8s remaining:    0.0s


[CV] ................................... C=0.1, gamma=1, total=   3.7s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   3.5s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   3.7s
[CV] C=0.1, gamma=1 ..................................................
[CV] ................................... C=0.1, gamma=1, total=   3.6s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.3s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.5s
[CV] C=0.1, gamma=0.1 ................................................
[CV] ................................. C=0.1, gamma=0.1, total=   3.5s
[CV] C=0.1, gamma=0.1 ................................................
[CV] .

[CV] ................................. C=10, gamma=0.01, total=   4.1s


[Parallel(n_jobs=1)]: Done  60 out of  60 | elapsed:  3.7min finished


GridSearchCV(estimator=SVC(kernel='linear'),
             param_grid={'C': [0.1, 1, 5, 10], 'gamma': [1, 0.1, 0.01]},
             verbose=2)

In [38]:
grid.best_params_

{'C': 1, 'gamma': 1}

`'C': 1, 'gamma': 1`

### Part 3: Training and testing with the best parameters

**As the tuning above shows, the best parameters are `C=1` and `gamma=1`. Now we shall train and test the model with this parameter setting**

In [32]:
svm_best = SVC(kernel='linear', C=1, gamma=1, random_state=42)

In [33]:
svm_best.fit(train_tfidf, y_train)

SVC(C=1, gamma=1, kernel='linear', random_state=42)

In [34]:
y_pred_test_svm_best = svm_best.predict(test_tfidf)

In [35]:
target_names = ['class 0', 'class 1']
print(classification_report(y_test, y_pred_test_svm_best, target_names=target_names, digits=3))

              precision    recall  f1-score   support

     class 0      0.711     0.337     0.457      1880
     class 1      0.835     0.961     0.894      6575

    accuracy                          0.822      8455
   macro avg      0.773     0.649     0.676      8455
weighted avg      0.808     0.822     0.797      8455



               precision    recall  f1-score   support

     class 0      0.711     0.337     0.457      1880
     class 1      0.835     0.961     0.894      6575

    accuracy                          0.822      8455
    macro avg     0.773     0.649     0.676      8455
    weighted avg  0.808     0.822     0.797      8455


**Conclusion: we have reached results comparable to authors before and after hyperparameter tuning. After tuning, our f1-score for class 1 (Argument) is higher than the authors' value: 0.894 vs 0.855, while our f1-score for class 0 (None) is slightly lower. Our overall f1-score for both classes (0.797) is higher than the authors' (0.737)**