## Introduction

- Tutorial on Mining of supporting and objecting units .
- Presenter: Harsh Shah, Meher Vivek, Rene Sherf
- Task: Find the stance of evidence towards the claim

- Inspired by [Bar-Haim et al.  2017](https://www.aclweb.org/anthology/E17-1024)
.
    - Paper: Stance Classification of Context-Dependent Claims
    - Input: Topics, Claims 
    - Output: Stance of claim towards the topic


In [2]:
from assignment_stanceDetectionModel import StanceDetectionModel

      topicId  split                                          topicText  \
0           1   test  This house believes that the sale of violent v...   
1           1   test  This house believes that the sale of violent v...   
2           1   test  This house believes that the sale of violent v...   
3           1   test  This house believes that the sale of violent v...   
4           1   test  This house believes that the sale of violent v...   
5           1   test  This house believes that the sale of violent v...   
6           1   test  This house believes that the sale of violent v...   
7           1   test  This house believes that the sale of violent v...   
8           1   test  This house believes that the sale of violent v...   
9           1   test  This house believes that the sale of violent v...   
10          1   test  This house believes that the sale of violent v...   
11          1   test  This house believes that the sale of violent v...   
12          1   test  Thi

In [4]:
model = StanceDetectionModel()

# Import the data

In [5]:
import pandas as pd
data = pd.read_csv('Dataset/claim_stance_dataset_v1.csv')    

In [6]:
data

Unnamed: 0,topicId,split,topicText,topicTarget,topicSentiment,claims.claimId,claims.stance,claims.claimCorrectedText,claims.claimOriginalText,claims.article.rawFile,...,claims.article.rawSpan.end,claims.article.cleanFile,claims.article.cleanSpan.start,claims.article.cleanSpan.end,claims.Compatible,claims.claimTarget.text,claims.claimTarget.span.start,claims.claimTarget.span.end,claims.claimSentiment,claims.targetsRelation
0,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2973,PRO,Exposure to violent video games causes at leas...,exposure to violent video games causes at leas...,articles/t1/raw_1.txt,...,640,articles/t1/clean_1.txt,418,568,yes,Exposure to violent video games,0.0,31.0,-1.0,1.0
1,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2974,CON,video game violence is not related to serious ...,video game violence is not related to serious ...,articles/t1/raw_1.txt,...,1697,articles/t1/clean_1.txt,829,907,yes,video game violence,0.0,19.0,1.0,1.0
2,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2975,CON,some violent video games may actually have a p...,some violent video games may actually have a p...,articles/t1/raw_1.txt,...,2089,articles/t1/clean_1.txt,1004,1082,yes,some violent video games,0.0,24.0,1.0,1.0
3,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2977,PRO,exposure to violent video games causes both sh...,exposure to violent video games causes both sh...,articles/t1/raw_1.txt,...,3695,articles/t1/clean_1.txt,1442,1577,yes,exposure to violent video games,0.0,31.0,-1.0,1.0
4,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2978,PRO,Violent video games increase the violent tende...,they increase the violent tendencies among youth,articles/t1/raw_1.txt,...,8167,articles/t1/clean_1.txt,3900,3948,yes,Violent video games,0.0,19.0,-1.0,1.0
5,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2979,CON,No conclusive link was found between video gam...,have shown no conclusive link between video ga...,articles/t1/raw_1.txt,...,9167,articles/t1/clean_1.txt,4124,4199,yes,video game usage,37.0,53.0,1.0,1.0
6,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2980,PRO,violent video games are significantly associat...,violent video games are significantly associat...,articles/t1/raw_1.txt,...,11587,articles/t1/clean_1.txt,5608,5792,yes,violent video games,0.0,19.0,-1.0,1.0
7,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2981,PRO,video game publishers unethically train childr...,video game publishers unethically train childr...,articles/t1/raw_1.txt,...,13905,articles/t1/clean_1.txt,7084,7222,yes,video game publishers,0.0,21.0,-1.0,1.0
8,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2982,PRO,violent video games may increase mild forms of...,violent video games may increase mild forms of...,articles/t1/raw_1.txt,...,14480,articles/t1/clean_1.txt,7476,7571,yes,violent video games,0.0,19.0,-1.0,1.0
9,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2983,PRO,exposure to violent video games results in inc...,exposure to violent video games results in inc...,articles/t1/raw_1.txt,...,14664,articles/t1/clean_1.txt,7599,7755,yes,exposure to violent video games,0.0,31.0,-1.0,1.0


In [7]:
# Get the claim text from the 7th column of the dataset
claim_corrected_data = data.iloc[:,7]

# Data Prerpocessing on claim corrected data

## Remove stop words 

In [8]:
claim_corrected_data_cleaned = model.remove_stop_words(claim_corrected_data)

## Stemming

In [9]:
claim_corrected_data_stemmed = model.get_stemmed_text(claim_corrected_data_cleaned) 

## Lemmatization

In [10]:
claim_lemmatized_reviews = model.get_lemmatized_text(claim_corrected_data_stemmed)

# Data after preprocessing

In [11]:
claim_lemmatized_reviews

['exposur violent video game caus least temporari increas aggress exposur correl aggress real world',
 'video game violenc relat seriou aggress behavior real life',
 'violent video game may actual prosoci effect context',
 'exposur violent video game caus short term long term aggress player decreas empathi prosoci behavior',
 'violent video game increas violent tendenc among youth',
 'No conclus link found video game usag violent activ',
 'violent video game significantli associ with: increas aggress behavior, thoughts, affect; increas physiolog arousal; decreas pro-soci (helping) behavior',
 'video game publish uneth train child use weapon and, importantly, harden emot act murder',
 'violent video game may increas mild form aggress behavior child young adult',
 'exposur violent video game result increas physiolog arousal, aggression-rel thought feel well decreas prosoci behavior',
 'No long-term relationship found play violent video game youth violenc bulli',
 'aggress child tend sele

# Updating the dataset

In [12]:
data['claims.claimCorrectedText'] = claim_lemmatized_reviews
data

Unnamed: 0,topicId,split,topicText,topicTarget,topicSentiment,claims.claimId,claims.stance,claims.claimCorrectedText,claims.claimOriginalText,claims.article.rawFile,...,claims.article.rawSpan.end,claims.article.cleanFile,claims.article.cleanSpan.start,claims.article.cleanSpan.end,claims.Compatible,claims.claimTarget.text,claims.claimTarget.span.start,claims.claimTarget.span.end,claims.claimSentiment,claims.targetsRelation
0,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2973,PRO,exposur violent video game caus least temporar...,exposure to violent video games causes at leas...,articles/t1/raw_1.txt,...,640,articles/t1/clean_1.txt,418,568,yes,Exposure to violent video games,0.0,31.0,-1.0,1.0
1,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2974,CON,video game violenc relat seriou aggress behavi...,video game violence is not related to serious ...,articles/t1/raw_1.txt,...,1697,articles/t1/clean_1.txt,829,907,yes,video game violence,0.0,19.0,1.0,1.0
2,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2975,CON,violent video game may actual prosoci effect c...,some violent video games may actually have a p...,articles/t1/raw_1.txt,...,2089,articles/t1/clean_1.txt,1004,1082,yes,some violent video games,0.0,24.0,1.0,1.0
3,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2977,PRO,exposur violent video game caus short term lon...,exposure to violent video games causes both sh...,articles/t1/raw_1.txt,...,3695,articles/t1/clean_1.txt,1442,1577,yes,exposure to violent video games,0.0,31.0,-1.0,1.0
4,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2978,PRO,violent video game increas violent tendenc amo...,they increase the violent tendencies among youth,articles/t1/raw_1.txt,...,8167,articles/t1/clean_1.txt,3900,3948,yes,Violent video games,0.0,19.0,-1.0,1.0
5,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2979,CON,No conclus link found video game usag violent ...,have shown no conclusive link between video ga...,articles/t1/raw_1.txt,...,9167,articles/t1/clean_1.txt,4124,4199,yes,video game usage,37.0,53.0,1.0,1.0
6,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2980,PRO,violent video game significantli associ with: ...,violent video games are significantly associat...,articles/t1/raw_1.txt,...,11587,articles/t1/clean_1.txt,5608,5792,yes,violent video games,0.0,19.0,-1.0,1.0
7,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2981,PRO,video game publish uneth train child use weapo...,video game publishers unethically train childr...,articles/t1/raw_1.txt,...,13905,articles/t1/clean_1.txt,7084,7222,yes,video game publishers,0.0,21.0,-1.0,1.0
8,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2982,PRO,violent video game may increas mild form aggre...,violent video games may increase mild forms of...,articles/t1/raw_1.txt,...,14480,articles/t1/clean_1.txt,7476,7571,yes,violent video games,0.0,19.0,-1.0,1.0
9,1,test,This house believes that the sale of violent v...,the sale of violent video games to minors,-1,2983,PRO,exposur violent video game result increas phys...,exposure to violent video games results in inc...,articles/t1/raw_1.txt,...,14664,articles/t1/clean_1.txt,7599,7755,yes,exposure to violent video games,0.0,31.0,-1.0,1.0


# Split train and test data and handle missing values

In [13]:
train,test = model.load_dataset(data)
print('Train Size',train.shape)
print('Test Size', test.shape)

Train Size (0, 0)
Test Size (0, 0)


# Three Approaches to predict claim sentiment


 - Vader sentimet analysis (baseline)
 - SVM Classifier
    - Without claim target
    - With claim target

# Vader sentimet analysis (baseline)

In [14]:
claim_corrected_data_test = test.iloc[:,5]
VaderStanceclaimDetection = pd.DataFrame([model._vadersentiment_analysis(evidence) for evidence in claim_corrected_data_test])
# printing first five predicted value to see the predicted sentiment
VaderStanceclaimDetection[0:5]

IndexError: single positional indexer is out-of-bounds

# Confusion matrix for predicted claim sentiment 

In [13]:
from sklearn.metrics import confusion_matrix
claim_sentiment_data_test = test.iloc[:,17]
actual_sentiment = claim_sentiment_data_test
matrix = confusion_matrix(actual_sentiment, VaderStanceclaimDetection)
print(matrix)

[[273 450]
 [ 99 464]]


# Evaluation of model for claim prediction 

In [14]:
# Calculating precision, recall, fmeasure, accuracy from confusion matrix
accuracy_vader = (matrix[0][0]+matrix[1][1]) / (matrix[0][0]+matrix[1][1]+matrix[0][1]+matrix[1][0])
precision_vader = matrix[1][1] / (matrix[0][1]+matrix[1][1])
recall_vader    = matrix[0][0] / (matrix[0][0]+matrix[1][0])
f_measure_vader = (2 * precision_vader * recall_vader ) / (precision_vader + recall_vader) 
print("Accuracy:\t" + str(accuracy_vader))
print("precision:\t" +   str(precision_vader))
print("Recall:\t\t" + str(recall_vader))
print("f_measure:\t" + str(f_measure_vader)) 

Accuracy:	0.573094867807154
precision:	0.5076586433260394
Recall:		0.7338709677419355
f_measure:	0.6001563499395921


#  SVM classifier without claim target

## Select columns from train data

In [15]:
claim_corrected_data_train = train.iloc[:,5]
claim_sentiment_data_train = train.iloc[:,17]

# Features calculation for the train data 
-                Negative_score
-                Neutral_score
-                Positive_score
-                Num_of_positive_words 
-                Num_of_negative_words 
-                Num_of_neutral_words
-                Avg_tfidf_feature 
-                Max_tfidf_feature

In [17]:
# This step will take some time as feauture calculation is performed
X_train_data = model._instance_features(claim_corrected_data_train)
# claim sentences are converted into feauters
X_train_data

Unnamed: 0,0,1,2,3,4,5,6,7
0,0.259,0.741,0.000,0,1,10,0.201983,0.469099
1,0.000,0.674,0.326,1,0,6,0.301540,0.579753
2,0.000,0.808,0.192,1,0,8,0.245826,0.538193
3,0.000,0.585,0.415,3,0,10,0.239538,0.389968
4,0.000,0.815,0.185,1,0,11,0.199175,0.521924
5,0.000,1.000,0.000,0,0,8,0.125000,1.000000
6,0.000,1.000,0.000,0,0,13,0.174656,0.692578
7,0.000,1.000,0.000,0,0,3,0.333333,1.000000
8,0.000,1.000,0.000,0,0,6,0.286983,0.619440
9,0.000,0.556,0.444,2,0,7,0.233757,0.576349


# Select columns from test data

In [18]:
claim_corrected_data_test = test.iloc[:,5]
claim_sentiment_data_test = test.iloc[:,17]

# claim sentiment from train data
actual_claim_sentiment =  claim_sentiment_data_train

# Training SVM model on features

In [19]:
# This step will take some time as model is getting trained
from sklearn import svm
clf=svm.SVC(gamma='auto')
clf.fit(X_train_data,actual_claim_sentiment)
Y_test_data_transfom = model._instance_features(claim_corrected_data_test)
# predict claim sentiment on the test data
Y_test_class_data_model1 = clf.predict(Y_test_data_transfom)
print('SVM Model trained on testing data ')

SVM Model trained on testing data 


In [20]:
# displaying predicted stance for the claim sentences
Y_test_class_data_model1

array([-1., -1., -1., ..., -1., -1.,  1.])

# Confusion matrix for predicted claim sentiment 

In [21]:
from sklearn.metrics import confusion_matrix
actual = claim_sentiment_data_test
matrix_model1 = confusion_matrix(actual, Y_test_class_data_model1)
print(matrix_model1)


[[561 162]
 [329 234]]


# Evaluation of model for claim prediction 

In [22]:
# Calculating precision, recall, fmeasure, accuracy from confusion matrix
accuracy_model1 = (matrix_model1[0][0]+matrix_model1[1][1]) / (matrix_model1[0][0]+matrix_model1[1][1]+matrix_model1[0][1]+matrix_model1[1][0])
precision_model1 = matrix_model1[1][1] / (matrix_model1[0][1]+matrix_model1[1][1])
recall_model1    = matrix_model1[0][0] / (matrix_model1[0][0]+matrix_model1[1][0])
f_measure_model1 = (2 * precision_model1 * recall_model1 ) / (precision_model1 + recall_model1)
print("Accuracy:\t" + str(accuracy_model1))
print("precision:\t" +   str(precision_model1))
print("Recall:\t\t" + str(recall_model1))
print("f_measure:\t" + str(f_measure_model1)) 

Accuracy:	0.6181959564541213
precision:	0.5909090909090909
Recall:		0.6303370786516854
f_measure:	0.6099866175978589


#  SVM classifier with claim target

In [23]:
from stanceDetectionModelTarget import StanceDetectionModelTarget

In [24]:
model2 = StanceDetectionModelTarget()

In [25]:
# Target data of claim also added for finding the fetaures(Similarity added)
targets_claim_data_train = train.iloc[:,14]
X_target_claim_filtered = model2._instance_features(claim_corrected_data_train,targets_claim_data_train)

# Train data are transformed to features

In [26]:
X_target_claim_filtered

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0.259,0.741,0.000,-0.5423,0,1,10,0.201983,0.328947,0.469099
1,0.000,0.674,0.326,0.4404,1,0,6,0.301540,0.341463,0.579753
2,0.000,0.808,0.192,0.2263,1,0,8,0.245826,0.176471,0.538193
3,0.000,0.585,0.415,0.7269,3,0,10,0.239538,0.325000,0.389968
4,0.000,0.815,0.185,0.3612,1,0,11,0.199175,0.186667,0.521924
5,0.000,1.000,0.000,0.0000,0,0,8,0.125000,0.180000,1.000000
6,0.000,1.000,0.000,0.0000,0,0,13,0.174656,0.129412,0.692578
7,0.000,1.000,0.000,0.0000,0,0,3,0.333333,0.291667,1.000000
8,0.000,1.000,0.000,0.0000,0,0,6,0.286983,0.405405,0.619440
9,0.000,0.556,0.444,0.6808,2,0,7,0.233757,0.237288,0.576349


# Training SVM on model2

In [28]:
# This step will take some time as model is getting trained
claim_target_data_test = test.iloc[:,14]
X_train_class_data = claim_sentiment_data_train
Y_test_data_transfom = model2._instance_features(claim_corrected_data_test,claim_target_data_test)
from sklearn import svm
clf=svm.SVC(gamma='auto')
clf.fit(X_target_claim_filtered,X_train_class_data)
Y_test_data = claim_corrected_data_test
# predict claim sentiment on the test data
Y_test_class_data_model2 = clf.predict(Y_test_data_transfom)
print('SVM Model trained on testing data ')

SVM Model trained on testing data 


In [29]:
# displaying predicted stance for the claim sentences
Y_test_class_data_model2

array([-1., -1., -1., ..., -1., -1.,  1.])

# Confusion matrix for predicted claim sentiment 

In [30]:
from sklearn.metrics import confusion_matrix
actual = claim_sentiment_data_test
matrix_model2 = confusion_matrix(actual, Y_test_class_data_model2)
print(matrix_model2)


[[566 157]
 [331 232]]


# Evaluation of model for claim prediction 

In [31]:
# Calculating precision, recall, fmeasure, accuracy from confusion matrix
accuracy_model2 = (matrix_model2[0][0]+matrix_model2[1][1]) / (matrix_model2[0][0]+matrix_model2[1][1]+matrix_model2[0][1]+matrix_model2[1][0])
precision_model2 = matrix_model2[1][1] / (matrix_model2[0][1]+matrix_model2[1][1])
recall_model2    = matrix_model2[0][0] / (matrix_model2[0][0]+matrix_model2[1][0])
f_measure_model2 = (2 * precision_model2 * recall_model2 ) / (precision_model2 + recall_model2)
print("Accuracy:\t" + str(accuracy_model2))
print("precision:\t" +   str(precision_model2))
print("Recall:\t\t" + str(recall_model2))
print("f_measure:\t" + str(f_measure_model2)) 

Accuracy:	0.6205287713841369
precision:	0.596401028277635
Recall:		0.6309921962095875
f_measure:	0.6132091772166678


# Comparing accuracy of different approaches 

In [32]:
print('Accuracy VaderSentiment:' ,accuracy_vader)
print('Accuracy Model1:\t' ,accuracy_model1)
print('Accuracy Model2:\t', accuracy_model2)


Accuracy VaderSentiment: 0.573094867807154
Accuracy Model1:	 0.6181959564541213
Accuracy Model2:	 0.6205287713841369


# Calculating stance for model2
with the  formula from the paper

In [33]:
topic_sentiment_data_test = test.iloc[:,2]
claim_target_data_relation = test.iloc[:,18]
topic_text = test.iloc[:,0]
stance_test = test.iloc[:,4]
# find predicted using following value topic_sentiment_data_test, claim_target_data_relation,Y_test_class_data_model2
# To Do
predicted_stance[0:7]

0    1.0
1    1.0
2    1.0
3    1.0
4    1.0
5    1.0
6    1.0
dtype: float64

# Confusion matrix for stance prediction 

In [34]:
# Converting given stance to 1/-1 to compare the result with predicted stance
stance_filtered = stance_test.replace('PRO', 1)
stance_filtered = stance_filtered.replace('CON', -1)
# pro/ con to 1 / -1 
matrix_stance = confusion_matrix(stance_filtered, predicted_stance)
matrix_stance

array([[398, 220],
       [269, 399]], dtype=int64)

In [35]:
accuracy_stance = (matrix_stance[0][0]+matrix_stance[1][1]) /(matrix_stance[0][0]+matrix_stance[1][1]+matrix_stance[0][1]+matrix_stance[1][0]) 
print("Accuracy:\t" + str(accuracy_stance))

Accuracy:	0.619751166407465


In [36]:
# Converting -1/1 to pro and con for comparison with the given stance 
predicted_stances = predicted_stance.replace(1, 'PRO')
predicted_stance = predicted_stances.replace(-1,'CON')

# Printing the final result comparing actual stance with predicted stance ( calculated using formula)

In [37]:
DataFrameCombination = pd.concat([topic_text,claim_corrected_data_test,stance_test],axis =1)
DataFrameCombination['predicted.Stance'] = predicted_stance

In [38]:
DataFrameCombination

Unnamed: 0,topicText,claims.claimCorrectedText,claims.stance,predicted.Stance
0,This house believes that the sale of violent v...,exposur violent video game caus least temporar...,PRO,PRO
1,This house believes that the sale of violent v...,video game violenc relat seriou aggress behavi...,CON,PRO
2,This house believes that the sale of violent v...,violent video game may actual prosoci effect c...,CON,PRO
3,This house believes that the sale of violent v...,exposur violent video game caus short term lon...,PRO,PRO
4,This house believes that the sale of violent v...,violent video game increas violent tendenc amo...,PRO,PRO
5,This house believes that the sale of violent v...,No conclus link found video game usag violent ...,CON,PRO
6,This house believes that the sale of violent v...,violent video game significantli associ with: ...,PRO,PRO
7,This house believes that the sale of violent v...,video game publish uneth train child use weapo...,PRO,PRO
8,This house believes that the sale of violent v...,violent video game may increas mild form aggre...,PRO,PRO
9,This house believes that the sale of violent v...,exposur violent video game result increas phys...,PRO,PRO
