# Project Milestone 4: Linguistic Hargringer of Betrayal

## Feature extraction
In this Notebook, we use the same methods used in the original paper to extract from messages the following features : "Politeness, Sentiment and Discourse" 

The datset that we extract the features from comes from the same game than the original dataset but instead of games with multiple features as entries, this dataset has messages as entries. The dataset comes from a convokit corpus.

### Step 1, importing dataset

In [2]:
import nltk
import pandas as pd
import numpy as np
from tqdm import tqdm
from collections import defaultdict
import convokit
from convokit import Corpus, Speaker, Utterance
from convokit import download
from pandas import DataFrame
from typing import List, Dict, Set

In [2]:
from convokit import Corpus, download

#We first import the corpus
corpus = Corpus(filename=download("diplomacy-corpus"))

#Then we turn the corpus into a dataframe in order to visualize it
corpus_df=corpus.get_utterances_dataframe()
corpus_df.head()

#In this notebook, we will only be interested in the "text" feature which contains the messages. 
#We keep the other features because they'll be helpfull for the analysis later on

Dataset already exists at C:\Users\Ludovic\.convokit\downloads\diplomacy-corpus


Unnamed: 0_level_0,timestamp,text,speaker,reply_to,conversation_id,meta.speaker_intention,meta.receiver_perception,meta.receiver,meta.absolute_message_index,meta.relative_message_index,meta.year,meta.game_score,meta.game_score_delta,meta.deception_quadrant
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Game1-italy-germany-0,74,Germany!\n\nJust the person I want to speak wi...,italy-Game1,,Game1-italy-germany,Truth,Truth,germany-Game1,74,0,1901,3,0,Straightforward
Game1-italy-germany-1,76,"You've whet my appetite, Italy. What's the sug...",germany-Game1,Game1-italy-germany-0,Game1-italy-germany,Truth,Truth,italy-Game1,76,1,1901,3,0,Straightforward
Game1-italy-germany-2,86,👍,italy-Game1,Game1-italy-germany-1,Game1-italy-germany,Truth,Truth,germany-Game1,86,2,1901,3,0,Straightforward
Game1-italy-germany-3,87,It seems like there are a lot of ways that cou...,germany-Game1,Game1-italy-germany-2,Game1-italy-germany,Truth,Truth,italy-Game1,87,3,1901,3,0,Straightforward
Game1-italy-germany-4,89,"Yeah, I can’t say I’ve tried it and it works, ...",italy-Game1,Game1-italy-germany-3,Game1-italy-germany,Truth,,germany-Game1,89,4,1901,3,0,Unknown


### Step 2: Politeness extraction

To extract the politeness, I follow this method: https://github.com/CornellNLP/Cornell-Conversational-Analysis-Toolkit/blob/master/examples/politeness-strategies/politeness_demo.ipynb

Basically what it does is using the Stanford politeness corpus in order to train a classifier which is able to determine whether or not a given text is polite. It will return a score between 0 = certain that the text is impolite and 1 = certain that the text is polite. This number will be our politeness coefficient. 

The Stanford politeness corpus is a dataset comming from wikipedia discussions where every message has been labelled "polite/impolite" 

In [3]:
#We create a Text parser, it is the way computers analyse natural language
#It uses formal grammar rules and lexicon from convokit.
#We'll have to use the same text parser on every dataset
from convokit import TextParser
parser=TextParser(verbosity=1000)

In [4]:
#We parse the diplomacy corpus
corpus = parser.transform(corpus)

1000/17289 utterances processed
2000/17289 utterances processed
3000/17289 utterances processed
4000/17289 utterances processed
5000/17289 utterances processed
6000/17289 utterances processed
7000/17289 utterances processed
8000/17289 utterances processed
9000/17289 utterances processed
10000/17289 utterances processed
11000/17289 utterances processed
12000/17289 utterances processed
13000/17289 utterances processed
14000/17289 utterances processed
15000/17289 utterances processed
16000/17289 utterances processed
17000/17289 utterances processed
17289/17289 utterances processed


In [5]:
# Downloading the wikipedia portion of annotated data
# This is the Stanford politeness corpus
wiki_corpus = Corpus(download("wikipedia-politeness-corpus"))

Dataset already exists at C:\Users\Ludovic\.convokit\downloads\wikipedia-politeness-corpus


In [6]:
# We parse the wiki corpus
wiki_corpus = parser.transform(wiki_corpus)

1000/4353 utterances processed
2000/4353 utterances processed
3000/4353 utterances processed
4000/4353 utterances processed
4353/4353 utterances processed


In [7]:
#Convokit comes with multiple text analysis tools and one of them is a tool that analyses the politeness strategies 
# We use that tool on the wiki corpus
from convokit import PolitenessStrategies
ps = PolitenessStrategies()
wiki_corpus = ps.transform(wiki_corpus, markers=True)

#we also use it on the diplomacy corpus
#we see that the features meta.politeness_strategies and meta.politeness_markers were added
corpus=ps.transform(corpus, markers=True)
corpus.get_utterances_dataframe().head()

Unnamed: 0_level_0,timestamp,text,speaker,reply_to,conversation_id,meta.speaker_intention,meta.receiver_perception,meta.receiver,meta.absolute_message_index,meta.relative_message_index,meta.year,meta.game_score,meta.game_score_delta,meta.deception_quadrant,meta.parsed,meta.politeness_strategies,meta.politeness_markers
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
Game1-italy-germany-0,74,Germany!\n\nJust the person I want to speak wi...,italy-Game1,,Game1-italy-germany,Truth,Truth,germany-Game1,74,0,1901,3,0,Straightforward,"[{'rt': 0, 'toks': [{'tok': 'germany', 'tag': ...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen..."
Game1-italy-germany-1,76,"You've whet my appetite, Italy. What's the sug...",germany-Game1,Game1-italy-germany-0,Game1-italy-germany,Truth,Truth,italy-Game1,76,1,1901,3,0,Straightforward,"[{'rt': 2, 'toks': [{'tok': 'you', 'tag': 'PRP...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen..."
Game1-italy-germany-2,86,👍,italy-Game1,Game1-italy-germany-1,Game1-italy-germany,Truth,Truth,germany-Game1,86,2,1901,3,0,Straightforward,"[{'rt': 0, 'toks': [{'tok': '👍', 'tag': 'JJ', ...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen..."
Game1-italy-germany-3,87,It seems like there are a lot of ways that cou...,germany-Game1,Game1-italy-germany-2,Game1-italy-germany,Truth,Truth,italy-Game1,87,3,1901,3,0,Straightforward,"[{'rt': 1, 'toks': [{'tok': 'it', 'tag': 'PRP'...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen..."
Game1-italy-germany-4,89,"Yeah, I can’t say I’ve tried it and it works, ...",italy-Game1,Game1-italy-germany-3,Game1-italy-germany,Truth,,germany-Game1,89,4,1901,3,0,Unknown,"[{'rt': 5, 'toks': [{'tok': 'yeah', 'tag': 'UH...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen..."


In [8]:
#It's time to train our classifier. Let's import the binaries for that
import random
from sklearn import svm
from scipy.sparse import csr_matrix
from sklearn.metrics import classification_report
from convokit import Classifier

In [9]:
#We are only interested in polites and impolites messages and not the "neutra" ones
binary_corpus = Corpus(utterances=[utt for utt in wiki_corpus.iter_utterances() if utt.meta["Binary"] != 0])

In [10]:
#We create test and train sets to train the classifier
test_ids = binary_corpus.get_utterance_ids()[-100:]
train_corpus = Corpus(utterances=[utt for utt in binary_corpus.iter_utterances() if utt.id not in test_ids])
test_corpus = Corpus(utterances=[utt for utt in binary_corpus.iter_utterances() if utt.id in test_ids])
print("train size = {}, test size = {}".format(len(train_corpus.get_utterance_ids()),
                                               len(test_corpus.get_utterance_ids())))

train size = 2078, test size = 100


In [14]:
#We create the classifier which uses the feature politeness_strategies to classify the politeness of each message (utterance)
clf = Classifier(obj_type="utterance", 
                        pred_feats=["politeness_strategies"], 
                        labeller=lambda utt: utt.meta['Binary'] == 1)
clf.fit(train_corpus)

Initialized default classification model (standard scaled logistic regression).


<convokit.classifier.classifier.Classifier at 0x240f55e4e88>

In [15]:
#We make the prediction on the diplomacy corpus
test_pred = clf.transform(corpus)
clf.summarize(test_pred)

Unnamed: 0_level_0,prediction,pred_score
id,Unnamed: 1_level_1,Unnamed: 2_level_1
Game7-austria-russia-110,0,0.011240
Game7-austria-russia-133,0,0.015805
Game5-turkey-france-3,0,0.018057
Game3-england-france-45,0,0.018603
Game11-austria-germany-110,0,0.018603
...,...,...
Game8-turkey-france-136,1,0.998873
Game1-england-france-16,1,0.998889
Game1-italy-russia-20,1,0.999130
Game4-turkey-england-6,1,0.999174


In [33]:
#Here is an example of how it works. 2 sentences, 1 judged impolite and the other judged polite
pred2label = {1: "polite", 0: "impolite"}
test_ids = corpus.get_utterance_ids()[-100:]
for i, idx in enumerate(test_ids[27:29]):
    print(i)
    test_utt = corpus.get_utterance(idx)
    ypred, yprob = test_utt.meta['prediction'], test_utt.meta['pred_score']
    print("test utterance:\n{}".format(test_utt.text))
    print("------------------------")
    print("Result: {}, probability estimates = {}\n".format(pred2label[ypred], yprob))

0
test utterance:
Ok. I was thinking... do you want to have Munich and I get Belgium? May require a bit of trust, but I am full in for this alliance. Also Munich is better for you to have then me
------------------------
Result: polite, probability estimates = 0.6732540104404642

1
test utterance:
I actually thought about that, but it's considerably riskier. A Ruhr is decently likely to have its support cut, while it's impossible for that to happen to A Munich
------------------------
Result: impolite, probability estimates = 0.23976556264648305



In [23]:
#Here is the datset once we've added the politeness coefficient
a=corpus.get_utterances_dataframe()
a.head()

Unnamed: 0_level_0,timestamp,text,speaker,reply_to,conversation_id,meta.speaker_intention,meta.receiver_perception,meta.receiver,meta.absolute_message_index,meta.relative_message_index,meta.year,meta.game_score,meta.game_score_delta,meta.deception_quadrant,meta.parsed,meta.politeness_strategies,meta.politeness_markers,meta.prediction,meta.pred_score
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
Game1-italy-germany-0,74,Germany!\n\nJust the person I want to speak wi...,italy-Game1,,Game1-italy-germany,Truth,Truth,germany-Game1,74,0,1901,3,0,Straightforward,"[{'rt': 0, 'toks': [{'tok': 'germany', 'tag': ...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen...",0,0.112246
Game1-italy-germany-1,76,"You've whet my appetite, Italy. What's the sug...",germany-Game1,Game1-italy-germany-0,Game1-italy-germany,Truth,Truth,italy-Game1,76,1,1901,3,0,Straightforward,"[{'rt': 2, 'toks': [{'tok': 'you', 'tag': 'PRP...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen...",0,0.107987
Game1-italy-germany-2,86,👍,italy-Game1,Game1-italy-germany-1,Game1-italy-germany,Truth,Truth,germany-Game1,86,2,1901,3,0,Straightforward,"[{'rt': 0, 'toks': [{'tok': '👍', 'tag': 'JJ', ...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen...",0,0.220636
Game1-italy-germany-3,87,It seems like there are a lot of ways that cou...,germany-Game1,Game1-italy-germany-2,Game1-italy-germany,Truth,Truth,italy-Game1,87,3,1901,3,0,Straightforward,"[{'rt': 1, 'toks': [{'tok': 'it', 'tag': 'PRP'...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen...",1,0.685146
Game1-italy-germany-4,89,"Yeah, I can’t say I’ve tried it and it works, ...",italy-Game1,Game1-italy-germany-3,Game1-italy-germany,Truth,,germany-Game1,89,4,1901,3,0,Unknown,"[{'rt': 5, 'toks': [{'tok': 'yeah', 'tag': 'UH...","{'feature_politeness_==Please==': 0, 'feature_...","{'politeness_markers_==Please==': [], 'politen...",0,0.171131


### Step 3: Sentiment extraction

To extract the sentiment feature, we use the Stanford Natural Language Processing core. First, we had to download the Stanford core NLP on :

https://stanfordnlp.github.io/CoreNLP/download.html#getting-a-copy

It is basically all the grammar rules and lexicon of the Stanford NLP. 

Once downloaded, to use it, we have to launch a virtual environment on our computer with java using this command from terminal:

**java -mx1g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 50000**

We will be able to find and use it on localhost port 9000

The method to use the Stanford coreNLP and exctract Sentiments was found on : 

https://github.com/coffeewithshiva/Sentiment_Analysis_Stanford_NLP/blob/master/sentiment_analysis_stanford_nlp.py

In [38]:
from pycorenlp import StanfordCoreNLP

In [39]:
nlp = StanfordCoreNLP('http://localhost:9000')

In [40]:
import math
#This method returns an array containing the sentiment of each sentence of a message (text)
#If there is a message that takes over a minute to be analysed, we return NaN. This happens rarely 
#There are more than 17'000 messages total and running it takes a long time.
def get_sentiment(text):
   
    #We simply use stanford NLP to extract the wanted features
    results = nlp.annotate(text,properties={
        'annotators':'sentiment, ner, pos',
        'outputFormat': 'json',
        'timeout': 60000,
        })
   
    sentiment_acc=[]
    #In the array, for each sentence, we add 1 if sentence is positive, -1 if negative and 0 if neutral
    try:
        for s in results["sentences"]:
            if(s["sentiment"]=="Negative"):
                sentiment_acc.append(-1)
            elif(s["sentiment"]=="Positive"):
                sentiment_acc.append(1)
            else:
                sentiment_acc.append(0)
        print(sentiment_acc)
        return sentiment_acc
    except:
        print("error")
        return math.nan

In [34]:
#We create a new column to our dataframe, containing sentiment array. This takes a very long time to compute
data_sentiment=a
data_sentiment["sentiment"]=a["text"].apply(lambda x: get_sentiment(x))

[0, 0, -1, -1, 0, 0, 1, 0]
[0, 0]
[0]
[-1]
[-1, -1, 0, 0, 0, -1, 1, 1]
[-1, -1, 1]
[-1]
[-1, 0]
[-1]
[1, 0, 0]
[0, 1, 0, 0]
[-1, -1]
[0, 0, 1, 0, 1, 1, -1, 1, 1]
[0, -1, -1, 0, 0]
[-1]
[-1, -1, 0, -1, -1]
[0]
[-1, 0, 1, 0]
[1, -1]
[1]
[0]
[0, 0, 0]
[0]
[0, 1, -1, 1]
[0]
[0, 0]
[-1]
[0]
[-1]
[-1, 0, 1]
[1, 0]
[0, -1, -1]
[0, -1, 0, 1, 0]
[0, 0, -1, 0]
[0, 0, 0, 0, 0]
[0, 0, 1, 0]
[-1, 0]
[0, -1, 1, 0]
[0, -1]
[0]
[1, 0, 0, 0]
[-1, 0, 0]
[0, 1]
[0]
[-1, 1]
[0, 1]
[0, -1]
[1, 0, 1, 0, -1]
[1, 1]
[1, 0]
[1, 1, 1]
[0, 1, 0, 1]
[0, 0]
[0, 0]
[0, -1, 0]
[0, 0]
[0, -1, 0]
[0]
[1]
[-1]
[-1]
[1]
[1, -1, 1, -1]
[0, 0]
[-1, -1]
[0]
[0]
[0]
[0]
[0, 1, 0, -1]
[0]
[-1, 0]
[-1, 0]
[0, 0]
[0, 0, 0]
[1]
[1, 0]
[0]
[-1]
[1, 0, 0]
[-1, 0]
[0, 0]
[0, 0]
[0]
[-1, 0, 0]
[0, 0]
[0]
[-1, 0]
[0, 0]
[0, -1, 0]
[-1, -1, -1]
[-1]
[0]
[-1]
error
[-1]
[1]
[0, 0]
[1, -1, 0]
[0]
[0]
[-1]
[0, 0, 0]
[0, 0]
[1, -1, -1, -1]
[-1, 1, 1, 0]
[0, -1, 1, 0, 0]
[1, -1, 0]
[1, 1, 1, -1, 1]
[0, -1, 0, 1, 1, -1]
[0, -1, 1, -1, 0]
[

[1, 0, 1, -1, 1]
[0, -1, 0, 0]
[-1, 0, 0]
[0]
[0, -1, 0]
[0, -1, 0, 1, -1]
[0]
[0]
[-1, 0]
[0]
[0, 1, -1, -1]
[-1, 0]
[0, 0]
[0, 1]
[1]
[1]
[0, 0]
[0, 0, 0]
[1, 0]
[0, 0, 0]
[0]
[-1]
[0]
[0, 0, 0]
[-1]
[0, 0, 1]
[-1, -1, 0, -1, -1, 1]
[0, 0]
[0, 0, 0, 0]
[0]
[0]
[1]
[-1, 0]
[0]
[-1]
[0]
[-1, -1]
[0]
[0, 0]
[0, 0]
[0, 0, 0, 0]
[0, 0]
[0, 1]
[-1, 0, 0, 0]
[0, 0, 1, 0]
[0, 1]
[0, 0, 0, 0]
[0]
[0, 1]
[0]
[1, 1]
[1, 0, 0]
[0, 0]
[-1, -1]
[1]
[-1, 0]
[1, 0]
[0, 1, 0]
[-1]
[0, -1, 1]
[0, 1, -1]
[1, 0]
[0, 0]
[0]
[0]
[0, -1, 0]
[0, 1]
[0]
[0, -1]
[0]
[0, 0]
[0, 0]
[1]
[0, 0, 0, 0]
[0, 0, 1]
[1]
[-1]
[-1, 0]
[-1, -1]
[0, -1, 1, -1, 0]
[1, 1]
[0, 0, -1, 0, 0, 1, 1]
[1, 0]
[1, 1]
[0, -1, -1]
[0]
[-1, 0]
[0]
[0]
[0, -1, 0]
[0]
[0, -1]
[-1, 1]
[-1, 0, -1, -1, -1]
[1, 1]
[0, 0, 1, 0]
[0]
[1, 0, -1]
[0, -1]
[0]
[0, -1, -1, 0]
[0, 0, -1, 0]
[-1, 1, -1]
[0]
[-1, 0]
[1, 1, -1, -1, 0, 1, 1]
[1, 1, 0, -1, 1, 0]
[-1, 0, 1, -1, -1, 1]
[0]
[0]
[-1, 0]
[0, -1, 0]
[0, 0]
[-1]
[-1, 1]
[0, 1, 0, 0, -1, 0, 0, 1, 

[0, 0, 0]
[-1]
[0, 0, 0]
[1]
[0, 0, 0]
[0, 0]
[0, 0]
[-1, 0, -1, -1, -1]
[1]
[0]
[-1]
[0, 1]
[0, 0]
[0, 0, 1]
[0, -1, 0, -1, -1, 0, -1, 0, -1]
[-1, 1]
[0, 0]
[0, 0, 0]
[0, -1, 1, -1, 0]
[1, 1]
[1]
[0, 0, -1]
[0, 0, -1]
[0]
[-1, -1, 1, 0]
[0, 0, 1, 0]
[0]
[-1]
[0, 1]
[1, 1]
[-1, -1]
[0, -1]
[0]
[1, -1]
[-1]
[1, -1, -1]
[0]
[0]
[0]
[-1, 1, 1]
[1, 0]
[0]
[0, 0]
[0]
[0, 0, -1]
[1]
[-1, 0]
[1]
[0, -1]
[0]
[0]
[0]
[0, -1]
[1]
[0, 0, 0, 1]
[1, 0, 0, 0]
[0, -1]
[1, 0, -1, 0]
[-1]
[-1, 0, -1, 0]
[0, 0, 0, -1, -1]
[-1, 0, -1]
[-1]
[-1]
[1, 0, 0]
[-1, -1, 0, -1, 0]
[0, 0]
[1, 0]
[0, -1]
[0, 0]
[0, -1]
[-1]
[0, -1, 0, 0, 1]
[0]
[-1]
[0, 0]
[1, -1, -1]
[0]
[0, 0, 0]
[-1, -1, 0, 0, 0]
[0]
[0, 0]
[-1, -1]
[-1, 0, 1, -1, -1, 1]
[0, 0]
[-1, 0]
[0, -1]
[0, -1, 0]
[1]
[0]
[0, 0]
[0]
[0, -1]
[0]
[1, -1, 1]
[1]
[-1]
[0]
[0, 0]
[0]
[0, -1, -1]
[-1]
[-1]
[1]
[0]
[1]
[0]
[0]
[0, 1]
[-1, -1, -1, -1]
[0]
[1, 1, 1]
[1, -1, 0]
[1, 0, -1, -1]
[0, -1]
[0]
[1]
[0]
[0]
[0, 0, 0]
[-1]
[-1]
[0, 0]
[1, 0, 1, 1]
[-1, -1]

[0, -1, 0]
[0, 0, 0, 0, -1, -1]
[0, -1, 0, 0]
[1, -1, 0]
[-1, -1]
[-1, 0, -1]
[0, 0]
[-1, -1, -1, 0, 0, 0]
[1, 1, -1]
[1, 1]
[1, 0, 0]
[1, 1]
[0]
[1, 0]
[0]
[1, 0]
[0, 0]
[0, 0]
[-1, -1, 0]
[-1, 0, -1]
[0]
[-1]
[0, 0]
[-1]
[0]
[-1, -1]
[0]
[-1]
[0]
[1, 0]
[0]
[0, 0]
[0]
[0]
[1, 0, 0]
[0, 1]
[1, 0]
[1, -1, 0]
[0, 0]
[0, 0, -1]
[0]
[0]
[1]
[-1, 0]
[0, 0]
[-1, 0]
[0, 0]
[0, 0]
[-1, 0, 0]
[0, -1, 0]
[1, 1, 0, -1, 1]
[0, -1, 0]
[1, -1]
[1, 0]
[0, -1, -1]
[0]
[1, 0]
[-1]
[0, 0]
[0, 0, -1, -1]
[0, 1, 0]
[-1, 0, 0]
[0]
[0, 0]
[-1, -1, -1]
[0]
[1, -1]
[-1]
[0]
[0, -1]
[0, 1]
[1, 0]
[-1, 1]
[1, 0]
[0, 0]
[0, 0, 0]
[1]
[0]
[0, -1]
[-1, 0]
[0, 0, 0]
[-1, -1, -1]
[0]
[1]
[0, 0]
[0, 0]
[0, -1, 0, 0]
[0]
[0, 0]
[0]
[0]
[0]
[-1]
[0, 0]
[0, 0]
[0]
[0, 0]
[0]
[1, 0]
[1]
[0, 0]
[-1]
[0, 0]
[0]
[0, 0]
[0, 0, 0]
[-1, 0]
[-1, 1]
[1, 0, 0]
[0]
[0]
[1]
[0]
[0, 0, 0]
[0]
[-1, 0]
[0]
[0]
[-1, -1]
[0]
[0, 0]
[-1, 0]
[0, 0]
[0]
[0]
[-1, 0]
[1, 1]
[-1]
[0, -1]
[0, 0, 0]
[-1, 1]
[-1]
[-1]
[0, -1, -1, -1, 0]
[0, -1]

[0]
[-1]
[-1]
[0, -1]
[0, 0]
[0]
[0]
[0]
[0]
[-1]
[0]
[1]
[0, 0]
[1]
[0, 0]
[0]
[0]
[-1, -1]
[-1]
[0]
[0]
[0]
[0]
[1, 0]
[0]
[0]
[0]
[1]
[0]
[0, -1]
[0, 0, 1]
[-1]
[1]
[0]
[-1]
[0]
[1, 0, -1]
[0]
[0]
[0]
[0, -1, 0]
[-1]
[1]
[1]
[1]
[0]
[0]
[0]
[0]
[0]
[-1]
[0]
[-1]
[1]
[-1]
[0]
[0]
[0]
[-1]
[0]
[0]
[-1]
[-1]
[-1]
[-1, -1]
[0]
[0, -1]
[-1]
[0]
[1, -1]
[0]
[-1, 0]
[0]
[-1]
[0]
[-1]
[-1]
[0]
[-1, 1, 0]
[0]
[1]
[-1]
[1, 0, -1, 0]
[0]
[0]
[-1]
[1]
[1]
[1, 0]
[-1]
[0]
[0]
[1]
[0, 1]
[0]
[0, -1]
[0, 0]
[1, 1]
[-1, -1]
[1, -1]
[0]
[-1]
[1, 0]
[0]
[0]
[1]
[1]
[0, 0]
[1]
[1, -1, 1]
[-1, 0, 0]
[-1]
[-1]
[0]
[0]
[-1]
[0, -1]
[1, 0]
[0]
[0]
[-1]
[0]
[-1]
[0]
[0, -1]
[0]
[1]
[0]
[0]
[-1]
[1]
[-1]
[1]
[0, 0, 0, 1]
[-1]
[0]
[1, -1]
[1, -1, 0]
[-1, -1, 0]
[-1, 0]
[-1]
[0, 0, 0]
[1]
[-1]
[-1]
[-1, 0, 0]
[-1, 1]
[0, 0, 0, 1]
[0, -1, 0, -1]
[0]
[0]
[0, 0, 1, 1]
[-1]
[0]
[-1]
[1]
[0]
[0]
[0]
[-1, 0]
[0, -1, -1, -1, -1]
[0, -1, 1]
[-1]
[-1, 0, 0, 1]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0, 0]
[1]
[0]
[-1

[0]
[0]
[-1]
[0, -1, 0]
[1]
[-1, 0]
[0]
[0]
[0]
[-1]
[0]
[0]
[1]
[-1, 0]
[-1]
[0, 0, 0]
[-1]
[0, -1]
[0]
[0]
[-1]
[-1]
[1]
[0]
[1, 0]
[0]
[0]
[0, -1, 0, -1, 0]
[1]
[0]
[0]
[0]
[0]
[0]
[0, 0, -1, 0, 0, 0, 0]
[0]
[1, 0]
[0]
[0]
[1]
[1, -1]
[0]
[-1]
[0]
[0]
[1]
[1]
[-1]
[0]
[0, 0]
[0]
[0]
[-1, 0]
[0]
[0]
[-1]
[1, 0]
[-1, 0]
[-1]
[0, 1]
[-1, -1, 0]
[0, 0]
[0]
[1]
[0, 1]
[1, 0]
[0]
[0]
[0, -1, -1]
[0]
[0, 0]
[0]
[1, -1]
[0, -1]
[0]
[0]
[0]
[0]
[-1]
[0]
[0]
[0]
[1, 1]
[0, -1, -1, 0, 0, -1, 0, -1]
[-1]
[-1, 1, -1, 0]
[0]
[1, 0, -1]
[0, 0, 0]
[0]
[0, 0, -1]
[0]
[-1, -1]
[0, 0, 1]
[0]
[0]
[-1]
[0]
[0]
[-1, 0]
[0]
[0]
[0, 0, 0]
[-1]
[0]
[-1, 0]
[0, -1, -1]
[0, 0]
[0, -1, -1]
[0, 1]
[0]
[0]
[0]
[0]
[0]
[-1]
[-1]
[0]
[0]
[0]
[0]
[0]
[-1]
[0]
[0]
[0]
[1]
[1]
[1, 0]
[1]
[0]
[0]
[1, 1]
[-1]
[0]
[-1]
[1]
[0]
[0]
[1]
[1]
[-1]
[1]
[0]
[0]
[0, 1, 0]
[1, 0]
[0, 1, 0]
[1]
[0, -1]
[0, 0]
[0]
[0]
[1, -1]
[0, 1]
[0]
[-1]
[1]
[-1]
[-1]
[1]
[1]
[0]
[0]
[-1]
[0]
error
[-1]
[-1]
[0, -1]
[-1]
[0]
[0]
[0]
[0]
[0]
[

[0]
[0]
[1]
[-1]
[1]
[0]
[0, -1, -1]
[0, 0]
[0]
[0]
[0]
[0]
[-1, 1, 0]
[0, 0, -1]
[0, 0]
[-1]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[-1]
[0]
[0]
[0]
[0, 0, 0]
[-1, 1, -1]
[0]
[1]
[0, -1, 0, 1]
[0, 0, 0]
[0]
[0]
[0, 0, 0]
[0]
[0]
[-1]
[1, 0]
[0, 0]
[0, 0]
[1]
[0, 0]
[0, 0, 0]
[0]
[0]
[-1, 0]
[0]
[-1]
[-1]
[-1]
[-1]
[0, -1]
[0]
[0]
[0]
[0]
[-1]
[0, 1]
[0]
[0, 0]
[0]
[0]
[-1, 1]
[0]
[1]
[-1]
[-1, 1, 0, 0]
[0]
[-1, -1]
[-1, -1]
[0, -1]
[0, 1]
[-1]
[0, -1]
[1]
[-1]
[0]
[0, 1, 0]
[0]
[0]
[0]
[-1]
[-1]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0, -1, -1]
[0]
[0]
[1, -1]
[0]
[0, -1]
[1]
[0]
[0]
[0]
[-1]
[0]
[-1]
[0]
[0]
[1]
[0]
[0]
[0]
[1]
[0]
[0]
[0, 0]
[0]
[-1]
[0]
[0]
[0]
[0]
[0, -1, 0]
[0]
[1]
[-1]
[-1]
[0]
[1]
[-1]
[0]
[1]
[0]
[1, 0]
[0]
[0]
[0]
[0]
[1]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[-1, 0]
[0]
[0]
[0]
[1]
[0]
[1]
[0]
[0]
[-1]
[0, -1, 1]
[0]
[0]
[1]
[1]
[0]
[0]
[-1]
[1]
[0]
[1]
[0, 1]
[0]
[0]
[0

[0, 1]
[1]
[-1, 0, -1]
[1]
[0]
[1]
[0, 0, 0]
[0]
[0]
[-1]
[-1]
[0]
[0]
[-1]
[0]
[0]
[-1]
[1]
[0]
[0]
[1]
[-1]
[-1]
[0]
[-1, 0]
[0]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[0]
[-1]
[0]
[0]
[1, 0]
[0]
[0]
[-1]
[0]
[0]
[-1, -1]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[0]
[0]
[1]
[0]
[0]
[0]
[-1]
[0]
[0]
[0]
[0]
[0]
[-1, 0, 0, 0]
[0]
[0]
[0]
[0]
[-1]
[-1]
[-1, 0]
[0]
[1]
[1]
[0]
[0]
[1, 0]
[0, 1]
[0, 0]
error
[0]
[-1, 0, 0]
[0]
[-1, 1, -1]
[0, 0, -1]
[0]
[-1, -1]
[0]
[0]
[-1]
[0, 0]
[0]
[0]
[0]
[1]
[-1, -1]
[0, 0]
[0]
[0]
[0]
[0]
[0]
[0]
[-1, 0, 0]
[0]
[0, 0]
[0]
[0]
[0]
[0]
[0]
[0]
[1]
[-1]
[1]
[0]
[0]
[0]
[1]
[0, 0]
[-1]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[0]
[0, 0]
[1, 0]
[0, 0]
[0]
[0]
[0]
[0]
[0]
[0, 1]
[0]
[0]
[-1]
[0]
[-1]
[0]
[1, 1]
[1, 1]
[0, 0]
[-1]
[0, 0]
[0]
[0, 0]
[1, 0]
[-1, -1]
[0]
[0]
[0]
[0]
[0, 1]
[0, 0]
[0]
[0]
[0]
[0]
[0]
[0]
[-1]
[0, 0]
[-1]
[0]
[0]
[-1]
[-1]
[1]
[-1]
[0]
[-1]
[0]
[0]
[0, 0]
[0]
[1]
[0]
[1]
[0]
[-1]
[0]
[0, -1]
[-1]
[-1]
[0]
[0]
[0]
[0]
[-1, 1]
[0, 1]
[-1]
[1]
[0]
[0]
[0, 1]
[0]
[0]

[-1, 0, 0]
[-1]
[-1, 0]
[0]
[0]
[-1, -1, -1]
[-1]
[0]
[0]
[-1, 0, 0]
[0, 0]
[0]
[-1]
[-1, 0]
[1]
[0]
[-1]
[1]
[-1]
[-1, 0]
[1]
[-1]
[0]
[0, 1, -1]
[0, 0, 0]
[-1]
[0]
[1]
[0]
[0]
[1, -1, -1, 0]
[0, 0]
[-1, 0]
[0]
[0]
[1, 0]
[0]
[1]
[0]
[0]
[0]
[0]
[0]
[1]
[0]
[-1]
[0, 0]
[1, 0]
[0]
[0, -1]
[-1]
[0, 0, -1]
[0]
[-1]
[0]
[-1]
[0, -1, -1]
[1, 0]
[-1]
[-1, 0]
[0]
[-1]
[-1, 0, -1]
[0, 0]
[0]
[0]
[0, 1]
[-1, -1]
[-1, -1]
[0]
[0, -1, 0, -1]
[0]
[1]
[0, 0]
[1]
[0, 0, 0]
[0, 0]
[0, -1]
[1]
[-1, -1, 0]
[0]
[0]
[-1, 0, 0]
[1, -1, 0]
[0, -1, 1, 0]
[-1]
[0]
[-1]
[0]
[0]
[-1]
[0]
[0]
[-1]
[0, 0]
[0]
[0]
[0]
[0]
[0, -1, 0]
[0, 1, 1]
[-1]
[-1]
[0]
[0]
[-1]
[0, 0, -1, 0, -1, -1, -1, 0, 0]
[1]
[0]
[0]
[1]
[0, 0]
error
[1]
[0]
[-1]
[0, 0, 0, 0, 0]
[0]
[0, 0, 0]
[0, -1, 0, -1, 0, 0, 0]
[1]
[-1]
[-1, 0]
[0, 1]
[0]
[0, 0, -1]
[0, -1, -1]
[-1, -1, 0, -1, -1, -1]
[0]
[0, -1]
[0]
[0, 0]
[0]
[0, 0]
[-1]
[0]
[0, 1]
[1, 0]
[0, -1, 0]
[0]
[0, 0, -1, -1]
[0]
[0]
[1, 1, 1]
[-1]
[0]
[0]
[-1]
[0]
[-1, 1]
error
[0, 0, 1]

[0]
[0]
[-1, -1, 1]
[0]
[0, 0]
[0]
[0]
[0]
[0, 0]
[0]
[0]
[0, 0]
[-1]
[-1, -1]
[1]
[0]
[0, 0]
[1]
[-1]
[0, 0]
[-1, 0, 0, -1]
[0]
[0]
[0, 0]
[0]
[0]
[-1]
[-1]
[1]
[-1]
[0]
[0]
[0]
[0]
[0]
[-1]
[-1]
[0]
[-1]
[-1]
error
[0, 0]
[0]
[0]
[-1, 0]
[1, 0]
[0]
[-1]
[0]
[1]
[-1]
[0]
[0]
[0, 0]
[-1]
[-1, 0]
[0, 0]
[-1]
[0, 0, 0]
[0]
[1]
[0, 0]
[-1]
[-1, 0]
[1]
[-1]
[0]
[0]
[-1, 0, -1]
[-1, -1, 0]
[0, 0]
[1, 0]
[0, 0, 0]
[0, -1]
[-1]
[0, 0]
[1, 0]
[-1]
[0]
[0]
[0]
[0, 0]
[0]
[-1]
[-1, 1, -1]
[0]
[0, -1]
[0, 0]
[1, -1, 0]
[1]
[-1, 0]
[0, 0]
[-1]
[0]
[0, 0]
[1, 0]
[-1]
[-1]
[0]
[1]
[1]
[-1]
[1]
[0, 0]
[-1]
[-1, -1]
[0]
[-1]
[0]
[0, 0, 0]
[-1]
[-1]
[0, 0, 0]
[1]
[0]
[-1, 0, 0]
[0]
[0]
[0]
[0, 0]
[1]
[0]
[1]
[-1, 1, 0]
[1, 1]
[0, 1]
[0, -1]
[0, -1]
[0]
[-1]
[1]
[1]
[-1, -1]
[0, 1, 1]
[0, 0]
[0]
[1]
[-1]
[-1]
[0]
[-1]
[0]
[0, -1, -1]
[1, 1]
[-1, -1, -1, 0]
[-1]
[1, 0]
[0]
[0]
[1]
[-1]
[1]
[0, 1, -1]
[-1]
[0, 0]
[0]
[0, 0]
[1]
[0]
[-1]
[0, 0]
[-1, 1, 1]
[0]
[0]
[-1, -1]
[-1, -1, 0, 0]
[1]
[1]
[1]
[0, 0, 

[0]
[1]
[0, 0]
[0]
[-1]
[0]
[-1, -1]
[0, 0, -1]
[0, -1]
[-1, -1]
[0]
[0, 0]
[0]
[0]
[1, -1, 0, -1, 0]
[1, 1]
[1, 0]
[-1, 0, 0, 0, 1]
[0]
[0]
[0]
[-1, 0]
[-1]
[0]
[0]
[0]
[1]
[0]
[1]
[0]
[0]
error
[0]
[0]
[0]
[0, 0]
[0, 1]
[0, -1]
[0, -1]
[0, 0, 0]
[0]
[0]
[1, -1]
[-1]
[-1]
[0]
[0]
[-1]
[0]
[0]
[0]
[0]
[1]
[-1]
[0]
[0]
[1, 0]
[1, 0]
[-1]
[0, 1]
[0, 0]
[0, 0]
[0]
[0, 1]
[0]
[0]
[-1, -1]
[0]
[1, -1]
[-1]
[0, -1]
[-1, 0, -1]
[1, 0, 1]
[0, 0]
[0]
[-1]
[0]
[0]
[0, -1, 1]
[-1, 0, -1]
[-1]
[0]
[1]
[0]
[0]
[0]
[0, 0]
[-1, -1]
[0]
[1, 1]
[1]
[0]
[0]
[1]
[0, -1]
[1, -1]
[0, -1, -1]
[0, 0, 0]
[-1, -1, 1]
[0]
[0]
[1]
[-1, 0]
error
[0]
[-1]
[0]
[0]
[-1]
[1, 1]
[-1]
[-1, 1]
[-1]
[0]
[0]
[0]
[-1, 1]
[0, -1]
[0]
[-1]
[0, -1]
[1]
[0, 0, 0]
[1]
[1]
[0, 0, 0, -1, 0, -1, -1]
[0]
[0, -1]
[-1]
[1]
[-1]
[-1]
[0]
[0]
[0, 0, 0]
[0]
[0, -1]
[1]
[0]
[1]
[-1]
[-1]
[-1]
[0]
[0, 1]
[0]
[-1]
[0, 0]
[1]
[0]
[1]
[-1]
[0, 0, 0]
[1]
[0, 0, -1, -1]
[0, 0, 1, -1, -1, 0]
[0, -1]
[1, -1, 1]
[0, 0, 1]
[1]
[0, 0, -1, 0]
[0]
[0

[0]
[0, 0, -1]
[-1, 0, -1]
[0, -1, 0]
[0]
[1, 1, -1]
[-1, 0, -1, 0, 0, 0, 0, 0, 0]
[1]
[0]
[0]
[0]
[0, 0]
[-1, 1]
[0, 0]
[-1, 0, 1]
[-1, 0]
[0, -1, -1, 0]
[1, -1]
[-1, -1]
[-1, 1, -1, -1, 0, 0]
[0, 1, 0]
[0, -1, 1, 1]
[-1, -1, -1, -1, 0, 0, 0]
[0, 0]
[-1, -1, -1, 0]
[0]
[0, 0]
[-1, 0]
[-1]
[0, 0]
[-1, 0]
[-1, 1, 0]
[-1, -1, 0, 1, 0]
[0, 0, 1]
[-1, 0]
[1, 0]
[-1, -1, 0]
[0, 0, -1, 0]
[1, 0, 1, 0]
[0]
[-1, 1, 0, 0, 0]
[0]
[0]
[-1]
[0]
[0, -1]
[0, 0]
[0]
[0, 0]
[0, 0, 0, -1, 0, 0, 0, 0]
[-1, -1]
[0, 1, -1, -1, 0, 1, 0, 0, 0, 1, 0]
[-1, -1, 0, -1, 1, 1, 0, -1]
[0]
[0, 0, 1]
[0]
error
error
[-1, -1, 0]
[1, -1, 0, 0]
[1, 1, -1, -1]
[-1, 0, 1, 0, -1, 0, -1, -1, 0, -1]
[0, 0, 0, 0]
[0]
[0]
[0]
[0, 0, 0]
[-1]
[-1, -1]
[0]
[0]
[-1]
[-1, -1, 0]
[1]
[0, 0]
[0, -1, -1]
[0]
[0, 0, 0, 1]
[0, 1, 0]
[-1, 0, -1]
[0]
[-1, 0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0, 0]
[-1]
[-1]
[0, 1]
[1, 0, 0]
[1]
[0, 0, 0]
[0]
[0, 0, 0, 0, 0]
[0, 0]
[0]
[1]
[0, 1, 0, 0]
[0, -1]
[-1, -1, 0]
[0, 0]
[0]
[0]
[0]
[0, -1, -1, -1]

[0, -1, 1, -1]
[-1, -1]
[-1]
[0]
[0, 0]
[0]
[1]
[0]
[0, 0]
[0, 0]
[0]
[-1]
[-1]
[0, -1, 0]
[-1, -1, 1]
[-1, -1, 0]
[-1, 0]
[0, 0]
[0, -1, 0, 0]
[1]
[0, -1, 0]
[1]
[-1]
[0]
[0, 0]
[0, -1]
[0]
[1, 0, 1, 1, 0]
[-1, 0]
[0]
[0, -1]
[0, 1, 0]
[0]
[0, 0]
[0]
[0, -1, 0]
[0, 0]
[0, 0, 0]
[-1, 1, 0, -1]
[0, 1, 0, -1, -1]
[0, 0]
[0]
[0, 0, 0]
[0, 0, -1, -1, 0]
[0, 0]
[-1, 0, 0, -1]
[1, 1, 0]
[0, -1]
[-1]
[0]
[1, 1, 0, -1]
[0]
[0, 0, 0]
[0]
[0, 0, -1, 0]
[0]
[0, 0]
[-1, -1]
[-1, 0, 0]
[-1]
[0, 0, 0, 0, 0]
[0, 0]
[0, -1]
[-1, 0, 0, 0]
[0, 0]
[0, 1, 0, 0, 0]
[0, -1, 1, 0, -1]
[0, 0]
[0, -1, 0]
[1, 0]
[-1, 1]
[1]
[-1]
[-1]
[0]
[0]
[-1]
[0]
[1]
[1, 0]
[0]
[0, 1]
[1]
[1]
[0, 0]
[0]
[0]
[-1, 0, 0]
[0]
[-1, 0]
[0]
[-1, 0]
[0]
[0]
[0]
[1]
[0]
[0]
[0]
[0, 0, -1]
[-1]
[0, -1, -1]
[0]
[0]
[-1, 1]
[0]
[-1, -1]
[1, 0]
[0, 0]
[0]
[0, 0]
error
[0, -1]
[0, 0]
[0, 0]
[0]
[0]
[0, 0]
[1]
[-1, 0]
[0, 1, -1]
[0, 0, 0, -1]
[-1, 0]
[0, -1]
[0, 0, 0]
[1, -1, 0]
[1, 0]
[0, 1]
[1, 0]
[0, 0, 0]
[0]
[0]
[0, 0, 0]
[-1, 0]
[0,

[0, 0, 0]
[0, -1]
[1, 0, 0, 1, 1]
[0]
[0, -1, 0]
[-1, 1, 0, 0, -1]
[0]
[-1, 0]
[1, -1, 0, 0, 1, 0, 0, 0, 1, -1, -1, 1]
[1, 1, 0, -1, 1, 1]
[0]
[0, 1]
[-1, 0, -1]
[-1, -1, 0]
[-1, 1, -1, -1, 0]
[-1, 1, -1]
[0, -1, 0, -1, -1, -1, 1, -1, -1]
[0, -1, -1, 0]
[0, -1, 0, 0, 0, 0]
[1, 0, 0, -1, 1, -1, 0]
[0, 0, -1, 0, -1, 1, 0]
[-1, 0, -1, 1, 0, -1, 0, -1]
[0, -1]
[-1]
[0, 0, 0, 0, 0]
[-1, -1, -1, 0, 0, 1]
[-1]
[0]
[0]
[-1]
[-1, 0, 0, 1]
[-1]
[-1, 0]
[0, 0]
[0, 0]
[0, -1, -1]
[0, 0]
[0]
[0, 1]
[0, 0]
[0]
[-1, 0]
[-1, 0, -1]
[1, -1, 1]
[0, 1, -1]
[0, 1]
[-1]
[0, 0]
[0, 0]
[-1]
[0, 0, 1]
[1]
[0, 0]
[0]
[0]
[-1]
[0]
[0]
[0]
[1, 0]
[-1, 0]
[0, 0, -1, 0]
[0]
[-1, 1, -1]
[0, -1]
[0, 0, 0]
[0, 0]
[0, -1, 1]
[1, -1]
[0]
[0]
[-1, 0]
[0]
[0]
[1]
[-1]
[0]
[0]
[0]
[1, -1]
[0]
[0, 0]
[-1]
[0]
[0]
[0, 0]
[0]
[0]
[0, 0]
[-1, 1, -1]
[-1, 0]
[0]
[-1]
[0, -1]
[0]
[0]
[0, 0, 0]
[-1]
[1]
[0, 0, 1]
[0]
[0]
[-1, 0, 0]
[0]
[1]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[-1]
[1]
[0]
[0, -1]
[-1]
[0, 0, 1]
[-1]
[1, -1]
[0, 0, 1]
[0,

[0, 0]
[0]
[0]
[1]
[0, 0, -1]
[0]
[0]
[0]
[-1]
[-1]
[-1]
[0]
[0]
[1]
[0]
[0, -1, -1, 0]
[0, -1, -1, 0, 0, 1]
[0, -1]
[0, 0, 0]
[0, 0]
[-1, 1, -1]
[0]
[1, -1]
[-1]
[0]
[-1]
[-1, -1]
[1]
[0, -1, -1]
[-1]
[1, 0, -1]
[-1, 0]
[1, 0, -1, -1]
[0, 0, 0]
[-1, 0]
[-1, -1]
[1]
[-1]
[0, -1]
[-1]
[1, 0]
[-1, -1]
[1]
[1]
[-1]
[0, 0, 1]
[1, 0]
[0, 1]
[0, -1]
[-1, -1]
[0, -1]
[0]
[1, 0]
[0]
[-1, 0]
[-1]
[-1]
[0, 0]
[0]
[-1, 0]
[0]
[1, -1]
[1]
[0]
[-1]
[-1, 1, 0]
[-1]
[-1, 0, -1]
[0, -1]
[1, 0]
[-1]
[0]
[0, 0]
[1, 1]
[-1]
[0]
[1]
[0, 0]
[-1, -1]
[0, 0, 0]
[-1]
[1, -1]
[-1]
[0]
[-1, 0]
[-1, 0]
[-1]
[0, 1]
[-1]
[-1]
[0]
[0]
[-1]
[0, -1]
[-1]
[-1]
[-1]
[0]
[-1]
[-1]
[1, 0]
[0]
[0, 1]
[-1]
[0]
[0]
[-1]
[-1]
[-1]
[-1, 0]
[0]
[-1]
[1, 1]
[1]
[0]
[-1, 1]
[-1, -1]
[0]
[1, -1]
[0]
[0, -1]
[-1]
[-1]
[-1]
[0]
[0, 1, 0]
[-1]
[1, 1, 0]
[-1, -1]
[0, -1, 0, -1]
[-1]
[1, 1]
[-1]
[1]
[0, -1, 0, 0, 0, 0]
[0]
[1]
[0]
[0, 1]
[0, 1, 0]
[1]
[-1, -1]
[0]
[-1]
[0, 0]
[0]
[-1, 0, -1, 1]
[0, 0]
[-1]
[0]
[-1]
[-1]
[0]
[-1]
[0]
[

In [48]:
#here are a few examples of the method applied to some sentences:

text_1 = "I smile whenever I see you."

text_2 = "I have never seen something as disgusting as this."

text_3 = "the cat eats a mouse."

print(text_1, " sentiment:  ", get_sentiment(text_1)) #positive sentiment
print(text_2, " sentiment: ", get_sentiment(text_2)) #negative sentiment
print(text_3, " sentiment: ", get_sentiment(text_3)) #neutral sentiment

[1]
I smile whenever I see you.  sentiment:   [1]
[-1]
I have never seen something as disgusting as this.  sentiment:  [-1]
[0]
the cat eats a mouse.  sentiment:  [0]


In [36]:
sdata=data_sentiment

### Step 4: Discourse extraction

To extract the discourse from the messages, we use the Penn Discourse Treebank 2.0 (pdtb2). 

from :

https://github.com/cgpotts/pdtb2

we download pdtb2.csv and pdtb2.py which we put in the same folder as this notebook. The way we compute the discourse of the sentences is easy;

    1) We exctract the discourse markers from the pdtb2 corpus
    2) For every sentence, we search how many of these markers they contain

In [50]:
#first we need to import and read the discourse corpus
from pdtb2 import CorpusReader
discourse_corpus = CorpusReader('pdtb2.csv')

In [53]:
#We use the function ConnHead to extract the discourse markers from the corpus and add them to a list
discourse_markers=[]
for datum in discourse_corpus.iter_data(): 
    sc= datum.ConnHead
    if not(sc in discourse_markers):
        discourse_markers.append(sc)

#We remove first discourse marker which is "None" and isn't a real discourse marker 
#but just there because sometimes there are no markers
discourse_markers.pop(0)

#We remove the last discourse marker because it is an 8 letter marker which doesn't even exist in the english language
#it was probably there by mistake. this marker was:
# "on the one hand on the other hand"
discourse_markers=discourse_markers[:-1]

#Finally, just as in the paper, we remove the unwanted markers from the list, those are the too common ones
unwanted_discourse_markers=['And', 'and', 'for','as', 'but', 'if', 'or', 'so']

for u in unwanted_discourse_markers:
    discourse_markers.remove(u)
    


row 2924IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

row 6439IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

row 10044IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [54]:
#Here is the final list
discourse_markers

['once',
 'although',
 'though',
 'because',
 'nevertheless',
 'before',
 'for example',
 'until',
 'previously',
 'when',
 'then',
 'while',
 'as long as',
 'however',
 'also',
 'after',
 'separately',
 'still',
 'so that',
 'moreover',
 'in addition',
 'instead',
 'on the other hand',
 'for instance',
 'nonetheless',
 'unless',
 'meanwhile',
 'yet',
 'since',
 'rather',
 'in fact',
 'indeed',
 'later',
 'ultimately',
 'as a result',
 'either or',
 'therefore',
 'in turn',
 'thus',
 'in particular',
 'further',
 'afterward',
 'next',
 'similarly',
 'besides',
 'if and when',
 'nor',
 'alternatively',
 'whereas',
 'overall',
 'by comparison',
 'till',
 'in contrast',
 'finally',
 'otherwise',
 'as if',
 'thereby',
 'now that',
 'before and after',
 'additionally',
 'meantime',
 'by contrast',
 'if then',
 'likewise',
 'in the end',
 'regardless',
 'thereafter',
 'earlier',
 'in other words',
 'as soon as',
 'except',
 'in short',
 'neither nor',
 'furthermore',
 'lest',
 'as though',
 

In [40]:
#method that returns the common elements of 2 lists
def intersection(lst1, lst2): 
    lst3 = [value for value in lst1 if value in lst2] 
    return lst3 

#Method that searches for discourse markers in a text
def search_discourse(text, markers):
    discourse_list=[]
    
    #first we tokenize the text into words
    tokens=nltk.word_tokenize(text)
    
    #then, because the markers can have up to 4 words, single tokens aren't enough
    #We need to also search through bigrams, trigrams and fourgrams
    bigrams = [ " ".join(pair) for pair in nltk.bigrams(tokens)]
    trigrams = [ " ".join(trio) for trio in nltk.trigrams(tokens)]
    fourgrams = [ " ".join(quatuor) for quatuor in nltk.ngrams(tokens, 4)]
    
    #We add the markers to a list
    discourse_list.append(intersection(tokens, markers))
    discourse_list.append(intersection(bigrams, markers))
    discourse_list.append(intersection(trigrams, markers))
    discourse_list.append(intersection(fourgrams, markers))
    
    #It is a list of lists so we flatten it into a list of markers before returning it
    flat_list = []
    for sublist in discourse_list:
        for item in sublist:
            flat_list.append(item)
    
    return flat_list
    



In [41]:
data_sentiment["discourse_markers"]=data_sentiment.text.apply(lambda x: search_discourse(x, discourse_markers))
data_sentiment["discourse_number"]=data_sentiment.discourse_markers.apply(lambda x: len(x))

In [43]:
data_sentiment.to_csv("final_data.csv")