## École Polytechnique de Montréal
## Département Génie Informatique et Génie Logiciel

## INF8460 – Traitement automatique de la langue naturelle - TP1

## Objectifs d'apprentissage: 

•	Savoir accéder à un corpus, le nettoyer et effectuer divers pré-traitements sur les données
•	Savoir effectuer une classification automatique des textes pour l’analyse de sentiments
•	Evaluer l’impact des pré-traitements sur les résultats obtenus


## Équipe et contributions 
Veuillez indiquer la contribution effective de chaque membre de l'équipe en pourcentage et en indiquant les modules ou questions sur lesquelles chaque membre a travaillé


Nom Étudiant 1: Luu Thien-Kim (1834378) 33.33%

Nom Étudiant 2: Mellouk Souhaila (1835144) 33.33%

Nom Étudiant 3: Younes Mourad (1832387) 33.33%

Nous avons tous travaillé ensemble sur chaque question

## Librairies externes

In [1]:
import os
import pandas as pd
from typing import List, Literal, Tuple

## Valeurs globales

In [2]:
data_path = "data"
output_path = "output"

## Données

In [3]:
def read_data(path: str) -> Tuple[List[str], List[bool], List[Literal["M", "W"]]]:
    data = pd.read_csv(path)
    inputs = data["response_text"].tolist()
    labels = (data["sentiment"] == "Positive").tolist()
    gender = data["op_gender"].tolist()
    return inputs, labels, gender

In [4]:
train_data = read_data(os.path.join(data_path, "train.csv"))
test_data = read_data(os.path.join(data_path, "test.csv"))

train_data = ([text.lower() for text in train_data[0]], train_data[1], train_data[2])
test_data = ([text.lower() for text in test_data[0]], test_data[1], test_data[2])

## 1. Pré-traitement et Exploration des données

### Lecture et prétraitement

Dans cette section, vous devez compléter la fonction preprocess_corpus qui doit être appelée sur les fichiers train.csv et test.csv. La fonction preprocess_corpus appellera les différentes fonctions créées ci-dessous. Les différents fichiers de sortie doivent se retrouver dans le répertoire output.  Chacune des sous-questions suivantes devraient être une ou plusieurs fonctions.

In [5]:
train_path = os.path.join(data_path, "train.csv")
test_path = os.path.join(data_path, "test.csv")

train_phrases_path = os.path.join(output_path, "train_phrases.csv")
test_phrases_path = os.path.join(output_path, "test_phrases.csv")

#### 1) Segmentez chaque corpus en phrases, et stockez-les dans un fichier `nomcorpus`_phrases.csv (une phrase par ligne)

In [81]:
import nltk
nltk.download("punkt") 
nltk.download("wordnet")
import csv

def segmentSentences(path) :
    data = read_data(path)[0]
    if not os.path.isdir(output_path) :
        try:
            os.mkdir(output_path)
        except OSError:
            print ("Creation of the directory %s failed" % path)
        else:
            print ("Successfully created the directory %s " % path)
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(path))[0] + "_phrases.csv"
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f: 
        for sentences in data :
            sentences = nltk.sent_tokenize(sentences)
            for sentence in sentences:
                sentence = sentence.replace('"', '""').replace('"', '""')
                f.write('"'+sentence+'"')
                f.write("\n")
    return newFilePath


[nltk_data] Downloading package punkt to /Users/kimluu/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /Users/kimluu/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [84]:
segmentSentences(train_path)
segmentSentences(test_path)

"I don't think any one there has EBOLA Bob Latta You should be back in Washington actually getting something done there on the House floor."
";-)...anything other than jeans and t-shirts are superfluous, by the way."
"'Update your wardrobe'...pfft."
"Meh, I could only get to 8."
"Need to work up."
"A bill consisting of a single sentence."
"Very well done, sir."
"So far, so good."
"Thx !"
"My buddy Jeff Johnson was your prop master on that."
"she had me at everlasting youth."
"Congratulations to you for a well deserved recognition!"
"baffoon, idiot, dumb."
"The intelegent conversation continues......."
"They don't deserve that honour(stupid Hollywood movie business people)"
"Yawn!"
"Is this honestly news?"
"Same to you brotha!"
"Get at it :)"
"Would good to know how the age of all of these things have been measured."
"the perfect society is shaped in the form of a pyramid, the old at the top and the young at the bottom supporting the old."
"if we kill off the young that will take care o

"what do you drink with these Chevy??"
"goat milk??"
"lol"
"Thank you sir."
"Need all the support I can get!"
"I think we need to keep empty profile empty heads out of Facebook @courtney"
"Wonderful and so much to learn from these stories."
"It would be awesome if y'all come to okc"
"my new id :) please add me :) <3"
"a really charismatic speech.I wonder if we can control our right brain to have a very different feeling of our world.maybe when we have a deeper understanding of our universe and ourself ,we will treasure what we have and the world can become more harmonious."
"I love the show, but I'm wildly crazy about Jermaine Fowler!!"
"He's funny and sweet, but he has one of the most lovable faces I've ever seen...he makes me smile whenever I see him."
"<3 And I love the radio show, Craig, I just wish I could watch as well as listen!!"
"<3"
"I am sad you did not cast your vote for a Fellow Texan Rep. Louie Gohmert (TX-01) you won't be getting our vote next time around."
"Lol."
"I'll 

"Thank You Ruben for being a champion of the environment on top of being a champion of the the people."
"no problem, and ditto :)"
"beautiful :-* <3 I love you <3"
":D Thanks!"
"It's good to be back!"
"Gotta try get some better lifts again now haha"
"Great job."
"God bless ."
"Hell yeah Josh."
"Sounds really good."
"Can't wait"
"i have another question, is pryor and his staff going to take the illegal subsidy?"
"An amazing cross between Don Ho, Connor McLeod, and Al Pacino."
"Thanks for the follow!"
"Hoping you feel better......"
"yes the republican plan is better, oh wait after seven years they are to stupid to come up with one mainly they the pro-life party don't give a dam if people die because they can't afford health insurance"
"Your presence, support and help will mean so much to the Puna families."
"My heart goes out to them who are currently suffering."
"How would that stop someone from voting more than once?"
"So your telling me there's a chance!"
"So much love for you and you

"Your welcome and thank you in return."
"Good luck with your goals!"
"Congratulations Karen!"
"Well earned!!!"
"!"
"Thank you for the FB!"
"I have a bestie named """"Karli"""", Ms. Carly :) Looking forward to propping you in my feed :)"
"Thanks for following back!"
"He faced overwhelming challenges that most of us, in our lives, will never have to face."
"But his intelligence, his passion, and his determination brought about success and he was able to help not only himself, but his family and his community as well."
"You're an inspiration, William Kamkwamba."
"How can Dennett's argument be 'compelling' if his conclusion is that you can't trust your own brains?"
"Surely this is self-defeating!"
"I know Craig likes crazy socks."
"Check these out."
"Also an inspiring story."
"(Hopefully I posted this right) https://www.facebook.com/johnscrazysocks/posts/275038306233288:0"
"Same here!"
"Work hard, play hard:)"
"It's cool, mine was too slow."
"Keep it up!"
"vamos todavia q quiero verte en l

"That is exciting!"
":)"
">lets just FUCK the entire english language, right."
"in."
"the."
"asss >FTFYFH FTFTFTFY"
"Oh, I just read it wrong then."
"For all my muso friends, including my fellow steel panners!....So inspiring.......but even more than that, I so get."
"What she is saying and doing......enjoyed every minute of it."
"Interesting but this seems to be based on opinion vs actual data."
"Also this goes against Steven Johnsons speech on where good ideas come from."
"http://www.ted.com/talks/steven_johnson_where_good_ideas_come_from.html"
"I love this talk, as it touches some points that many of us would love to be able to ignore (talk about numbing)."
"Already shared it with my friends, and I'm sure I'll watch it again whenever I feel insecure on my path."
"That was a great and energetic talk."
"I would defiantly like to see another one by him."
"Thanks for the follow back!"
"Good luck with your goals!"
"THAT'S TRUTH, THAT'S TRUTH!"
"!"
"Help the Middle Class by enacting HR 82

":P"
"time to get back to work, you people take to much time off as is."
"Thank you for visiting us here at Ft Rucker!"
"Thank you so much Jim, and happy Passover for you and your family..."
"really inspiring and moving..Was moved both by what he was saying and by how he said it - all that passion!"
"Makes you think of your own dreams...and if any of those dreams can make this place a better world"
"Cant wait 2 see u Queen:) Bless and i know u will Win"
"Thank you for taking the time to speak with us."
"TY AND GOOD MORNING AND HAPPY MONDEY FOR YOU TRACI....!"
"I LOVE YOU...!!"
";-) :-*** <3"
"Merry Christmas to you ..."
"Doing a great Job... Have a fruitful season ..."
"Thanks for visiting CCES!"
"The Upper School truly enjoyed the discussion this morning."
"I'm glad you are working for the most vulnerable!"
"Well, you are my hero - I didn't know about your role in the riots - but I'm not surprised."
"I never miss an episode and I tell all my friends about Greenleaf - can't wait for th

"All the beautiful people!!"
"!"
"You look so handsome"
"One more for my collection !"
"It's so cool."
"If the kit was to be sold for ordinary people I think it would be a good tool for learning about electronics."
"Fascinating and promising medical development"
"damn wish i was there"
"yeah, i need to get back to exercising regularly."
"I've been slacking this semester with school and attempting to spend my free time with Jen."
"I'll try to keep this up."
"Do you know if there is an app for this?"
"Child pornography is legal now?"
"Obama got his behind handed back to him yesterday by the leaders of the emerging Republican Majority."
"As good as I expected..."
"I applaud you Monica!"
"You are claiming your story and using your voice for good."
"To the Ted moderators I respectfully ask that the team review the moderation on this thread - the misogynistic comments that are getting through here are not helpful to discussion"
"I think more interesting then the talk was the photographers co

"Nice talk."
"Would a car or motorcycle tire grip the road better if the tire (rubber) was embedded with the product ?"
"Please do, Im pretty interested."
"Thank you."
"Merry Christmas!!!!"
"!"
"Yeah!!!!"
"#seahawks baby!"
"!"
"You see?"
"If you get the conditions right, the implausible becomes plausible and the impossible might become possible :) Makes you wonder what mankind could achieve by embracing ideas like this one."
"Nice talk."
"I would love to try that."
"I think its included as a whole workout"
"All the way from Baltimore, Maryland---KNOCK 'EM DEAD B!"
"<3"
"Nice Vincent Ward II."
"Zombies like you."
"You all can't part."
"Lol.."
"A cute pink costume ball half bunny costume it's cute Gio Benitez"
"Support for ethnic cleansing and oppression is progressive?..."
"https://www.facebook.com/PalestineProjectPage/videos/661351454015967/?hc_ref=NEWSFEED"
"And thank you Steven Schuyler and Liam Foote for your service, and your voice in support of common sense solutions to gun violen

"I the couch to 5k with a bunch of coworkers."
"It's a great way to introduce yourself to running."
":)"
"I dont agree."
"Being physically fit is a major advantage over being androgynous."
"Thanks for following!"
"Best of luck with the gains :)"
"I'm with you Dutch."
"And Israel!"
"Thank you!"
"I studied music for my first few 5 semesters of school; the Harp Performance piqued my interest."
"I love finding musicians who like to train!"
"They took the land to protect it......."
"That Fred!!!!!!"
"Grrrrr!"
"Gary and I send our best wishes!"
"!"
"Good luck on the run."
"We will be supporting you, although you won't see us, along with other hams from Phoenix CERT and allies."
"North Koreans are passing through a hard time...Indeed,after every hardship there is ease.."
"Congratulations, Soror."
"Keep up the great work."
"YEA!!!!!"
"good job!!!"
"*cheer cheer*"
"Not bad to lose to the partner... Shelby."
"Go get'em tomorrow!"
"From the looks of your personal records, I'm pretty sure I'm actu

"I get messages weekly this helps some."
"Thank you, lovely lady."
"I'm glad I discovered this!"
"Very motivated to make some progress!"
"350.00 or 2000.00 is too steep for me!"
"Sorry to disappoint :( Maybe one day I'll get one of me on the kayak."
"Your new pp is good though :-)"
"Him and my uncle tore that engine down."
"She was endorsed by Paul Ryan which is why she is another RINO FRAUD."
"HA!"
"thats a paysite, nobody pays for porn."
"im semi that ur semi"
"Congratulations Al."
"You deserve all the glory tonight by setting the standard for all Rattlers!"
"What you on about tryna act like the hijab is not modest or respectful of a womans rights when the average car ad is a **** in a bikini"
"I would love to hear from him """"these are the new steps to get rid of the Republican Party once an for all""""."
"Thanks for the follow back!"
"Looking good."
":)"
"Not a problem, sir!"
"Thank you!"
"were you high or something?"
"I am choosing NAU over ASU entirely because of cost."
"ASU has

"Cane wait!!!"
"Ur a beast Greg!"
"Corp of engineers is our problem."
"Kids Need Free Minds you came up with one Sheila J Lee Mam"
"Thanks for the follow too!"
"Don't worry to much about the competition from me, I'm not that much of a menace!"
"I wonder if knee push ups count into the challene..."
"I'll have to ask."
"XD"
"If you genuinely think like that, you are either fourteen or a sociopath."
"Were not living in the Iron Age anymore; people have the luxury of not being survivalist robots, for one thing."
"But he is a very talented vocalist."
"I just got a 50$ GC there, so I have a feeling Ill be going there a few times."
"Any other decent recommendations for the nutrition-conscious diner?"
"It is great!"
"Probably the most fun I have had since organized sports."
"Thatd be great."
"Breeze is a bit of a young Jericho."
"el azul es mi color preferido"
"Great Team of Weather Reporters!"
"Incredible!"
"I was absolutely mesmerized by this talk and the social and physical science behind i

"This excerpt is only a tiny taste of what is a great show!"
"Watching All-Ireland GAA!!"
"!"
"It would most likely only be grab a few more ghosts and kill some new bosses."
"But this is open for suggestion."
"I just think a little bit of rewards would be cool for max people and people working their way there."
"ALERT: A government """"whistleblower"""" inside the Obama regime has blown the lid off of why Barack Hussein Obama has recently signed Executive Orders that give him the power to declare martial law across America!"
"http://ppsimmons.blogspot.com/2012/05/government-whistleblower-says-obama-is.html"
"I'm sure Algore was glad to see you again after that questioning you gave him!"
"LOVED IT!"
"I always pictured a somewhat stern grandmother in a rocking chair, knitting and listening, and watching, and looking on, and ..."
"Remember those lumberjack commercials Sean did?"
"They were awesome!"
"How about some more of those?!"
"!"
"Wow."
"Very amazing teacher."
"Very beautiful and po

"when/if you ever start serving the people of the ENTIRE state, then i'll congratulate you."
"However, i'm sure you'll just do more of the same catering to your own self serving interests."
"Just awesome."
"He is the man."
"Insightful, helpful!"
"Thanks!"
"Best director ever!"
"Explain to them why you guys continue letting Obama break our laws."
"Lawlessness abounds in D.C. Where are the checks and balances?"
"Arrest him."
"Bravo Mr. and Mrs. Floyd Thomure- I wish we could give a medal to his lovely bride for supporting him all these years!!!"
"Thanks for your service!!"
"!"
"Twice as much light.."
"Very interesting way on closing the speech."
"Thank you!"
"<<Blushes>>"
"It was Blunt and his ilk that added the abortion language that prevents this bill from moving forward."
"You know that."
"Thank you as well :)"
"I wonder what software he uses to construct this presentation?"
"His presentation slides are really impressive!"
"I like to use this software for my presentation as well!"
"Aw

"NAVIENT BANK LIES CHEATS AND COMMITS FRAUD!"
"I HAVE AUDIO PROOF OF THEIR DECEPTION!"
"Glad to see another Odd Book!"
"Keep em coming!"
"I think Bob is a good guy."
"Don't see him doing anything though !"
"excellent open stance!"
"Are there federal agencies still operating that are not under a gag order?"
"I thought the current administration was in process of dismantling all federal agencies?"
"A reflective moment in time ...... emotive photo."
"Nice Shaun"
"recorded it!"
"You did very well, future Governor Pascrell!!"
"!"
"Do you actually think the Big 8 would be alive today if the SWC members didnt want to join?"
"I play the French Horn in an orchestra in Mexico."
":) I love families of musicians!"
"yaay!"
"Lucky you."
":)"
"The lead donkey needs to resign."
"Katie thank you so much!"
"Love your hair btw!"
"Was really moving."
"I'm sure to try it."
"Those counties were hit pretty hard by the flooding we had."
"It's good to know that they will be getting relief."
"Finally someone wa

'output/test_phrases.csv'

#### 2) Normalisez chaque corpus au moyen d’expressions régulières en annotant les négations avec _Neg L’annotation de la négation doit ajouter un suffixe _NEG à chaque mot qui apparait entre une négation et un signe de ponctuation qui identifie une clause. Exemple : 
No one enjoys it.  no one_NEG enjoys_NEG it_NEG .

I don’t think I will enjoy it, but I might.  i don’t think_NEG i_NEG will_NEG enjoy_NEG it_NEG, but i might.

In [85]:
def getPath(path) :
    if "train" in path :
        path = train_path
    elif "test" in path :
        path = test_path
        
    return path

In [86]:
import re

def normalize(path) :
    with open(path, "r") as f :
        data = list(f)
    
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(getPath(path)))[0] + "_negation.csv"
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f:
        for sentence in data:
            match = re.sub("(?i)(?<=not |n't | no )(.*?[,.(?!;]+)", lambda m: m.group(1).replace(" ", "_NEG ")
                           .replace(".", "_NEG.").replace(",", "_NEG,").replace("?", "_NEG?").replace("!", "_NEG!")
                           .replace("(", "_NEG(").replace(";", "_NEG;"), sentence)
            f.write(match)
            
    return newFilePath
            

In [87]:
normalize(train_phrases_path)
normalize(test_phrases_path)

'output/test_negation.csv'

#### 3) Segmentez chaque phrase en mots (tokenisation) et stockez-les dans un fichier `nomcorpus`_mots.csv. (Une phrase par ligne, chaque token séparé par un espace, il n’est pas nécessaire de stocker la phrase non segmentée ici) ;

In [104]:
def tokenize(path) :
    sentences = []
    
    with open(path, "r") as f :
        data = list(f)
        
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(getPath(path)))[0] + "_mots.csv"
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f: 
        for sentence in data :
            listTokens = nltk.word_tokenize(sentence)
            tokens = ' '.join(listTokens)
            tokens = sentence.replace('"', '""').replace('"', '')
            f.write('"' + tokens + '"')
            f.write('\n')
                
    return newFilePath

In [105]:
tokenize(train_phrases_path)
tokenize(test_phrases_path)

train_mots_path = os.path.join(output_path, "train_mots.csv")
test_mots_path = os.path.join(output_path, "test_mots.csv")

#### 4) Lemmatisez les mots et stockez les lemmes dans un fichier `nomcorpus`_lemmes.csv (une phrase par ligne, les lemmes séparés par un espace) ;

In [106]:
def lemmatize(path) :
    with open(path, "r") as f :
        data = list(f)
        
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(getPath(path)))[0] + "_lemmes.csv"
    lemmzer = nltk.WordNetLemmatizer()
    
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f: 
        for sentences in data :
            lemmes = [lemmzer.lemmatize(token) for token in sentences.split()]
            sentences = ' '.join(lemmes)
            f.write(sentences)
            f.write('\n')
                
    return newFilePath

In [107]:
lemmatize(train_mots_path)
lemmatize(test_mots_path)

'output/test_lemmes.csv'

#### 5) Retrouvez la racine des mots (stemming) en utilisant nltk.PorterStemmer(). Stockez-les dans un fichier `nomcorpus`_stems.csv (une phrase par ligne, les racines séparées par une espace) ;

In [114]:
def stemmize(path) :    
    with open(path, "r") as f :
        reader = csv.reader(f)
        data = list(reader)
        
    path = getPath(path)
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(path))[0] + "_stems.csv"
    
    stemmer = nltk.PorterStemmer()
    
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f: 
        for sentences in data :
            for sentence in sentences :
                stems = [stemmer.stem(token) for token in sentence.split()]
                sentences = ' '.join(stems)
                f.write('"' + sentences + '"')
                f.write('\n')
                
    return newFilePath
    

In [115]:
stemmize(train_mots_path)
stemmize(test_mots_path)

"I don't think ani one there ha ebola bob latta you should be back in washington actual get someth done there on the hous floor."
";-)...anyth other than jean and t-shirt are superfluous, by the way."
"'updat your wardrobe'...pfft."
"meh, I could onli get to 8."
"need to work up."
"A bill consist of a singl sentence."
"veri well done, sir."
"So far, so good."
"thx !"
"My buddi jeff johnson wa your prop master on that."
"she had me at everlast youth."
"congratul to you for a well deserv recognition!"
"baffoon, idiot, dumb."
"the inteleg convers continues......."
"they don't deserv that honour(stupid hollywood movi busi people)"
"yawn!"
"Is thi honestli news?"
"same to you brotha!"
"get at it :)"
"would good to know how the age of all of these thing have been measured."
"the perfect societi is shape in the form of a pyramid, the old at the top and the young at the bottom support the old."
"if we kill off the young that will take care of us with thier s.s. taxes, we will have no support."

"i'm count on see you on saturday and/or sunday!"
"keep up the good work!"
"awww what a wonder blessings."
"just follow back, fellow 1000 lb."
"club member."
"nice workouts!"
"I wish I could go to sleep and never wake up..... deep financi reasons......."
"that is the sweetest thing for you to say!"
"happi new year to you, you are do super awesom and inspir me more than you can possibl know!"
"<333"
"hey moeee :)yea it absolut fine to train fasted...if you'r plan to train at 7pm...could be tricky..."
"the rose bowl stadium ha been design by the govern as a US histor landmark for obviou reasons."
"I notic that a few second after submit my comment."
"good stuff all around!"
"omg yes!!!!!"
"stephani cum"
"He just kill her."
"divorc is so messy."
"toujour aussi pro, que ce soit en match ou pour commenter!"
"thats... complet normal then... like she wa freak 12 tops, do you know what kinda weird shit kid do when theyr 12?"
"not say it okay what she did but cmon she wa bare a form person at th

"congratul to all of these emerg artists."
"thank to staff for point us to the link to vote."
"yep, it digital!"
"I person use kyle webster brush for photoshop."
"thi piece, I use a crayon one and a color pencil one."
"I highli recommend hi brushes!"
"congratul on cross the last technolog barrier in recycling/wast management!!"
"you'v help construct a fruit future."
"soph is a cuti patootie, no need to be jealous."
"give her a big kiss for me."
"<3 westi :)"
"absolut love it linda :D hope the book is go well!"
"I can't wait until I can get my hand on a copi of this!"
"it alway so excit find out about a new book by your favorit author, yay!"
"wonder talk amy!"
"I realli love to learn about bodi languag and how thi can show who we are."
"i love this!!"
"i could watch thi everi day!!"
"(and probabl will!)"
"zooey rocks!!"
"!"
"extrem interesting."
"We need to 'discover' the gene that creat superstit about animals, and anim appendages."
"onc we do, possibl we can breed it out of human and 

"thank for follow back:)"
"I would love to know how to make a donat that might help advanc thi new technology."
"If I had gold, Id give some to you."
"As fantast as thi theori sounds, use number make the so believ fact sound even more dubious."
"As a thai buddhist monk, even I had to laugh... bless condom - what next?"
"In a villag with no run water, no electricity, no televis and no telephones, are laptop realli the priority?"
"265,139 ride ha had a few one-second replies."
"I never have."
"theyr both on averag faster than me, Im pretti sure"
"It is particularli import for woman veteran who consid VA servic chariti servic as oppos to servic they have earned."
"thanks; thi is a great step forward for our servicepeople!"
"I spoke with steve at some length at the lincoln-reagan day lunch last month."
"besid the stand on fiscal respons I wa most impress by hi desir TO listen."
"It wa striking."
"pleas consid solar roads, too :) their first two round of fund were provid by the u.s. feder h

"Im with her!!"
"!"
"thi would not have been possibl if it wasn't the webb"
"kira kosarin como siempr teve muy hermosa"
"lol, tell me about it!"
"yeah I heard u r a great singer!"
"I rare comment, when I click thi talk I wa about to give up when I saw eat, pay love... but damn thi is the most inspir ted video I have seen!"
"veri inspir and resonating!"
"It made me curiou to read the book"
"you'r welcome."
"thank you for follow me."
"fantast talk!"
"I am thank that you continu to dream and share your dream with everyone!"
"that is veri true."
"We honor them by protect and serv our veteran of today."
"that tell them their sacrific wa not in vain."
"sick video!!!!!!!!!"
"go bethanie!!!!!!!!!"
"love the do!!!!"
"!"
"My pleasure; all I do is click, you'r the one do all the work."
"the onli realli amaz thing is how much I am on fitocracy."
":-)"
"On that note congresswoman, how DO you stand ON the iran nuke deal?."
"."
"."
"."
"thank you for speak up for us!"
"thi talk show us we an spend so

"congratul !!!!!"
"and keep go !!!"
"!"
"congratulations!"
"On to the next!"
"!"
"billi gonna take the republican primari debat and win the gop nomine for presid ......better believ it!!!"
"!"
"i'v eaten there kyrsten,good food!"
"london and gloriou in the same sentence... that new!"
":)"
"look back dure the last few decad , the clinton year were the most prosper and peac for our nation ."
"furthermor they were the onli time in gener that we had budget surplus , real peac , and real econom expans ."
"what a terrif match up with suzanne,helen or maureen...just love the style of play of suzann ,seen it on old vintag footag what grace ...."
"$40 with all you can eat concess and a free t shirt last time I wa at a buc game."
"I fuck hate basebal but at those price it just a nice place to hang out for an afternoon."
"monday im start stronglift 5x5!"
"just waitin for my tattoo to heal lol."
"go to start with the bar onli and ride thi program out and see how it goe"
"luv those boot beth!!"
"br

"We have to find effect way to deal with plastic problem."
"ha!"
"erica yackel dial - you have to watch and listen - chick dig scars!"
"http://thinkprogress.org/politics/2011/03/29/154368/rep-sean-duffy-complains-about-his-salary/"
"brene brown, i'm gonna rememb her name... her inspir talk simpli sum our human up."
"i'm grate astonish to have listen to thi :)"
"promis big dreams, and deliv morsel to keep the dream alive."
"classic intern tri and test elect gimmick"
"never seen insidi but love josh xx"
"that' a great article!"
"i'm write my amazon review sometim thi week, now that i'v finish the book."
"..."
"It will certainli give praise, but probabl not as well written as thi stranger article."
"I will do my best."
":)"
"I can't wait to get thi book."
"I love the odd thoma series!"
"haha."
"Is there a gener 'cardio' option?"
":pi total had to dig my car out of the snow yesterday, that wa a workout fershure."
"npr realli is a wast of money some thing DO need to lose fund"
"He is fire u

"shame shame shame for vote for rex tillerson....a a moder dem, I am depend on you and other senat to stand strong against thi administr incess threat to our democracy!"
"the key is to get the guy who made the mortgag mess out."
"don't let the one who made the mortgag mess give the solutions, they failed."
"look nice mr. ward as unsual"
"whi is it that the bill when pass wa revenu neutral but now that the republican want to repeal it the cost (read lower taxes) is $230 billion."
"what changed?"
"repres cousin!!!!"
"you make us proud!!"
":)"
"josh stewart, you'r such a stone cold fox."
":)"
"right deserved!"
"hey are u guy friends?"
"how lovely."
"U both are awesom :)"
"just side comment."
"interestingli those pictur look similar to the one taken by nazi' offic on top of human bodi and bones..."
"anytime!"
"have a great day :D"
"thanks!"
"they'r come along!"
"no: I disagre with you becaus you are a racist dick."
"thank for the follow back!!"
"check out your web site and fito knowledg pa

"you sure look amaz but none of them even come close to be as good as you."
"i'm a progress democrat Of america member write to thank you for support HR 292 and HJ re 119; I hope you'll lead on these bill and whip up some votes."
"also we ask that you cosponsor the two other bill outlin in our letter to you: viz."
"HJ re 44 and 113."
"thank you!"
"www.pdamerica.org"
"it probabl wors in poland."
"If you arent with them, you are a jew support communist that want to destroy the cathol church and polish traditions."
"As our ours...pray for the victims, a non-partisan act of faith."
"do you expect a 12 year old who wa rape to carri the child?"
"https://www.facebook.com/winningateverything/photos/a.115511545191454.20134.115346608541281/654566051285998/?type=1"
"that realli cool and sexy, nice fangs!"
"and how you plan on fund this?"
"I hope not on the back of our kids...."
"yes!"
"thi is usual my second or third, simpli becaus the person Im show it to doesnt know yet who stringer or avon is,

"maluquito que esti :( Te amo ok?"
"she regret it in the book."
"I forget who she name in the show but she stupidli impati and use it on her boss who is mean to her, big fuck whoop."
"thank you!"
"likewis :)"
"good looking."
"that girl doesnt look bad, either."
":P"
"time to get back to work, you peopl take to much time off as is."
"thank you for visit us here at Ft rucker!"
"thank you so much jim, and happi passov for you and your family..."
"realli inspir and moving..wa move both by what he wa say and by how he said it - all that passion!"
"make you think of your own dreams...and if ani of those dream can make thi place a better world"
"cant wait 2 see u queen:) bless and i know u will win"
"thank you for take the time to speak with us."
"TY and good morn and happi mondey for you traci....!"
"I love you...!!"
";-) :-*** <3"
"merri christma to you ..."
"do a great job... have a fruit season ..."
"thank for visit cces!"
"the upper school truli enjoy the discuss thi morning."
"i'm glad 

"sweet !"
"simple, concis and illuminating: beauti"
"thank for follow back."
"have you travel much in time?"
"yea, steve!"
"our man for such a time as this!"
"now you just have to get a real job"
"bring to new jersey also thi place is great"
"Oh you two are still friend after that 11 hour ordeal a few year back?"
"?"
"thank you for speak today at anaconda' memori day service."
"It wa an honor have you there."
"thanks!"
"i'm all about those natti gains!"
":D"
"you can't deni what doesn't exist."
"between your facebook updates, your youtub site and cspan, it ha been a joy to watch you work."
"thank for repres us the way you do."
"may god continu to bless you and your famili and your critic work in the house."
"elizabeth...thank you for give me a glimps into whi you are a success."
"It is your overal charisma!"
"absolut what got me were your close thought of ole!"
"o_o had no idea there wa go to be anoth one."
"butcher bay is an awesom video game, a classic."
"where were you when you were

"nice to meet you!"
"thi is the first time I see a succes group production."
"i'm happi to know someth like it is realli possible."
"and i'm sad that I know so littl about music and i'm (also first time) jealou I couldn't be part of this."
"thank you."
"may he rest in peac in heaven with our lord jesu christ"
"beauti smile, beauti pic!!"
"My daughter is 6yr old and she said she gonna beat u one day!"
"lol"
"ms. chapman, you are sell the repulican agenda say the republican are the onli one that have A plan TO save medicare."
"our social program don't need saving, just not turn over TO the republican party."
"best parad in the counti on monday is in tini rockdale, right where aston mill and lenni height get together."
"mani familiar face from our local government."
"I receiv in jesu name, amen!"
"brit your segment about trumpa potenti suit against the women accus him of grope (assault) is total bunk..ther is evidenv that he wa set up...that is the most relev part of the trial..th questio

"no joy when you are a paid agent of turkey who support isi and who' citizen are tri to murder our US troop http://youtu.be/3cvnnno4hlk"
"guy in sweater like that kill hilter."
"thank and welcom back [from injury]...happi training!"
"would love to know more about that fibr optic weav inspir by the octupu"
"So glad to hear thi happening, we can not solv problem creativ or critic with polar divid our country, proud to be a north shore resident."
"thank-you both."
"alreadi got it on kindl and will be read it at lunchtime."
"see you here."
"here is hope the blizzard won't cancel your boston shows."
"i'd say hit the scorpion bar after the show, but I know @craigyferg is abstaining."
"We learn about the world befor we enter it.our most import learn happen befor we are born, while we are still in the womb."
"and our health and well be throughout our life, is crucial affect by the nine month we spend in the womb."
"~anni murphi paul (book: origins)"
"and you as well!"
":) alway a pleasure, ms.

"emot correctness...i'v alway recognis it essenc (more correctly, comprehend it in feel - if anyon know a better word to describ this, pleas share) but these word support verbal translat beauti :) thank you sally!"
"seem like we share a similar profil description!"
";)"
"these are enjoy way to learn and school should start to use these teach method to feed children creativity."
"question everyth you hear, includ what I am tell you today."
"inspir peopl to be more self awar by lead by example."
"awesom"
"thi is whi I am alway studying."
"I will have to work with all kind of convert as we begin use a varieti of energi sourc"
"Im not happi about /r/books."
"ill probabl stop post there as it get flood even more with pictur of book and endless circlejerk over the same 6 books."
"that is also part of the shame, yes."
"I still disagre that we should stop drill for oil just becaus we spill some... how sensic is that?"
"jane it is bore veri bore and drawn out!!!!!!"
"also jane pleas tone down y

"(:"
"I miss watch your tv show"
"@john- thi is not about you!"
"thi post is about pray care for the famili who just lost their love ones.."
"pleas have merci and respect."
"principl OF great must BE constant goal OF usa"
"actual with dumbbells, you just log one set and it is assum you did both arms."
"for instance, if you do 5 set of 5 with a 10 lb dumbbel with each arm, you just log 10 lb 5 rep x 5."
"At least that is my understand of how thi works."
"just saw your interview on tavi in chicago, never heard of you until now, great interview and perspective!!"
"I will be read your book."
"yeaaaaay im glad u will be with fox 2 more year"
"It won't let me view it caus i'm in canada eh."
":( ... lol"
"hello, I realli appreci the fact that you teach philosohpi to prison inmates."
"such effort can realli help in contribut toward distinguish right and wrong action among criminals."
"check out some of my philosophi and ethic essay at http://www.researchomatic.com/philosophy-essay/"
"stand and

"what I don't like is brit hume, say quot quot quot everi 10 seconds."
"i' annoying."
"I switch to a differ channel when he' on air and I come back when he' done report"
"lol..."
"i'd total forgotten."
"next time, tell me the day of so I don't come in to work, mmkay?"
"say hi to your mom for me."
";-)"
"congratul bianca!"
"ridgeland is veri proud if you!"
"likewise, it' a pleasur : )"
"now just cut bannon out and you will be my hero"
"gee, where are all the job the republican promis as we now head into march?"
"good luck!"
"We will miss you!"
"god bless you and your family."
"i want to thank you mr water for all your hard work."
"i'm so glad that michel bachman bomb in iowa !"
"I found the talk vague."
"similar to say we should have an economi that oper less on greed."
"ye but it' built into the dynam of the system so doe he have ani idea on how make a signific system change?"
"I didn't hear it."
"thi wa wonderful!"
"what an inspir woman!"
"rivalri be damned, he is fine as hell"
"hones

"morn like this, I feel like i'v been run over."
"sorri I wasn't there fred."
"i'm in flagtaff, AZ where our daughter lives."
"heard you got zing a few time (all in fun)."
"I appreci your dedic to help the US get better"
"thank you ashle watkins, let hope a healthi dialogu can start"
"welcom to fitocracy!"
"thank for the follow, and if you have ani question about thi site, pleas feel free to ask."
"good luck."
"also... gotta get your about me section fill out."
"let peopl know what your about and what you wanna achieve."
":)"
"I way almost a goner at age 6 month now I am age 60 in august.a onli son"
"No problem."
"and thank for the prop."
"i'm realli like thi site, veri fluid and easi to catch on."
":D"
"good job he play realli well happi to see isner play a great match!!!"
"he is my favourit player"
"should have been fire a long time ago."
"We all knew he wa dirty."
"now let him pay for hi actions."
"pink is a pretti color for him."
"."
"thi talk remind me to be grateful, that while m

"It come on but it doesnt respond to ani buttons."
"NP read your about and notic it wa real similar to my follow and thank for the luck."
"seem like it go to help take me to my next level."
"I am a civil as well but in construct rather then design."
"exactly!"
"it just find time and drive to do that."
"Do you think it will come off to strong though if all of a sudden Im all about work out, with him?"
"Hi ashley!"
"im your big fan here in philippines!"
"these data are inde scary."
"that wa a realli good interview."
"I didn't know we had a person in congress that made that much sense."
"so excit for the potenti of thi new chapter for america"
"those ring thing you attach to the back of your phone to make it easi to hold onto"
"I think that our nation is great that peopl can make into offic without the need to be from the right family."
"are we not?"
"you point that out well in your respons to me."
"that the major of the founder were not slave owner and most were industrious."
"amazing, m

"i wont vote for you ani of the other senat you all take money and dont stand up for patriots."
"I still havent even been abl to sign in for the first time."
":D ... :("
"I remain veri unhappi that we onli had two choic today."
"veri inform talk about a veri import issue."
"No problem :] & you have a gorgeou dog."
"her improvis is too predictable."
"anyon can compos in a primit pseudo-romant style."
"especi in c-dur."
"reward from hard work."
"beauti"
"wow BS photo and no immedi import of our elect coming."
"stop play around and help the peopl jesu help us get you to repres"
"I am so total go to tri and make thi one."
"thank for the reminder."
"quantum flux: that data program is readili avail and easi to use!"
"han rosl ha help make it avail at gapminder.org have fun!"
"It is a fantast program!"
"I can't understand whi flag weren't lowered, not veri respect to these innoc victim of thi horribl tragedi"
"yesssss!"
"you deserv that and more!"
"happi boss day!"
"It is interest studi and t

"It would go to show he wa a true profession in control of all that he could control in thi sport that push the boundaries."
"they are all dirti and should be held account no matter what alligi they have with politicians."
"My mother' premium just went UP again the second time in 8 mos."
"what planet is hh on and where is their info come from?"
"they just manufactur fact to match their agenda!"
"did they explain how much they dump in the river legal of course?"
"i would love a sequel!!"
"!"
"I sens as much spin as solid evid or science."
"mahalo senat mazi K. hirono for thi veri import issue!"
"I myself is now experienc thi situat while tri to finish up my bachelor."
"again mahalo!"
"Is it just me or are the ted talk get less inspirational?"
"yeh, thi talk wa a littl interesting, but what she had to say wa not exactli rocket science."
"I suppos it show how stupid donor agenc can be...but not much more."
"thi might be the onli way."
"I doubt wed ever see a jump straight to a 3-year ban 

"anoth reason olpc is a good idea is books; In the west we have easi access to book but in develop countri there are book their kid would never have access to, with access to a laptop and the internet all that knowledg is within reach!"
"take a look around...kick your feet up and enjoy!"
"My mind is still tri to absorb all that she said and it natur consequences."
"impressive."
"yeah, I work with them."
"I do bioinformat - comput and data scienc appli to biology."
"I analyz dna sequenc data and genom on computers, it' fun!"
":)"
"As a former runner I salut you with laurels!"
"may el-khalil, you are exemplar to us all women, one of the bravest, most spirit and beauti woman, insid and outside."
"you two are the best team hawaii could hope for!"
"If onli everybodi could see that!!!!!!!!!!!!!"
"you both have such great track record base on the ideal of commun servic - admir :-)"
"noth like the rich grounds/steet of nashville."
"a good time for a stride not a race to think immigr reform."
"

"I can't wait for the day when I have photo op and say i'm do something, when in reality, i'm not."
"thi is crazy!!"
"how can parent allow this??"
"I didn't pay tuition for my boy to have safe places."
"what will they do in the real world??"
"?"
"thank for the follow back!"
"I dig your profil pic."
":-d"
"pray for you and them and our nation in jesu name."
"thank buddy, i'm readi for la!"
"can't wait."
":)"
"hire in the US hit 8-year high, job open hit 13-year high."
"thanks, #obama."
"http://bit.ly/1cszjrf [[share]]"
"you can!"
"Go through my site- maxbeechcreative.com"
"congrat and huge thank to congresswoman water for her tireless efforts."
"We appreic all that you do on behalf of your constitu and abroad."
"It is nice to have a politician who isn't in the news for deceit and corruption."
"I dress your husband' grave on memori day - he is one of my heroes."
"thank back!"
"yep, univers of alberta."
"you live around here?"
"pleas don't sell my land steve"
"just shake my head at the ig

"!"
"awesome!!"
"i'v been so busi with in-law in town I need to bing watch the last coupl of episodes."
"great news.....but not surprising!!"
"We know 'winners' when we see 'em!"
"<smile>"
"anoth corrupt politician and a hrc shill vote against her"
"one of the best ted talk ever."
"true, inspiring, and cool!"
"that is a wonder fb game.wow.88 https://www.facebook.com/pages/allflogk/239888956148041?sk=app_208195102528120"
"vote ye on audit the fed is the best thing you could do... next time, I hope you will."
"you are do a fabul job."
"you were made for politics."
"there is NO video :(( onli sound...:("
"chri is alway #1 in all our hearts."
"soror fudge, victori is yours!"
"hear, hear!"
"great, inspir talk."
"I am go to be an upstander!"
"Go monica!"
"We are with you."
"pleas fight the republican who want to hold feder assist hostag for more budget cut"
"shut up and take my money!"
"such love young ladies....and nice as well!"
"thank girls!"
"need your dad in the u.s. senate!"
"I bet you

"lol!"
"what a pleasur to meet you and your famili at combat veteran for congress in san diego sat evening!"
"you'll be a great asset in congress."
"love your websit video....beauti countri up there in montana."
"thank for your servic as a seal and i'm pray for a zink victory!!"
"!"
"!buena suert senat we'r with you all the way!!"
"!"
"bkb is absolut core lategam on medusa."
"If she not get it, she becom useless be stunned-hexed-disarmed."
"and she have no mobility, kite her have 2 halberd is quit easy."
"duh...ya think...unbelievable!"
"We realli need more of those!"
":)"
"have safe and good trip!"
"I vote thi morning, color in the circl besid your name, as I have been sinc I wa 18 (i'm 40)."
"My onli wish ..."
"It wa for governor!"
"that wa truli amaz and special natasha wa a rather intellig child,might have help her cope with stuff,mak you wonder what is the sourc of confid for children like her,"
"i wa inspir to get out of the rut."
"http://blackopscharlie.com/2012/03/02/poem-livin

'output/test_stems.csv'

#### 6) Ecrivez une fonction qui supprime les mots outils (stopwords) du corpus. Vous devez utiliser la liste de stopwords de NLTK ;

In [118]:
nltk.download("stopwords")
from nltk.corpus import stopwords
stopwords.words("english")

def deleteStopWords(path) :
    with open(path, "r") as f :
        reader = csv.reader(f)
        data = list(reader)
        
    path = getPath(path)
    newFilePath = output_path + '/' + os.path.splitext(os.path.basename(path))[0] + "_stopWords.csv"
    stopwords_english = set(stopwords.words("english"))
    output = []
    
    file = open(newFilePath, "w")
    with open(newFilePath, "w") as f: 
        for sentences in data :
            for sentence in sentences :
                newSentence = [token for token in nltk.word_tokenize(sentence) if token not in stopwords_english]
                sentences = ' '.join(newSentence)
                output.append(sentences)
                f.write('"' + sentences + '"')
                f.write('\n')
                
    return output
                
#enlever la création de nouveaux fichiers
    

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/kimluu/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [119]:
deleteStopWords(train_mots_path)
deleteStopWords(test_mots_path)

['Thanks back !',
 'Yep , University Alberta .',
 'You live around ?',
 "please n't sell land Steve",
 'shaking head ignorance deliberate ignoring facts FDR , Pearl Harbor , WWII .',
 'To contemplated tri , perhaps ?',
 'Pshh ... Is treat props .. Just go around deleting ? ! ? !',
 'Sureeeeeeeeeeeeeeeeeeee I see .',
 ": pYeah 's definitely still bugs around .",
 "My workout last night posted today 's date .",
 'lol',
 'Thanks !',
 'I also love bacon .',
 ': )',
 'Hello Isaac !',
 "My copy arrived yesterday France , I 'm happy excited read ! ! !",
 'XD',
 'We need keep Bob Menendez congress .',
 "It 's really excellent lecture , I believe her.so , fake till make !",
 "I 'm human according questions .",
 'And tone great , I lot fun : )',
 "You 're awesome ! !",
 '!',
 'B Mattek awesome , I love bad girl risque style .',
 "Brit : I 'm glad u r Fox tonight .",
 'You calm people bring common sense .',
 'You class act !',
 '!',
 'Good luck .',
 'I really hope make .',
 'Please reach Mr. Trum

#### 7) Écrivez une fonction preprocess_corpus(corpus) qui prend un corpus brut stocké dans un fichier.csv, effectue les étapes précédentes, puis stocke le résultat de ces différentes opérations dans un fichier corpus _norm.csv

In [124]:
def preprocess_corpus(input_file: str, output_file: str) :
    #to do : vérifier si c'est bien le résultat voulu
    results = deleteStopWords(stemmize(lemmatize(tokenize(normalize(segmentSentences(input_file))))))
    file = open(output_file, "w")
    with open(output_file, "w") as f:
        for element in results :
            r = element.replace('"', '""').replace('"', '')
            f.write('"' + r + '"')
            f.write('\n')
            
    

In [125]:
preprocess_corpus(
   os.path.join(data_path, "train.csv"), os.path.join(output_path, "train_norm.csv")
)
preprocess_corpus(
   os.path.join(data_path, "test.csv"), os.path.join(output_path, "test_norm.csv")
)

"I don't think any one there has EBOLA Bob Latta You should be back in Washington actually getting something done there on the House floor."
";-)...anything other than jeans and t-shirts are superfluous, by the way."
"'Update your wardrobe'...pfft."
"Meh, I could only get to 8."
"Need to work up."
"A bill consisting of a single sentence."
"Very well done, sir."
"So far, so good."
"Thx !"
"My buddy Jeff Johnson was your prop master on that."
"she had me at everlasting youth."
"Congratulations to you for a well deserved recognition!"
"baffoon, idiot, dumb."
"The intelegent conversation continues......."
"They don't deserve that honour(stupid Hollywood movie business people)"
"Yawn!"
"Is this honestly news?"
"Same to you brotha!"
"Get at it :)"
"Would good to know how the age of all of these things have been measured."
"the perfect society is shaped in the form of a pyramid, the old at the top and the young at the bottom supporting the old."
"if we kill off the young that will take care o

"Keep moving up the ladder Senator!"
"These guys didn't green light this sort of program without authorization!"
"big animal big idiot"
"Fab!"
"Great writer """"Manifest your destiny"""" is a must"
"Gracias igualmente para ti ___ Rubn J.Kihuen__Bendiciones para toda la familia"
"your awesome!"
"congrats."
"Learn how hacking & fake news could quickly escalate into the next 9/11."
"It's the asymmetric cyber-warfare threat that NOBODY's allowed to talk about... not even Donald Trump."
"Hint: Everyone has a cell phone."
"agsaf.org/could-donald-trumps-twitter-feed-be-weaponized"
"You deserve this based on hard work alone."
"You have my vote."
"I like the Ted talks, but they have a very left-wing bias."
"Let's face it, Barak Obama talking about morality I about as cogent as Darth Vader talking about openness and sensitivity."
"The only reason to sight his empty speeches is to score political points."
"Confermation..nice efford wit your game!"
"!"
"Aw shucks."
"Thank you"
"Simple."
"Get rid o

"lol."
"Thank you, I think."
":)"
"So what!t The real news will be that YOU ARE DOING SOMETHING TO REPEAL IT!"
"Until then, this is just ho hum...I told you so news."
"DO SOMETHING instead of just giving verbal spews."
"As long as its making USA citizens into job and career holders , excludes illegal immigrants and safe for the environment, promotes state and national gross domestic product without unions , should pass easily."
"Love it!"
"Now you can tease him about not knowing who Bobby Governale was!"
"Thank you for co-sponsoring the ABLE Act and continuing the work of ensuring our loved ones with disabilities are given the opportunity to be full, contributing members of society."
"Good luck on the run."
"We will be supporting you, although you won't see us, along with other hams from Phoenix CERT and allies."
"You know what?"
"That has got to be the best side effects I've ever heard of."
"Seriously that made me smile hard as hell when I read that."
"And keep running girl, you're do

"!"
"My pleasure!"
":) Have a great day!!"
":p"
"and for god's sakes people, if there's a butterfly ballot, PLEASE look at it carefully and don't vote for pat buchanan!!"
":)"
"With wisdom and grace guiding you, Bonnie go forth and do great things!"
"Optical drives are becoming a thing of the past."
"I completely left one out of my rig"
"and yet, my last two health care premium raises have been the smallest percentage rises in 10 years.."
"I love the way this guy thinks critically about himself and still thinks outside the box!!"
"Its really cool how he chases these big topics."
"I Hope he will go further with the other questions!"
"different world + great ideals + hard work = best story about the most interesting things in the world .. about us"
">Long enough to be familiar with the time in hip-hop where it was common to kill other rappers for their music."
"How many rappers have killed other rappers?"
"Ive heard theories with Tupac, but besides that, I cant think of any cases where t

"I wish I could convince him to become an expert on several other topics that I want to learn -- and then have him serve as my Professor."
";) LOL"
"Awesome talk."
"and only 274k views."
"One of the humanitys best talking her mindout about real problems."
"Thanks!"
"Writing my article for next week right now, actually"
"The public sector union was so awesome in wisconsin that it lost 60% of it's members when they were no longer forced to pay dues."
"Thank you Representative Correa!"
"Our community needs this."
"We appreciate your leadership."
"Bravo!"
"We need to try & get on something."
"My Husband works retail & I'm on Disability."
"Very scary."
"First Oregon Trail reference I have seen on Fitocracy...double props ;)"
"I live in New Jersey and we have a storm coming so I'll be doing the same soon."
"Yup the druid deck is pretty easy mode."
"Was trying other classes for quests and just not having any luck at all."
"Including mage and priest."
"Maybe just on a real bad luck streak."
"M

"LIVE AND IN LIV.i.NG COLOR, HAHAHAHA, with love Danny"
"You're right Richard, it is jobs."
"So perhaps we should start by ending the firing of hard working Americans from the military simply for being gay."
"Good morning from Okinawa, Japan."
"Please explain to your cohost that being consistently moronic is not a quality we, the American people, require in a president."
"A little too much too fast!"
"He is NOT MY PRESIDENT!!!"
"With that said, we shall wait and see just how much clout and pushing Congress around he has and does."
"Jill and those that enjoyed this video will most likely appreciate the Urantia Book."
"This is true especially if they grasped her closing remarks."
"Brother In Faith, JJ"
"No prob thanks for the props"
"Filled with factual errors and what can only be intentional distortions and turning a blind eye."
"What a biased presentation."
"I expected more from TED."
"A fascinating use of anthropology as part of conceptual art."
"With the added layer of technology thi

"Four boys!"
"Respect!!!"
":)"
"The National Association of REALTORS just formally approved me to represent you as an FPC!"
"Hey you got it!"
"Thanks for the follow!"
":)"
"So glad this talk is well received."
"Water, sanitation & hygiene are such important topics - and not only in the developing world."
"Sounds like an absolutely awesome weekend :)"
"It doesn't have to be a competition!"
"We can all love each other's pictures."
":)"
"..and you voted to shut down the government too."
"Which way is the wind blowing today?"
"Thank you Senator."
"Now......please help to protect the Delaware Bay shoreline and enhance our economic development by finding the funding for the Channel Deepening Project."
"Would love the talk if I could watch it."
"This talk seems to have been hijacked by ads."
"As soon as his images start it switches to an ad for Coca Cola and various other ads."
"So mad about this, and don't know how to fix it!!!"
"!"
"Yep, she'd rather talk about her fkn turkey and ruin every

"Ally Spw Mary Singer Albertson Susan Beckett Yvonne Wyborny Delores Lyons etc"
"You're welcome :) Yup, great site :)"
"Good substitution."
"I had one of those days, and my hubby suggested subway instead."
":)"
"You're very welcome!"
"Anytime :)"
"One more thing...did anyone else find her examples of memes at the beginning of her talk to be a bit weak?"
"PLEASE look into cosponsoring the bill that would definately be a HUGE economic stimulus."
"It is the FairTax Bill and would do for our Republic what is already done for WA State """"no income tax."""""
"This is such a phenomenal story."
"Her journey is very inspiring and amazing."
"Thank you for sharing this with all of us!"
"Praise God!"
"That is really exciting!"
"My pleasure... likewise:)"
"amy, please have fun on the set of the movie :) & glad your in the movie as well"
"I did, you guys are mocking the fact that CNN called out a racist, and thinking thats bad"
"Congratulations on your sweet baby girl."
"What a joy she will be to y

"Nice im gonna order me a pair!!"
"!"
"God bless ya brother."
"If i could give negative props for the fraternity though, I would."
"Props are for 240."
"Ito nailed learning versus education, """"To me, education is what people do to you; and learning is what you do to yourself."""""
"As usual, you presented yourself beautifully on Fox news!"
"I want to see you as president someday!"
"A powerfully moving and heartfelt speech."
"haha I'm glad to hear that."
"Looks like it cause you to level up too."
"awesome."
"the misinformation, well, """"LIES"""" are alive and well on this page."
"Very unfortunate that so many WILLFULLY IGNORANT voters who believe anything at all just because they hear it on Fox."
"You were great Jaime, Hannity even commented how much he liked what you had to say."
"Hope to see more of you on Fox."
"As a straight guy, even I can agree..."
"wat a hansome kitty u have"
"Just did some of these with a band - felt pretty good."
"Brillant!!"
"Ingels has a terrific future ah

"Also... Gotta get your """"About me"""" section filled out."
"Let people know what your about and what you wanna achieve."
":)"
"I ways almost a goner at age 6 months now I am age 60 in August.A only son"
"No problem."
"And thanks for the prop."
"I'm really liking this site, very fluid and easy to catch on."
":D"
"Good job he played really well happy to see isner play a great match!!!"
"he is my favourite player"
"Should have been fired a long time ago."
"We all knew he was dirty."
"Now let him pay for his actions."
"Pink is a pretty color for him."
"."
"This talk reminds me to be grateful, that while my country is not perfect, there are still many good things going for me."
"I am thankful."
"PLEASE SIR."
"WE HAVE ASKED MANY TIMES FOR YOU TO COSPONSOR H RES 752."
"WE HAVE 159 COSPONSORS NOW SO SAY YES AND JOIN YOUR FELLOW CONGRESSMEN."
"THANKYOU."
"Saw both games... Michigan Wins.... Way to go!"
"Good."
"I hope you support regime change in Iran."
"Reek, it rhymes with leek."
"Thanks D

"RBA!"
"And injuries suck!"
"@Susan, the ceremonies are on the weekends when Congress is not voting."
"I feel very strongly about attending these ceremonies because it is quite an accomplishment."
"These young men deserve my recognition."
"-- L.L."
"That's funny Chrissy :)"
"Kaaabbbboooommm!"
"You is welcome!"
"Yom yom yom, tucker time."
"1000 is plenty if they are well randomized, 3.2 percent margin of error."
"Hope to see you Monday in Charleston."
"I would like to see that myself, and i would attend."
"As you say you represent us all."
"Cats are cunts...Dogs rule."
"Had another great workout this morning too!"
":-D"
"Congratulations, Girl!!!"
"SO proud of you"
"REALLY WELL DEMOCRATS WERE THE CAUSE OF IT"
"Jesus, then Trump, then the world implodes."
"Wow!"
"I can't help but laugh at stupid."
"Union labor build this countyr corporate greed will destroy it"
"Thank you, Representative """"Raj!"""""
"@Jordan_E hey no problem thanks"
"It's always a special treat to see Miss Edna."
"Her g

"Goverment is everything to Obama."
"I'm astounded you are astounded this!"
"Aliens... BAH!"
"Do you really expect me to believe that there are beings on other planets that fly off in spaceships and land on other planets...lol."
"Give me a break!!!"
"Errr wait..... we do that."
"Never mind."
"To say that guns are just the harmless tool, neither evil or good is false."
"My box of Cheerios doesn't kill my family when I drop it."
"Guns are extremely dangerous, and it is foolish to deny it."
"The amount students are charged on text books is high way robbery.....Why can't they be free via downloading pdf??"
"?"
"Hi Kristi!"
"Unless I missed it on your website, when will you be speaking in Sioux Falls again?"
"Thanks, in advance!"
"Noooo."
"The escape pod is in a car jail awaiting parts."
"Meanwhile I'm driving a gas guzzling rental."
"Killing me!"
"I am so very jelly!!!"
"I mean, hey YAY congrats.... and happy birthday :).... and yeah, still can admit I am way jelly!"
"!"
"Thanks for being 

"Twice as much light.."
"Very interesting way on closing the speech."
"Thank you!"
"<<Blushes>>"
"It was Blunt and his ilk that added the abortion language that prevents this bill from moving forward."
"You know that."
"Thank you as well :)"
"I wonder what software he uses to construct this presentation?"
"His presentation slides are really impressive!"
"I like to use this software for my presentation as well!"
"Awwwwwwww, noooooooo!"
"Snatefinch!!!"
"::single tear::"
"Jesus Christ!"
"You're killing me."
"If only dreams came true."
"you'll always be a pro to me"
"Nice to meet you."
":)"
"This truely is an amazing job, but the blind can share driving feelings which used to be impossible for them."
"Wish for the best on this work."
"Bravo for your hard work on this important topic!"
"as a project manager, the communication is the most important skill."
"Of course, we have to deal with the human right and message security issues carefully."
"This is a good briefing on it."
"why is Zimmerm

"I don't think_neg any_neg one_neg there_neg has_neg ebola_neg bob_neg latta_neg you_neg should_neg be_neg back_neg in_neg washington_neg actually_neg getting_neg something_neg done_neg there_neg on_neg the_neg house_neg floor_neg."
";-)...anyth other than jean and t-shirt are superfluous, by the way."
"'updat your wardrobe'...pfft."
"meh, I could onli get to 8."
"need to work up."
"A bill consist of a singl sentence."
"veri well done, sir."
"So far, so good."
"thx !"
"My buddi jeff johnson wa your prop master on that."
"she had me at everlast youth."
"congratul to you for a well deserv recognition!"
"baffoon, idiot, dumb."
"the inteleg convers continues......."
"they don't deserve_neg that_neg honour_neg(stupid hollywood movi busi people)"
"yawn!"
"Is thi honestli news?"
"same to you brotha!"
"get at it :)"
"would good to know how the age of all of these thing have been measured."
"the perfect societi is shape in the form of a pyramid, the old at the top and the young at the bottom su

"you influenc me a lot."
"everyth is differ now."
"Of cours bro!"
"i'll shoot you my info on reddit"
"keep hold your vagina you pussi as pog."
"marin hate you"
"lol it all good."
"nice work."
"No joel - SC voter like mr. sanford and we love the fact he will repres u in dc!"
"you GO mark sanford!"
"!"
"I will catch up one day!!"
"!"
"thank for take the time to check it out!"
"just want to thank susan collin for be in the bangor parad today."
"It wa so awesom to see you, olympia snow and mike michaud."
"thank for all that you do :)"
"thank you veri much!!"
"and you are welcome!"
"I hadn't seen_neg you_neg in_neg my_neg feed_neg for_neg a_neg few_neg days_neg._neg._neg."
"I must have miss you..."
"So I had some catch up to do!"
":-)"
"I did my best to make sure all my friend knew about thi film, to thi day ive found onli one person who hate it."
"the soundtrack wa perfect and you mean brie larson, who is hoooooooooooooooooooooooooot"
"what a great spoke person for ani event."
"ever sinc I

"If your carri around a firearm that is damag or malfunct or you are not completely_neg competent_neg with_neg you_neg are:_neg a_neg."
"An idiot B."
"A danger to yourself C. A danger to everyon around you D. all of the above."
"and see all the slacker at work!!!"
"!"
"train sniffer dog are better than ani machine."
"haha thank you--it' actual onli 3 awesom dogs, that basset hound is a real troublemaker!"
":)"
"taxat and regul have made econom grow stall; it hasn't stalled_neg by_neg itself_neg."
"there isnt 0 chanc they havent trade him"
"you betcha!"
"and I vote for you!"
"he danc to beat it."
"rain booti look g8 on you lol #six"
"I have a fish from feel that john mel shape great board and surf shop."
"you are welcom :) heh, my neck is do much better after surviv that attack on the work halloween parti :P"
"thoma windu help is on the way"
"I feel like she threw shade and snuck in a plug for her movie?"
"I am in!!!"
"I think I might die alittl !!!"
"!"
"what lobbi group told you to do

"an electrolyt gener is simpl and reliable."
"what sort of first world solut were avail to priestley and others?"
"the moral turpitud is the problem."
"america is start to feel the weight of graft and corrupt the level in mani place in africa make societi impossible."
"and how is that of relev to the conversation, their policies, or what peopl should vote for?"
"did you all wear them?"
"dear elizabeth, hope you had a fantast birthday!"
"you look great and what a perfect day to be out in our perfect CT weather."
"It need so much courag for her to share her stori weith u .thank you jill bolt taylor,what you said teach me a lot."
"while the rest of u are for send back the one that are alreadi here!"
"happi new year sir!"
"miss the laugh but I do enjoy see you on you tube."
"enjoy your day and pleas be well."
"allonsy!"
"outstand first word to your peers."
"congrats!"
"for sure."
"I think the onli way to get through to him on how uncool it is would be to not go_neg."
"have a bless time!"
"

"sound good!"
"make sens sinc it between them."
"thank :)"
"congressman you rock!"
"pleas tell me thi deutch guy can come close to you and congressman wexler."
"nevermind I restart and got to see the end."
"thank you again!"
"it' cool, mine wa too slow."
"keep it up!"
"hey!"
"all good!"
"thankyou, same here."
"will do!"
"(:"
"remov and flagged."
"thank guys."
"I can not, and will not support_neg any_neg one_neg who_neg supports_neg the_neg racist_neg donald_neg trump_neg."
"If you support a racist then you must be one a well."
"racist !!"
"!"
"whomev stand with israel stand with the devil"
"thi is shame and is total against human right to be heard."
"I am go to see if my daughter can come."
"and now at cvh peopl stay on it year after year after year to call out without reprimand.....tot abused!"
":("
"no, but I will be walk for you in lasalle!!!"
"!"
"interest bike, but they spent literal 2/3 of the talk joke about themselv befor we even saw the bike, and then they didn't really_neg sa

"well cunti flop is claim to be a troll when she clearli just desper for attention."
"destroy them...lik I destroy beatl rock band."
"dear grace: yourself is a great achiev for our commun already!"
"keep up your great work and happi women' histori month to my mom, wife, sister and daughter:))"
"i'm sorry, not in_neg annapolis_neg, west point and colorado springs..."
"these graduat will know what it mean to have freedom and to preserv it."
"So sick of all you asshol at fox new repeat the democrat talk point over and over again."
"unless you idiot get behind trump , it' over for the republican parti and you will be to blame."
"Go vania!"
"will sneak away from centr court when you are on :)"
"well durbin ha the latino vote!!!"
"!"
"realli glad to hear!"
"can confirm lost 10 lb in a month with thi one easi trick that doctor hate"
"you'r a brutal bastard, I love it."
"It wa a great game."
"i'm from indiana but I bleed blue."
"great game UK"
"I love the part of withhold the pay of lawmak if 

"thank you a well."
"haha can't make_neg any_neg promises_neg _neg;)"
"As much a I enjoy neighborhoods, california feel like the album we should have gotten."
"ive been enjoy it so much in the past 24 hours."
"candid obama promis to end these two wars."
"ask him whi he broke hi promise."
"i'v seen that talk so mani times, it is my joint favourit on ted.com."
"hi enthusiasm is so infectious!"
"doe anyon have the transcript of the piec he read at the end?"
"yeah, and those peopl are usual abl to wake up befor midday i.e."
"peopl who cant afford to sit around watch video game all day."
"why?"
"whi is my mind so fuck up that a sad a that made me read it I too want to see a picture."
"love thi pic steve!"
"so happy!"
"<3"
"So you'r quick to jump onto the clinton scandal but don't give_neg a_neg single_neg care_neg to_neg the_neg many_neg, mani scandal trump ha brought on himself."
"gotcha."
"#keepoklahomar #andbrok #anduneduc #andarcha"
"other sport like nascar?"
"Or golf?"
"please... ufc i

"none of that is opinion, except the veri last sentence..."
"thank you, florida times-union, for recogn our outstand congressman, jim cooper."
"great talk."
"ted deliv yet again."
"keep it up."
"thi truli is an inspir site..i might need help co i'm total hook on it."
"perhap dan areili is right after all."
"irrat behaviour!"
"No prob, had to fix that littl oversight ;)"
"and congrat on the weight loss progress!"
"well deserved, I have neglect to properli prop you!"
":)"
"I found your fiction dialogu veri entertaining, and i'v often thought that laugh at monstros make them a whole lot smaller and le dangerous."
"It is thi kind of talk that make ted what it is, simpli mind-blowing."
"now thi is a ted talk; ask for fact and self control in the face of great potential."
"great job!"
"were all busi crying."
"nice!"
"what all do you have planned?"
"As someon with a home gym i'd be happi to go over stuff I wish wa different."
"good morn mr. cuomo"
"not bad_neg for_neg a_neg first_neg attempt_

"just bring hi usual renergi to proceedings... and hi entir famili are basic use car salesmen, pander their warez."
"bjj is the viscou greas they use to oil their flimflam machine."
"thi is a must see!"
"!"
"hey there!"
"My pleasur and thank you!"
"!"
"bend over good citizens, here it comes!"
"how about repeal obamacar all togeth - are you alreadi so out of touch that you don't realize_neg that_neg our_neg premiums_neg are_neg out_neg of_neg control_neg -_neg deductibles_neg are_neg way_neg up_neg._neg._neg._neg."
"I guess a along a your health care is paid for by u it realli doe not matter_neg -_neg right_neg?"
"you'r with one of my faves."
"Hi BJ - it' dave and ami kahn here in mv."
"We just watch your talk togeth and are truli inspir by your dedic to improv live and die within our healthcar system."
"We will make sure to forward thi on to debbi in case she hasn't seen_neg it_neg."
"see you around town."
"xo!"
"thanks, terry!"
"your is look pretti good, too!"
"thank u for stand up fo

"they are such hero and heroines!"
"may they rest in peac with god."
"most great revolutions, the marin corp and nation are born in breweries!"
"it just a much of a cunt move a dive to get a penalty."
"yeah, got a bit bore with the last one."
"tri to eat more fruit too so thought i'd chang up the picture."
"sick of see my mug whenev I log on!"
"awesome."
"A thinker for earth."
"It is onli possibl by educ the local populations."
"It need patienc to coordin everyth but I wa rather surpris how it went fast to recreat thi kind of ecosystem."
"thank you!"
"good luck on your goal a well."
"everyth from bf4 to dirt ralli to civil V and kerbal space program."
"I agree, it is more obviou in fast pace games, but in ani game, everyth is much smoother."
"love thi common sens approach."
"it' nice in thi crazi polit climat to be abl to be proud of one' congresswoman."
"thanks."
"My old profil wa me a guil so I had to follow."
"yes, I probabl should :)"
"their stock sure plummet be associ with your s

"there is anoth reason for want to know what conscious is and that is to know at what point we can say that someon ha inde died."
"happi birthday and good luck"
"I onli go to the gym two or three time a week, but when I do, I spend all day there."
"woo!!"
"see you around (probabl in the weight room)!!"
"ps: I love your profil picture."
"b'awwww <3 thank you - that pictur is adorable!"
"maggi hassan, I don't represent_neg the_neg people_neg who_neg elected_neg me_neg into_neg office_neg, I have been bought and paid for BY the clinton campaign !"
"you give me kind word :) thanks."
"enjoy central bank everyone?"
"I hope it wa a great game"
"keep up the good work seanator....."
"your first match win at wimbledon is super"
"great work chellie!"
"you speak the truth and make u first district proud."
"they are constantli interrupt her!"
"!"
"you look quit sculpted!"
"i'm happi to discov such a talent musician."
"happi holiday to you, your famili and your staff."
"peac to all"
"you are even mo

"phylli gave that bomber a C2 system that allow it to excel and survive."
"thank phyllis!"
"no, it isn't."
"but it' onli a temporari setback."
"don't let_neg it_neg get_neg you_neg down_neg, and get better soon!"
"I would like find out if the gop ha tie with compani that they support there caus and who are part owner in."
"I know there are some who' hand are veri dirty."
"hey mark - are you still around?"
"good for you."
"you are awesom sean!!!!"
"big fan!"
"thank you veri much!"
"I think i'll keep them ariund for a while!"
";)"
"like, it realli not that_neg hard_neg to_neg listen_neg to_neg the_neg speech_neg and_neg put_neg everything_neg in_neg context_neg."
"liber are just be intellectu dishonest"
"you got some fuckin weird fantasi b"
"If I could tell her that it shouldnt be my problem and that she should do it herself Id probabl get a lot of shit and be grounded."
"pres."
"obama and menendez in november."
"yay!"
"thank for the fb!"
"I will watch longmir and be happi"
"these creep 

"I have famili in goshen and I am so glad to see you are reach out to the peopl you serve."
"1 child polici for 25 year in 3rd world will end human miseri and tragedies; pleas do that and not wast a singl second"
"becaus natali is a troll...."
"yes, there is a chemic neuro basis, and not just_neg warm_neg emotion_neg, for everyth that you think."
"haha what is thi real?"
"you can still log them."
"go back (left arrow) and add them."
"I realli enjoy and appreci davis, hi write and talks, and all he doe to inform u about what he ha learned."
"I onli wish more peopl would pay attention."
"tomorrow you'll have mine for sure."
"#helplifevotehouseresolution30 !"
"https://www.youtube.com/watch?v=r_p71bjjcp8"
"you'r veri welcome."
"that not logical_neg at_neg all_neg."
"they can learn a lot from you sir."
"[[photo]]"
"serious tho."
"If you have a second copi in your hand, and keep it, you can use it forever..."
"greta on fox new ask a good question whi isn't gov_neg.oversight done befor wast t

"well thi sub did start a a refuge from the shitti /r/cring mod shitti rules."
"kee wa not what_neg i_neg expected_neg -_neg very_neg funny_neg talk_neg but_neg with_neg good_neg strong_neg underpinnings_neg on_neg interesting_neg animal_neg behaviour_neg!"
"mine too...way to go ML"
"rive creat a marvellous, fascin and funni poem with nowaday internet words, tag and names."
"great!"
"idk if it consid punk, but escap the fate is great, a well a fall in reverse."
"ashley is a great escap the fate song."
"the hive are realli good too"
"much appreciated, thanks!"
"great win!"
"My famili is cheer for you from the beach!"
"!"
"> that white imag is crucial to find ani kind of employment."
"seriously?"
"preserv your hairdo == white imag == necessari condit for get a job????"
"?"
"yes, say NO to_neg all_neg government_neg spending_neg, unless, of course, you are talk about the iraq war or the idaho f-35 project."
"and, then, by gosh don't raise_neg my_neg taxes_neg to_neg pay_neg for_neg either

"steal the fruit of your labor by forc and send it down a black hole sinc 1913 !!"
"[[image_share]]"
"great to see you live woodsy."
"So impress on the jumps."
"I hope you didn't give_neg him_neg a_neg snake_neg cup_neg!"
":O"
"thanks, lois."
"I have been a long time support and fan."
"I appreci you look out for the rest of u"
"great talk, should have had more time."
"welcom to peak oil, how may I help you?"
"that is amazing."
"nice work B."
"sarah-jayn blakemor is a delight to listen to, and so knowledgable."
"I ran across her 38 minut convers at edge."
"it' great to see her present at ted http://edge.org/conversation/the-adolescent-brain"
"barri schwartz is reason express idea how to be a man in societi and just in life, and I veri much agre with him."
"I am pleas to listen to hi speech."
"thank you veri much ..."
"I listen to the ted talk of michel obama."
"I think, she is power and she ha vision."
"educ is the key."
"It is the most import thing."
"I think I should take good educati

"good win & it wa nice see u here in jamaica...mak sure to come back next year.."
"bad skelet joke give me a femur."
"zoe!"
"you are an exampl of dedic and courage."
"thank you!"
"Do we realli need 8 separ post about this?"
"great talk, monica, with such a time and import message."
"have share it with my group of 15-year old girl guid - I think they especi could benefit from your messag on the power of compass onlin and our respons (not right)_neg of_neg free_neg speech_neg."
"hope they will all becom upstanders!"
"vote for you and ms. landrieu today a well."
"best of luck to both of you!"
"halperin you'r such A right wing tool"
"wcw you are beautiful!!!"
"!"
"Of course."
"To be the best you have to be realli hardworking."
"realli pete ,defend black panthers."
"tri say the bias comment Al sharpton say to defend white ,you will be off the air."
"when i do I get call a troll."
"pleas help lee zeldin in ani way that you can."
"not only_neg is_neg lee_neg the_neg right_neg man_neg to_neg r

"que lindo e nadar y pasar relax."
"welcom to fito!"
"can't wait_neg to_neg see_neg you_neg get_neg involved_neg."
"beauti baby!"
"!"
"go to miss you and the rest of the cast a well."
"take care and have a nice time off."
"peac"
"It wa pleasur and honor for sapac to have you at our lunch DA rice."
"thank you"
"see you there:) i'll bring my camera."
"now, how about our marin who ha been left to rot in a mexican jail?"
"Or do we need to trade more terrorist for traitors?"
"total blame you if NY get destroyed."
"those were the good ole days!"
"Ha karl establish rove.that po is what is wrong with our elect system with hi super pac bs.someon need to kick him the nuts."
"thank you."
"i'v had a veri rough coupl of weeks."
"thi put eas to my mind."
"other danc @ an all that remains/dethklok show = follow.your gorgeou pic mayyyyyyy have had an influence."
"mebb a lil."
"hey john, Do u rememb me?"
"you gave me your racket at tallahassee, Fl in the challeng when u beat prakash armatraj!"
"I saw m

"but do you realli need to explain it in the context of our monkey ancestor and the current econom bust ?"
"today you also have the opportun to honor other brave men and woman who you salut everi 9/11."
"sign on to sponsor the 9/11 health bill congressman."
"realli honor them instead of just give them lip servic each year."
"hey!"
"He wa veri good to me!"
"wa he good to you?"
"you look great !!!!"
"!"
"We are so proud of you!"
"way to go alex!"
"great pictur of you three."
"sometim those turn out to be veri good sessions."
"arbol hombre, you didn't tell_neg me_neg ja_neg rule_neg was_neg the_neg headliner_neg!"
"that' the real travesty!"
"powerful, insightful, inspiring!"
"thank for share with passion."
"how could you slap the trump voter in the face mr. huffman.w are voter in your district."
"you were one of the 5 or six that refus to go to the inaguration.i'm asham of you and will encourag all my friend to not vote_neg for_neg you_neg when_neg and_neg if_neg you_neg run_neg again_neg

"I dont think you should be charg 18% interest against ur campaign funds!"
"!"
"MY beautigul daughter mari carolin"
"No problem!"
"thank for the follow!"
"As we say mazel tov."
"We know you will be a great congresswomen!"
"!"
"actually, she isn't."
"she is spot on."
"what she say about the church is accurate."
"I know that if integr ha anyth to do with thi race (and it should!)"
", then loi frankel is our onli choice."
"chad gonna beat the fuck out of him dude no matter_neg what_neg money_neg is_neg thrown_neg at_neg him_neg."
"thank for the props!"
":)"
"when u come to ny,knight..i think i love u"
"borax is a base."
"It is also anti-fungal."
"seem like a great choic for moldi clothes."
"I grew up think that vulner wa a weakness, that I had to creat a fortif to protect me from the danger of the world...but it depriv me of someth I wish I had gotten to know earlier in my life...myself."
"A beauti talk."
"ye let' rais the minimum wage so busi have to lay peopl off or rais price and it wi

"yep, no irony_neg here_neg what_neg so_neg ever_neg._neg."
"thanks, I wish the medium would publish such a break down."
"awesome..."
"i'm one outta shape gymnast."
":)"
"great previou race congrad"
"thanks!"
"happi to follow back :) welcom to fito!"
"!"
"had a great time and feel bless to hear your platforo"
"simple."
"effective."
"By not actually_neg talking_neg directly_neg about_neg his_neg main_neg idea_neg, and instead illustr with a raft of histor comparison jeff made a lucid, applic comparison to our current state of the web."
"brilliant."
"If you like asian ab so much, you should follow stea1ksauc in thi group."
"I think he' also in the other same groups."
"lion dancing, hors stance, kung Fu fighter... our kind of asian."
"lol"
"I have been tell peopl for year that governor cuomo love landlord and real estat developers."
"He doe not deserve_neg any_neg renter's_neg vote_neg."
"He is a creep, a liar and hi betray of work peopl is immens and unforgivable."
"merri christma to suc

"even if she ha ani posist she is not move forward or backward thi is not accept from you ted"
"dramedy, or is it just a comma !??"
"food for thought !?"
"?lol"
"We thurston of gloucest heritag endors seth enthusiastically!"
"tough loss for your guy joseph but they put up a good fight!"
"one of the best, hard-fought game in a long time."
"thank for your support james!"
"amen!"
"thank you father"
"did you ask him what hi favorit cuban food is."
"racist"
"dude, you'r welcome!"
":)"
"great."
"finally, be shameless will no longer_neg be_neg a_neg stigma_neg for_neg women_neg, but a honor quality..."
"good morn l.d....welcom back..."
"debbie, we need help get improv in mass transit."
"individu citi shouldn't be_neg able_neg to_neg opt_neg out_neg like_neg rochester_neg, so a poor man would have to walk 21 mile to work."
"I believ that jodi is veri on her game..and I mean game."
"she make juan martinez get off of hi goal and game and she make hime danc like a puppet"
"it' not how_neg many_ne

"That b**** weighs 45 lbs (35 for me) -- always include!"
"Very creative with the animals and shapes of people."
"Like, how he used them."
"The story was good."
"hahaha why thank you, sir!!!"
"I will continue to try for better, but I am pretty pleased so far!"
"My quads are starting to super stick out now and my abs are starting to show!"
"yayyy!"
"Thanks for this talk."
"It makes total sense..."
"I encourage TedXers to do an experiment."
"One day of your lives, smile to all strangers you bump into during the day, what effect does it have?"
"Go for it young lady and I know you did your best on all of them"
"Congratulations, Congresswoman Fudge!"
"Despite all the comments here, I found Thandie's talk insightful, genuine and heartwarming."
"I wonder if she is a Buddhist ?"
"Thank you Thandie :)"
"Never."
"Youre too close."
"Everyone else has appeal because theyre so far away"
"One of the best talks i found in TED."
"Thank you for Jill & TED for this wonderful talk :)"
"Wow, this brillian

"thank back!"
"yep, univers of alberta."
"you live around here?"
"pleas don't sell my land steve"
"just shake my head at the ignor and deliber ignor of the fact about fdr, pearl harbor, and wwii."
"To be contempl dure your tri, perhaps?"
"pshh... Is that how you treat my props.. just go around delet them?!?!"
"sureeeeeeeeeeeeeeeeeee I see how it is."
":pyeah there' definit still some bug around here."
"My workout from last night post with today' date on it."
"lol"
"thanks!"
"I also love bacon."
":)"
"hello isaac !"
"My copi arriv yesterday in france, i'm so happi and veri excit to read it !!!"
"XD"
"We need to keep bob menendez in congress."
"it' realli an excel lecture,i believ in her.so,fak it till you make it!"
"i'm human accord to most of the questions."
"and hi tone wa great, I had a lot of fun :)"
"you'r both awesome!!"
"!"
"B mattek is awesome, I love her bad girl risqu style."
"brit: i'm so glad u r on fox tonight."
"you calm peopl down and bring common sens to all this."
"you 

"It must be abl to process that inform insan quick!..."
"remark senseri perception."
"thank you for vote against hunt in nation parks...it would be veri scari go there and know hunter are shoot at animals.it would be horrible....thank you patricia macher"
"best news iv heard in a long time!!!"
"congrats!"
"!"
"awesome!!"
"i'v been so busi with in-law in town I need to bing watch the last coupl of episodes."
"great news.....but not surprising_neg!_neg!"
"We know 'winners' when we see 'em!"
"<smile>"
"anoth corrupt politician and a hrc shill vote against her"
"one of the best ted talk ever."
"true, inspiring, and cool!"
"that is a wonder fb game.wow.88 https://www.facebook.com/pages/allflogk/239888956148041?sk=app_208195102528120"
"vote ye on audit the fed is the best thing you could do... next time, I hope you will."
"you are do a fabul job."
"you were made for politics."
"there is NO video_neg :_neg(_neg( onli sound...:("
"chri is alway #1 in all our hearts."
"soror fudge, victori is y

"i'd wish I could studi under him."
"betti white I love her"
"we'r honor you let us."
"your work inspir mani feel and theme most dont dare go near so thank you."
"can't someone_neg stop_neg that_neg train_neg!_neg!"
"!"
"Of course!"
"that b**** weigh 45 lb (35 for me) -- alway include!"
"veri creativ with the anim and shape of people."
"like, how he use them."
"the stori wa good."
"hahaha whi thank you, sir!!!"
"I will continu to tri for better, but I am pretti pleas so far!"
"My quad are start to super stick out now and my ab are start to show!"
"yayyy!"
"thank for thi talk."
"It make total sense..."
"I encourag tedxer to do an experiment."
"one day of your lives, smile to all stranger you bump into dure the day, what effect doe it have?"
"Go for it young ladi and I know you did your best on all of them"
"congratulations, congresswoman fudge!"
"despit all the comment here, I found thandie' talk insightful, genuin and heartwarming."
"I wonder if she is a buddhist ?"
"thank you thandi :

### Exploration des données

#### 1)

Complétez les fonctions retournant les informations suivantes (une fonction par information, chaque fonction prenant en argument un corpus composé d'une liste de phrases segmentées en tokens(tokenization)) ou une liste de genres et une liste de sentiments:

In [21]:
corpus = [['soso.', 'kim a acheté un mbp13 silver!', 'mourad a acheté un mbp16 spacegrey']]
corpus = [['soso', 'a', 'acheté', 'un', 'mbp16', 'silver', '.', 'Je', 'ne', 'suis', 'pas', 'daccord!'],
          ['kim', 'a', 'acheté', 'un', 'mpb13', 'silver','.'], 
          ['mourad', 'a', 'acheté', 'un', 'mbp16', 'spacegrey', '.', 'Quil', 'aime', 'beaucoup']]

In [22]:
#Return if the corpus is tokenized or not
def isTokenized(corpus) :
    for sentences in corpus :
        for sentence in corpus :
            for l in sentence :
                if ' ' in l :
                    return 0
    return 1

In [23]:
#Return the corpus as a list of documents, that are not tokenizated, but segmented in sentences
def getListOfSentences(corpus):
    listOfDocs = []
    listOfTokens = []
    if isTokenized(corpus) :
        for sentences in corpus :
            for token in sentences :
                listOfTokens.append(token)
            s = ' '.join(listOfTokens)
            s = nltk.sent_tokenize(s)
            listOfDocs.append(s)
            listOfTokens = []
    return listOfDocs

In [24]:
getListOfSentences(corpus)

[['soso a acheté un mbp16 silver .', 'Je ne suis pas daccord!'],
 ['kim a acheté un mpb13 silver .'],
 ['mourad a acheté un mbp16 spacegrey .', 'Quil aime beaucoup']]

##### a. Le nombre total de tokens (mots non distincts)

In [25]:
def getNumberOfTokens(corpus):
    corpus = getListOfSentences(corpus)
    count = 0
    for sentences in corpus :
        for sentence in sentences :
            count = count + len(nltk.word_tokenize(sentence))
    return count

In [26]:
getNumberOfTokens(corpus)

30

##### b. Le nombre total de types

In [27]:
def getNumberOfTypes(corpus):
    corpus = getListOfSentences(corpus)
    listOfTokens = []
    for sentences in corpus :
        for sentence in sentences :
            tokenList = nltk.word_tokenize(sentence)
            for token in tokenList :
                listOfTokens.append(token)
    listOfTypes = list(dict.fromkeys(listOfTokens))  
    return len(listOfTypes)

In [28]:
getNumberOfTypes(corpus)

20

##### c. Le nombre total de phrases avec négation

In [29]:
def getNumberOfNeg(corpus) :
    corpus = getListOfSentences(corpus)
    #to do
    pass

##### d. Le ratio token/type

In [30]:
def getRatioTokenType(corpus):
    return float(getNumberOfTokens(corpus)/getNumberOfTypes(corpus))

In [31]:
getRatioTokenType(corpus)

1.5

##### e. Le nombre total de lemmes distincts

In [32]:
import nltk
def getLemmesNumber(corpus):
    corpus = getListOfSentences(corpus)
    lemmzer = nltk.WordNetLemmatizer()
    lemmesList = []
    for sentences in corpus :
        for sentence in sentences :
            lemmes = [lemmzer.lemmatize(token) for token in sentence.split()]
            for lemme in lemmes :  
                lemmesList.append(lemme)
    
    lemmesList = list(dict.fromkeys(lemmesList))   
    return len(lemmesList)

In [33]:
getLemmesNumber(corpus)

19

##### f. Le nombre total de racines (stems) distinctes

In [34]:
import nltk
def getStemsNumber(corpus):
    corpus = getListOfSentences(corpus)
    stemmer = nltk.PorterStemmer()
    stemsList = []
    for sentences in corpus :
        for sentence in sentences :
            stems = [stemmer.stem(token) for token in sentence.split()]
            for stem in stems :
                stemsList.append(stem)
    stemsList = list(dict.fromkeys(stemsList))     
    return len(stemsList)

In [35]:
getStemsNumber(corpus)

19

##### g. Le nombre total de documents (par classe)

In [36]:
def getNumberOfDocPerClass(sentiments):
    countPositive = 0
    countNegative = 0
    for sentiment in sentiments : 
        if sentiment : #positif
            countPositive = countPositive + 1
        else : #negatif
            countNegative = countNegative + 1
    return countPositive, countNegative
        
        

In [37]:
semtiments = [0,
          0,
          0,
          0,
          0, #5 negatif
          1,
          1,
          1,
          1,
          1,
          1] #6 positif

getNumberOfDocPerClass(semtiments)

(6, 5)

##### h. Le nombre total de phrases (par classe)

In [38]:
def getNumberOfSentencesPerClass(corpus, sentiments) :
    corpus = getListOfSentences(corpus)
    countSentencesPositives = 0
    countSentencesNegatives = 0
    for i in range(len(corpus)):
        if sentiments[i] : #positive
            countSentencesPositives = countSentencesPositives + len(corpus[i])
        else : #negative
            countSentencesNegatives = countSentencesNegatives + len(corpus[i])          
    return countSentencesPositives, countSentencesNegatives   

In [39]:
sentiments = [0,1,1]
getNumberOfSentencesPerClass(corpus, sentiments)

(3, 2)

##### i. Le nombre total de phrases avec négation (par classe)

In [40]:
#to do

##### j. Le pourcentage de réponses positives par genre de la personne à qui cette réponse est faite (op_gender)

In [41]:
genders = [['M', 'M', 'W', 'W', 'M', 'M', 'W', 'W', 'M', 'M', 'W', 'W']]
sentiments = [1 , 0, 1, 0, 1 , 0, 1,
             0, 0, 1,  0, 0]

In [71]:
def getPourcentageOfPositiveReponsesPerGender(genders, sentiments):
    countPosM = countPosW = 0
    iterator = 0
    totalResponse = len(sentiments)
  
    for sentiment in sentiments:
        if sentiment :
            if genders[0][iterator] == 'M':
                countPosM = countPosM + 1
            elif genders[0][iterator] == 'W':
                countPosW = countPosW + 1
        iterator = iterator + 1
  
    pourcentageW = float(countPosW / totalResponse) * 100
    pourcentageM = float(countPosM / totalResponse) * 100
  
    return(pourcentageM,pourcentageW)       

In [72]:
getPourcentageOfPositiveReponsesPerGender(genders, sentiments)

(25.0, 16.666666666666664)

#### 2) Écrivez la fonction explore(corpus, sentiments, genders) qui calcule et affiche toutes ces informations, précédées d'une légende reprenant l’énoncé de chaque question (a,b, ….j).

In [70]:
def explore(
    corpus: List[List[str]], sentiments: List[bool], genders: List[Literal["M", "W"]]
) -> None:
    print("Le nombre total de tokens (mots non distincts) : " + getNumberOfTokens(corpus) + "\n")
    print("Le nombre total de types : " + getNumberOfTypes(corpus) + "\n")
    print("Le nombre total de phrases avec négation : "   + "\n") #to do
    print("Le ratio token/type : " + getRatioTokenType(corpus) + "\n")
    print("Le nombre total de lemmes distincts : " + getLemmesNumber(corpus) + "\n")
    print("Le nombre total de racines (stems) distinctes : " + getStemsNumber(corpus) + "\n")
    print("Le nombre total de documents (par classe) : " + getNumberOfDocPerClass(semtiments) + "\n")
    print("Le nombre total de phrases (par classe) : " + getNumberOfSentencesPerClass(corpus, sentiments) + "\n")
    print("Le nombre total de phrases avec négation (par classe) : "  + "\n") #to do
    print("Le pourcentage de réponses positives par genre de la personne à qui cette réponse est faite (op_gender) : " 
          + getPourcentageOfPositiveReponsesPerGender(sentiments, genders)  + "\n")

    
    

#### 3) Calculer une table de fréquence (lemme, rang (le mot le plus fréquent a le rang 1 etc.) ; fréquence (le nombre de fois où il a été vu dans le corpus).  Seuls les N mots les plus fréquents du vocabulaire (N est un paramètre) doivent être gardés. Vous devez stocker les 1000 premières lignes de cette table dans un fichier nommé table_freq.csv

In [45]:
def calculateFrequences(corpus) :
   
    corpus = getListOfSentences(corpus)
    lemmzer = nltk.WordNetLemmatizer()
    lemmesList = []
    sorted_dict = {}
   
    for sentences in corpus :
        for sentence in sentences :
            lemmes = [lemmzer.lemmatize(token) for token in sentence.split()]
            for lemme in lemmes :
                lemmesList.append(lemme)
               
    for word in lemmesList:
        if word not in sorted_dict:
            sorted_dict[word] = 0
        sorted_dict[word] += 1
    words = sorted_dict.items()
    sorted_lemme = sorted(words, key= lambda kv: kv[1], reverse=True)
    return sorted_lemme

In [46]:
calculateFrequences(corpus)

[('a', 3),
 ('acheté', 3),
 ('un', 3),
 ('.', 3),
 ('mbp16', 2),
 ('silver', 2),
 ('soso', 1),
 ('Je', 1),
 ('ne', 1),
 ('suis', 1),
 ('pa', 1),
 ('daccord!', 1),
 ('kim', 1),
 ('mpb13', 1),
 ('mourad', 1),
 ('spacegrey', 1),
 ('Quil', 1),
 ('aime', 1),
 ('beaucoup', 1)]

## 2. Classification automatique

### a) Classification  automatique avec un modèle sac de mots (unigrammes), Naive Bayes et la régression logistique

En utilisant la librairie scikitLearn et l’algorithme Multinomial Naive Bayes et Logistic Regression, effectuez la classification des textes avec un modèle sac de mots unigramme pondéré avec TF-IDF.  Vous devez entrainer chaque modèle sur l’ensemble d’entrainement et le construire à partir de votre fichier corpus_train.csv. 

Construisez et sauvegardez votre modèle sac de mots avec les données d’entrainement en testant les pré-traitements suivants (séparément et en combinaison): tokenisation, lemmatisation, stemming, normalisation des négations, et suppression des mots outils. Vous ne devez garder que la combinaison d’opérations qui vous donne les meilleures performances sur le corpus de test. Indiquez dans un commentaire les pré-traitements qui vous amènent à votre meilleure performance (voir la section 3 – évaluation). Il est possible que la combinaison optimale ne soit pas la même selon que vous utilisiez la régression logistique ou Naive Bayes. On s’attend à avoir deux modèles optimaux, un pour Naive Bayes, et un avec régression logistique.

In [47]:
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report

### Naive Bayes

In [48]:
def naiveBayes(train_data, test_data):
    vectorizer = TfidfVectorizer()
    vectors = vectorizer.fit_transform(train_data[0])
    clf = MultinomialNB(alpha=0.5)
    clf.fit(vectors, train_data[1])
    
    vectors_test = vectorizer.transform(test_data[0])
    y_pred = clf.predict(vectors_test)
    return y_pred

### Régression Logistique

In [49]:
def logisticsRegression(train_data, test_data):
    vectorizer = TfidfVectorizer()
    vectors = vectorizer.fit_transform(train_data[0])
    model = LogisticRegression(C=1.0)
    model.fit(vectors, train_data[1])
    
    vectors_test = vectorizer.transform(test_data[0])
    y_pred = model.predict(vectors_test)
    return y_pred

###  b) Autre représentation pour l’analyse de sentiments et classification automatique

On vous propose maintenant d’utiliser une nouvelle représentation de chaque document à classifier.
Vous devez créer à partir de votre corpus la table suivante :

| Vocabulaire | Freq-positive | Freq-négative |
|-------------|---------------|---------------|
| happy | 10 | 1 |
| ... | ... | ... |

Où :

• Vocabulaire représente tous les types (mots uniques) de votre corpus d’entrainement

• Freq-positive : représente la somme des fréquences du mot dans tous les documents de la classe positive

• Freq-négative : représente la somme des fréquences du mot dans tous les documents de la classe négative

Notez qu’en Python, vous pouvez créer un dictionnaire associant à tout (mot, classe) une fréquence.
Ensuite il vous suffit de représenter chaque document par un vecteur à 3 dimensions dont le premier élément représente un biais (initialisé à 1), le deuxième élément représente la somme des fréquences positives (freq-pos) de tous les mots uniques (types) du document et enfin le troisième élément représente la somme des fréquences négative (freq-neg) de tous les mots uniques du document. 

En utilisant cette représentation ainsi que les pré-traitements suggérés, trouvez le meilleur modèle possible en testant la régression logistique et Naive Bayes. Vous ne devez fournir que le code de votre meilleur modèle dans votre notebook.

## 3. Évaluation et discussion

#### a) Pour déterminer la performance de vos modèles, vous devez tester vos modèles de classification sur l’ensemble de test et générer vos résultats pour chaque modèle dans une table avec les métriques suivantes : Accuracy et pour chaque classe, la précision, le rappel et le F1 score. On doit voir cette table générée dans votre notebook avec la liste de vos modèles de la section 2 et leurs performances respectives. 

In [50]:
y_pred_bayes = naiveBayes(train_data, test_data)
print(classification_report(test_data[1], y_pred_bayes))

              precision    recall  f1-score   support

       False       0.85      0.26      0.40       254
        True       0.80      0.98      0.88       751

    accuracy                           0.80      1005
   macro avg       0.82      0.62      0.64      1005
weighted avg       0.81      0.80      0.76      1005



In [51]:
y_pred_regression = logisticsRegression(train_data, test_data)
print(classification_report(test_data[1], y_pred_regression))

              precision    recall  f1-score   support

       False       0.80      0.53      0.64       254
        True       0.86      0.95      0.90       751

    accuracy                           0.85      1005
   macro avg       0.83      0.74      0.77      1005
weighted avg       0.84      0.85      0.84      1005



#### b) Générez un graphique qui représente la performance moyenne (mean accuracy – 10 Fold cross-validation) de vos différents modèles par tranches de 500 textes sur l’ensemble d’entrainement.

#### c) Que se passe-t-il lorsque le paramètre de régularisation de la régression logisque (C) est augmenté ?

## 4. Analyse et discussion

#### a) En considérant les deux types de représentations, répondez aux question suivantes en reportant la question dans le notebook et en inscrivant votre réponse:

#### b) Quel est l’impact de l’annotation de la négation ?

#### c) La suppression des stopwords est-elle une bonne idée pour l’analyse de sentiments ?

#### d) Le stemming et/ou la lemmatisation sont-ils souhaitables dans le cadre de l’analyse de sentiments ?

## 5. Contribution

Complétez la section en haut du notebook indiquant la contribution de chaque membre de l’équipe en indiquant ce qui a été effectué par chaque membre et le pourcentage d’effort du membre dans le TP. 