# Wikipedia Abuse Filter
### Created by: Abhay Agarwal [abhayka@stanford.edu](mailto:abhayka@stanford.edu) (10/16/18)
This is an interactive notebook file that downloads a very large dataset of wikipedia abusive comments and edits (almost 30MB!) You can try it out yourself! To try it yourself, first run all the code once through (by going to the Runtime menu and selecting "run all").

This code uses a [Naive Bayes](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) classifier to classify the spam. Try to see where this classifier breaks down. Can you trick the system? How does it respond to adversarial techniques like misspelling, adding different spaces, etc.?

In [0]:
# You don't have to understand this part. These are code functions that will be used later.

from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
import pandas as pd
pd.set_option('display.max_colwidth', -1)
from sklearn.naive_bayes import MultinomialNB
from urllib.request import urlopen

In [0]:
# read the data into a "data frame" (used to quickly process the data)
df_train = pd.read_csv('https://raw.githubusercontent.com/designing-ml/wikipedia/master/wikipedia_dataset.tsv', sep='\t', header=0)

You should see two tables of data below. The first table shows the first ten data with a label of "0", which means the dataset regards these passages as "not spam" (in computing parlance this is called "ham"). The second table shows the first ten data with a label of "1", which means these are considered spam.

In [103]:
# Edit the "9" to show more or fewer examples
df_train.loc[lambda d: d.label == 0][0:9]

Unnamed: 0,label,comment
0,0,"I want to lay out the major problemsin terms of structure and clarityafflicting the Native American controversy paragraph. As I show below, this inept writing serves to obscure the controversy surrounding her claim to Native American ancestry, which was the most-covered issue in her 2012 Senate Campaign. Competently-written paragraphs have clear topic sentences that state the main point of the paragraph. The topic sentenceof the current, incompetently written paragraph on the Native American controversy is as follows: In April 2012, the Boston Herald sparked a campaign controversy when it reported that Association of American Law Schools (AALS) directories from 1986 to 1995 listed Warren as a minority professor.This topic sentence is unclear; a reader who reads it would not see what the controversy was about. Being listed as a minority is not controversial. The issue is that Warren 1) listed herself in a 2) minority-recruitment directory for law schools, on the basis of 3) undocumented claims to Native American ancestry. A competently written topic sentence would convey those three main points. Here is an example of such a topic sentence: In April 2012, the Boston Herald sparked a campaign controversy when it noted that Warren had listed herself in a directory of minority law professors, used by law schools for recruitment purposes, despite the fact that she lacked any documented minority ancestry.That competently-written topic sentence states the controversy in a nutshell. I am willing to entertain any number of alternative topic sentences, including those that are unduly favorable to Warren. (I've given up on truly achieving NPOV for this article.) But I am not willing to abide unclear writing. The current paragraph states that Warren's opponent Scott Brown speculated that she had fabricated a native ancestry to gain advantage in employment.At this point, the paragraph has not stated that she claimed to be Native American, so it is not clear what type of ancestry the term native ancestryrefers to. I have tried to replace native ancestrywith Native American ancestry,but have been repeatedly reverted. The current paragraph states that Scott Brown issued attack ads that referred to Warren's ancestry.Whether made out of incompetence or malice, this statement constitutes a BLP violation. The statement indicates that Brown attacked Warren because of her ancestry or race. In reality, he attacked her for (allegedly) lying about her race to get ahead in the employment market."
1,0,"He is deleting the NYTimes, Chosun Ilbo reference. You also deleted a chunk of info without explantion and could've rephrased it. Why does he/she do that? NYTimes and Chosun Ilbo(the largest newspaper company in South Korea) http://pr.chosun.com/chosun_english/"
2,0,"I please you Pleas, Cool Cat is an Turkish Nationalis is not got Administrator -"
3,0,"*Ok I see, you restored my faith in the Wikipedia process, sorry for the misunderstanding."
4,0,In reverting the talk page there you had reverted comments i had recently placed there. please reffrain from doing so in the future. January 14th 2007
5,0,"Well, at last I see a person being called a Palestinian. Didn't know Palestinia was a country. Or a religion. Or anything. But at last, overhere, the word seems to exists, albeit negative. White phosphor, anyone? -"
6,0,:Sorry for the delay; I've restored the article.
7,0,":Well, that's very generous of you! But I'm not exactly into putting green circles. Although I'd use my credit when I'd really need it, if you know what I mean. ''''"
9,0,"Time for a WIKI investigation here. It is no secret Admins work for pay and this is as bad as it gets. I believe I could find at very least a dozen references for the main criminal, Mr. Bermender. Purely laughable in that Mr. Bermender's ad for affilates isn't good enough for this admin."


In [104]:
# Edit the "9" to show more or fewer examples
df_train.loc[lambda d: d.label == 1][0:9]

Unnamed: 0,label,comment
8,1,"You, sir, are a capital douche bag. Please listen to some Dragonforce and stop making wiki lame. thank you"
10,1,FUCK THE WIKI PUNKS who will fuck wikpeida over.... NOW I go to IRS and file HAVE FUN ASSWIPES... YOU JUST COST WIKI BIG TIME...
15,1,"Ya so the us sucs cuz bugh is dumb and they have to much army and people are to mean in new york and jesusland is to big and ya. So anyway, this was severs opinion"
18,1,"Oh go fuck yourself and stop stalking my page, you twisted bag of wank."
46,1,"PLEASE OTHER THAN HIDE BEHIND WK RULES acutally IDENTITY THE OFFESNES COMMIMITED! YOU JUST SAID you dont care about my OPINIONS!..yet the opinions that where QUOTED WHERE FROM THE REFERENCES YOU HAD ACCEPTED!! LOL... SO in which case i am formally complaining about YOU AND YOUR BIASED STANCE! ALL MY REFERNCES HAVE ISBN NUMBERS, YEAR AND PUBLISHERS! YOU ARE PROTECTING YOUR BIASED VIEW!"
47,1,"Why you being such as hater all of the time man? Get off my dick. - A Different World - 4:59 PM - Friday, August 25, 2006."
48,1,I don't know who died and made you king of situations in which you have no first-hand knowledge. Keep doing what you're doing and I'll keep correcting it and other people who also are familiar with these matters will step in and continue to correct it. ()
50,1,"This wikipedia stuff is good shit, next rounds on me aye."
59,1,keep your sorry ass shit post off my page


In [0]:
# This code takes all the training data and adds it to a Naive Bayes classifier.

# transform the comment column into feature vectors
count_vec = CountVectorizer()
x_train_counts = count_vec.fit_transform(df_train.comment)

# from occurences to frequencies (tf idf)
tfidf_transformer = TfidfTransformer()
x_train_tfidf = tfidf_transformer.fit_transform(x_train_counts)

# training a classifier
clf = MultinomialNB().fit(x_train_tfidf, df_train.label)

In [106]:
#@title Enter a phrase to test! { run: "auto", vertical-output: true, display-mode: "both" }
test_phrase = "This is a test phrase (edit to add your own) " #@param {type:"string"}

df_test = pd.DataFrame(data={'comment':[test_phrase]})

# transform test data into feature vectors
x_new_counts = count_vec.transform(df_test.comment)
x_new_tfidf = tfidf_transformer.transform(x_new_counts)

predicted = clf.predict(x_new_tfidf)

print("Phrase is Spam? : " + ("True" if predicted[0] == 1 else "False"))

Phrase is Spam? : False
