# Gridsearch

Using gridsearch to determine different parameters to improve the score

In [1]:
import requests
import time
import pandas as pd
import numpy as np
from nltk.corpus import stopwords
from sklearn.feature_extraction.text import CountVectorizer, HashingVectorizer, TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score, GridSearchCV, train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline, Pipeline
from sklearn.naive_bayes import MultinomialNB
from sklearn.neighbors import KNeighborsClassifier
import sklearn.metrics as sklm

import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
pd.set_option('display.max_columns', 500)

### Modeling

In [3]:
df = pd.read_csv('./data.csv').drop(columns='Unnamed: 0')

In [4]:
df.head()

Unnamed: 0,text,title,target
0,Good Morning r/wow! Welcome to the World First...,Azshara's Eternal Palace World First Race Mega...,1
1,Weekly healing thread.,Midweek Mending - Your Weekly Healing Thread,1
2,,i found this like 2 years ago and not a dungeo...,1
3,,The jump says it all,1
4,,"She was a bad bad Warchief. Punish her, Anduin!",1


In [5]:
X = df['title']
y = df.target

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y,
                                                    test_size=0.25,
                                                    random_state=42,
                                                    stratify=y)

In [7]:
#Creating a function for the different transformers with Naive Bayes model in the gridsearchCV.
def pipe_gs(transformer):
    pipe = Pipeline([
        ('tf', transformer),
        ('clf', MultinomialNB())])
    parameters = {
        'tf__stop_words':['english',None],
        'tf__strip_accents':['ascii'],
        'tf__ngram_range':[(1,1),(1,2),(1,3)],   
        'tf__min_df': [0.01,0.1,1],
        'tf__max_features': [5000,None],
        'clf__alpha': [0,0.5,1],
        'clf__fit_prior': [True, False]
        }

    grid_search = GridSearchCV(pipe, parameters, scoring='accuracy',cv=5, n_jobs=-1, verbose=1)
    gs=grid_search.fit(X_train, y_train)
    
    y_pred=gs.predict(X_test)
    
    print("Best parameters:")    
    print(grid_search.best_params_)
    
    print("Train model score: {}" .format(grid_search.score(X_train, y_train)))   
    print("Test model score: {}" .format(grid_search.score(X_test, y_test)))   

In [8]:
pipe_gs(CountVectorizer())

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Fitting 5 folds for each of 216 candidates, totalling 1080 fits


[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    2.8s
[Parallel(n_jobs=-1)]: Done 314 tasks      | elapsed:    6.3s
[Parallel(n_jobs=-1)]: Done 814 tasks      | elapsed:   14.9s


Best parameters:
{'clf__alpha': 1, 'clf__fit_prior': True, 'tf__max_features': 5000, 'tf__min_df': 1, 'tf__ngram_range': (1, 1), 'tf__stop_words': None, 'tf__strip_accents': 'ascii'}
Train model score: 0.9803476946334089
Test model score: 0.854875283446712


[Parallel(n_jobs=-1)]: Done 1080 out of 1080 | elapsed:   19.7s finished


In [9]:
pipe_gs(TfidfVectorizer())

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Fitting 5 folds for each of 216 candidates, totalling 1080 fits


[Parallel(n_jobs=-1)]: Done 124 tasks      | elapsed:    1.6s
[Parallel(n_jobs=-1)]: Done 685 tasks      | elapsed:   12.1s


Best parameters:
{'clf__alpha': 0.5, 'clf__fit_prior': True, 'tf__max_features': 5000, 'tf__min_df': 1, 'tf__ngram_range': (1, 1), 'tf__stop_words': None, 'tf__strip_accents': 'ascii'}
Train model score: 0.983371126228269
Test model score: 0.8503401360544217


[Parallel(n_jobs=-1)]: Done 1080 out of 1080 | elapsed:   19.0s finished


In [10]:
# Comparing the features processed by the CountVectorizer and the CountVectorizer with the best parameters.

In [11]:
vec=CountVectorizer()

In [12]:
X_train_vec = pd.DataFrame(vec.fit_transform(X_train).todense(),columns=vec.get_feature_names())

In [13]:
X_train_vec.head()

Unnamed: 0,10,100,1000th,11,110,12,120,132,14,1440p,15,150,16,17,18,18th,19,19th,1gb,1h,1mb,1pm,1shot,1st,1v1,20,200,2000,2005,2007,2008,2009,200iq,2019,23,25,250,2500,27,2gb,2h,31,31st,340ilvl,385,3d,3rd,3v1,3v3,3x,40,400ilv,420,425,45,4k,4th,50,500,500k,51,5k,5th,5v6,62,66,6k,7000,76,7th,80,84,8k,90k,92,9ish,abandon,abilities,ability,able,about,above,absolutely,abt,ac130,accepted,access,accidental,accidentally,accompanying,according,account,accounts,accretion,accuracy,accurate,achieve,achievement,achievements,acquire,acquisition,action,actions,activate,active,activision,actor,actual,actually,adam,add,added,adding,addon,addons,adibags,adorable,advanced,adventures,advice,aerial,affect,affected,affects,after,afterwards,again,against,age,agents,ago,agony,agree,ahhhhhh,ai,aid,aim,air,airships,aka,akshon,alberta,alchemy,alduin,ale,alex,alexstrasza,algalon,alive,all,alleria,alliance,allied,allow,allowed,allowing,ally,almighty,almost,aloija,alone,along,alongside,alpha,alphabet,alphabetical,already,also,alt,alternative,alts,always,alyssa,am,ama,amazing,amber,ambience,ambush,ammo,amnesia,among,amount,amp,an,ana,analysis,anatomy,and,anduin,angel,angela,animal,animals,animated,animation,animations,ankoan,anna,annihilation,anniversary,announced,annoyed,annoying,another,answer,any,anybody,anymore,anyone,anything,apex,app,apparently,appear,appearances,appears,applied,appreciate,approach,appropriate,april,arachnid,arathi,arcade,arcane,architecture,archives,are,area,aren,arena,arm,armor,armour,around,arounder,arranged,art,arthas,artifact,artifacting,artist,artstation,artwork,as,asap,ascending,ascension,ashe,...,turn,turret,turrets,twice,twilight,twitch,twitter,two,type,tyrande,tyson,ulduar,ult,ultimate,ultimates,ults,un,unable,uncomfs,undead,under,underpowered,understand,undo,undulating,undying,unexperience,unholy,unintended,unique,unit,unlikely,unlock,unlocked,unlocking,unmount,unpleasant,unpopular,unreleased,untextured,until,up,update,updated,updating,upgradable,upgrade,upload,uploaded,ups,upset,upsides,ur,urza,us,usable,usd,use,used,useful,usefull,user,using,va,val,validation,validity,value,valve,vanguard,vanilla,vanillas,variety,vast,ve,vendors,vengeance,versatile,version,very,vf,video,videos,viewer,viewers,violet,virtue,vision,visual,voice,void,voidelf,voidform,vs,vulpera,vynarcyon,wa,wacky,wait,waiting,waits,wakening,walking,wall,wallpaper,walls,walt,wandering,want,wanted,wanting,wants,war,warbringers,warchief,warcraft,warcraftmovies,warfront,warfronts,warftont,warlock,warlords,warning,warr,warrior,was,wasn,waste,wat,watch,watching,water,way,ways,wb,wc,wc3,we,weak,weakaura,weapon,weapons,wear,wears,webs,week,weekend,weekly,weeks,weights,weird,weirdly,welcome,well,went,were,wf,what,whats,when,where,which,while,whisper,white,who,whole,wholesome,why,wicked,wide,widow,widowmaker,will,win,windrunner,windwalkers,winning,wins,winston,wintergrasp,winterspring,wipe,wipefest,wipes,wise,with,without,wod,won,wonder,wong,wont,wore,worgen,work,worked,working,works,workshop,world,worldvein,worry,worse,worst,worth,wotlk,would,wow,wowprogress,wq,wra,wrath,wrecking,writing,written,wrote,wrynn,wtb,ww,xbox,xd,xiwyllag,xmog,xmogs,xpac,year,years,yes,yet,yo,yoinkies,you,young,your,yourself,youtube,youtuber,yt,za,zandalari,zaqul,zarya,zen,zenyatta,zero,zone,zones,zoth,zul
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0


In [14]:
vec=CountVectorizer(stop_words='english',strip_accents='ascii',ngram_range=(1,3), min_df=1, max_features=None)

In [15]:
X_train_vec = pd.DataFrame(vec.fit_transform(X_train).todense(),columns=vec.get_feature_names())

In [16]:
X_train_vec.head()

Unnamed: 0,10,10 dumbest,10 dumbest nerfs,10 hours,10 ill,10 minutes,10 ret,10 ret priority,10 seconds,10 seconds eating,10 years,10 years ago,100,100 stack,100 stack vf,1000th,1000th actual,1000th actual size,11,11 missions,11 missions bfa,110,110 lvl,110 lvl pvp,12,12 slower,12 slower reinhardt,120,120 does,120 does work,120 leveling,120 leveling rates,120 questions,120 questions unlocking,132,132 fatal,132 fatal memory,14,14 classic,14 classic wow,1440p,1440p played,1440p played laptop,15,15 crabs,15 crabs fast,15 decreases,15 decreases experience,150,150 damage,150 damage total,16,16 sigma,17,17 2019,17 2019 wow,18,18 2019,18 2019 wow,18 intellect,18 intellect hour,18th,18th letter,18th letter greek,19,19 2019,19 2019 3v3,19 2019 wow,19th,19th hero,19th hero added,1gb,1gb ptr,1gb ptr update,1h,1h weapons,1mb,1pm,1pm est,1pm est seagull,1shot,1shot tech,1shot tech style,1st,1st expensive,1st expensive purchase,1v1,1v1 samurai,1v1 samurai showdown,20,20 alt,20 alt damage,200,200 damage,2000,2000 lb,2000 lb swinging,2005,2005 2007,2007,2008,2008 didnt,2008 didnt google,2009,2009 burning,2009 burning boar,200iq,200iq sneaky,200iq sneaky cancer,2019,2019 3v3,2019 3v3 say,2019 arranged,2019 arranged patty,2019 finally,2019 finally draw,2019 predirection,2019 wow,23,23 2019,23 2019 wow,25,25 2019,250,250 points,250 points comp,2500,2500 bfa,2500 bfa s1,27,27 aug,2gb,2gb normal,2h,2h weapon,2h weapon options,31,31 apparently,31 apparently dutch,31 betting,31 betting pool,31st,31st hero,31st hero sigma,31st july,31st july days,340ilvl,340ilvl reward,385,385 pieces,3d,3d draenei,3d draenei kind,3d printed,3d printed lit,3rd,3rd day,3rd day world,3rd time,3v1,3v3,3v3 say,3v3 say 3v1,3x,3x gilded,3x gilded post,40,40 loot,40 loot boxes,40 slot,40 slot bags,400ilv,400ilv hit,400ilv hit lv,420,420 425,420 425 cause,425,425 cause,425 cause gambling,45,45 ilvl,45 ilvl neck,45 seconds,4k,4k accidentally,4k accidentally ate,4th,4th spec,4th spec class,50,50 credits,50 credits loot,50 sr,50 sr instead,500,500 copa,500 copa lucioball,500 glitchy,500 glitchy iq,500 hp,500 lucio,500k,500k gold,500k gold 80,51,5k,5k having,5k having leaver,5k save,5k save game,5th,5th fun,5th fun moment,5v6,62,62 sigma,62 sigma oldest,66,6k,6k dva,6k dva bomb,7000,7000 new,7000 new set,76,7th,7th legion,7th legion rep,7th legion stuff,80,80 usd,84,84 years,8k,90k,90k peak,90k peak viewers,92,92 2gb,92 2gb normal,9ish,9ish years,9ish years ago,abandon,abandon say,abandon say fam,abilities,abilities appropriate,abilities appropriate levels,...,world map,world map heroes,world pov,world quests,world quests broken,world quests kind,world race,world race going,world solesa,world solesa naksu,world warcraft,world warcraft classic,world warcraft invincible,world warcraft warlords,world watching,world watching moon,worldvein,worldvein essence,worry,worry role,worry role que,worse,worse monday,worse monday tuesday,worse upgrade,worst,worst healer,worst healer holy,worst patches,worst patches created,worst update,worst update bfa,worth,worth getting,worth listen,worth picking,worth picking right,worth play,worth play compet,wotlk,wow,wow account,wow account temporary,wow artifacting,wow artifacting happening,wow based,wow based plugins,wow bores,wow caps,wow caps 1mb,wow cd,wow character,wow character writing,wow classic,wow classic edition,wow credits,wow credits doing,wow epic,wow epic level,wow error,wow error 132,wow excitedly,wow excitedly told,wow explain,wow explain possibly,wow folder,wow folder 92,wow free,wow girl,wow headcanon,wow help,wow help screen,wow horde,wow horde priest,wow ink,wow ink thought,wow ipad,wow logo,wow model,wow model viewer,wow need,wow need addon,wow need help,wow need opinions,wow payment,wow payment work,wow podcaster,wow podcaster koltrane,wow professions,wow professions help,wow repair,wow repair armor,wow runs,wow runs worse,wow shouldn,wow shouldn auction,wow support,wow support helpful,wow testing,wow testing day,wow themed,wow themed gift,wow today,wow used,wow vs,wow vs swtor,wow wall,wow wall post,wow warlock,wow warlock worth,wowprogress,wowprogress track,wowprogress track characters,wq,wra,wrath,wrath wallpaper,wrath wallpaper 8k,wrecking,wrecking ball,wrecking ball calling,writing,writing bfa,writing bfa really,written,written 2005,written 2005 2007,wrote,wrote ez,wrote ez unintended,wrynn,wrynn ai,wrynn ai portraits,wrynn vanguard,wrynn vanguard reputation,wtb,wtb 40,wtb 40 slot,ww,ww monk,xbox,xbox eu,xbox scrims,xd,xd swears,xd swears robot,xiwyllag,xiwyllag atv,xiwyllag atv auction,xmog,xmogs,xmogs old,xmogs old lfr,xpac,xpac issue,year,year ago,year ago garbage,year ago warbringers,year break,year break just,years,years ago,years ago got,years ago today,years just,years just partake,yes,yes dwarf,yes dwarf bad,yo,yo enhancement,yo enhancement shamans,yo ho,yo ho hooks,yo ho yo,yoinkies,yoinkies bad,yoinkies bad fran,young,young lad,youre,youre doing,youre doing stop,youtube,youtube channel,youtube channel interviews,youtube feed,youtube video,youtuber,youtuber potxeca,youtuber potxeca short,yt,yt good,yt good old,za,za qul,za qul did,za qul world,zandalari,zandalari empire,zandalari reputation,zandalari troll,zandalari troll cosplay,zandalari troll druids,zandalari troll regeneration,zaqul,zarya,zarya buff,zarya buff insane,zarya death,zarya getting,zarya getting significant,zarya grav,zarya grav kinetic,zarya graviton,zarya graviton surge,zarya nowadays,zarya nowadays discussion,zarya using,zarya using kinetic,zaryas,zaryas ult,zaryas ult combination,zen,zen glitch,zen glitch quickly,zen lore,zen lore leak,zen nakji,zen nakji skin,zenyatta,zenyatta linked,zenyatta linked way,zenyatta match,zenyatta match electric,zero,zero matter,zero matter seconds,zero sigma,zero sigma ult,zone,zone cataclysm,zone cataclysm questlines,zone storyline,zones,zoth,zul,zul drak,zul gurub
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Features are increased from 3047 to 11836. However this is due n_grams being set to (1,3) which added new features.
The min df=1 parameter provided from the gridsearch does not remove any terms as when min_df = 1 no terms are ignored.

In [17]:
# Make the model based on the best parameters provided by gridsearch
pipe = make_pipeline(CountVectorizer(stop_words='english',strip_accents='ascii',ngram_range=(1,3), min_df=1, max_features=None),
                     MultinomialNB(alpha=1, fit_prior=True)) 
pipe = pipe.fit(X_train, y_train)               
y_pred = pipe.predict(X_test)
        
print("Train model score: {}" .format(pipe.score(X_train, y_train)))  
print("Test model score: {}" .format(pipe.score(X_test, y_test)))  

Train model score: 0.9924414210128496
Test model score: 0.854875283446712


In [18]:
# Check accuracy
print(sklm.accuracy_score(y_test, y_pred))
tn, fp, fn, tp = sklm.confusion_matrix(y_test, y_pred).ravel()

0.854875283446712


In [19]:
print("True Negatives: %s" % tn)
print("False Positives: %s" % fp)
print("False Negatives: %s" % fn)
print("True Positives: %s" % tp)

True Negatives: 158
False Positives: 36
False Negatives: 28
True Positives: 219


In [20]:
count=0
for x,test in zip(y_pred,y_test.index):
    if x != y_test[test]:
        if x==1: # Look for False positives
            print(X_test[test])
            count=count+1
print('Number of False positives: {}'.format(count)) # Counts the number of posts  

Wait what?
A small effort by me
What level do I have to get to to unlock everything?
A Junkrat Hiding in the Sewers Saves the Day!
When in doubt, swing randomly
Is there a video or post that explains the entirety of the overarching story?
How To Deal With Toxic People
Muh Stuns!!
Yeah no easter egg here :(
Sig’s Character design
Weekly Trash Talk Thread - July 23, 2019
[HELP] I set my profile to Public... except it isn't.
Best way to introduce a friend to the game?
Every time a DPS player gives credit to support, an angel gets their wings.
Big shatter from god
Can we play normal gamemodes in third person ?
Aussie player looking for teammates to play comp with.
Oppinions
My entire sub box :/
A fast win that my friend and I did a little while ago on No Limit...
Fly Orisa, Fly!!
My hands are still shaky from the intensity of this round!
Hi Reddit! I’m a clinical psychology grad student from California conducting a study on the gaming habits of different types of gamers and I need your hel

In [21]:
count=0
for x,test in zip(y_pred,y_test.index):
    if x != y_test[test]:
        if x==0: # Look for False negatives
            print(X_test[test])
            count=count+1
print('Number of False negatives: {}'.format(count)) # Counts the number of posts         

How I made Over 100k In One Week
Is there any information on the beguilling changes from week to week?
Huge FPS drops in +10 dungeons after season 3 started - anyone else have the same issues?
Radeon Driver Update
Wonder what's going on here
Hey guys, here's that Mining enchant bug in action! 1.6 second mining speed with Kul Tiran Herbalism on my gloves... 1.7 second mining speed with Kul Tiran Mining on my gloves!
Okay, I'm going to trolled for this, but I got to know, wtf is this thing?
This Summer Fashion Trends
What is the route of progression for someone just coming back into the game right now?
Meanwhile, Vision of Perfection for Arms....
Aqua Team Murder Force Achievement
Emotes are not to be taken lightly...
Why can't I enable dx12 in Win 8.1?
Quel´Dorei Symmetra (Fan Skin, I hope you like it! :D)
I'm ok at battlegrounds but absolute trash at WorldPVP. What am I doing wrong?
Personal space/housing
Attention to detail
What am I doing wrong?
Why can Fire Mages still not get the f

The posts that are inaccurately classified, the vocabulary in the title is general and does not contains words that distinguishes between wow and overwatch. As both games comes from Blizzard, the model is unable to distinguish.