# Lab Assignment 1: Exploring Text Data
## by Avi Sinha

### 1. Business Understanding

All html files are collected from the IMDb archive in the domain of movies. Each of the 30,000 documents is a review. The reviews are professionally written and are posted to different online newsgroups. Data collected by Bo Pang and Lillian Lee. http://www.cs.cornell.edu/people/pabo/movie-review-data/


#### Purpose of the Data and Analysis
Understanding human sentiment is an important part of businesses to understand the consumers relationship management (CRM). Since humans are verbal communicators, simple numbers are not an accurate indicator. Numeric rating systems can only describe sentiment to a certain extent and are not always available. A better approach is to understand general sentiment from the vocabulary collected in freely available reviews and posts. 

This knowledge of sentiment can be especially beneficial when applied to movie distributers who want a deeper understanding of what qualities make a movie successful before they spend millions to distribute them through channels (either streamed or physical). This way more financially viable movies can be chosen from studios and sold based on reviews. In the end, distributers make money from lucrative movies and consumers would get what they wanted to watch.

#### Prediction Task
The nuances can become extremely fine-grained with implict meanings such as intent, emotion, subjectivity. However, this prediction task would be a basic polarity analysis, a simple positive or negative, coupled with key words describing them, which is basically enough to take decent advantage of the wealth of data available. 

#### Level of Accuracy
The success of this task would result in basic classification of a review as positive or negative by analyzing vocabulary used. The level of success of this kind of classification depends on the length of the review and the complexity of the language used to describe it. Taking all this together, the required accuracy for this data classification to be of use would be around 90+ percent or above because any false classification could result in the movie not being distributed or wrongly distributed in place of a better performer, thus causing massive losses in revenue through wasted production.



### 2. Data Encoding

#### 2.1 Read in raw text documents

In [4]:
# %load parse.py
import glob
import re
import string

def preprocess(text):
    text= re.sub(b"<.*?>", b" ", text)#no_tags
    text= re.sub(b"\n", b" ", text)#no_new_lines
    text= re.sub(b"\r", b" ", text)#no_returns
    #lowered with no punctuation
    text= text.translate(None, b'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~').lower()
    return text


documents = []
for filename in glob.glob('polarity_html/sample/*.html'):
    with open(filename, 'rb') as f:
        raw = f.read()
        cleaned = preprocess(raw)
        documents.append(cleaned)


print(len(documents))


8


#### 2.2 Verify Data Quality and Implement Stemming

In [5]:
documents[2]


b'     review for hollywood shuffle 1987              hollywood shuffle 1987   reviewed by  steve fritzinger                                    hollywood shuffle                     a film review by steve fritzinger                      copyright 1987 steve fritzinger         with hollywood shuffle first time writerdirector robert townsend creates  a very entertaining film  this goodnatured parody of moviemaking gets most  of its laughs from gentle but wellaimed shots at the stereotypes connected to  black actors and white film makers  along the way hollywood shuffle takes time  out to lampoon hardboiled detectives eddie murphy tv movie critics and the  rewards of stardom         townsend also stars as bobby a struggling young black actor who works  parttime at the local winky dinky dog hot dog stand  bobby is trying out  for his first movie role playing a street hood named jivetime jimmy  unfortunately while bobby hopes this part will put him on the road to stardom  he also feels the 

This is the cleaned data

#### 2.3 Convert data from raw text into sparse encoded bag-of-words

In [29]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction import text 
from nltk.corpus import words


domain_specific_stop_words = ["author", "movies", "movie", "film", "reviews", "review"]
stop_words = text.ENGLISH_STOP_WORDS.union(domain_specific_stop_words)

count_vect = CountVectorizer(stop_words= stop_words, 
                             decode_error='ignore',
                             vocabulary=vocab
                                ) # an object capable of counting words in a document!

bag_words = count_vect.fit_transform(documents)

documents[0]

TypeError: 'method' object is not iterable

In [22]:
print(bag_words.shape) # this is a sparse matrix
print('=========')
print(bag_words[0])

(8, 1)
  (0, 0)	1


In [23]:
print(len(count_vect.vocabulary_))
#print(count_vect.vocabulary_)
count_vect.inverse_transform(bag_words[0])

1


[array(['gun'], dtype='<U3')]

In [24]:
# now let's create a pandas API out of this
import pandas as pd

pd.options.display.max_columns = 999
df = pd.DataFrame(data=bag_words.toarray(),columns=count_vect.get_feature_names())
df

Unnamed: 0,gun
0,1
1,0
2,0
3,0
4,0
5,0
6,0
7,0


In [13]:
# print out 10 most common words in our data
df.sum().sort_values()[-10:]

good                    10
arizona                 12
raising                 12
scene                   13
steve                   13
like                    14
responsibility          16
recartsmoviesreviews    16
copyright               17
1987                    20
dtype: int64

#### 2.4 Convert the data into a sparse encoded tf-idf representation.

In [14]:
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf_vect = TfidfVectorizer(stop_words= stop_words, decode_error='ignore', 
                             max_df=0.01,
                             min_df=4)

tfidf_mat = tfidf_vect.fit_transform(documents)
tfidf_mat

ValueError: max_df corresponds to < documents than min_df

In [47]:
# convert to pandas to get better idea about the data
df = pd.DataFrame(data=tfidf_mat.toarray(),columns=tfidf_vect.get_feature_names())
df

MemoryError: 

In [74]:
# print out 10 words with max tfidf, normalized by document occurrence
df.max().sort_values()[-10:]

blind        0.320479
date         0.322304
palace       0.349054
greasers     0.349054
shermans     0.351081
hollywood    0.367310
cambodia     0.376265
march        0.421297
shuffle      0.438277
gibson       0.438717
dtype: float64

### 3. Data Visualization: Visualize statistical summaries of the text data

#### 3.1 word frequencies, most relevant words; Termite chart

In [17]:
count_vect = CountVectorizer(stop_words= stop_words, 
                             decode_error='ignore',
                             ngram_range=(1, 2)
                                ) # an object capable of counting words in a document!
print(bag_words.shape) # this is a sparse matrix
print('=========')
print(bag_words[0])



(8, 2756)
  (0, 1262)	1
  (0, 1464)	1
  (0, 1996)	1
  (0, 502)	1
  (0, 1209)	1
  (0, 162)	1
  (0, 1731)	1
  (0, 621)	1
  (0, 1458)	1
  (0, 900)	1
  (0, 2020)	1
  (0, 1295)	1
  (0, 2569)	1
  (0, 328)	1
  (0, 1688)	1
  (0, 2004)	1
  (0, 465)	1
  (0, 635)	1
  (0, 252)	1
  (0, 507)	1
  (0, 2314)	1
  (0, 2553)	1
  (0, 497)	1
  (0, 724)	1
  (0, 495)	1
  :	:
  (0, 992)	5
  (0, 546)	1
  (0, 601)	1
  (0, 1852)	2
  (0, 812)	1
  (0, 85)	1
  (0, 387)	1
  (0, 2277)	1
  (0, 2334)	2
  (0, 366)	1
  (0, 2417)	1
  (0, 990)	1
  (0, 2648)	1
  (0, 1928)	1
  (0, 850)	1
  (0, 1718)	1
  (0, 1549)	1
  (0, 2419)	1
  (0, 505)	2
  (0, 1584)	4
  (0, 1326)	4
  (0, 2031)	1
  (0, 12)	3
  (0, 2650)	4
  (0, 1417)	4
(8, 2756)
  (0, 1262)	1
  (0, 1464)	1
  (0, 1996)	1
  (0, 502)	1
  (0, 1209)	1
  (0, 162)	1
  (0, 1731)	1
  (0, 621)	1
  (0, 1458)	1
  (0, 900)	1
  (0, 2020)	1
  (0, 1295)	1
  (0, 2569)	1
  (0, 328)	1
  (0, 1688)	1
  (0, 2004)	1
  (0, 465)	1
  (0, 635)	1
  (0, 252)	1
  (0, 507)	1
  (0, 2314)	1
  (0, 2553)	1


In [18]:
# now let's create a pandas API out of this
import pandas as pd

pd.options.display.max_columns = 999
df = pd.DataFrame(data=bag_words.toarray(),columns=count_vect.get_feature_names())
df

Unnamed: 0,10minute,10minute chase,16yearold,16yearold girl,1972,1972 greasers,1972 reviewed,1972written,1972written directed,1986,1986 reviewed,1986 shermans,1987,1987 billy,1987 blind,1987 gary,1987 hollywood,1987 jeff,1987 lethal,1987 raising,1987 reviewed,1987 steve,1987 swimming,1987 tod,30,30 skip,37,37 thought,80s,80s irony,9199672225,9199672225 email,absolutely,absolutely best,absolutely justify,absolutely wonderful,absorb,absorb probing,absurd,absurd visual,absurdly,absurdly unrealistic,accent,accent said,accents,accents amusing,accepts,accepts blind,accepts responsibility,accompanying,accompanying rock,accompanying story,accomplishment,accomplishment centered,accounting,accounting going,act,act imagine,acting,acting small,action,action starts,actionadventure,actionadventure hero,actions,actions tone,actor,actor works,actors,actors hollywood,actors producers,actors pull,actors white,add,add sexual,addition,addition mugged,adison,adison image,admit,admit lil,adults,adults thats,afternoon,afternoon seeing,aging,aging familyman,agony,agony disrupts,aimed,aimed bewilderingly,airplane,airplane exception,airplane wasnt,aka,aka jeff,albert,albert henderson,albinocrazed,albinocrazed henchman,allan,allan arbus,allegra,allegra sb6,allens,allens family,allows,allows drink,allows viewer,ally,ally psycho,alright,alright new,american,american foreign,americana,americana occasionally,amigos,amigos ive,amusing,amusing gosh,amusing im,analyst,analyst style,andersons,andersons understated,andrews,andrews dialogue,angles,angles crosscutting,anticlimatic,anticlimatic fight,antiheroes,antiheroes hope,apart,apart ive,appealing,appealing im,approach,approach comedy,approach liked,arbus,arbus parachutes,archaic,archaic words,archetype,archetype gray,arent,arent going,arizona,arizona 1987,arizona comedy,arizona farting,arizona gary,arizona jeff,arizona laughed,arizona like,arizona think,arizona trying,art,art think,ascii,ascii html,ask,ask borrow,aspects,aspects make,ass,ass happens,assassination,assassination attempt,associated,associated stereotypes,assume,assume doesnt,assume youre,attachment,attachment main,attempt,attempt police,attention,attention imagination,audience,audience perfomance,average,average 16yearold,averages,averages standard,avoid,avoid raising,away,away goons,away tv,babies,babies people,babyseye,babyseye views,bach,bach complained,bach played,bad,bad slapstick,badass,badass biker,badger,badger really,band,band suffers,bangkok,bangkok gulf,banjo,banjo look,bar,bar seaweedhead,base,base death,basically,basically like,basis,basis tv,bassinger,bassinger gives,bathos,bathos difficult,beats,beats country,beautiful,beautiful woman,beautiful young,beauty,beauty business,beavers,beavers mom,beckons,beckons return,beer,beer fridge,begin,begin relationship,begin stallone,begin track,behave,behave drunkenly,believable,believable form,believable fresh,believable world,believe,believe events,believes,believes world,belongs,belongs direct,benson,benson copyright,benson raising,best,best scenes,best zany,better,better avoid,better place,beverly,beverly hills,bewilderingly,bewilderingly stupid,biggest,biggest clients,biker,biker total,billy,billy green,bit,bit drawnout,bit long,bit offbeat,bitching,bitching hope,bizarre,bizarre seen,black,black actor,black actors,blacks,blacks selling,blake,blake edwards,blazing,blazing saddles,blind,blind date,blows,blows away,bobby,bobby hopes,bobby struggling,bobby trying,bobbys,bobbys family,bobbys mind,boogie,boogie fingers,border,border slimylooking,border thailand,boring,boring films,borrow,borrow expensive,bottle,bottle champagne,brazil,brazil brazil,brazil probably,brazil silver,break,break disco,break unlikely,breed,breed storytellers,brief,brief summary,bringing,bringing home,bringing lamy,brings,brings life,broken,broken urls,bronson,bronson good,brown,brown mustard,bruce,bruce willis,built,built saloon,buried,buried hanford,busey,busey albinocrazed,business,business man,business script,butt,butt slapstick,cage,cage mariachi,called,called greasers,calls,calls smiles,cambodia,cambodia 1987,cambodia hour,cambodia lot,cambodia nature,cambodia steve,came,came realizing,camera,camera angles,camera shoulder,camera work,canada,canada capsule,cancel,cancel story,capsule,capsule terrific,care,care emotional,caricatures,caricatures actors,caricatures presented,carrboro,carrboro nc,carried,carried bar,case,case constipation,caught,caught characters,cciosd,cciosd reston,centered,centered internal,centers,centers aging,certain,certain emotional,certainly,certainly allows,challenge,challenge observe,champagne,champagne unfortunately,changes,changes hours,chaos,chaos figures,character,character badger,character puts,character tells,characters,characters caricatures,characters explain,characters heads,characters perfect,characters say,characters screen,characters talk,charm,charm minor,charming,charming beautiful,chase,chase scene,chest,chest slimylooking,christmas,christmas imagine,cigar,cigar chest,cinema,cinema verite,civilized,civilized world,classic,classic fun,classic innocence,clients,clients regular,close,close hes,clue,clue posted,cobra,cobra invasion,coens,coens writerdirectors,cold,cold airplane,come,come sharing,come thumbsupthumbsdown,come years,comedy,comedy did,comedy effective,comedy isnt,comedy mercenary,comedy oil,comedy pretty,comedy really,comedy usually,comedy wild,comfortable,comfortable roles,command,command totality,comments,comments scattered,commentscriticisms,commentscriticisms relevant,commercials,commercials admit,comparison,comparison ridiculous,complained,complained bitching,complete,complete high,completely,completely successful,complicated,complicated members,concert,concert stop,connected,connected black,connection,connection detectives,consequences,consequences actions,considering,considering bad,consistency,consistency logic,constipation,constipation entire,contains,contains archaic,contents,contents editorial,control,control unless,controversial,...,sorry jet,sort,sort things,sort vocabulary,soul,soul seaweedhead,sound,sound effects,south,south familysomewhat,south gets,south just,south start,southern,southern americana,spalding,spalding gray,spalding grays,speaking,speaking leaves,special,special centers,speeches,speeches like,speed,speed chase,spent,spent weeks,spoofing,spoofing stereotypes,sprawl,sprawl couch,stale,stale predictable,stallone,stallone lifeless,stance,stance sacred,stand,stand bobby,stand hear,standard,standard deviation,standard scale,star,star course,stardom,stardom feels,stardom townsend,stars,stars bobby,start,start girlfriend,starts,starts lets,starts run,stated,stated copyright,states,states grist,statusimmediately,statusimmediately introduces,stereotypes,stereotypes connected,stereotypes intelligent,stereotypes poking,steve,steve fritzinger,steve martin,steve upstill,stole,stole christmas,stood,stood minute,stop,stop making,stop reading,story,story jarring,story special,storytellers,storytellers command,strangest,strangest engrossing,streak,streak tell,street,street hood,struggles,struggles david,struggling,struggling filmmaker,struggling young,stupid,stupid criminal,stupid harrison,stupid humor,stupid said,stupid wearing,style,style social,subdued,subdued providing,subtlety,subtlety humor,successful,successful exercise,suffers,suffers severe,suit,suit time,summary,summary follows,sun,sun allegra,support,support united,surrealistic,surrealistic badass,surrounding,surrounding filming,surrounds,surrounds seedy,survivalists,survivalists southern,sw,sw accents,swimming,swimming cambodia,symbiosis,symbiosis doubt,symbolic,symbolic approach,sympathy,sympathy early,synthetic,synthetic way,takes,takes time,taking,taking meat,taking stance,talk,talk like,talking,talking heads,talking reality,talking spalding,tarry,tarry yon,tastes,tastes humor,teach,teach chase,tears,tears youre,tell,tell orientation,tells,tells head,tend,tend cancel,tends,tends marginal,terrific,terrific fun,thailand,thailand acting,thailand waiting,thats,thats believable,thats charm,thats ithope,theater,theater imagining,theater live,theater set,theres,theres plot,theyre,theyre going,theyve,theyve given,things,things jet,think,think ally,think eventually,think intelligent,think moriarty,think shot,thought,thought ending,thought hail,threehour,threehour cinema,thrown,thrown firms,throws,throws life,thumbsupthumbsdown,thumbsupthumbsdown rating,time,time greaser,time lampoon,time licketysplit,time writerdirector,timing,timing pacing,tod,tod kuykendall,today,today understand,told,told walter,tone,tone scenes,tons,tons great,tornado,tornado walter,total,total morons,totality,totality attention,town,town built,town follows,town gathers,townsend,townsend creates,townsend spoofing,townsend stars,track,track men,tranqs,tranqs timing,transcends,transcends dialogue,translated,translated jive,transmutes,transmutes believable,treat,treat mark,tried,tried come,trip,trip scene,trip south,trudge,trudge door,truly,truly bizarre,try,try rate,trying,trying leave,trying role,tunneling,tunneling prison,turned,turned insideout,turns,turns heads,tv,tv critics,tv series,tv videotape,twice,twice long,tykes,tykes flick,types,types basically,ugh,ugh holy,unadulterated,unadulterated dumb,understand,understand funny,understand reservation,understated,understated comedy,understated music,understated symbolic,understatedly,understatedly deadpan,unfolded,unfolded impossible,unfortunately,unfortunately bobby,unfortunately told,united,united states,unless,unless stated,unless youre,unlikely,unlikely places,unnoticed,unnoticed walks,unrealistic,unrealistic portrayal,unscrupulous,unscrupulous exfiance,upstill,upstill copyright,upstill posted,upstill spalding,upstill swimming,urls,urls inthe,usa,usa rest,used,used teach,usually,usually leave,uucp,uucp uwbeaver,uwbeaver,uwbeaver sun,va,va seismorlgvaxjsf,valuable,valuable reasons,vaudeville,vaudeville entertainer,vendors,vendors caricatures,verite,verite excursion,version,version gospel,videos,videos guess,videotape,videotape im,viewer,viewer certain,viewer enjoying,views,views adults,visit,visit entire,visit firefights,visual,visual humor,visual joketelling,vocabulary,vocabulary does,vs,vs fleshandblood,wacky,wacky girl,waiting,waiting hey,waiting support,walks,walks particularly,walter,walter course,walter disorganized,walter finally,walter nadia,walter problem,wanted,wanted write,warm,warm human,warning,warning slimylooking,warnings,warnings sisterinlaw,wasnt,wasnt funny,waste,waste site,wasted,wasted afternoon,wasted pig,water,water bringing,way,way averages,way finally,way formal,way gibson,way hollywood,way left,weaknesses,weaknesses glaring,weapon,weapon 1987,weapon jeff,weapon tends,wear,wear plaid,wear theyre,wearing,wearing way,weeks,weeks thailand,wellaimed,wellaimed shots,werent,werent going,west,west small,white,white enters,white makers,wife,wife gibson,wife houseguests,wife leave,wife share,wild,wild rapidly,willis,willis struggles,winky,winky dinky,wisconsins,wisconsins kentucky,wish,wish mel,wishing,wishing twice,wo,wo clue,woman,woman harm,woman sings,womans,womans lust,wonderful,wonderful stand,woody,woody allens,words,words phrases,work,work got,works,works parttime,world,world canada,world creates,world eyes,world far,worth,worth waiting,write,write quotes,writerdirector,writerdirector robert,writerdirectors,writerdirectors raising,yeah,yeah parallels,year,year running,years,years finest,yes,yes art,yes including,yoks,yoks diet,yoks farcical,yoks just,yoks rude,yoks satirical,yon,yon giving,york,york monologue,york young,youll,youll like,young,young black,young struggling,young woman,young womans,youre,youre comedy,youre going,youre interested,youre picky,youre treat,yuppie,yuppie comedy,zany,zany come,zoot,zoot suit
0,0,0,1,1,0,0,0,0,0,0,0,0,3,0,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,1,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,2,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,1,0,0,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,1,0,0,1,1,4,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,2,1,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,...,0,0,0,0,0,0,0,0,4,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,1,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,2,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,1,1,0,0,0,0,0,0,0,0,3,1,1,1,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,2,1,1,1,1,0,0,0,0,0,1,1,0,0,0,0,3,1,1,1,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,2,1,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,2,...,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1
4,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,1,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,5,2,1,1,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,2,0,1,1,0,0,0,0,1,1,3,2,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,2,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,1,1,2,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,2,1,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,1,1,1,0,...,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,2,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,0,1,1,1,1,0,0,1,1,1,1,0,0,0,0,0,0,1,1,0,0,0,2,1,1,0,2,0,1,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,1,0,0,1,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,2,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,0,0,0,2,0,1,0,0,1,0,0,0,0,0,0
5,1,1,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,1,0,1,1,0,0,0,1,1,0,0,0,0,0,0,2,1,0,1,0,0,0,0,0,0,1,1,2,2,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0,1,1,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,2,1,1,0,0,0,0,0,0,0,0,7,2,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,1,0,0,2,1,1,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,1,0,1,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,1,1,1,0,0,0,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,1,4,0,0,1,1,1,0,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,1,1,0,...,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,2,2,0,0,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,1,2,1,0,1,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,1,1,0,0,0,5,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,0,1,0,0,0,1,1,0,0
6,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,2,1,1,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,...,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,3,2,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,4,0,0,4,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,5,5,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,3,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,4,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0
7,0,0,0,0,0,0,0,0,0,0,0,0,3,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,5,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,4,4,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,1,1,1,1,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [19]:
df.max().sort_values()[-10:]

shuffle              7
hollywood shuffle    7
raising              7
arizona              7
greaser              7
hollywood            7
greasers             8
greasers palace      8
palace               8
gibson               9
dtype: int64

#### 3.2  most relevant words Cloud chart

In [67]:
from yellowbrick.text import TSNEVisualizer
from sklearn.feature_extraction.text import TfidfVectorizer

tfidf = TfidfVectorizer(dtype='float64')


tsne = TSNEVisualizer(labels=["documents"])

docs = [documents]

tsne.fit(docs)
tsne.poof()

TypeError: Cannot cast array data from dtype('float64') to dtype('S32') according to the rule 'safe'