<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Acknowledgements" data-toc-modified-id="Acknowledgements-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Acknowledgements</a></span></li><li><span><a href="#Prepare-data-and-model" data-toc-modified-id="Prepare-data-and-model-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare data and model</a></span></li><li><span><a href="#Make-feature-matrix-(word2vec,-votes,-stars)" data-toc-modified-id="Make-feature-matrix-(word2vec,-votes,-stars)-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Make feature matrix (word2vec, votes, stars)</a></span></li><li><span><a href="#Create-Label-y-(Business-categories)" data-toc-modified-id="Create-Label-y-(Business-categories)-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Create Label y (Business categories)</a></span></li><li><span><a href="#Join-x,y-(feature-matrix,-category)-using-business_id" data-toc-modified-id="Join-x,y-(feature-matrix,-category)-using-business_id-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Join x,y (feature matrix, category) using business_id</a></span></li><li><span><a href="#Category-Prediction" data-toc-modified-id="Category-Prediction-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Category Prediction</a></span><ul class="toc-item"><li><span><a href="#Recall-(and-other-classification-metrics)" data-toc-modified-id="Recall-(and-other-classification-metrics)-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Recall (and other classification metrics)</a></span></li><li><span><a href="#Top-RECOMMENDATIONS" data-toc-modified-id="Top-RECOMMENDATIONS-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Top RECOMMENDATIONS</a></span></li></ul></li><li><span><a href="#Cluster-with-metadata-(useful,-cool,-funny,-stars)" data-toc-modified-id="Cluster-with-metadata-(useful,-cool,-funny,-stars)-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Cluster with metadata (useful, cool, funny, stars)</a></span></li></ul></div>

# Acknowledgements
Thanks to the tutorial: https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-3-more-fun-with-word-vectors

# Prepare data and model

In [183]:
%matplotlib inline
import pandas as pd
pd.options.display.max_columns = 999
pd.options.display.max_rows=70
import numpy as np
import matplotlib.pyplot as plt

import re

import nltk
import nltk.data
nltk.download('stopwords')
from nltk.corpus import stopwords # Import the stop word list



[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/daviderickson/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
def load_reviews(size='small'): 
    if size == 'small':
        filename = r'../../data/small-review.json'
    elif size == 'intermediate':
        filename = r'../../data/intermediate-review.json'
    elif size == 'full':
        filename = r'../../data/review.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfreviews = load_reviews(size='intermediate')

In [3]:
dfreviews.head()

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,ujmEBvifdJM6h6RLv4wQIg,0,2013-05-07 04:34:36,1,Q1sbwvVQXV2734tPgoKj4Q,1.0,Total bill for this horrible service? Over $8G...,6,hG7b0MtEbXx5QzbzE6C_VA
1,NZnhc2sEQy3RmzKTZnqtwQ,0,2017-01-14 21:30:33,0,GJXCdrto3ASJOqKeVWPi6Q,5.0,I *adore* Travis at the Hard Rock's new Kelly ...,0,yXQM5uF2jS6es16SJzNHfg
2,WTqjgwHlXbSFevF32_DJVw,0,2016-11-09 20:09:03,0,2TzJjDVDEuAW6MR5Vuc1ug,5.0,I have to say that this office really has it t...,3,n6-Gk65cPZL6Uz8qRm3NYw
3,ikCg8xy5JIg_NGPx-MSIDA,0,2018-01-09 20:56:38,0,yi0R0Ugj_xUx_Nek0-_Qig,5.0,Went in for a lunch. Steak sandwich was delici...,0,dacAIZ6fTM6mqwW5uxkskg
4,b1b1eb3uo-w561D0ZfCEiQ,0,2018-01-30 23:07:38,0,11a8sVPMUFtaC7_ABRkmtw,1.0,Today was my second out of three sessions I ha...,7,ssoyf2_x0EQMed6fgHeMyQ


In [4]:
dfreviews.columns

Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id'],
      dtype='object')

In [5]:
dfreviews['text'][0]

'Total bill for this horrible service? Over $8Gs. These crooks actually had the nerve to charge us $69 for 3 pills. I checked online the pills can be had for 19 cents EACH! Avoid Hospital ERs at all costs.'

In [6]:
# For simplicity, drop anything that isn't a letter
# Numbers and symbols may have interesting meaning and could be explore later

def lettersOnly(string):
    return re.sub("[^a-zA-Z]", " ", string) 

dfreviews['text'] = dfreviews['text'].apply(lettersOnly)


In [7]:
dfreviews['text'][0]

'Total bill for this horrible service  Over   Gs  These crooks actually had the nerve to charge us     for   pills  I checked online the pills can be had for    cents EACH  Avoid Hospital ERs at all costs '

In [8]:
def review_to_wordlist(string, remove_stopwords=False):
    string = re.sub("[^a-zA-Z]", " ", string) # keep only letters. more complex model possible later
    words =  string.lower().split() # make everything lowercase. split into words
    if remove_stopwords:
        stops = set(stopwords.words('english')) # create a fast lookup for stopwords
        words = [w for w in words if not w in stops] # remove stopwords
    return( words) # return a list of words
    
# dfreviews['text'] = dfreviews['text'].apply(review_to_words) # apply to reviews in dataframe


In [9]:
# Word2Vec expects single sentences, each one as a list of words

# Load the punkt tokenizer
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')

# Define a function to split a review into parsed sentences
def review_to_sentences( review, tokenizer, remove_stopwords=False ):
    # Function to split a review into parsed sentences. Returns a 
    # list of sentences, where each sentence is a list of words
    #
    # 1. Use the NLTK tokenizer to split the paragraph into sentences
    raw_sentences = tokenizer.tokenize(review.strip())
    #
    # 2. Loop over each sentence
    sentences = []
    for raw_sentence in raw_sentences:
        # If a sentence is empty, skip it
        if len(raw_sentence) > 0:
            # Otherwise, call review_to_wordlist to get a list of words
            sentences.append( review_to_wordlist( raw_sentence, \
              remove_stopwords ))
    #
    # Return the list of sentences (each sentence is a list of words,
    # so this returns a list of lists
    return sentences

In [10]:
sentences = []  # Initialize an empty list of sentences

print("Parsing sentences")
for review in dfreviews["text"]:
    sentences += review_to_sentences(review, tokenizer)

Parsing sentences


In [11]:
# Import the built-in logging module and configure it so that Word2Vec 
# creates nice output messages
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',\
    level=logging.INFO)

# Set values for various parameters
num_features = 300    # Word vector dimensionality                      
min_word_count = 40   # Minimum word count                        
num_workers = 4       # Number of threads to run in parallel
context = 10          # Context window size                                                                                    
downsampling = 1e-3   # Downsample setting for frequent words

# Initialize and train the model (this will take some time)
from gensim.models import word2vec
print("Training model...")
model = word2vec.Word2Vec(sentences, workers=num_workers, \
            size=num_features, min_count = min_word_count, \
            window = context, sample = downsampling)

# If you don't plan to train the model any further, calling 
# init_sims will make the model much more memory-efficient.
model.init_sims(replace=True)

# It can be helpful to create a meaningful model name and 
# save the model for later use. You can load it later using Word2Vec.load()
model_name = "300features_40minwords_10context"
model.save(model_name)

2020-01-21 15:20:16,054 : INFO : 'pattern' package not found; tag filters are not available for English
2020-01-21 15:20:16,064 : INFO : collecting all words and their counts
2020-01-21 15:20:16,065 : INFO : PROGRESS: at sentence #0, processed 0 words, keeping 0 word types


Training model...


2020-01-21 15:20:16,369 : INFO : PROGRESS: at sentence #10000, processed 1088334 words, keeping 25539 word types
2020-01-21 15:20:16,704 : INFO : PROGRESS: at sentence #20000, processed 2172597 words, keeping 35463 word types
2020-01-21 15:20:17,086 : INFO : PROGRESS: at sentence #30000, processed 3251616 words, keeping 42649 word types
2020-01-21 15:20:17,460 : INFO : PROGRESS: at sentence #40000, processed 4373996 words, keeping 48893 word types
2020-01-21 15:20:17,703 : INFO : PROGRESS: at sentence #50000, processed 5471587 words, keeping 53964 word types
2020-01-21 15:20:18,011 : INFO : PROGRESS: at sentence #60000, processed 6570064 words, keeping 58362 word types
2020-01-21 15:20:18,423 : INFO : PROGRESS: at sentence #70000, processed 7667364 words, keeping 62704 word types
2020-01-21 15:20:18,670 : INFO : PROGRESS: at sentence #80000, processed 8768955 words, keeping 66443 word types
2020-01-21 15:20:18,852 : INFO : PROGRESS: at sentence #90000, processed 9872097 words, keeping 

2020-01-21 15:21:04,131 : INFO : EPOCH 4 - PROGRESS: at 86.02% examples, 664160 words/s, in_qsize 7, out_qsize 0
2020-01-21 15:21:05,132 : INFO : EPOCH 4 - PROGRESS: at 93.44% examples, 656797 words/s, in_qsize 7, out_qsize 0
2020-01-21 15:21:06,047 : INFO : worker thread finished; awaiting finish of 3 more threads
2020-01-21 15:21:06,061 : INFO : worker thread finished; awaiting finish of 2 more threads
2020-01-21 15:21:06,097 : INFO : worker thread finished; awaiting finish of 1 more threads
2020-01-21 15:21:06,104 : INFO : worker thread finished; awaiting finish of 0 more threads
2020-01-21 15:21:06,105 : INFO : EPOCH - 4 : training on 10978770 raw words (7805065 effective words) took 12.1s, 646858 effective words/s
2020-01-21 15:21:07,142 : INFO : EPOCH 5 - PROGRESS: at 6.40% examples, 480993 words/s, in_qsize 7, out_qsize 1
2020-01-21 15:21:08,150 : INFO : EPOCH 5 - PROGRESS: at 15.29% examples, 578248 words/s, in_qsize 7, out_qsize 0
2020-01-21 15:21:09,151 : INFO : EPOCH 5 - PRO

In [12]:
model.most_similar('pizza')

  """Entry point for launching an IPython kernel.


[('crust', 0.7200956344604492),
 ('pizzas', 0.6885954141616821),
 ('pepperoni', 0.6838212013244629),
 ('margherita', 0.6435902118682861),
 ('calzone', 0.6134117841720581),
 ('mozzarella', 0.5657241344451904),
 ('meatball', 0.5640321969985962),
 ('lasagna', 0.5393074750900269),
 ('dough', 0.533336877822876),
 ('slice', 0.5115900635719299)]

In [13]:
model.most_similar('service')

  """Entry point for launching an IPython kernel.


[('waitstaff', 0.5476328134536743),
 ('staff', 0.4523521065711975),
 ('servers', 0.40953201055526733),
 ('value', 0.39693307876586914),
 ('bartenders', 0.39646321535110474),
 ('communication', 0.3960760235786438),
 ('execution', 0.39064428210258484),
 ('experience', 0.3879009485244751),
 ('hostesses', 0.3836747705936432),
 ('food', 0.3836521506309509)]

In [14]:
model.most_similar('bad')

  """Entry point for launching an IPython kernel.


[('terrible', 0.6351137161254883),
 ('horrible', 0.5976879000663757),
 ('awful', 0.5804861783981323),
 ('good', 0.5499963164329529),
 ('poor', 0.530113697052002),
 ('disappointing', 0.4936771094799042),
 ('alright', 0.4890596866607666),
 ('subpar', 0.4678184390068054),
 ('acceptable', 0.4646115303039551),
 ('meh', 0.45676177740097046)]

In [15]:
import numpy as np  # Make sure that numpy is imported

def makeFeatureVec(words, model, num_features):
    # Function to average all of the word vectors in a given
    # paragraph
    #
    # Pre-initialize an empty numpy array (for speed)
    featureVec = np.zeros((num_features,),dtype="float32")
    #
    nwords = 0.
    # 
    # WV.Index2word is a list that contains the names of the words in 
    # the model's vocabulary. Convert it to a set, for speed 
    index2word_set = set(model.wv.index2word)
    #
    # Loop over each word in the review and, if it is in the model's
    # vocaublary, add its feature vector to the total
    for word in words:
        if word in index2word_set: 
            nwords = nwords + 1.
            featureVec = np.add(featureVec,model[word])
    # 
    # Divide the result by the number of words to get the average
    featureVec = np.divide(featureVec,nwords)
    return featureVec


def getAvgFeatureVecs(reviews, model, num_features):
    # Given a set of reviews (each one a list of words), calculate 
    # the average feature vector for each one and return a 2D numpy array 
    # 
    # Initialize a counter
    counter = int(0.)
    # 
    # Preallocate a 2D numpy array, for speed
    reviewFeatureVecs = np.zeros((len(reviews),num_features),dtype="float32")
    # 
    # Loop through the reviews
    for review in reviews:
       #
       # Print a status message every 1000th review
       if counter%1000. == 0.:
           print ("Review %d of %d" % (counter, len(reviews)))
       # 
       # Call the function (defined above) that makes average feature vectors
       reviewFeatureVecs[counter] = makeFeatureVec(review, model, \
           num_features)
       #
       # Increment the counter
       counter = counter + 1
    return reviewFeatureVecs

In [16]:
# ****************************************************************
# Calculate average feature vectors
# using the functions we defined above. Notice that we now use stop word
# removal.

clean_reviews = []
for review in dfreviews["text"]:
    clean_reviews.append( review_to_wordlist( review, \
        remove_stopwords=True ))

reviewDataVecs = getAvgFeatureVecs( clean_reviews, model, num_features )

Review 0 of 100000




Review 1000 of 100000
Review 2000 of 100000
Review 3000 of 100000
Review 4000 of 100000
Review 5000 of 100000
Review 6000 of 100000
Review 7000 of 100000
Review 8000 of 100000




Review 9000 of 100000
Review 10000 of 100000
Review 11000 of 100000
Review 12000 of 100000
Review 13000 of 100000
Review 14000 of 100000
Review 15000 of 100000
Review 16000 of 100000
Review 17000 of 100000
Review 18000 of 100000
Review 19000 of 100000
Review 20000 of 100000
Review 21000 of 100000
Review 22000 of 100000
Review 23000 of 100000
Review 24000 of 100000
Review 25000 of 100000
Review 26000 of 100000
Review 27000 of 100000
Review 28000 of 100000
Review 29000 of 100000
Review 30000 of 100000
Review 31000 of 100000
Review 32000 of 100000
Review 33000 of 100000
Review 34000 of 100000
Review 35000 of 100000
Review 36000 of 100000
Review 37000 of 100000
Review 38000 of 100000
Review 39000 of 100000
Review 40000 of 100000
Review 41000 of 100000
Review 42000 of 100000
Review 43000 of 100000
Review 44000 of 100000
Review 45000 of 100000
Review 46000 of 100000
Review 47000 of 100000
Review 48000 of 100000
Review 49000 of 100000
Review 50000 of 100000
Review 51000 of 100000
Review 52000

# Make feature matrix (word2vec, votes, stars)

In [17]:
reviewDataVecs.shape[1]

300

In [18]:
# Add non-text data back to feature matrix
review_features = ['cool', 'funny', 'useful', 'stars' , 'business_id']
all_features_labels = ['w2v{}'.format(idx) for idx in range(reviewDataVecs.shape[1])] + review_features
all_features = np.append(reviewDataVecs, dfreviews[review_features].to_numpy(), 1)


In [19]:
# Create df 
all_features_df = pd.DataFrame(data=all_features, columns=all_features_labels)

# Convert all but business_id to numerical
business_ids = all_features_df['business_id']
all_features_df = all_features_df.iloc[:,:-1].astype('float64')
all_features_df['business_id'] = business_ids
del business_ids

# Group by business_id
all_features_business = all_features_df.groupby(by='business_id').mean()

In [20]:
all_features_business.head()

Unnamed: 0_level_0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
business_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1,Unnamed: 199_level_1,Unnamed: 200_level_1,Unnamed: 201_level_1,Unnamed: 202_level_1,Unnamed: 203_level_1,Unnamed: 204_level_1,Unnamed: 205_level_1,Unnamed: 206_level_1,Unnamed: 207_level_1,Unnamed: 208_level_1,Unnamed: 209_level_1,Unnamed: 210_level_1,Unnamed: 211_level_1,Unnamed: 212_level_1,Unnamed: 213_level_1,Unnamed: 214_level_1,Unnamed: 215_level_1,Unnamed: 216_level_1,Unnamed: 217_level_1,Unnamed: 218_level_1,Unnamed: 219_level_1,Unnamed: 220_level_1,Unnamed: 221_level_1,Unnamed: 222_level_1,Unnamed: 223_level_1,Unnamed: 224_level_1,Unnamed: 225_level_1,Unnamed: 226_level_1,Unnamed: 227_level_1,Unnamed: 228_level_1,Unnamed: 229_level_1,Unnamed: 230_level_1,Unnamed: 231_level_1,Unnamed: 232_level_1,Unnamed: 233_level_1,Unnamed: 234_level_1,Unnamed: 235_level_1,Unnamed: 236_level_1,Unnamed: 237_level_1,Unnamed: 238_level_1,Unnamed: 239_level_1,Unnamed: 240_level_1,Unnamed: 241_level_1,Unnamed: 242_level_1,Unnamed: 243_level_1,Unnamed: 244_level_1,Unnamed: 245_level_1,Unnamed: 246_level_1,Unnamed: 247_level_1,Unnamed: 248_level_1,Unnamed: 249_level_1,Unnamed: 250_level_1,Unnamed: 251_level_1,Unnamed: 252_level_1,Unnamed: 253_level_1,Unnamed: 254_level_1,Unnamed: 255_level_1,Unnamed: 256_level_1,Unnamed: 257_level_1,Unnamed: 258_level_1,Unnamed: 259_level_1,Unnamed: 260_level_1,Unnamed: 261_level_1,Unnamed: 262_level_1,Unnamed: 263_level_1,Unnamed: 264_level_1,Unnamed: 265_level_1,Unnamed: 266_level_1,Unnamed: 267_level_1,Unnamed: 268_level_1,Unnamed: 269_level_1,Unnamed: 270_level_1,Unnamed: 271_level_1,Unnamed: 272_level_1,Unnamed: 273_level_1,Unnamed: 274_level_1,Unnamed: 275_level_1,Unnamed: 276_level_1,Unnamed: 277_level_1,Unnamed: 278_level_1,Unnamed: 279_level_1,Unnamed: 280_level_1,Unnamed: 281_level_1,Unnamed: 282_level_1,Unnamed: 283_level_1,Unnamed: 284_level_1,Unnamed: 285_level_1,Unnamed: 286_level_1,Unnamed: 287_level_1,Unnamed: 288_level_1,Unnamed: 289_level_1,Unnamed: 290_level_1,Unnamed: 291_level_1,Unnamed: 292_level_1,Unnamed: 293_level_1,Unnamed: 294_level_1,Unnamed: 295_level_1,Unnamed: 296_level_1,Unnamed: 297_level_1,Unnamed: 298_level_1,Unnamed: 299_level_1,Unnamed: 300_level_1,Unnamed: 301_level_1,Unnamed: 302_level_1,Unnamed: 303_level_1,Unnamed: 304_level_1
--I7YYLada0tSLkORTHb5Q,-0.005223,0.013203,0.003038,0.005219,-0.015445,-0.0239,0.022235,0.005243,-0.004331,-0.016159,-0.004644,-0.015904,-0.01401,-0.001277,-0.010583,0.018494,-0.032239,0.009304,0.019475,0.020988,0.008241,-0.003566,-0.011737,-0.007616,-0.002773,-0.001232,0.001906,0.003526,0.008889,-0.003706,-0.006945,0.013901,-0.005344,0.022541,-0.019252,0.024952,-0.005151,0.007677,-0.001113,0.011767,-0.002323,-0.01123,0.0013,0.003953,-0.007738,-0.003068,-0.010008,-0.004464,0.004149,-0.002037,-0.014858,0.011594,-0.006096,-0.008228,0.019046,-0.001743,-0.002748,0.003867,0.001787,0.023379,0.000696,0.00366,0.003319,-0.011378,0.022503,0.000559,0.006824,0.006366,0.010075,0.004468,-0.003759,-0.006353,-0.029037,0.019419,-0.023796,0.013552,-0.008152,0.007197,0.004987,0.003728,0.006133,0.008269,0.008479,0.00612,-0.023127,-0.0126,0.002558,-0.021814,0.027417,0.005371,-0.003589,0.005786,-0.014635,0.018341,-0.003121,0.006018,0.018851,0.01104,-0.021877,-0.004835,0.019864,-0.000397,-0.007807,0.009183,-0.006709,-0.002768,0.029296,-0.003871,-0.000575,-0.012479,0.00489,0.023094,0.013396,-0.026432,0.006571,-0.000186,0.002575,-0.007381,0.014821,-0.010161,0.009179,-0.012276,0.016624,0.000203,0.015883,-0.003842,0.001717,0.018,0.002103,0.00119,-0.013259,0.004317,0.003305,-0.009963,0.000248,-0.006219,-0.005504,-0.012631,0.021161,-0.013545,0.003748,0.001858,-0.0027,-0.016487,-0.014337,-0.013673,-0.008969,0.002243,-0.004917,0.00461,0.019599,0.011536,0.013265,-0.018489,-0.003279,0.021542,0.004779,-0.010109,-0.007909,-0.011632,0.011815,0.004837,0.0024,-0.005271,-0.011838,-0.00092,0.000854,0.016514,-0.00266,0.005007,0.00745,0.002366,0.0186,0.012341,-0.003923,0.006543,-0.005612,-0.017796,-0.000585,-0.018443,0.021079,-0.020229,-0.014831,-0.004145,-0.001142,0.026174,-0.009629,-0.008576,0.007016,0.018397,0.003416,0.017813,0.001769,-0.000691,0.005052,-0.022161,-0.008082,0.002377,-0.015207,-0.003983,0.01159,-0.001088,0.020586,-0.013535,0.00733,-0.007836,-0.01056,0.013222,-0.002292,-0.015976,-0.000667,-0.005512,-0.007929,-0.007196,-0.004776,0.003071,-0.020419,0.013903,-0.011624,0.004576,0.014348,-0.000237,-0.015393,-0.012079,0.021398,0.001066,0.006931,-0.000196,-0.004139,0.017844,-0.00135,0.009227,0.027137,-0.00218,-0.014655,0.015815,0.007232,0.015476,-0.001738,0.009113,-0.009466,0.013402,-0.002148,-0.003952,0.01799,-0.02201,-0.006022,0.004883,-0.006745,0.006215,-0.001315,-0.012486,0.017581,-0.010556,0.009122,-0.007456,-0.007119,-0.001699,-0.006113,0.015474,0.001463,0.007989,0.009898,0.017991,-0.022558,0.010675,0.005857,-0.00322,5.3e-05,-0.007513,0.019841,0.021682,-0.020227,0.001476,0.00465,-0.010603,-0.000549,0.024183,0.013865,0.01075,-0.002404,-0.005597,-0.01739,-0.007033,-0.005543,-0.008435,-0.009834,-0.001851,-0.021076,-0.012153,0.001812,0.004794,0.006246,0.006409,-0.005728,0.016663,-0.002259,0.007373,-0.017416,-0.012376,0.352941,0.352941,0.823529,3.647059
--U98MNlDym2cLn36BBPgQ,-0.009473,0.007201,0.007892,0.007558,0.01489,-0.01304,0.000998,0.003142,0.01251,-0.009641,-0.008363,-0.001879,0.008971,-0.012477,-0.007306,0.015207,-0.00671,0.006058,0.015347,0.013466,0.008532,-0.009809,-0.014297,-0.010219,-0.009368,-0.008026,0.000225,-0.01447,0.011642,-0.021837,-0.003795,0.008734,-0.006346,0.012801,-0.008032,0.008933,-0.007662,-0.003781,0.015866,-0.004885,-0.00342,0.001984,-0.00896,-0.008293,0.000109,0.000424,-0.0179,-0.008771,0.005597,-0.006098,-0.013686,-0.01124,0.000259,-0.018127,0.013997,-0.007821,-0.006437,0.020163,-0.001049,0.008458,0.000157,0.005871,0.010256,0.005336,0.000152,-0.003606,-0.004147,0.003571,0.008488,0.002955,-0.023666,0.002256,-0.016963,0.0199,-0.035946,0.001264,-0.003626,-0.007445,0.001252,-0.009165,0.00174,-0.002499,0.029358,0.009762,-0.014283,-0.014628,0.009347,-0.005357,0.023107,0.017772,0.005322,0.000268,-0.019699,0.010072,-0.007535,0.002688,0.019548,0.005258,-0.005424,-0.012128,0.019191,0.003372,0.003562,0.009147,-0.018953,0.008326,0.021718,0.010737,0.010248,-0.016259,-0.01494,0.014091,0.015749,-0.034193,0.011758,-0.002841,0.01151,-0.002001,0.00191,-0.005489,0.014577,-0.011231,0.018701,0.008329,0.006234,-0.020305,0.000372,0.016223,-0.003292,-0.00572,-0.02751,0.007529,-0.002863,-0.000749,0.006188,-0.001027,0.003565,-0.002575,0.008592,-0.012345,0.003579,0.003061,0.010515,-0.008463,-0.0052,-0.0027,0.001463,-0.00914,-0.017576,0.00618,0.006529,-0.005442,-0.000809,-0.014269,-0.000472,0.00848,0.016967,-0.023097,0.005209,-0.016292,0.00596,0.00741,0.003231,-0.018636,0.007962,0.00977,-0.014006,0.013873,-0.00222,0.001197,-0.000364,-0.004451,0.005394,0.019312,-0.010599,0.00819,0.009584,-0.016238,0.00414,-0.005972,0.021776,-0.016622,-0.010791,-0.004947,-0.023165,0.019233,-0.000613,-0.007492,-0.000699,0.013902,0.005464,0.012575,-0.004372,0.004327,-0.003115,-0.012252,0.006913,0.005236,-0.019368,-0.005391,0.005623,0.006432,0.022029,-0.009159,-0.006492,0.002881,-0.014241,0.004627,-0.004803,-0.005085,0.000408,-0.006104,-0.007565,0.004088,-0.011873,0.002603,-0.019869,0.010066,-0.00583,0.006663,0.010104,-0.000855,-0.022315,-0.017786,0.022162,-0.009721,0.018821,0.002622,-0.012028,0.021037,-0.006022,-0.005586,0.0033,-0.014013,-0.007531,0.006539,0.007552,0.012004,-0.007375,0.009872,-0.011202,0.012094,-0.004685,-0.006495,0.006865,-0.006358,0.005425,-0.00094,-0.006885,-0.007055,0.005787,-0.00611,0.008055,0.012035,0.000168,-0.004976,-0.021966,0.012122,-0.006307,0.000289,-0.006792,0.013264,0.010632,0.006377,-0.004647,-0.008811,0.006545,0.003384,0.008636,-0.012588,0.002656,0.027923,-0.012597,-0.002731,-0.003223,-0.015098,-0.0003,0.005892,0.00719,-0.00083,-0.006708,0.00201,-0.015133,-0.004134,0.013898,0.000469,-0.004306,-0.002744,-0.005321,-0.002581,0.004612,0.004995,-0.002042,0.003872,-0.00744,0.010921,-0.007165,0.010107,-0.014089,-0.010551,0.0,0.0,2.0,3.0
--j-kaNMCo1-DYzddCsA5Q,0.009633,0.017234,0.02172,0.004169,-0.017047,0.021239,0.022135,0.018492,-0.025737,-0.037928,0.01828,-7e-06,-0.005802,-0.015352,-0.006522,0.002334,0.010471,0.027408,0.026609,0.010042,0.012227,-0.007788,-0.00143,0.034607,0.004424,-0.021396,-0.001293,-0.028477,-0.009573,-0.003362,0.007298,-0.019318,0.00776,0.035141,-0.02902,0.01573,-0.021251,-0.026926,-0.007267,-0.027688,-0.016911,-0.008173,-0.011411,0.016458,0.021287,0.015095,-0.011617,-0.017024,0.012681,-0.017895,0.001415,-0.0019,-0.016979,-0.020762,0.025932,0.00695,0.020581,0.009177,-0.020471,0.01232,0.013417,-0.006696,-0.021582,-0.010728,0.029076,-0.035886,0.026735,0.041019,0.028155,-0.004064,0.008174,-0.000281,-0.017237,0.012496,-0.016756,0.013198,-0.005503,0.0031,-0.00878,0.015854,0.022162,0.001933,-0.006392,-0.011483,-0.022019,0.008398,0.033826,0.003696,0.038147,-0.002722,0.012157,0.001011,0.008341,0.03413,-0.012817,0.010063,0.032218,0.00091,-0.007468,-0.02715,0.002558,0.022478,0.006398,-0.028837,-0.01712,-0.008066,-0.004648,0.017424,-0.038165,0.012726,-0.014071,0.006842,0.024031,-0.011678,-0.026158,-0.008592,-0.004284,-0.020306,0.015129,-0.001771,0.01084,0.032584,0.017343,-0.010482,-0.003187,-0.01053,0.012006,0.023894,0.000882,-0.00435,0.000148,0.022759,-0.033508,-0.022849,0.040028,-0.027479,-0.000731,0.025781,0.006088,0.015409,-0.017167,-0.02443,0.005184,-0.027714,-0.01012,-0.024396,-0.016754,-0.015322,-0.021488,-0.020994,0.025677,-0.001122,-0.005009,-0.032246,-0.017809,0.066689,-0.020374,-0.008658,-0.020044,0.021057,0.027508,0.030709,0.0053,-0.010031,-0.003083,-0.003467,0.04925,0.006947,0.020152,0.011769,0.002683,0.023202,-0.012936,0.004806,-0.012107,-0.012848,-0.019727,0.000232,0.003374,-0.008491,0.014565,0.004403,-0.006044,0.005692,-0.038754,-0.006393,-0.011121,0.025353,0.000418,0.002281,-0.004344,0.024774,0.011273,-0.014918,-0.017253,-0.007505,-0.000373,-0.024779,-0.021933,-0.001134,-0.010386,0.01263,0.000258,-0.003047,0.02557,-0.01956,-0.022088,0.030279,0.007202,0.00806,-0.018217,-0.02514,0.012217,-0.014579,-0.021562,-0.002347,0.015548,0.018797,0.029962,-0.012895,0.022535,-0.025045,0.019349,-0.027831,0.051708,0.022441,0.007402,-0.007486,0.018309,0.007561,0.003333,0.005273,0.020453,-0.012347,-0.002165,0.017393,-0.028464,0.013932,-0.003051,-0.003757,0.022351,-0.00033,-0.002272,0.007608,0.010865,-0.009705,-0.002934,0.012057,0.023419,0.007891,-0.049629,-0.019722,0.045787,-0.032095,-0.001229,-0.017484,0.008668,-0.001412,-0.019967,-0.006188,0.021955,0.014569,0.032061,0.025408,-0.027979,-0.023355,0.041538,-0.018833,0.020042,0.002688,0.012669,0.016015,-0.010872,0.032645,0.003781,-0.025301,-0.002471,-0.004316,0.016233,0.030151,-0.014278,0.02077,0.0053,0.004483,-0.004641,-0.002624,-0.029591,-0.004469,0.005594,-0.017324,-0.001699,0.026008,-0.014737,-0.005179,0.027793,0.034811,0.00841,0.016346,0.004344,0.00249,0.0,0.0,0.0,5.0
--wIGbLEhlpl_UeAIyDmZQ,-0.005456,0.009802,0.01644,0.004111,0.026086,0.012159,-0.01191,0.019884,-0.000267,0.009167,-0.022002,-0.000376,0.022163,-0.012592,-0.016783,0.007154,0.030941,0.010765,0.013459,0.011096,0.019684,0.008331,0.005821,-0.019253,0.01065,-0.026495,0.0028,0.004084,0.015505,-0.022831,0.009739,0.000924,-0.003843,0.001612,-0.001618,-0.007613,-0.009442,-0.005349,-0.002633,-0.003725,0.014835,0.013204,-0.009251,-0.003674,0.004898,0.01651,-0.012062,-2.8e-05,-0.003391,-0.023944,0.000422,-0.012971,0.005639,-0.005384,0.018614,0.013661,0.013891,0.020802,-0.015975,0.002161,0.00921,0.018026,-0.018251,-0.017948,-0.010045,-0.006756,-0.023177,0.017577,0.015901,-0.017156,-0.030951,0.006246,0.004852,0.010155,-0.01865,-0.003744,-0.006849,0.002759,-0.013968,-0.023144,-0.011524,-0.014173,0.01309,-0.008078,0.005399,0.011827,0.011999,0.006736,-0.002318,0.017966,-0.001693,-0.020036,-0.004857,-0.019216,-0.023563,0.013857,-0.003167,0.006324,0.008236,-0.012372,-0.005626,0.007414,0.014332,-0.021266,-0.005814,-0.001382,-0.019748,0.001419,-0.005417,-0.0118,-0.002742,-0.008808,0.005969,-0.01634,0.004003,-0.02227,-0.001811,0.005603,-0.000967,-0.010754,0.009281,0.010762,-0.00423,0.017744,-0.005627,-0.021057,0.004062,-0.009549,-0.026366,-0.012437,-0.018023,-0.005857,-0.004375,-0.009411,0.006758,-0.012338,-0.016047,0.016621,-0.00795,0.005086,-0.009438,-0.015627,-0.008897,0.009759,0.020124,0.006253,0.010768,-0.02042,0.003077,-0.00804,0.011339,0.002316,-0.005414,-0.004821,-0.001869,0.015924,-0.002956,-0.025049,-0.01275,0.001372,0.007707,0.002852,0.004081,0.003575,0.034528,0.028531,0.005457,-0.00511,0.004155,-0.012019,0.001941,0.01011,-0.006044,0.007181,-0.005672,0.018773,0.018902,-0.000433,0.012279,0.005938,0.000586,0.001359,-0.018573,-0.009612,-0.008295,-0.010517,-0.014956,-0.003387,-0.014494,0.001876,0.012112,0.002879,-0.011921,0.020464,-0.029488,0.016904,0.00712,-0.003552,-0.028669,0.002088,-0.001001,-0.004917,-0.002217,-0.009109,-0.003918,0.003644,-0.004833,0.00502,-0.0047,0.010091,-0.002118,-0.008017,-0.007088,0.03099,-0.014254,0.011027,-0.01389,0.008267,0.009554,0.005332,0.012902,-0.000232,0.007208,-0.018913,0.008089,-0.001019,0.017901,-0.008769,0.009083,0.007134,0.0089,0.008112,0.021203,-0.013173,0.005674,0.021077,-0.001882,-0.003455,0.003263,-0.011494,-0.002357,-0.013575,-0.014636,-0.001604,-0.009959,-0.00911,0.016737,-0.008355,0.0031,0.002337,-0.000418,-0.005265,0.00794,0.003551,-0.012436,-0.008322,0.017082,0.011061,-0.029862,-0.019501,-0.002858,0.001951,-0.000122,0.003853,0.00827,-0.023261,0.012332,-0.009129,0.000946,-0.011941,0.022079,-0.005741,-0.000776,-0.00152,0.009702,-0.007817,0.001017,-0.014863,0.009951,-0.000982,-0.013537,0.002709,0.014539,-0.007789,0.031251,0.011054,-0.014146,-0.003772,0.002339,0.011464,-0.006964,0.023242,0.004164,0.004424,-0.001646,0.005231,-0.00919,-0.002553,0.002079,-0.02935,0.666667,0.166667,3.0,3.833333
-000aQFeK6tqVLndf7xORg,0.010634,0.02208,0.014209,-0.017468,0.022044,-0.0014,-0.002871,0.023146,0.002384,0.001417,-0.010938,0.005055,0.027587,-0.017047,-0.013811,-0.002557,0.041712,0.002163,0.028847,0.000981,0.024016,0.0029,-0.005201,-0.009223,-0.002743,-0.017438,-0.003572,0.006089,0.010395,-0.013209,0.012886,-0.001177,-0.014673,0.022489,-0.009622,-0.008018,-0.018578,-0.016016,-0.012349,-0.014922,0.011807,0.012447,-0.011533,0.00777,0.021696,0.018031,-0.009846,0.018853,-0.006351,-0.024626,-0.005112,-0.009186,0.006745,-0.015911,0.015193,0.013527,0.023132,0.027268,-0.010801,0.012058,0.004636,0.011673,-0.029823,-0.026776,-0.001081,-0.013225,-0.019039,0.0359,0.018251,-0.018557,-0.02795,0.005416,-0.006096,0.014236,-0.016713,0.007266,-0.006211,0.013172,-0.027839,-0.01764,-0.010107,-0.012879,0.005412,-0.002554,0.003613,-0.000204,0.0191,0.015422,0.012548,0.025712,0.003411,-0.020152,0.010858,-0.018758,-0.026953,0.018693,-0.008354,0.01096,-0.000944,-0.027852,-5.5e-05,-0.001036,0.008829,-0.023901,-0.004498,-0.012934,-0.001759,0.00228,-0.010393,-0.001095,0.009954,0.000122,0.017624,-0.028894,0.008357,-0.017014,-0.009783,0.018886,-0.009965,-0.015121,0.00761,0.025632,-0.005411,0.021959,-0.012702,-0.042302,0.004007,-0.010247,-0.025184,-0.011162,-0.01954,0.015698,-0.020799,-0.011743,0.032221,-0.039323,-0.014192,0.035332,-0.019248,0.007542,-0.012299,-0.021357,0.001803,0.011896,0.01258,-0.007564,0.010964,-0.029188,-0.01086,-0.021306,0.014168,-0.00815,-0.006527,0.007753,-0.003765,0.011067,-0.004922,-0.023077,-0.007052,0.015688,0.018018,-0.000261,0.004033,0.011681,0.028107,0.016449,0.012073,-0.003362,0.001624,-0.013004,-0.00061,0.007606,0.008367,0.000196,-0.00947,0.017505,-0.002442,0.008911,0.010356,0.003271,-0.006186,-0.001911,-0.012913,0.008914,-0.023485,-0.0192,-0.031872,-0.007105,-0.024002,-0.002116,0.016141,0.007223,-0.011142,0.018811,-0.043218,0.011272,0.010321,-0.018093,-0.021208,0.003711,-0.0045,-0.006089,-0.001891,-0.007828,-0.010498,-0.001914,-0.00019,0.015799,-0.006593,0.032376,-0.01564,-0.002543,9e-05,0.024382,-0.024875,0.017178,-0.026717,0.00886,0.015729,0.002536,0.019144,0.001338,0.00015,-0.023235,0.018891,-0.008904,0.031017,-0.009575,0.007091,0.021063,0.010742,0.015952,0.016892,-0.020155,0.000906,0.037056,-0.00926,-0.009961,0.015665,-0.015672,0.006226,-0.025471,-0.019,0.001999,-0.01858,-0.010708,0.013912,-0.021817,-0.001499,0.009913,-0.022437,-0.002935,0.003617,-0.007614,-0.016705,-0.02221,0.031644,3.8e-05,-0.042605,-0.019705,0.005296,0.004769,-0.001515,0.01438,0.000286,-0.025709,0.018251,-0.010956,-0.004383,-0.01399,0.013778,-0.006711,0.002063,0.009063,-0.001155,-0.017032,0.008638,-0.022255,0.014825,0.002668,-0.017494,0.010238,0.009991,-0.01573,0.032765,0.01212,-0.005577,-0.006436,0.006122,0.005328,-1.9e-05,0.027976,0.005147,0.000412,0.008574,0.0186,-0.008427,-0.000187,0.014652,-0.030129,0.666667,0.0,0.0,5.0


In [21]:
all_features_business.describe()

Unnamed: 0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
count,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13943.0,13943.0,13943.0,13943.0
mean,-0.003064,0.013271,0.009814,0.00503,0.010627,-0.001113,0.002001,0.01039,-0.00032,-0.006723,-0.00532,-0.005815,0.007759,-0.009522,-0.008265,0.005916,0.004258,0.006717,0.012721,0.014328,0.008059,0.000892,-0.00348,-0.003057,0.003986,-0.009438,0.000703,-0.008268,0.007722,-0.017623,0.001843,-0.00093,-0.001647,0.019616,-0.00692,0.003965,-0.013586,-0.003097,-0.000797,-0.005616,0.002817,-0.001244,-0.006199,0.002333,0.00749,0.010137,-0.01208,-0.007829,-0.000653,-0.007427,-0.009152,-0.004206,-0.001295,-0.013134,0.011092,0.003128,0.00845,0.010707,-0.004164,0.009868,0.007101,0.008359,-0.008024,-0.008058,0.00878,-0.006673,-0.005685,0.01201,0.014049,-0.006189,-0.011193,0.001834,-0.011907,0.011114,-0.011875,0.004391,-0.009875,0.002256,-0.00572,-0.005301,0.002592,-0.003507,0.010428,-0.002655,-0.012499,0.001018,0.016215,-0.002295,0.015542,0.003795,-0.00212,-0.010085,-0.002769,0.003489,-0.007287,0.005785,0.005239,0.003826,-0.007431,-0.006113,0.003796,0.002484,0.001984,-0.006809,-0.011613,-0.006055,0.006124,-0.000657,-0.008005,-0.006148,-0.001333,0.007654,0.014817,-0.019549,0.002022,-0.011361,-0.00406,0.001153,0.004859,-0.00653,0.009043,0.009244,0.007514,0.004294,0.000595,-0.014303,0.004017,0.007292,-0.007148,-0.002959,-0.013875,0.006336,-0.009457,-0.007612,0.006327,-0.009667,-0.005108,0.004141,0.003471,-0.00251,-0.006891,-0.009714,-0.004066,-0.001494,0.001505,-0.005379,-0.001923,-0.01549,-0.006395,-0.00712,0.00812,0.008524,0.002329,-0.008556,-0.004764,0.020033,-0.007436,-0.014332,-0.010447,0.001724,0.015051,0.002115,0.005672,-0.003593,0.009698,0.00842,0.009863,0.003763,0.003922,0.002519,-0.002151,0.011328,0.003406,0.007475,-0.00489,0.007277,0.002408,-0.00618,0.002133,-0.008397,0.009599,-0.000802,-0.010138,0.00096,-0.010933,0.009575,-0.018738,-0.000841,-0.007573,0.002059,0.004265,0.013202,-0.000846,0.008352,-0.015107,-0.001042,0.003079,-0.002016,-0.014228,-0.001775,-0.001738,-6.7e-05,0.003867,-0.006766,0.000148,-0.006928,-0.007547,0.013669,-0.001067,0.003061,-0.003466,-0.002121,-0.005304,0.003852,-0.013774,0.007965,-0.014116,0.01098,6.7e-05,0.006122,0.014892,-0.004645,-0.002933,-0.023498,0.023194,-0.002134,0.017847,-0.004226,0.004651,0.009921,0.005313,0.005632,0.022859,-0.006774,-0.007394,0.016785,-0.000706,0.004866,-0.003649,0.000424,-0.003166,-0.001598,-0.006525,-0.000646,0.00281,-0.011868,0.009359,-0.006614,-0.000138,0.006705,-0.010223,-0.010017,0.011592,-0.000926,-0.003894,-0.012812,0.007333,-0.000477,-0.01864,-0.007945,0.003704,0.007203,0.006916,0.008091,-0.007295,-0.00833,0.013342,-0.001083,0.000486,-0.009093,0.014864,0.008571,-0.01146,0.004608,0.006137,-0.009683,0.00154,-0.000861,0.010936,0.004821,-0.011965,-4.3e-05,-0.000589,-0.007936,0.01545,-0.001201,-0.012113,-0.006083,-0.001173,-0.000291,-0.006021,0.011332,-0.001086,0.001629,0.003123,0.01788,-0.008149,0.006232,-0.007356,-0.01373,0.486991,0.423987,1.434996,3.615964
std,0.012574,0.012691,0.013487,0.011095,0.021245,0.016907,0.017223,0.01277,0.01584,0.014542,0.012649,0.018037,0.016,0.00863,0.011354,0.012275,0.0298,0.009474,0.014504,0.013937,0.009808,0.009913,0.015431,0.018432,0.013293,0.011701,0.012123,0.010297,0.012869,0.011821,0.012709,0.013073,0.014403,0.013582,0.013856,0.016305,0.012447,0.012857,0.01268,0.017846,0.010599,0.014781,0.010113,0.011828,0.015224,0.013605,0.011191,0.010592,0.011755,0.013023,0.011626,0.014875,0.011862,0.010413,0.011215,0.00913,0.013108,0.014191,0.01486,0.017919,0.014657,0.015697,0.013521,0.010449,0.015768,0.011161,0.012792,0.015422,0.009833,0.014837,0.018848,0.010213,0.015429,0.014219,0.014298,0.011426,0.010927,0.013764,0.011169,0.012041,0.011255,0.012486,0.015638,0.012546,0.013671,0.014496,0.015246,0.017183,0.013725,0.017746,0.013077,0.014652,0.013447,0.016087,0.012791,0.010634,0.015896,0.012322,0.013207,0.012448,0.012594,0.011329,0.012405,0.022496,0.011583,0.009242,0.019767,0.012708,0.014739,0.014589,0.012295,0.016529,0.011211,0.013053,0.011754,0.013579,0.014731,0.011809,0.012352,0.012342,0.010389,0.018576,0.017107,0.013816,0.012026,0.017056,0.011515,0.013253,0.013614,0.011637,0.013557,0.012294,0.014592,0.010147,0.014395,0.013536,0.017008,0.01866,0.021773,0.015211,0.013368,0.012922,0.018828,0.01452,0.013401,0.010855,0.012483,0.012431,0.014119,0.013193,0.009707,0.014342,0.011141,0.012118,0.01049,0.013089,0.014265,0.012903,0.011055,0.019661,0.010756,0.01145,0.011387,0.012621,0.017064,0.016006,0.014778,0.013955,0.010968,0.011681,0.009931,0.013572,0.015357,0.011429,0.011536,0.013619,0.012767,0.013583,0.01209,0.011902,0.011884,0.013692,0.012744,0.010288,0.014662,0.01843,0.014575,0.013594,0.013225,0.010749,0.011462,0.0118,0.011482,0.015187,0.017512,0.019378,0.0106,0.012213,0.013172,0.011278,0.010627,0.012003,0.014814,0.010535,0.015839,0.010345,0.010264,0.010842,0.009741,0.018139,0.010813,0.013837,0.011864,0.018308,0.010047,0.009261,0.010783,0.010399,0.016454,0.011223,0.009965,0.012264,0.013687,0.01077,0.011726,0.010528,0.014726,0.011782,0.0133,0.014093,0.011001,0.013986,0.013266,0.014947,0.015156,0.009714,0.011817,0.012059,0.012412,0.011963,0.015178,0.019311,0.015687,0.008356,0.017704,0.010752,0.012176,0.013567,0.010724,0.013507,0.015091,0.009027,0.010639,0.014732,0.013494,0.013288,0.016772,0.012381,0.014501,0.017422,0.011337,0.010657,0.009578,0.009371,0.01645,0.021723,0.011894,0.011679,0.013377,0.013417,0.0115,0.016707,0.010082,0.014509,0.010318,0.010081,0.012555,0.019478,0.009747,0.014546,0.011093,0.012609,0.015258,0.009686,0.020021,0.014047,0.014542,0.011661,0.017527,0.012784,0.014536,0.011553,0.010494,0.013065,0.011615,0.011922,0.010597,0.014169,0.01787,0.013494,1.299472,1.070148,2.371442,1.277067
min,-0.083584,-0.048183,-0.082702,-0.108288,-0.115774,-0.086931,-0.10857,-0.062102,-0.086272,-0.086558,-0.078426,-0.127634,-0.058273,-0.055188,-0.065513,-0.067697,-0.088034,-0.040384,-0.068092,-0.050891,-0.043811,-0.052155,-0.076216,-0.115352,-0.05179,-0.068577,-0.066455,-0.074327,-0.055605,-0.101189,-0.106899,-0.080055,-0.084317,-0.070692,-0.069446,-0.086763,-0.074888,-0.076831,-0.071654,-0.081089,-0.056127,-0.059005,-0.070768,-0.10885,-0.079988,-0.054087,-0.073513,-0.060223,-0.093119,-0.085957,-0.078384,-0.111495,-0.061136,-0.113398,-0.05252,-0.050458,-0.064211,-0.052294,-0.070343,-0.061472,-0.069506,-0.073227,-0.0674,-0.063204,-0.092215,-0.070846,-0.063995,-0.072909,-0.038463,-0.077258,-0.101383,-0.070499,-0.111484,-0.065544,-0.079804,-0.06323,-0.092385,-0.05905,-0.055972,-0.0837,-0.066289,-0.086322,-0.084504,-0.095018,-0.074781,-0.075327,-0.048621,-0.105973,-0.062609,-0.077603,-0.063582,-0.070549,-0.064844,-0.08559,-0.062849,-0.069915,-0.083331,-0.055568,-0.08717,-0.069933,-0.080017,-0.06473,-0.058979,-0.102921,-0.089095,-0.061811,-0.100191,-0.079627,-0.082636,-0.145845,-0.0517,-0.075631,-0.055132,-0.086985,-0.067807,-0.077298,-0.072372,-0.049288,-0.068825,-0.089492,-0.06577,-0.079663,-0.063194,-0.078701,-0.092992,-0.087106,-0.052613,-0.138988,-0.064867,-0.078219,-0.088858,-0.07685,-0.084366,-0.079238,-0.07423,-0.072232,-0.077914,-0.080781,-0.08354,-0.097923,-0.087568,-0.076769,-0.189754,-0.067418,-0.065557,-0.083685,-0.071574,-0.083164,-0.109107,-0.075874,-0.049538,-0.062527,-0.066495,-0.073781,-0.06084,-0.086051,-0.071932,-0.064877,-0.077329,-0.082992,-0.048539,-0.081197,-0.065844,-0.073435,-0.087719,-0.126048,-0.066422,-0.058022,-0.042334,-0.057865,-0.050671,-0.063593,-0.084013,-0.085433,-0.068843,-0.058188,-0.083884,-0.0977,-0.112267,-0.066147,-0.06208,-0.064511,-0.092434,-0.102401,-0.083442,-0.079474,-0.101518,-0.105695,-0.066116,-0.044844,-0.075232,-0.034519,-0.095657,-0.07663,-0.094195,-0.09083,-0.057373,-0.081108,-0.076312,-0.084817,-0.05352,-0.056505,-0.07046,-0.050453,-0.081961,-0.062613,-0.060985,-0.046992,-0.049998,-0.063819,-0.06686,-0.095871,-0.069913,-0.07738,-0.082403,-0.038838,-0.065892,-0.06159,-0.082533,-0.056202,-0.036492,-0.074352,-0.064665,-0.082717,-0.034178,-0.072146,-0.065978,-0.069489,-0.059326,-0.060248,-0.051783,-0.091879,-0.04766,-0.07206,-0.083009,-0.028998,-0.075281,-0.058935,-0.085736,-0.087183,-0.080319,-0.071016,-0.080142,-0.050536,-0.070369,-0.131792,-0.061645,-0.075445,-0.059933,-0.067897,-0.106021,-0.06684,-0.091128,-0.08807,-0.074678,-0.08998,-0.055124,-0.106685,-0.101585,-0.121885,-0.065813,-0.049703,-0.048747,-0.058144,-0.105451,-0.080401,-0.068748,-0.083766,-0.086956,-0.1387,-0.047121,-0.084898,-0.090269,-0.062505,-0.046821,-0.057777,-0.107601,-0.093094,-0.04623,-0.071981,-0.074638,-0.130865,-0.068574,-0.071326,-0.096004,-0.080342,-0.109352,-0.06719,-0.092592,-0.087325,-0.147928,-0.056686,-0.077646,-0.065126,-0.046775,-0.046602,-0.076124,-0.056595,-0.089082,-0.096181,0.0,0.0,0.0,1.0
25%,-0.011047,0.005704,0.002057,-0.001162,-0.003995,-0.013807,-0.009842,0.00228,-0.009508,-0.016474,-0.012641,-0.016447,-0.005067,-0.014398,-0.01494,-0.001582,-0.022835,0.001037,0.003752,0.005267,0.001656,-0.00561,-0.013774,-0.014248,-0.004736,-0.017333,-0.006562,-0.014543,-0.000189,-0.024654,-0.006131,-0.009696,-0.010131,0.011981,-0.015955,-0.007725,-0.021689,-0.010412,-0.008177,-0.0188,-0.003462,-0.012058,-0.012795,-0.004974,-0.002849,0.001326,-0.018304,-0.014664,-0.007698,-0.015514,-0.01718,-0.014092,-0.008737,-0.01946,0.004114,-0.002157,-0.001116,0.001324,-0.01275,-0.002786,-0.002367,-0.002003,-0.016963,-0.014185,-0.000787,-0.013448,-0.014193,0.001253,0.007743,-0.016647,-0.022909,-0.003984,-0.022926,0.002811,-0.020874,-0.002882,-0.016172,-0.00671,-0.013486,-0.012478,-0.004289,-0.011952,0.001653,-0.009752,-0.021436,-0.008986,0.005373,-0.015982,0.007111,-0.008244,-0.010466,-0.020511,-0.011879,-0.007619,-0.015974,-0.000517,-0.004708,-0.003255,-0.016061,-0.013757,-0.004644,-0.004398,-0.006519,-0.023414,-0.018615,-0.011481,-0.008272,-0.008243,-0.016469,-0.016678,-0.009064,-0.003131,0.008687,-0.027614,-0.004317,-0.021021,-0.012721,-0.007292,-0.002488,-0.014033,0.003235,-0.005907,-0.004343,-0.004012,-0.007174,-0.026132,-0.002868,-0.00248,-0.016639,-0.010279,-0.022394,-0.000233,-0.019461,-0.013093,-0.002783,-0.017456,-0.017561,-0.010013,-0.013727,-0.013652,-0.014504,-0.018622,-0.011732,-0.012173,-0.007957,-0.012042,-0.01022,-0.023647,-0.014651,-0.015055,0.002094,-0.000721,-0.003687,-0.017369,-0.010935,0.011669,-0.01679,-0.02224,-0.016874,-0.012067,0.008129,-0.004498,-0.001175,-0.011995,-0.002183,-0.00034,0.000348,-0.006334,-0.003104,-0.004747,-0.008264,0.00304,-0.007221,0.001055,-0.011585,-0.001436,-0.005622,-0.01398,-0.005644,-0.016781,0.001996,-0.010992,-0.017311,-0.005523,-0.019912,-0.004943,-0.028406,-0.009608,-0.016233,-0.005188,-0.002175,0.005832,-0.007517,-0.001715,-0.028283,-0.016314,-0.003955,-0.009178,-0.021955,-0.008351,-0.008601,-0.006972,-0.005233,-0.01323,-0.009535,-0.012968,-0.013708,0.007173,-0.007144,-0.010048,-0.009622,-0.01008,-0.012333,-0.009246,-0.019356,0.002084,-0.020758,0.004624,-0.012005,-0.000385,0.008811,-0.011584,-0.012753,-0.029865,0.01633,-0.007654,0.007772,-0.011472,-0.003075,0.000686,-0.001073,-0.003557,0.014632,-0.015979,-0.017751,0.010715,-0.007386,-0.002794,-0.010916,-0.006231,-0.01351,-0.016188,-0.017097,-0.005677,-0.010669,-0.018296,0.001997,-0.015883,-0.006943,-0.001656,-0.018427,-0.015054,0.005327,-0.009387,-0.012994,-0.02119,-0.00483,-0.007981,-0.028222,-0.01916,-0.002989,0.000431,0.001314,0.002222,-0.018487,-0.023525,0.005611,-0.007868,-0.007073,-0.016213,0.007427,-0.003003,-0.018163,-0.004812,0.000357,-0.016093,-0.004923,-0.015839,0.005216,-0.004672,-0.01841,-0.005725,-0.012568,-0.013484,-8.7e-05,-0.010836,-0.021065,-0.013971,-0.014057,-0.008244,-0.011715,0.004486,-0.007152,-0.005825,-0.004292,0.010479,-0.013931,-0.003333,-0.021107,-0.022517,0.0,0.0,0.15251,3.0
50%,-0.002658,0.012379,0.010436,0.006015,0.010233,0.000163,0.001477,0.00981,-0.001206,-0.007999,-0.005203,-0.006089,0.009373,-0.008939,-0.007374,0.006492,0.003939,0.006416,0.011701,0.01557,0.007658,0.00054,-0.004754,-0.003026,0.003099,-0.009317,0.001174,-0.008948,0.007178,-0.017525,0.001988,-0.001564,-0.00111,0.020149,-0.007503,0.004415,-0.014125,-0.002935,-0.000489,-0.008182,0.002553,-0.001784,-0.006986,0.002087,0.009229,0.010038,-0.011713,-0.008454,-0.000526,-0.00641,-0.009069,-0.003945,-0.001933,-0.013888,0.011175,0.003155,0.007651,0.010264,-0.004809,0.009416,0.006283,0.006794,-0.007894,-0.008055,0.010385,-0.006726,-0.005295,0.011657,0.013235,-0.005708,-0.009874,0.001877,-0.011782,0.012617,-0.012106,0.00405,-0.009966,0.001053,-0.006024,-0.00437,0.002593,-0.003452,0.011099,-0.00303,-0.014262,0.000137,0.014266,-0.002177,0.016125,0.00391,-0.002714,-0.010276,-0.002581,0.004077,-0.00694,0.005837,0.006849,0.003959,-0.007457,-0.005482,0.003961,0.00284,0.001468,-0.009959,-0.011296,-0.005431,0.003756,-0.000196,-0.007042,-0.006421,-0.001724,0.006567,0.014889,-0.020908,0.003097,-0.011575,-0.003561,0.000224,0.005921,-0.006236,0.009569,0.010081,0.008745,0.00392,0.000739,-0.014368,0.003586,0.006873,-0.007018,-0.002702,-0.014756,0.00681,-0.010264,-0.007308,0.006319,-0.009394,-0.005344,0.005452,0.002174,-0.001973,-0.006749,-0.009807,-0.003456,-0.001577,0.000803,-0.005833,-0.001952,-0.014674,-0.006533,-0.005757,0.008223,0.008263,0.003413,-0.009073,-0.004498,0.019216,-0.007722,-0.014306,-0.009905,-0.000976,0.014548,0.002036,0.006008,-0.004115,0.009206,0.009507,0.008844,0.002813,0.003708,0.002262,-0.0026,0.011126,0.003814,0.007801,-0.004684,0.007264,0.001929,-0.006471,0.001855,-0.008353,0.009856,-0.001242,-0.010117,0.000503,-0.011065,0.008917,-0.017407,-0.002077,-0.006338,0.002239,0.00442,0.012682,-0.001081,0.007653,-0.016276,0.003136,0.002883,-0.001888,-0.014103,-0.001709,-0.001297,0.000299,0.005511,-0.00756,0.00056,-0.006938,-0.008117,0.01397,-0.001419,0.002501,-0.003548,-0.002237,-0.005639,0.00296,-0.013334,0.007439,-0.014228,0.010718,0.000298,0.006759,0.014607,-0.004458,-0.002849,-0.023027,0.022793,-0.001804,0.015627,-0.003757,0.004383,0.011346,0.004664,0.005108,0.023254,-0.007889,-0.007686,0.01589,0.000177,0.005623,-0.003809,0.001445,-0.004063,-0.005099,-0.004762,-0.000665,0.00072,-0.011613,0.008923,-0.005837,-0.000885,0.005195,-0.00897,-0.01001,0.011394,-0.000108,-0.002784,-0.011845,0.006356,-0.000602,-0.018049,-0.008284,0.003302,0.007851,0.007237,0.008307,-0.007468,-0.010933,0.013694,-0.001725,0.000795,-0.007967,0.014468,0.00963,-0.011637,0.003581,0.005408,-0.009879,0.001196,-0.004268,0.010715,0.004136,-0.011368,0.000207,-0.000183,-0.007313,0.016607,-0.000991,-0.012075,-0.006378,0.000164,-0.00015,-0.004814,0.011287,-0.000715,0.001593,0.002437,0.017351,-0.007808,0.005315,-0.009506,-0.013766,0.076923,0.0,1.0,4.0
75%,0.00478,0.020252,0.018238,0.011961,0.024504,0.010952,0.013728,0.018753,0.008473,0.002566,0.002315,0.005706,0.02001,-0.004428,-0.000945,0.014176,0.02823,0.012277,0.021024,0.024249,0.013947,0.006965,0.005988,0.008122,0.012346,-0.001133,0.008306,-0.002465,0.014762,-0.010677,0.009949,0.007774,0.007443,0.027952,0.00169,0.015935,-0.006209,0.004651,0.006993,0.007801,0.009068,0.009294,-0.000185,0.009499,0.018319,0.01897,-0.005984,-0.001299,0.006212,0.001053,-0.001523,0.005973,0.006287,-0.007256,0.018077,0.00861,0.016983,0.020108,0.003506,0.022301,0.015294,0.018426,0.00145,-0.001891,0.019104,0.000141,0.002544,0.022492,0.020192,0.004266,0.000867,0.00791,-0.000718,0.020409,-0.003404,0.01111,-0.003541,0.010284,0.002383,0.002389,0.009652,0.004592,0.019919,0.004541,-0.004012,0.010344,0.025779,0.009909,0.024097,0.016593,0.006067,-0.000162,0.005914,0.015438,0.001393,0.012307,0.016222,0.011001,0.001364,0.001998,0.012857,0.009505,0.010062,0.009879,-0.004754,-0.000135,0.022151,0.007231,0.001411,0.003795,0.005767,0.019184,0.021343,-0.01251,0.009272,-0.002017,0.004868,0.00897,0.012643,0.001031,0.015266,0.022555,0.019883,0.013055,0.008717,-0.002486,0.0101,0.017659,0.002954,0.004372,-0.00655,0.013351,0.000635,-0.00154,0.015405,-0.002743,0.007162,0.018292,0.021763,0.008292,0.000272,1.2e-05,0.005635,0.008515,0.010811,0.001168,0.006807,-0.00664,0.001354,0.001833,0.013934,0.017835,0.008929,-0.000355,0.001492,0.027582,0.002745,-0.007325,-0.003433,0.014476,0.021824,0.008913,0.012605,0.004428,0.021251,0.018573,0.018423,0.014375,0.010718,0.00962,0.003435,0.019639,0.01441,0.014523,0.001899,0.016181,0.01043,0.001706,0.009565,-0.000304,0.017232,0.008234,-0.003003,0.007053,-0.002419,0.025621,-0.008146,0.006968,0.001156,0.009441,0.010977,0.019978,0.00588,0.01851,-0.002028,0.013926,0.009997,0.005491,-0.006582,0.005034,0.005619,0.007083,0.013817,-0.001085,0.009861,-0.000692,-0.001448,0.020362,0.004766,0.015251,0.002951,0.005484,0.001482,0.016604,-0.007674,0.013736,-0.007226,0.016925,0.011651,0.013158,0.021094,0.002668,0.006191,-0.017027,0.029109,0.003769,0.026487,0.002829,0.011633,0.019498,0.010888,0.014854,0.031093,0.001177,0.002938,0.021867,0.006578,0.013596,0.003986,0.008195,0.006934,0.013602,0.004534,0.004272,0.015925,-0.005194,0.015947,0.003232,0.006285,0.013643,-0.000879,-0.00461,0.017397,0.008318,0.005989,-0.004453,0.018277,0.007032,-0.008424,0.003893,0.009991,0.014134,0.012832,0.01417,0.003669,0.006186,0.021077,0.005675,0.008557,-0.0008,0.022057,0.021446,-0.004896,0.013021,0.011192,-0.003576,0.008622,0.015484,0.016441,0.014253,-0.004915,0.006751,0.011057,-0.001958,0.030322,0.008285,-0.00374,0.002407,0.011518,0.007738,0.001674,0.018039,0.005256,0.008372,0.009789,0.024422,-0.00215,0.015139,0.005131,-0.00461,0.555556,0.5,1.833333,5.0
max,0.064741,0.084106,0.083579,0.059902,0.114816,0.088511,0.107744,0.064037,0.104323,0.060413,0.059461,0.106553,0.079047,0.05178,0.040758,0.07467,0.148787,0.06985,0.082168,0.069665,0.065093,0.064273,0.075607,0.082578,0.070099,0.073794,0.06368,0.041751,0.096844,0.065913,0.058294,0.065041,0.05861,0.08579,0.08167,0.089162,0.069089,0.066031,0.073925,0.092793,0.061816,0.079833,0.051851,0.087835,0.063425,0.104404,0.075669,0.044585,0.070669,0.056045,0.050594,0.070684,0.070373,0.047868,0.095339,0.04793,0.092214,0.074881,0.118972,0.088602,0.099991,0.106282,0.059356,0.051338,0.104422,0.059245,0.061403,0.090159,0.064966,0.059494,0.086312,0.120114,0.052091,0.077865,0.064638,0.074588,0.05742,0.109389,0.049867,0.050275,0.075055,0.072317,0.097469,0.07698,0.059878,0.062964,0.092983,0.087314,0.143254,0.093507,0.09422,0.064179,0.060462,0.075104,0.051324,0.064009,0.066012,0.212004,0.069685,0.121798,0.08262,0.087511,0.070212,0.109514,0.064896,0.042922,0.085761,0.06725,0.065836,0.058039,0.091415,0.100598,0.134917,0.048666,0.053686,0.064059,0.10636,0.062286,0.071713,0.046965,0.071885,0.081932,0.117653,0.063065,0.059206,0.094772,0.074873,0.065642,0.068973,0.065462,0.069886,0.083824,0.064573,0.082648,0.074903,0.09479,0.103808,0.073123,0.086542,0.067639,0.096775,0.075212,0.123111,0.06972,0.062728,0.057725,0.060153,0.063671,0.091946,0.069615,0.066404,0.093567,0.069819,0.072468,0.073103,0.103464,0.06381,0.088411,0.038714,0.105896,0.071241,0.200819,0.056651,0.08411,0.093192,0.069224,0.098708,0.074043,0.065436,0.078068,0.052464,0.088688,0.076044,0.069463,0.067749,0.071696,0.075246,0.067711,0.066252,0.062688,0.082437,0.068429,0.060225,0.083045,0.073303,0.094942,0.035412,0.074462,0.073156,0.085236,0.098512,0.081254,0.078143,0.08884,0.06778,0.069411,0.091263,0.065147,0.069609,0.086773,0.06136,0.072729,0.074741,0.057549,0.109259,0.0432,0.050935,0.078957,0.05338,0.094015,0.058053,0.07639,0.060163,0.095398,0.060069,0.086257,0.051201,0.07288,0.08125,0.056427,0.076087,0.084965,0.069629,0.031196,0.107125,0.051818,0.136806,0.088436,0.090186,0.087792,0.084302,0.090949,0.093948,0.097882,0.064035,0.079617,0.074585,0.062478,0.051324,0.06004,0.076301,0.112198,0.073741,0.055612,0.087965,0.040769,0.116986,0.056504,0.072346,0.087871,0.053042,0.051203,0.090984,0.072543,0.062687,0.061973,0.132132,0.069153,0.050634,0.070602,0.068575,0.062326,0.061243,0.083753,0.060195,0.166511,0.087642,0.065116,0.068844,0.063331,0.091909,0.069396,0.046898,0.078878,0.086678,0.043481,0.060154,0.111422,0.059133,0.080705,0.049321,0.06431,0.075124,0.037624,0.097773,0.058061,0.101336,0.051585,0.09861,0.059169,0.050044,0.081725,0.057365,0.119919,0.098944,0.084284,0.040886,0.093771,0.080132,0.053972,56.0,28.0,75.0,5.0


# Create Label y (Business categories)

In [22]:
def load_business_df(): 
    filename = r'../../data/business.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfbusiness = load_business_df()

In [23]:
dfbusiness.head()

Unnamed: 0,address,attributes,business_id,categories,city,hours,is_open,latitude,longitude,name,postal_code,review_count,stars,state
0,2818 E Camino Acequia Drive,{'GoodForKids': 'False'},1SWheh84yJXfytovILXOAQ,"Golf, Active Life",Phoenix,,0,33.522143,-112.018481,Arizona Biltmore Golf Club,85016,5,3.0,AZ
1,30 Eglinton Avenue W,"{'RestaurantsReservations': 'True', 'GoodForMe...",QXAEGFB4oINsVuTFxEYKFQ,"Specialty Food, Restaurants, Dim Sum, Imported...",Mississauga,"{'Monday': '9:0-0:0', 'Tuesday': '9:0-0:0', 'W...",1,43.605499,-79.652289,Emerald Chinese Restaurant,L5R 3E7,128,2.5,ON
2,"10110 Johnston Rd, Ste 15","{'GoodForKids': 'True', 'NoiseLevel': 'u'avera...",gnKjwL_1w79qoiV3IC_xQQ,"Sushi Bars, Restaurants, Japanese",Charlotte,"{'Monday': '17:30-21:30', 'Wednesday': '17:30-...",1,35.092564,-80.859132,Musashi Japanese Restaurant,28210,170,4.0,NC
3,"15655 W Roosevelt St, Ste 237",,xvX2CttrVhyG2z1dFg_0xw,"Insurance, Financial Services",Goodyear,"{'Monday': '8:0-17:0', 'Tuesday': '8:0-17:0', ...",1,33.455613,-112.395596,Farmers Insurance - Paul Lorenz,85338,3,5.0,AZ
4,"4209 Stuart Andrew Blvd, Ste F","{'BusinessAcceptsBitcoin': 'False', 'ByAppoint...",HhyxOkGAM07SRYtlQ4wMFQ,"Plumbing, Shopping, Local Services, Home Servi...",Charlotte,"{'Monday': '7:0-23:0', 'Tuesday': '7:0-23:0', ...",1,35.190012,-80.887223,Queen City Plumbing,28217,4,4.0,NC


# Join x,y (feature matrix, category) using business_id

In [24]:
dfbusiness.columns

Index(['address', 'attributes', 'business_id', 'categories', 'city', 'hours',
       'is_open', 'latitude', 'longitude', 'name', 'postal_code',
       'review_count', 'stars', 'state'],
      dtype='object')

In [25]:
len(dfbusiness['stars'].unique())

9

In [26]:
# Add business details to features df
keep_cols = ['business_id', 'categories', 'review_count']
all_features_business = all_features_business.merge(dfbusiness[keep_cols], how='left', on='business_id') 

In [27]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.005223,0.013203,0.003038,0.005219,-0.015445,-0.0239,0.022235,0.005243,-0.004331,-0.016159,-0.004644,-0.015904,-0.01401,-0.001277,-0.010583,0.018494,-0.032239,0.009304,0.019475,0.020988,0.008241,-0.003566,-0.011737,-0.007616,-0.002773,-0.001232,0.001906,0.003526,0.008889,-0.003706,-0.006945,0.013901,-0.005344,0.022541,-0.019252,0.024952,-0.005151,0.007677,-0.001113,0.011767,-0.002323,-0.01123,0.0013,0.003953,-0.007738,-0.003068,-0.010008,-0.004464,0.004149,-0.002037,-0.014858,0.011594,-0.006096,-0.008228,0.019046,-0.001743,-0.002748,0.003867,0.001787,0.023379,0.000696,0.00366,0.003319,-0.011378,0.022503,0.000559,0.006824,0.006366,0.010075,0.004468,-0.003759,-0.006353,-0.029037,0.019419,-0.023796,0.013552,-0.008152,0.007197,0.004987,0.003728,0.006133,0.008269,0.008479,0.00612,-0.023127,-0.0126,0.002558,-0.021814,0.027417,0.005371,-0.003589,0.005786,-0.014635,0.018341,-0.003121,0.006018,0.018851,0.01104,-0.021877,-0.004835,0.019864,-0.000397,-0.007807,0.009183,-0.006709,-0.002768,0.029296,-0.003871,-0.000575,-0.012479,0.00489,0.023094,0.013396,-0.026432,0.006571,-0.000186,0.002575,-0.007381,0.014821,-0.010161,0.009179,-0.012276,0.016624,0.000203,0.015883,-0.003842,0.001717,0.018,0.002103,0.00119,-0.013259,0.004317,0.003305,-0.009963,0.000248,-0.006219,-0.005504,-0.012631,0.021161,-0.013545,0.003748,0.001858,-0.0027,-0.016487,-0.014337,-0.013673,-0.008969,0.002243,-0.004917,0.00461,0.019599,0.011536,0.013265,-0.018489,-0.003279,0.021542,0.004779,-0.010109,-0.007909,-0.011632,0.011815,0.004837,0.0024,-0.005271,-0.011838,-0.00092,0.000854,0.016514,-0.00266,0.005007,0.00745,0.002366,0.0186,0.012341,-0.003923,0.006543,-0.005612,-0.017796,-0.000585,-0.018443,0.021079,-0.020229,-0.014831,-0.004145,-0.001142,0.026174,-0.009629,-0.008576,0.007016,0.018397,0.003416,0.017813,0.001769,-0.000691,0.005052,-0.022161,-0.008082,0.002377,-0.015207,-0.003983,0.01159,-0.001088,0.020586,-0.013535,0.00733,-0.007836,-0.01056,0.013222,-0.002292,-0.015976,-0.000667,-0.005512,-0.007929,-0.007196,-0.004776,0.003071,-0.020419,0.013903,-0.011624,0.004576,0.014348,-0.000237,-0.015393,-0.012079,0.021398,0.001066,0.006931,-0.000196,-0.004139,0.017844,-0.00135,0.009227,0.027137,-0.00218,-0.014655,0.015815,0.007232,0.015476,-0.001738,0.009113,-0.009466,0.013402,-0.002148,-0.003952,0.01799,-0.02201,-0.006022,0.004883,-0.006745,0.006215,-0.001315,-0.012486,0.017581,-0.010556,0.009122,-0.007456,-0.007119,-0.001699,-0.006113,0.015474,0.001463,0.007989,0.009898,0.017991,-0.022558,0.010675,0.005857,-0.00322,5.3e-05,-0.007513,0.019841,0.021682,-0.020227,0.001476,0.00465,-0.010603,-0.000549,0.024183,0.013865,0.01075,-0.002404,-0.005597,-0.01739,-0.007033,-0.005543,-0.008435,-0.009834,-0.001851,-0.021076,-0.012153,0.001812,0.004794,0.006246,0.006409,-0.005728,0.016663,-0.002259,0.007373,-0.017416,-0.012376,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,-0.009473,0.007201,0.007892,0.007558,0.01489,-0.01304,0.000998,0.003142,0.01251,-0.009641,-0.008363,-0.001879,0.008971,-0.012477,-0.007306,0.015207,-0.00671,0.006058,0.015347,0.013466,0.008532,-0.009809,-0.014297,-0.010219,-0.009368,-0.008026,0.000225,-0.01447,0.011642,-0.021837,-0.003795,0.008734,-0.006346,0.012801,-0.008032,0.008933,-0.007662,-0.003781,0.015866,-0.004885,-0.00342,0.001984,-0.00896,-0.008293,0.000109,0.000424,-0.0179,-0.008771,0.005597,-0.006098,-0.013686,-0.01124,0.000259,-0.018127,0.013997,-0.007821,-0.006437,0.020163,-0.001049,0.008458,0.000157,0.005871,0.010256,0.005336,0.000152,-0.003606,-0.004147,0.003571,0.008488,0.002955,-0.023666,0.002256,-0.016963,0.0199,-0.035946,0.001264,-0.003626,-0.007445,0.001252,-0.009165,0.00174,-0.002499,0.029358,0.009762,-0.014283,-0.014628,0.009347,-0.005357,0.023107,0.017772,0.005322,0.000268,-0.019699,0.010072,-0.007535,0.002688,0.019548,0.005258,-0.005424,-0.012128,0.019191,0.003372,0.003562,0.009147,-0.018953,0.008326,0.021718,0.010737,0.010248,-0.016259,-0.01494,0.014091,0.015749,-0.034193,0.011758,-0.002841,0.01151,-0.002001,0.00191,-0.005489,0.014577,-0.011231,0.018701,0.008329,0.006234,-0.020305,0.000372,0.016223,-0.003292,-0.00572,-0.02751,0.007529,-0.002863,-0.000749,0.006188,-0.001027,0.003565,-0.002575,0.008592,-0.012345,0.003579,0.003061,0.010515,-0.008463,-0.0052,-0.0027,0.001463,-0.00914,-0.017576,0.00618,0.006529,-0.005442,-0.000809,-0.014269,-0.000472,0.00848,0.016967,-0.023097,0.005209,-0.016292,0.00596,0.00741,0.003231,-0.018636,0.007962,0.00977,-0.014006,0.013873,-0.00222,0.001197,-0.000364,-0.004451,0.005394,0.019312,-0.010599,0.00819,0.009584,-0.016238,0.00414,-0.005972,0.021776,-0.016622,-0.010791,-0.004947,-0.023165,0.019233,-0.000613,-0.007492,-0.000699,0.013902,0.005464,0.012575,-0.004372,0.004327,-0.003115,-0.012252,0.006913,0.005236,-0.019368,-0.005391,0.005623,0.006432,0.022029,-0.009159,-0.006492,0.002881,-0.014241,0.004627,-0.004803,-0.005085,0.000408,-0.006104,-0.007565,0.004088,-0.011873,0.002603,-0.019869,0.010066,-0.00583,0.006663,0.010104,-0.000855,-0.022315,-0.017786,0.022162,-0.009721,0.018821,0.002622,-0.012028,0.021037,-0.006022,-0.005586,0.0033,-0.014013,-0.007531,0.006539,0.007552,0.012004,-0.007375,0.009872,-0.011202,0.012094,-0.004685,-0.006495,0.006865,-0.006358,0.005425,-0.00094,-0.006885,-0.007055,0.005787,-0.00611,0.008055,0.012035,0.000168,-0.004976,-0.021966,0.012122,-0.006307,0.000289,-0.006792,0.013264,0.010632,0.006377,-0.004647,-0.008811,0.006545,0.003384,0.008636,-0.012588,0.002656,0.027923,-0.012597,-0.002731,-0.003223,-0.015098,-0.0003,0.005892,0.00719,-0.00083,-0.006708,0.00201,-0.015133,-0.004134,0.013898,0.000469,-0.004306,-0.002744,-0.005321,-0.002581,0.004612,0.004995,-0.002042,0.003872,-0.00744,0.010921,-0.007165,0.010107,-0.014089,-0.010551,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,0.009633,0.017234,0.02172,0.004169,-0.017047,0.021239,0.022135,0.018492,-0.025737,-0.037928,0.01828,-7e-06,-0.005802,-0.015352,-0.006522,0.002334,0.010471,0.027408,0.026609,0.010042,0.012227,-0.007788,-0.00143,0.034607,0.004424,-0.021396,-0.001293,-0.028477,-0.009573,-0.003362,0.007298,-0.019318,0.00776,0.035141,-0.02902,0.01573,-0.021251,-0.026926,-0.007267,-0.027688,-0.016911,-0.008173,-0.011411,0.016458,0.021287,0.015095,-0.011617,-0.017024,0.012681,-0.017895,0.001415,-0.0019,-0.016979,-0.020762,0.025932,0.00695,0.020581,0.009177,-0.020471,0.01232,0.013417,-0.006696,-0.021582,-0.010728,0.029076,-0.035886,0.026735,0.041019,0.028155,-0.004064,0.008174,-0.000281,-0.017237,0.012496,-0.016756,0.013198,-0.005503,0.0031,-0.00878,0.015854,0.022162,0.001933,-0.006392,-0.011483,-0.022019,0.008398,0.033826,0.003696,0.038147,-0.002722,0.012157,0.001011,0.008341,0.03413,-0.012817,0.010063,0.032218,0.00091,-0.007468,-0.02715,0.002558,0.022478,0.006398,-0.028837,-0.01712,-0.008066,-0.004648,0.017424,-0.038165,0.012726,-0.014071,0.006842,0.024031,-0.011678,-0.026158,-0.008592,-0.004284,-0.020306,0.015129,-0.001771,0.01084,0.032584,0.017343,-0.010482,-0.003187,-0.01053,0.012006,0.023894,0.000882,-0.00435,0.000148,0.022759,-0.033508,-0.022849,0.040028,-0.027479,-0.000731,0.025781,0.006088,0.015409,-0.017167,-0.02443,0.005184,-0.027714,-0.01012,-0.024396,-0.016754,-0.015322,-0.021488,-0.020994,0.025677,-0.001122,-0.005009,-0.032246,-0.017809,0.066689,-0.020374,-0.008658,-0.020044,0.021057,0.027508,0.030709,0.0053,-0.010031,-0.003083,-0.003467,0.04925,0.006947,0.020152,0.011769,0.002683,0.023202,-0.012936,0.004806,-0.012107,-0.012848,-0.019727,0.000232,0.003374,-0.008491,0.014565,0.004403,-0.006044,0.005692,-0.038754,-0.006393,-0.011121,0.025353,0.000418,0.002281,-0.004344,0.024774,0.011273,-0.014918,-0.017253,-0.007505,-0.000373,-0.024779,-0.021933,-0.001134,-0.010386,0.01263,0.000258,-0.003047,0.02557,-0.01956,-0.022088,0.030279,0.007202,0.00806,-0.018217,-0.02514,0.012217,-0.014579,-0.021562,-0.002347,0.015548,0.018797,0.029962,-0.012895,0.022535,-0.025045,0.019349,-0.027831,0.051708,0.022441,0.007402,-0.007486,0.018309,0.007561,0.003333,0.005273,0.020453,-0.012347,-0.002165,0.017393,-0.028464,0.013932,-0.003051,-0.003757,0.022351,-0.00033,-0.002272,0.007608,0.010865,-0.009705,-0.002934,0.012057,0.023419,0.007891,-0.049629,-0.019722,0.045787,-0.032095,-0.001229,-0.017484,0.008668,-0.001412,-0.019967,-0.006188,0.021955,0.014569,0.032061,0.025408,-0.027979,-0.023355,0.041538,-0.018833,0.020042,0.002688,0.012669,0.016015,-0.010872,0.032645,0.003781,-0.025301,-0.002471,-0.004316,0.016233,0.030151,-0.014278,0.02077,0.0053,0.004483,-0.004641,-0.002624,-0.029591,-0.004469,0.005594,-0.017324,-0.001699,0.026008,-0.014737,-0.005179,0.027793,0.034811,0.00841,0.016346,0.004344,0.00249,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,-0.005456,0.009802,0.01644,0.004111,0.026086,0.012159,-0.01191,0.019884,-0.000267,0.009167,-0.022002,-0.000376,0.022163,-0.012592,-0.016783,0.007154,0.030941,0.010765,0.013459,0.011096,0.019684,0.008331,0.005821,-0.019253,0.01065,-0.026495,0.0028,0.004084,0.015505,-0.022831,0.009739,0.000924,-0.003843,0.001612,-0.001618,-0.007613,-0.009442,-0.005349,-0.002633,-0.003725,0.014835,0.013204,-0.009251,-0.003674,0.004898,0.01651,-0.012062,-2.8e-05,-0.003391,-0.023944,0.000422,-0.012971,0.005639,-0.005384,0.018614,0.013661,0.013891,0.020802,-0.015975,0.002161,0.00921,0.018026,-0.018251,-0.017948,-0.010045,-0.006756,-0.023177,0.017577,0.015901,-0.017156,-0.030951,0.006246,0.004852,0.010155,-0.01865,-0.003744,-0.006849,0.002759,-0.013968,-0.023144,-0.011524,-0.014173,0.01309,-0.008078,0.005399,0.011827,0.011999,0.006736,-0.002318,0.017966,-0.001693,-0.020036,-0.004857,-0.019216,-0.023563,0.013857,-0.003167,0.006324,0.008236,-0.012372,-0.005626,0.007414,0.014332,-0.021266,-0.005814,-0.001382,-0.019748,0.001419,-0.005417,-0.0118,-0.002742,-0.008808,0.005969,-0.01634,0.004003,-0.02227,-0.001811,0.005603,-0.000967,-0.010754,0.009281,0.010762,-0.00423,0.017744,-0.005627,-0.021057,0.004062,-0.009549,-0.026366,-0.012437,-0.018023,-0.005857,-0.004375,-0.009411,0.006758,-0.012338,-0.016047,0.016621,-0.00795,0.005086,-0.009438,-0.015627,-0.008897,0.009759,0.020124,0.006253,0.010768,-0.02042,0.003077,-0.00804,0.011339,0.002316,-0.005414,-0.004821,-0.001869,0.015924,-0.002956,-0.025049,-0.01275,0.001372,0.007707,0.002852,0.004081,0.003575,0.034528,0.028531,0.005457,-0.00511,0.004155,-0.012019,0.001941,0.01011,-0.006044,0.007181,-0.005672,0.018773,0.018902,-0.000433,0.012279,0.005938,0.000586,0.001359,-0.018573,-0.009612,-0.008295,-0.010517,-0.014956,-0.003387,-0.014494,0.001876,0.012112,0.002879,-0.011921,0.020464,-0.029488,0.016904,0.00712,-0.003552,-0.028669,0.002088,-0.001001,-0.004917,-0.002217,-0.009109,-0.003918,0.003644,-0.004833,0.00502,-0.0047,0.010091,-0.002118,-0.008017,-0.007088,0.03099,-0.014254,0.011027,-0.01389,0.008267,0.009554,0.005332,0.012902,-0.000232,0.007208,-0.018913,0.008089,-0.001019,0.017901,-0.008769,0.009083,0.007134,0.0089,0.008112,0.021203,-0.013173,0.005674,0.021077,-0.001882,-0.003455,0.003263,-0.011494,-0.002357,-0.013575,-0.014636,-0.001604,-0.009959,-0.00911,0.016737,-0.008355,0.0031,0.002337,-0.000418,-0.005265,0.00794,0.003551,-0.012436,-0.008322,0.017082,0.011061,-0.029862,-0.019501,-0.002858,0.001951,-0.000122,0.003853,0.00827,-0.023261,0.012332,-0.009129,0.000946,-0.011941,0.022079,-0.005741,-0.000776,-0.00152,0.009702,-0.007817,0.001017,-0.014863,0.009951,-0.000982,-0.013537,0.002709,0.014539,-0.007789,0.031251,0.011054,-0.014146,-0.003772,0.002339,0.011464,-0.006964,0.023242,0.004164,0.004424,-0.001646,0.005231,-0.00919,-0.002553,0.002079,-0.02935,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,0.010634,0.02208,0.014209,-0.017468,0.022044,-0.0014,-0.002871,0.023146,0.002384,0.001417,-0.010938,0.005055,0.027587,-0.017047,-0.013811,-0.002557,0.041712,0.002163,0.028847,0.000981,0.024016,0.0029,-0.005201,-0.009223,-0.002743,-0.017438,-0.003572,0.006089,0.010395,-0.013209,0.012886,-0.001177,-0.014673,0.022489,-0.009622,-0.008018,-0.018578,-0.016016,-0.012349,-0.014922,0.011807,0.012447,-0.011533,0.00777,0.021696,0.018031,-0.009846,0.018853,-0.006351,-0.024626,-0.005112,-0.009186,0.006745,-0.015911,0.015193,0.013527,0.023132,0.027268,-0.010801,0.012058,0.004636,0.011673,-0.029823,-0.026776,-0.001081,-0.013225,-0.019039,0.0359,0.018251,-0.018557,-0.02795,0.005416,-0.006096,0.014236,-0.016713,0.007266,-0.006211,0.013172,-0.027839,-0.01764,-0.010107,-0.012879,0.005412,-0.002554,0.003613,-0.000204,0.0191,0.015422,0.012548,0.025712,0.003411,-0.020152,0.010858,-0.018758,-0.026953,0.018693,-0.008354,0.01096,-0.000944,-0.027852,-5.5e-05,-0.001036,0.008829,-0.023901,-0.004498,-0.012934,-0.001759,0.00228,-0.010393,-0.001095,0.009954,0.000122,0.017624,-0.028894,0.008357,-0.017014,-0.009783,0.018886,-0.009965,-0.015121,0.00761,0.025632,-0.005411,0.021959,-0.012702,-0.042302,0.004007,-0.010247,-0.025184,-0.011162,-0.01954,0.015698,-0.020799,-0.011743,0.032221,-0.039323,-0.014192,0.035332,-0.019248,0.007542,-0.012299,-0.021357,0.001803,0.011896,0.01258,-0.007564,0.010964,-0.029188,-0.01086,-0.021306,0.014168,-0.00815,-0.006527,0.007753,-0.003765,0.011067,-0.004922,-0.023077,-0.007052,0.015688,0.018018,-0.000261,0.004033,0.011681,0.028107,0.016449,0.012073,-0.003362,0.001624,-0.013004,-0.00061,0.007606,0.008367,0.000196,-0.00947,0.017505,-0.002442,0.008911,0.010356,0.003271,-0.006186,-0.001911,-0.012913,0.008914,-0.023485,-0.0192,-0.031872,-0.007105,-0.024002,-0.002116,0.016141,0.007223,-0.011142,0.018811,-0.043218,0.011272,0.010321,-0.018093,-0.021208,0.003711,-0.0045,-0.006089,-0.001891,-0.007828,-0.010498,-0.001914,-0.00019,0.015799,-0.006593,0.032376,-0.01564,-0.002543,9e-05,0.024382,-0.024875,0.017178,-0.026717,0.00886,0.015729,0.002536,0.019144,0.001338,0.00015,-0.023235,0.018891,-0.008904,0.031017,-0.009575,0.007091,0.021063,0.010742,0.015952,0.016892,-0.020155,0.000906,0.037056,-0.00926,-0.009961,0.015665,-0.015672,0.006226,-0.025471,-0.019,0.001999,-0.01858,-0.010708,0.013912,-0.021817,-0.001499,0.009913,-0.022437,-0.002935,0.003617,-0.007614,-0.016705,-0.02221,0.031644,3.8e-05,-0.042605,-0.019705,0.005296,0.004769,-0.001515,0.01438,0.000286,-0.025709,0.018251,-0.010956,-0.004383,-0.01399,0.013778,-0.006711,0.002063,0.009063,-0.001155,-0.017032,0.008638,-0.022255,0.014825,0.002668,-0.017494,0.010238,0.009991,-0.01573,0.032765,0.01212,-0.005577,-0.006436,0.006122,0.005328,-1.9e-05,0.027976,0.005147,0.000412,0.008574,0.0186,-0.008427,-0.000187,0.014652,-0.030129,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [28]:
all_features_business['categories'][0]

'Nightlife, Sports Bars, Restaurants, Bars, American (Traditional)'

In [29]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.005223,0.013203,0.003038,0.005219,-0.015445,-0.0239,0.022235,0.005243,-0.004331,-0.016159,-0.004644,-0.015904,-0.01401,-0.001277,-0.010583,0.018494,-0.032239,0.009304,0.019475,0.020988,0.008241,-0.003566,-0.011737,-0.007616,-0.002773,-0.001232,0.001906,0.003526,0.008889,-0.003706,-0.006945,0.013901,-0.005344,0.022541,-0.019252,0.024952,-0.005151,0.007677,-0.001113,0.011767,-0.002323,-0.01123,0.0013,0.003953,-0.007738,-0.003068,-0.010008,-0.004464,0.004149,-0.002037,-0.014858,0.011594,-0.006096,-0.008228,0.019046,-0.001743,-0.002748,0.003867,0.001787,0.023379,0.000696,0.00366,0.003319,-0.011378,0.022503,0.000559,0.006824,0.006366,0.010075,0.004468,-0.003759,-0.006353,-0.029037,0.019419,-0.023796,0.013552,-0.008152,0.007197,0.004987,0.003728,0.006133,0.008269,0.008479,0.00612,-0.023127,-0.0126,0.002558,-0.021814,0.027417,0.005371,-0.003589,0.005786,-0.014635,0.018341,-0.003121,0.006018,0.018851,0.01104,-0.021877,-0.004835,0.019864,-0.000397,-0.007807,0.009183,-0.006709,-0.002768,0.029296,-0.003871,-0.000575,-0.012479,0.00489,0.023094,0.013396,-0.026432,0.006571,-0.000186,0.002575,-0.007381,0.014821,-0.010161,0.009179,-0.012276,0.016624,0.000203,0.015883,-0.003842,0.001717,0.018,0.002103,0.00119,-0.013259,0.004317,0.003305,-0.009963,0.000248,-0.006219,-0.005504,-0.012631,0.021161,-0.013545,0.003748,0.001858,-0.0027,-0.016487,-0.014337,-0.013673,-0.008969,0.002243,-0.004917,0.00461,0.019599,0.011536,0.013265,-0.018489,-0.003279,0.021542,0.004779,-0.010109,-0.007909,-0.011632,0.011815,0.004837,0.0024,-0.005271,-0.011838,-0.00092,0.000854,0.016514,-0.00266,0.005007,0.00745,0.002366,0.0186,0.012341,-0.003923,0.006543,-0.005612,-0.017796,-0.000585,-0.018443,0.021079,-0.020229,-0.014831,-0.004145,-0.001142,0.026174,-0.009629,-0.008576,0.007016,0.018397,0.003416,0.017813,0.001769,-0.000691,0.005052,-0.022161,-0.008082,0.002377,-0.015207,-0.003983,0.01159,-0.001088,0.020586,-0.013535,0.00733,-0.007836,-0.01056,0.013222,-0.002292,-0.015976,-0.000667,-0.005512,-0.007929,-0.007196,-0.004776,0.003071,-0.020419,0.013903,-0.011624,0.004576,0.014348,-0.000237,-0.015393,-0.012079,0.021398,0.001066,0.006931,-0.000196,-0.004139,0.017844,-0.00135,0.009227,0.027137,-0.00218,-0.014655,0.015815,0.007232,0.015476,-0.001738,0.009113,-0.009466,0.013402,-0.002148,-0.003952,0.01799,-0.02201,-0.006022,0.004883,-0.006745,0.006215,-0.001315,-0.012486,0.017581,-0.010556,0.009122,-0.007456,-0.007119,-0.001699,-0.006113,0.015474,0.001463,0.007989,0.009898,0.017991,-0.022558,0.010675,0.005857,-0.00322,5.3e-05,-0.007513,0.019841,0.021682,-0.020227,0.001476,0.00465,-0.010603,-0.000549,0.024183,0.013865,0.01075,-0.002404,-0.005597,-0.01739,-0.007033,-0.005543,-0.008435,-0.009834,-0.001851,-0.021076,-0.012153,0.001812,0.004794,0.006246,0.006409,-0.005728,0.016663,-0.002259,0.007373,-0.017416,-0.012376,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,-0.009473,0.007201,0.007892,0.007558,0.01489,-0.01304,0.000998,0.003142,0.01251,-0.009641,-0.008363,-0.001879,0.008971,-0.012477,-0.007306,0.015207,-0.00671,0.006058,0.015347,0.013466,0.008532,-0.009809,-0.014297,-0.010219,-0.009368,-0.008026,0.000225,-0.01447,0.011642,-0.021837,-0.003795,0.008734,-0.006346,0.012801,-0.008032,0.008933,-0.007662,-0.003781,0.015866,-0.004885,-0.00342,0.001984,-0.00896,-0.008293,0.000109,0.000424,-0.0179,-0.008771,0.005597,-0.006098,-0.013686,-0.01124,0.000259,-0.018127,0.013997,-0.007821,-0.006437,0.020163,-0.001049,0.008458,0.000157,0.005871,0.010256,0.005336,0.000152,-0.003606,-0.004147,0.003571,0.008488,0.002955,-0.023666,0.002256,-0.016963,0.0199,-0.035946,0.001264,-0.003626,-0.007445,0.001252,-0.009165,0.00174,-0.002499,0.029358,0.009762,-0.014283,-0.014628,0.009347,-0.005357,0.023107,0.017772,0.005322,0.000268,-0.019699,0.010072,-0.007535,0.002688,0.019548,0.005258,-0.005424,-0.012128,0.019191,0.003372,0.003562,0.009147,-0.018953,0.008326,0.021718,0.010737,0.010248,-0.016259,-0.01494,0.014091,0.015749,-0.034193,0.011758,-0.002841,0.01151,-0.002001,0.00191,-0.005489,0.014577,-0.011231,0.018701,0.008329,0.006234,-0.020305,0.000372,0.016223,-0.003292,-0.00572,-0.02751,0.007529,-0.002863,-0.000749,0.006188,-0.001027,0.003565,-0.002575,0.008592,-0.012345,0.003579,0.003061,0.010515,-0.008463,-0.0052,-0.0027,0.001463,-0.00914,-0.017576,0.00618,0.006529,-0.005442,-0.000809,-0.014269,-0.000472,0.00848,0.016967,-0.023097,0.005209,-0.016292,0.00596,0.00741,0.003231,-0.018636,0.007962,0.00977,-0.014006,0.013873,-0.00222,0.001197,-0.000364,-0.004451,0.005394,0.019312,-0.010599,0.00819,0.009584,-0.016238,0.00414,-0.005972,0.021776,-0.016622,-0.010791,-0.004947,-0.023165,0.019233,-0.000613,-0.007492,-0.000699,0.013902,0.005464,0.012575,-0.004372,0.004327,-0.003115,-0.012252,0.006913,0.005236,-0.019368,-0.005391,0.005623,0.006432,0.022029,-0.009159,-0.006492,0.002881,-0.014241,0.004627,-0.004803,-0.005085,0.000408,-0.006104,-0.007565,0.004088,-0.011873,0.002603,-0.019869,0.010066,-0.00583,0.006663,0.010104,-0.000855,-0.022315,-0.017786,0.022162,-0.009721,0.018821,0.002622,-0.012028,0.021037,-0.006022,-0.005586,0.0033,-0.014013,-0.007531,0.006539,0.007552,0.012004,-0.007375,0.009872,-0.011202,0.012094,-0.004685,-0.006495,0.006865,-0.006358,0.005425,-0.00094,-0.006885,-0.007055,0.005787,-0.00611,0.008055,0.012035,0.000168,-0.004976,-0.021966,0.012122,-0.006307,0.000289,-0.006792,0.013264,0.010632,0.006377,-0.004647,-0.008811,0.006545,0.003384,0.008636,-0.012588,0.002656,0.027923,-0.012597,-0.002731,-0.003223,-0.015098,-0.0003,0.005892,0.00719,-0.00083,-0.006708,0.00201,-0.015133,-0.004134,0.013898,0.000469,-0.004306,-0.002744,-0.005321,-0.002581,0.004612,0.004995,-0.002042,0.003872,-0.00744,0.010921,-0.007165,0.010107,-0.014089,-0.010551,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,0.009633,0.017234,0.02172,0.004169,-0.017047,0.021239,0.022135,0.018492,-0.025737,-0.037928,0.01828,-7e-06,-0.005802,-0.015352,-0.006522,0.002334,0.010471,0.027408,0.026609,0.010042,0.012227,-0.007788,-0.00143,0.034607,0.004424,-0.021396,-0.001293,-0.028477,-0.009573,-0.003362,0.007298,-0.019318,0.00776,0.035141,-0.02902,0.01573,-0.021251,-0.026926,-0.007267,-0.027688,-0.016911,-0.008173,-0.011411,0.016458,0.021287,0.015095,-0.011617,-0.017024,0.012681,-0.017895,0.001415,-0.0019,-0.016979,-0.020762,0.025932,0.00695,0.020581,0.009177,-0.020471,0.01232,0.013417,-0.006696,-0.021582,-0.010728,0.029076,-0.035886,0.026735,0.041019,0.028155,-0.004064,0.008174,-0.000281,-0.017237,0.012496,-0.016756,0.013198,-0.005503,0.0031,-0.00878,0.015854,0.022162,0.001933,-0.006392,-0.011483,-0.022019,0.008398,0.033826,0.003696,0.038147,-0.002722,0.012157,0.001011,0.008341,0.03413,-0.012817,0.010063,0.032218,0.00091,-0.007468,-0.02715,0.002558,0.022478,0.006398,-0.028837,-0.01712,-0.008066,-0.004648,0.017424,-0.038165,0.012726,-0.014071,0.006842,0.024031,-0.011678,-0.026158,-0.008592,-0.004284,-0.020306,0.015129,-0.001771,0.01084,0.032584,0.017343,-0.010482,-0.003187,-0.01053,0.012006,0.023894,0.000882,-0.00435,0.000148,0.022759,-0.033508,-0.022849,0.040028,-0.027479,-0.000731,0.025781,0.006088,0.015409,-0.017167,-0.02443,0.005184,-0.027714,-0.01012,-0.024396,-0.016754,-0.015322,-0.021488,-0.020994,0.025677,-0.001122,-0.005009,-0.032246,-0.017809,0.066689,-0.020374,-0.008658,-0.020044,0.021057,0.027508,0.030709,0.0053,-0.010031,-0.003083,-0.003467,0.04925,0.006947,0.020152,0.011769,0.002683,0.023202,-0.012936,0.004806,-0.012107,-0.012848,-0.019727,0.000232,0.003374,-0.008491,0.014565,0.004403,-0.006044,0.005692,-0.038754,-0.006393,-0.011121,0.025353,0.000418,0.002281,-0.004344,0.024774,0.011273,-0.014918,-0.017253,-0.007505,-0.000373,-0.024779,-0.021933,-0.001134,-0.010386,0.01263,0.000258,-0.003047,0.02557,-0.01956,-0.022088,0.030279,0.007202,0.00806,-0.018217,-0.02514,0.012217,-0.014579,-0.021562,-0.002347,0.015548,0.018797,0.029962,-0.012895,0.022535,-0.025045,0.019349,-0.027831,0.051708,0.022441,0.007402,-0.007486,0.018309,0.007561,0.003333,0.005273,0.020453,-0.012347,-0.002165,0.017393,-0.028464,0.013932,-0.003051,-0.003757,0.022351,-0.00033,-0.002272,0.007608,0.010865,-0.009705,-0.002934,0.012057,0.023419,0.007891,-0.049629,-0.019722,0.045787,-0.032095,-0.001229,-0.017484,0.008668,-0.001412,-0.019967,-0.006188,0.021955,0.014569,0.032061,0.025408,-0.027979,-0.023355,0.041538,-0.018833,0.020042,0.002688,0.012669,0.016015,-0.010872,0.032645,0.003781,-0.025301,-0.002471,-0.004316,0.016233,0.030151,-0.014278,0.02077,0.0053,0.004483,-0.004641,-0.002624,-0.029591,-0.004469,0.005594,-0.017324,-0.001699,0.026008,-0.014737,-0.005179,0.027793,0.034811,0.00841,0.016346,0.004344,0.00249,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,-0.005456,0.009802,0.01644,0.004111,0.026086,0.012159,-0.01191,0.019884,-0.000267,0.009167,-0.022002,-0.000376,0.022163,-0.012592,-0.016783,0.007154,0.030941,0.010765,0.013459,0.011096,0.019684,0.008331,0.005821,-0.019253,0.01065,-0.026495,0.0028,0.004084,0.015505,-0.022831,0.009739,0.000924,-0.003843,0.001612,-0.001618,-0.007613,-0.009442,-0.005349,-0.002633,-0.003725,0.014835,0.013204,-0.009251,-0.003674,0.004898,0.01651,-0.012062,-2.8e-05,-0.003391,-0.023944,0.000422,-0.012971,0.005639,-0.005384,0.018614,0.013661,0.013891,0.020802,-0.015975,0.002161,0.00921,0.018026,-0.018251,-0.017948,-0.010045,-0.006756,-0.023177,0.017577,0.015901,-0.017156,-0.030951,0.006246,0.004852,0.010155,-0.01865,-0.003744,-0.006849,0.002759,-0.013968,-0.023144,-0.011524,-0.014173,0.01309,-0.008078,0.005399,0.011827,0.011999,0.006736,-0.002318,0.017966,-0.001693,-0.020036,-0.004857,-0.019216,-0.023563,0.013857,-0.003167,0.006324,0.008236,-0.012372,-0.005626,0.007414,0.014332,-0.021266,-0.005814,-0.001382,-0.019748,0.001419,-0.005417,-0.0118,-0.002742,-0.008808,0.005969,-0.01634,0.004003,-0.02227,-0.001811,0.005603,-0.000967,-0.010754,0.009281,0.010762,-0.00423,0.017744,-0.005627,-0.021057,0.004062,-0.009549,-0.026366,-0.012437,-0.018023,-0.005857,-0.004375,-0.009411,0.006758,-0.012338,-0.016047,0.016621,-0.00795,0.005086,-0.009438,-0.015627,-0.008897,0.009759,0.020124,0.006253,0.010768,-0.02042,0.003077,-0.00804,0.011339,0.002316,-0.005414,-0.004821,-0.001869,0.015924,-0.002956,-0.025049,-0.01275,0.001372,0.007707,0.002852,0.004081,0.003575,0.034528,0.028531,0.005457,-0.00511,0.004155,-0.012019,0.001941,0.01011,-0.006044,0.007181,-0.005672,0.018773,0.018902,-0.000433,0.012279,0.005938,0.000586,0.001359,-0.018573,-0.009612,-0.008295,-0.010517,-0.014956,-0.003387,-0.014494,0.001876,0.012112,0.002879,-0.011921,0.020464,-0.029488,0.016904,0.00712,-0.003552,-0.028669,0.002088,-0.001001,-0.004917,-0.002217,-0.009109,-0.003918,0.003644,-0.004833,0.00502,-0.0047,0.010091,-0.002118,-0.008017,-0.007088,0.03099,-0.014254,0.011027,-0.01389,0.008267,0.009554,0.005332,0.012902,-0.000232,0.007208,-0.018913,0.008089,-0.001019,0.017901,-0.008769,0.009083,0.007134,0.0089,0.008112,0.021203,-0.013173,0.005674,0.021077,-0.001882,-0.003455,0.003263,-0.011494,-0.002357,-0.013575,-0.014636,-0.001604,-0.009959,-0.00911,0.016737,-0.008355,0.0031,0.002337,-0.000418,-0.005265,0.00794,0.003551,-0.012436,-0.008322,0.017082,0.011061,-0.029862,-0.019501,-0.002858,0.001951,-0.000122,0.003853,0.00827,-0.023261,0.012332,-0.009129,0.000946,-0.011941,0.022079,-0.005741,-0.000776,-0.00152,0.009702,-0.007817,0.001017,-0.014863,0.009951,-0.000982,-0.013537,0.002709,0.014539,-0.007789,0.031251,0.011054,-0.014146,-0.003772,0.002339,0.011464,-0.006964,0.023242,0.004164,0.004424,-0.001646,0.005231,-0.00919,-0.002553,0.002079,-0.02935,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,0.010634,0.02208,0.014209,-0.017468,0.022044,-0.0014,-0.002871,0.023146,0.002384,0.001417,-0.010938,0.005055,0.027587,-0.017047,-0.013811,-0.002557,0.041712,0.002163,0.028847,0.000981,0.024016,0.0029,-0.005201,-0.009223,-0.002743,-0.017438,-0.003572,0.006089,0.010395,-0.013209,0.012886,-0.001177,-0.014673,0.022489,-0.009622,-0.008018,-0.018578,-0.016016,-0.012349,-0.014922,0.011807,0.012447,-0.011533,0.00777,0.021696,0.018031,-0.009846,0.018853,-0.006351,-0.024626,-0.005112,-0.009186,0.006745,-0.015911,0.015193,0.013527,0.023132,0.027268,-0.010801,0.012058,0.004636,0.011673,-0.029823,-0.026776,-0.001081,-0.013225,-0.019039,0.0359,0.018251,-0.018557,-0.02795,0.005416,-0.006096,0.014236,-0.016713,0.007266,-0.006211,0.013172,-0.027839,-0.01764,-0.010107,-0.012879,0.005412,-0.002554,0.003613,-0.000204,0.0191,0.015422,0.012548,0.025712,0.003411,-0.020152,0.010858,-0.018758,-0.026953,0.018693,-0.008354,0.01096,-0.000944,-0.027852,-5.5e-05,-0.001036,0.008829,-0.023901,-0.004498,-0.012934,-0.001759,0.00228,-0.010393,-0.001095,0.009954,0.000122,0.017624,-0.028894,0.008357,-0.017014,-0.009783,0.018886,-0.009965,-0.015121,0.00761,0.025632,-0.005411,0.021959,-0.012702,-0.042302,0.004007,-0.010247,-0.025184,-0.011162,-0.01954,0.015698,-0.020799,-0.011743,0.032221,-0.039323,-0.014192,0.035332,-0.019248,0.007542,-0.012299,-0.021357,0.001803,0.011896,0.01258,-0.007564,0.010964,-0.029188,-0.01086,-0.021306,0.014168,-0.00815,-0.006527,0.007753,-0.003765,0.011067,-0.004922,-0.023077,-0.007052,0.015688,0.018018,-0.000261,0.004033,0.011681,0.028107,0.016449,0.012073,-0.003362,0.001624,-0.013004,-0.00061,0.007606,0.008367,0.000196,-0.00947,0.017505,-0.002442,0.008911,0.010356,0.003271,-0.006186,-0.001911,-0.012913,0.008914,-0.023485,-0.0192,-0.031872,-0.007105,-0.024002,-0.002116,0.016141,0.007223,-0.011142,0.018811,-0.043218,0.011272,0.010321,-0.018093,-0.021208,0.003711,-0.0045,-0.006089,-0.001891,-0.007828,-0.010498,-0.001914,-0.00019,0.015799,-0.006593,0.032376,-0.01564,-0.002543,9e-05,0.024382,-0.024875,0.017178,-0.026717,0.00886,0.015729,0.002536,0.019144,0.001338,0.00015,-0.023235,0.018891,-0.008904,0.031017,-0.009575,0.007091,0.021063,0.010742,0.015952,0.016892,-0.020155,0.000906,0.037056,-0.00926,-0.009961,0.015665,-0.015672,0.006226,-0.025471,-0.019,0.001999,-0.01858,-0.010708,0.013912,-0.021817,-0.001499,0.009913,-0.022437,-0.002935,0.003617,-0.007614,-0.016705,-0.02221,0.031644,3.8e-05,-0.042605,-0.019705,0.005296,0.004769,-0.001515,0.01438,0.000286,-0.025709,0.018251,-0.010956,-0.004383,-0.01399,0.013778,-0.006711,0.002063,0.009063,-0.001155,-0.017032,0.008638,-0.022255,0.014825,0.002668,-0.017494,0.010238,0.009991,-0.01573,0.032765,0.01212,-0.005577,-0.006436,0.006122,0.005328,-1.9e-05,0.027976,0.005147,0.000412,0.008574,0.0186,-0.008427,-0.000187,0.014652,-0.030129,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [30]:
def stringDFColToBinaryCols(df, series_name):
    # Create list of all categories
    all_cats = []
    for string in df[series_name]:
        string = str(string)
        cats = string.strip().replace(' ', '').split(',')
        for cat in cats:
            if cat not in all_cats:
                all_cats.append(cat)
    # Make binary for each cat for each row
    for cat in all_cats:
        df[cat] = df[series_name].str.strip().str.replace(' ', '').str.contains(cat)
        # This technique will have some problems. 'Golf' may appear in non-Golf categories (ie 'Disc Golf')
        # Can be fixed with regular expressions: ',Golf,' OR 'BOF Golf,' OR ',Golf EOF'
    
    return df, all_cats
        
all_features_business, all_cats = stringDFColToBinaryCols(all_features_business, 'categories')

  if sys.path[0] == '':


In [31]:
print(all_cats)

['Nightlife', 'SportsBars', 'Restaurants', 'Bars', 'American(Traditional)', 'Pizza', 'HairRemoval', 'NailTechnicians', 'Beauty&Spas', 'NailSalons', 'Waxing', 'DaySpas', 'Electronics', 'ProfessionalServices', 'LocalServices', 'ElectronicsRepair', 'Computers', 'Shopping', 'Automotive', 'AutoRepair', 'Chinese', 'EyelashService', 'TobaccoShops', 'VapeShops', 'CarDealers', 'UsedCarDealers', 'Dentists', 'GeneralDentistry', 'CosmeticDentists', 'PediatricDentists', 'Health&Medical', 'Tex-Mex', 'Mexican', 'Arts&Entertainment', 'Festivals', 'Food', 'FoodTrucks', 'FarmersMarket', 'Portuguese', 'Bakeries', 'ChickenShop', 'Barbeque', 'EventPlanning&Services', 'EventPhotography', 'Photographers', 'SessionPhotography', 'SkinCare', 'Antiques', 'IceCream&FrozenYogurt', 'Donuts', 'SpecialtyFood', 'WebDesign', 'GraphicDesign', 'Marketing', 'RecyclingCenter', 'Caterers', 'Southern', 'ComfortFood', 'Breakfast&Brunch', 'French', 'American(New)', 'Burgers', 'Sandwiches', 'Coffee&Tea', 'Brasseries', 'Gyms', '

In [32]:
print(
    len(all_features_business[all_features_business['Golf']==True]), 
    len(all_features_business[all_features_business['DiscGolf']==True]), 
)

61 1


In [33]:
print(all_features_business[all_features_business['DiscGolf']==True]['categories'].values)
print('Should not have a True value for Golf, but does. Problem to deal with in the future.')
print(all_features_business[all_features_business['DiscGolf']==True]['Golf'].values)

['Sporting Goods, Active Life, Bike Rentals, Disc Golf, Shopping']
Should not have a True value for Golf, but does. Problem to deal with in the future.
[True]


In [34]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count,Nightlife,SportsBars,Restaurants,Bars,American(Traditional),Pizza,HairRemoval,NailTechnicians,Beauty&Spas,NailSalons,Waxing,DaySpas,Electronics,ProfessionalServices,LocalServices,ElectronicsRepair,Computers,Shopping,Automotive,AutoRepair,Chinese,EyelashService,TobaccoShops,VapeShops,CarDealers,UsedCarDealers,Dentists,GeneralDentistry,CosmeticDentists,PediatricDentists,Health&Medical,Tex-Mex,Mexican,Arts&Entertainment,Festivals,Food,FoodTrucks,FarmersMarket,Portuguese,Bakeries,ChickenShop,Barbeque,EventPlanning&Services,EventPhotography,Photographers,SessionPhotography,SkinCare,Antiques,IceCream&FrozenYogurt,Donuts,SpecialtyFood,WebDesign,GraphicDesign,Marketing,RecyclingCenter,Caterers,Southern,ComfortFood,Breakfast&Brunch,French,American(New),Burgers,Sandwiches,Coffee&Tea,Brasseries,Gyms,ChildCare&DayCare,LeisureCenters,Fitness&Instruction,ActiveLife,HardwareStores,Home&Garden,RealEstate,Condominiums,Hotels,HomeServices,ShoppingCenters,Hotels&Travel,HairSalons,EthnicFood,Turkish,InternationalGrocery,TapasBars,ShippingCenters,PrintingServices,Massage,MassageTherapy,Reflexology,Buffets,Korean,SushiBars,Japanese,Cafes,Soup,Golf,Venues&EventSpaces,AutoDetailing,BodyShops,AutoCustomization,Towing,Trainers,WeightLossCenters,FoodDeliveryServices,FastFood,Delis,Ethiopian,Vegetarian,Painters,DrywallInstallation&Repair,StuccoServices,Orthodontists,Periodontists,OralSurgeons,Piercing,Tattoo,Chiropractors,Optometrists,Italian,Couriers&DeliveryServices,PublicServices&Government,SportingGoods,Fashion,GolfEquipment,Bikes,Ski&SnowboardShops,SportsWear,BikeRepair/Maintenance,Filipino,PetGroomers,Veterinarians,PetSitting,Pets,PetServices,AutoGlassServices,RealEstateServices,RealEstateAgents,Pakistani,Indian,CardioClasses,DanceStudios,ChickenWings,Cosmetics&BeautySupply,Desserts,Sewing&Alterations,Arts&Crafts,Wheel&RimRepair,Tires,AutoParts&Supplies,Colonics,Saunas,Doctors,MedicalSpas,Naturopathic/Holistic,MeditationCenters,Reiki,SpiritualShop,Orthopedists,SportsMedicine,Surgeons,Grocery,MedicalCenters,InteriorDesign,Rugs,FurnitureStores,HomeDecor,Mattresses,Women'sClothing,Men'sClothing,ShoeStores,JuiceBars&Smoothies,Acupuncture,LaserHairRemoval,FamilyPractice,UrgentCare,Thai,AsianFusion,Vietnamese,Laotian,HomeCleaning,CarpetCleaning,Accessories,Barbers,Gluten-Free,SpeechTherapists,PhysicalTherapy,OccupationalTherapy,Seafood,Steakhouses,Wholesalers,DiscountStore,PartySupplies,DepartmentStores,...,Gelato,TelevisionServiceProviders,Fences&Gates,MetalFabricators,ScubaDiving,Diving,DiveShops,WatchRepair,Halotherapy,CulturalCenter,Lakes,Macarons,CustomCakes,Aquariums,BusinessConsulting,BotanicalGardens,PaintStores,Moroccan,Persian/Iranian,DataRecovery,Cajun/Creole,PartyEquipmentRentals,CarBrokers,BootCamps,Musicians,PartyCharacters,MusicProductionServices,Cuban,PuertoRican,RVDealers,RVRental,Bowling,Venezuelan,SummerCamps,PetAdoption,RefinishingServices,PublicTransportation,CommercialTruckDealers,CommercialTruckRepair,FoodStands,CommercialRealEstate,OutletStores,Campgrounds,RVParks,Resorts,TalentAgencies,GutterServices,UsedBookstore,AdultEducation,StripteaseDancers,DanceSchools,Wallpapering,GoldBuyers,PawnShops,Videographers,Arabian,DonationCenter,TravelAgents,Basque,Spanish,WaterDelivery,WaterStores,Kosher,SkateParks,Izakaya,Poutineries,BailBondsmen,PressureWashers,Herbs&Spices,PhotoBoothRentals,CannabisDispensaries,Poke,ArtClasses,Teppanyaki,Oncologist,HotPot,Szechuan,IrishPub,CyclingClasses,MountainBiking,ShoeRepair,ShoeShine,Cupcakes,SafeStores,Hunting&FishingSupplies,RehabilitationCenter,BasketballCourts,CountryClubs,Endocrinologists,Neurologist,Irish,PetCremationServices,PersonalInjuryLaw,Divorce&FamilyLaw,BankruptcyLaw,Immunodermatologists,RetirementHomes,Cantonese,PoleDancingClasses,Rodeo,VinylRecords,Props,Delicatessen,EthnicGrocery,GuestHouses,YelpEvents,RestaurantSupplies,PatioCoverings,Masonry/Concrete,DigitizingServices,Framing,TestPreparation,PrivateTutors,Skydiving,HomeHealthCare,MedicalSupplies,Psychologists,ModernEuropean,Shutters,FabricStores,SouvenirShops,Russian,CheeseShops,CarWindowTinting,FireProtectionServices,FacePainting,Tuscan,Gastroenterologist,Butcher,Blood&PlasmaDonationCenters,German,Keys&Locksmiths,DUILaw,CriminalDefenseLaw,Investing,SmogCheckStations,CarInspectors,BrewingSupplies,HongKongStyleCafe,PublicMarkets,VehicleWraps,Airports,TeethWhitening,RVRepair,CountertopInstallation,MortuaryServices,SnowRemoval,EstatePlanningLaw,Wills,Trusts,&Probates,BusinessLaw,Airlines,Estheticians,Engraving,TrophyShops,CandleStores,PopcornShops,Fishing,TrailerDealers,BeachBars,BeachVolleyball,ArtificialTurf,PanAsian,DJs,Paintball,MiniGolf,GoKarts,Wigs,GolfLessons,Opera&Ballet,Jazz&Blues,Waffles,SolarInstallation,HomeEnergyAuditors,CannabisClinics,Uzbek,Prenatal/PerinatalCare,Hypnosis/Hypnotherapy,Eatertainment,Afghan,HealthInsuranceOffices,BeverageStore,Tiling,Sicilian,Bartenders,SpineSurgeons,Carpenters,Singaporean,SkilledNursing,Live/RawFood,SepticServices,PrintMedia,SkatingRinks,InternetCafes,WineTours,Boating,DemolitionServices,ProductDesign,3DPrinting,RoadsideAssistance,Himalayan/Nepalese,Officiants,Kickboxing,Boxing,CookingClasses,CookingSchools,PersonalChefs,Indonesian,AquariumServices,Brazilian,LaboratoryTesting,HockeyEquipment,SkateShops,RealEstatePhotography,Video/FilmProduction,Sandblasting,Perfume,PrivateJetCharter,SoulFood,Bookbinding,TanningBeds,RealEstateLaw,EmergencyPetHospital,BoatCharters,Rafting/Kayaking,BoudoirPhotography,Argentine,SocialClubs,OutdoorFurnitureStores,SouthAfrican,AcaiBowls,LactationServices,PlacentaEncapsulations,Observatories,Ukrainian,Planetarium,Cabaret,Hakka,Sailing,FireplaceServices,Gunsmith,UniversityHousing,IndoorPlaycentre,Embassy,OliveOil,Karate,LocalFishStores,MotorsportVehicleRepairs,Synagogues,GuitarStores,MobileDentRepair,Paddleboarding,Distilleries,PostOffices,PetTransportation,CurrencyExchange,PastaShops,Smokehouse,Hydrotherapy,Pop-upShops,Videos&VideoGameRental,OxygenBars,ExcavationServices,MobileHomeRepair,PickYourOwnFarms,Farms,Scottish,British,Passport&VisaServices,PianoBars,PoliceDepartments,WeddingChapels,RegistrationServices,FloatSpa,DayCamps,TrainStations,Prosthodontists,MedicalCannabisReferrals,Mongolian,Orthotics,ChristmasTrees,ClubCrawl,ScreenPrinting,HazardousWasteDisposal,EnvironmentalAbatement,LawnServices,HennaArtists,KidsHairSalons,Zoos,EmploymentLaw,DebtReliefServices,VehicleShipping,Hats,BusTours,DinnerTheater,EstateLiquidation,GeneralLitigation,Coffee&TeaSupplies,Soccer,TrailerRepair,Awnings,Pretzels,ArtSpaceRentals,EditorialServices,Honduran,Nicaraguan,Marinas,CareerCounseling,TeamBuildingActivities,TownCarService,PayrollServices,AerialFitness,CremationServices,GolfCartRentals,GolfCartDealers,LivestockFeed&Supply,UltrasoundImagingCenters,GrillingEquipment,LightingStores,Donairs,Falafel,CannabisTours,PersonalAssistants,AcneTreatment,Clowns,Magicians,InstallmentLoans,Prosthetics,ParentingClasses,FoodBanks,StreetArt,Buses,DialysisClinics,Newspapers&Magazines,Cideries,AutoSecurity,TrailerRental,TabletopGames,MedicalTransportation,SoftwareDevelopment,HolidayDecoratingServices,HolidayDecorations,Cambodian,BirdShops,LanguageSchools,SeniorCenters,OsteopathicPhysicians,PetHospice,TrafficSchools,TrafficTicketingLaw,Urologists,Taekwondo,FarmEquipmentRepair,Coffeeshops,Sunglasses,AnimalPhysicalTherapy,Rheumatologists,PartyBikeRentals,Bangladeshi,Vocational&TechnicalSchool,PetWasteRemoval,Pathologists,Aestheticians,PsychicMediums,TastingClasses,WineTastingClasses,BodyContouring,PumpkinPatches,GeneratorInstallation/Repair,AddictionMedicine,VacationRentalAgents,AppraisalServices,Snorkeling,Dominican,Gemstones&Minerals,Cryotherapy,Trinidadian,ImmigrationLaw,SupperClubs,Burmese,AssistedLivingFacilities,PianoServices,HomeownerAssociation,ScavengerHunts,WalkingTours,BeerTours,BartendingSchools,Carousels,ConciergeMedicine,Matchmakers,WellDrilling,SriLankan,Trains,FurnitureRental,Badminton,PetPhotography,TitleLoans,DanceWear,IVHydration,CPRClasses,BikeSharing,NannyServices,Cafeteria,MistingSystemServices,HorseBoarding,Recording&RehearsalStudios,DisabilityLaw,SocialSecurityLaw,HabilitativeServices,CSA,RetinaSpecialists,BoatDealers,HearingAidProviders,PowderCoating,CircuitTrainingGyms,RotisserieChicken,EnvironmentalTesting,BingoHalls,ValetServices,SugarShacks,Austrian,Races&Competitions,Anesthesiologists,HouseSitters,TikiBars,CarShareServices,Squash,VisitorCenters,CheeseTastingClasses,FleaMarkets,WorkersCompensationLaw,Mosques,HolisticAnimalCare,Firewood,FoodTours,VascularMedicine,Tableware,Hydroponics,HighFidelityAudioEquipment,BarCrawl,BounceHouseRentals,BuddhistTemples,DIYAutoShop,HerbalShops,LANCenters,ConveyorBeltSushi,Egyptian,ReligiousSchools,HairLossCenters,Armenian,MotorcycleGear,ElderCarePlanning,BoatTours,BusRental,RacingExperience,HomeStaging,ReligiousItems,Ziplining,Colombian,Rolfing,Haitian,WildlifeControl,ConceptShops,DiscGolf,Drive-InTheater,TaiChi,International,TenantandEvictionLaw,Doulas,Neurotologists,Belgian,EthicalGrocery,Shanghainese,Machine&ToolRental,FirstAidClasses,HealthRetreats,Empanadas,AirportTerminals,RoofInspectors,Airsoft,VocalCoach,TelevisionStations,IceDelivery,Gerontologists,CustomsBrokers,MotorsportVehicleDealers,FlightInstruction,Cheerleading,RockClimbing,BalloonServices,ATVRentals/Tours,MassageSchools,Pool&Billiards,PettingZoos,Toxicologists,WaterParks,AirportLounges,Australian
0,--I7YYLada0tSLkORTHb5Q,-0.005223,0.013203,0.003038,0.005219,-0.015445,-0.0239,0.022235,0.005243,-0.004331,-0.016159,-0.004644,-0.015904,-0.01401,-0.001277,-0.010583,0.018494,-0.032239,0.009304,0.019475,0.020988,0.008241,-0.003566,-0.011737,-0.007616,-0.002773,-0.001232,0.001906,0.003526,0.008889,-0.003706,-0.006945,0.013901,-0.005344,0.022541,-0.019252,0.024952,-0.005151,0.007677,-0.001113,0.011767,-0.002323,-0.01123,0.0013,0.003953,-0.007738,-0.003068,-0.010008,-0.004464,0.004149,-0.002037,-0.014858,0.011594,-0.006096,-0.008228,0.019046,-0.001743,-0.002748,0.003867,0.001787,0.023379,0.000696,0.00366,0.003319,-0.011378,0.022503,0.000559,0.006824,0.006366,0.010075,0.004468,-0.003759,-0.006353,-0.029037,0.019419,-0.023796,0.013552,-0.008152,0.007197,0.004987,0.003728,0.006133,0.008269,0.008479,0.00612,-0.023127,-0.0126,0.002558,-0.021814,0.027417,0.005371,-0.003589,0.005786,-0.014635,0.018341,-0.003121,0.006018,0.018851,0.01104,-0.021877,-0.004835,0.019864,-0.000397,-0.007807,0.009183,-0.006709,-0.002768,0.029296,-0.003871,-0.000575,-0.012479,0.00489,0.023094,0.013396,-0.026432,0.006571,-0.000186,0.002575,-0.007381,0.014821,-0.010161,0.009179,-0.012276,0.016624,0.000203,0.015883,-0.003842,0.001717,0.018,0.002103,0.00119,-0.013259,0.004317,0.003305,-0.009963,0.000248,-0.006219,-0.005504,-0.012631,0.021161,-0.013545,0.003748,0.001858,-0.0027,-0.016487,-0.014337,-0.013673,-0.008969,0.002243,-0.004917,0.00461,0.019599,0.011536,0.013265,-0.018489,-0.003279,0.021542,0.004779,-0.010109,-0.007909,-0.011632,0.011815,0.004837,0.0024,-0.005271,-0.011838,-0.00092,0.000854,0.016514,-0.00266,0.005007,0.00745,0.002366,0.0186,0.012341,-0.003923,0.006543,-0.005612,-0.017796,-0.000585,-0.018443,0.021079,-0.020229,-0.014831,-0.004145,-0.001142,0.026174,-0.009629,-0.008576,0.007016,0.018397,0.003416,0.017813,0.001769,-0.000691,0.005052,-0.022161,-0.008082,0.002377,-0.015207,-0.003983,0.01159,-0.001088,0.020586,-0.013535,0.00733,-0.007836,-0.01056,0.013222,-0.002292,-0.015976,-0.000667,-0.005512,-0.007929,-0.007196,-0.004776,0.003071,-0.020419,0.013903,-0.011624,0.004576,0.014348,-0.000237,-0.015393,-0.012079,0.021398,0.001066,0.006931,-0.000196,-0.004139,0.017844,-0.00135,0.009227,0.027137,-0.00218,-0.014655,0.015815,0.007232,0.015476,-0.001738,0.009113,-0.009466,0.013402,-0.002148,-0.003952,0.01799,-0.02201,-0.006022,0.004883,-0.006745,0.006215,-0.001315,-0.012486,0.017581,-0.010556,0.009122,-0.007456,-0.007119,-0.001699,-0.006113,0.015474,0.001463,0.007989,0.009898,0.017991,-0.022558,0.010675,0.005857,-0.00322,5.3e-05,-0.007513,0.019841,0.021682,-0.020227,0.001476,0.00465,-0.010603,-0.000549,0.024183,0.013865,0.01075,-0.002404,-0.005597,-0.01739,-0.007033,-0.005543,-0.008435,-0.009834,-0.001851,-0.021076,-0.012153,0.001812,0.004794,0.006246,0.006409,-0.005728,0.016663,-0.002259,0.007373,-0.017416,-0.012376,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,--U98MNlDym2cLn36BBPgQ,-0.009473,0.007201,0.007892,0.007558,0.01489,-0.01304,0.000998,0.003142,0.01251,-0.009641,-0.008363,-0.001879,0.008971,-0.012477,-0.007306,0.015207,-0.00671,0.006058,0.015347,0.013466,0.008532,-0.009809,-0.014297,-0.010219,-0.009368,-0.008026,0.000225,-0.01447,0.011642,-0.021837,-0.003795,0.008734,-0.006346,0.012801,-0.008032,0.008933,-0.007662,-0.003781,0.015866,-0.004885,-0.00342,0.001984,-0.00896,-0.008293,0.000109,0.000424,-0.0179,-0.008771,0.005597,-0.006098,-0.013686,-0.01124,0.000259,-0.018127,0.013997,-0.007821,-0.006437,0.020163,-0.001049,0.008458,0.000157,0.005871,0.010256,0.005336,0.000152,-0.003606,-0.004147,0.003571,0.008488,0.002955,-0.023666,0.002256,-0.016963,0.0199,-0.035946,0.001264,-0.003626,-0.007445,0.001252,-0.009165,0.00174,-0.002499,0.029358,0.009762,-0.014283,-0.014628,0.009347,-0.005357,0.023107,0.017772,0.005322,0.000268,-0.019699,0.010072,-0.007535,0.002688,0.019548,0.005258,-0.005424,-0.012128,0.019191,0.003372,0.003562,0.009147,-0.018953,0.008326,0.021718,0.010737,0.010248,-0.016259,-0.01494,0.014091,0.015749,-0.034193,0.011758,-0.002841,0.01151,-0.002001,0.00191,-0.005489,0.014577,-0.011231,0.018701,0.008329,0.006234,-0.020305,0.000372,0.016223,-0.003292,-0.00572,-0.02751,0.007529,-0.002863,-0.000749,0.006188,-0.001027,0.003565,-0.002575,0.008592,-0.012345,0.003579,0.003061,0.010515,-0.008463,-0.0052,-0.0027,0.001463,-0.00914,-0.017576,0.00618,0.006529,-0.005442,-0.000809,-0.014269,-0.000472,0.00848,0.016967,-0.023097,0.005209,-0.016292,0.00596,0.00741,0.003231,-0.018636,0.007962,0.00977,-0.014006,0.013873,-0.00222,0.001197,-0.000364,-0.004451,0.005394,0.019312,-0.010599,0.00819,0.009584,-0.016238,0.00414,-0.005972,0.021776,-0.016622,-0.010791,-0.004947,-0.023165,0.019233,-0.000613,-0.007492,-0.000699,0.013902,0.005464,0.012575,-0.004372,0.004327,-0.003115,-0.012252,0.006913,0.005236,-0.019368,-0.005391,0.005623,0.006432,0.022029,-0.009159,-0.006492,0.002881,-0.014241,0.004627,-0.004803,-0.005085,0.000408,-0.006104,-0.007565,0.004088,-0.011873,0.002603,-0.019869,0.010066,-0.00583,0.006663,0.010104,-0.000855,-0.022315,-0.017786,0.022162,-0.009721,0.018821,0.002622,-0.012028,0.021037,-0.006022,-0.005586,0.0033,-0.014013,-0.007531,0.006539,0.007552,0.012004,-0.007375,0.009872,-0.011202,0.012094,-0.004685,-0.006495,0.006865,-0.006358,0.005425,-0.00094,-0.006885,-0.007055,0.005787,-0.00611,0.008055,0.012035,0.000168,-0.004976,-0.021966,0.012122,-0.006307,0.000289,-0.006792,0.013264,0.010632,0.006377,-0.004647,-0.008811,0.006545,0.003384,0.008636,-0.012588,0.002656,0.027923,-0.012597,-0.002731,-0.003223,-0.015098,-0.0003,0.005892,0.00719,-0.00083,-0.006708,0.00201,-0.015133,-0.004134,0.013898,0.000469,-0.004306,-0.002744,-0.005321,-0.002581,0.004612,0.004995,-0.002042,0.003872,-0.00744,0.010921,-0.007165,0.010107,-0.014089,-0.010551,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,--j-kaNMCo1-DYzddCsA5Q,0.009633,0.017234,0.02172,0.004169,-0.017047,0.021239,0.022135,0.018492,-0.025737,-0.037928,0.01828,-7e-06,-0.005802,-0.015352,-0.006522,0.002334,0.010471,0.027408,0.026609,0.010042,0.012227,-0.007788,-0.00143,0.034607,0.004424,-0.021396,-0.001293,-0.028477,-0.009573,-0.003362,0.007298,-0.019318,0.00776,0.035141,-0.02902,0.01573,-0.021251,-0.026926,-0.007267,-0.027688,-0.016911,-0.008173,-0.011411,0.016458,0.021287,0.015095,-0.011617,-0.017024,0.012681,-0.017895,0.001415,-0.0019,-0.016979,-0.020762,0.025932,0.00695,0.020581,0.009177,-0.020471,0.01232,0.013417,-0.006696,-0.021582,-0.010728,0.029076,-0.035886,0.026735,0.041019,0.028155,-0.004064,0.008174,-0.000281,-0.017237,0.012496,-0.016756,0.013198,-0.005503,0.0031,-0.00878,0.015854,0.022162,0.001933,-0.006392,-0.011483,-0.022019,0.008398,0.033826,0.003696,0.038147,-0.002722,0.012157,0.001011,0.008341,0.03413,-0.012817,0.010063,0.032218,0.00091,-0.007468,-0.02715,0.002558,0.022478,0.006398,-0.028837,-0.01712,-0.008066,-0.004648,0.017424,-0.038165,0.012726,-0.014071,0.006842,0.024031,-0.011678,-0.026158,-0.008592,-0.004284,-0.020306,0.015129,-0.001771,0.01084,0.032584,0.017343,-0.010482,-0.003187,-0.01053,0.012006,0.023894,0.000882,-0.00435,0.000148,0.022759,-0.033508,-0.022849,0.040028,-0.027479,-0.000731,0.025781,0.006088,0.015409,-0.017167,-0.02443,0.005184,-0.027714,-0.01012,-0.024396,-0.016754,-0.015322,-0.021488,-0.020994,0.025677,-0.001122,-0.005009,-0.032246,-0.017809,0.066689,-0.020374,-0.008658,-0.020044,0.021057,0.027508,0.030709,0.0053,-0.010031,-0.003083,-0.003467,0.04925,0.006947,0.020152,0.011769,0.002683,0.023202,-0.012936,0.004806,-0.012107,-0.012848,-0.019727,0.000232,0.003374,-0.008491,0.014565,0.004403,-0.006044,0.005692,-0.038754,-0.006393,-0.011121,0.025353,0.000418,0.002281,-0.004344,0.024774,0.011273,-0.014918,-0.017253,-0.007505,-0.000373,-0.024779,-0.021933,-0.001134,-0.010386,0.01263,0.000258,-0.003047,0.02557,-0.01956,-0.022088,0.030279,0.007202,0.00806,-0.018217,-0.02514,0.012217,-0.014579,-0.021562,-0.002347,0.015548,0.018797,0.029962,-0.012895,0.022535,-0.025045,0.019349,-0.027831,0.051708,0.022441,0.007402,-0.007486,0.018309,0.007561,0.003333,0.005273,0.020453,-0.012347,-0.002165,0.017393,-0.028464,0.013932,-0.003051,-0.003757,0.022351,-0.00033,-0.002272,0.007608,0.010865,-0.009705,-0.002934,0.012057,0.023419,0.007891,-0.049629,-0.019722,0.045787,-0.032095,-0.001229,-0.017484,0.008668,-0.001412,-0.019967,-0.006188,0.021955,0.014569,0.032061,0.025408,-0.027979,-0.023355,0.041538,-0.018833,0.020042,0.002688,0.012669,0.016015,-0.010872,0.032645,0.003781,-0.025301,-0.002471,-0.004316,0.016233,0.030151,-0.014278,0.02077,0.0053,0.004483,-0.004641,-0.002624,-0.029591,-0.004469,0.005594,-0.017324,-0.001699,0.026008,-0.014737,-0.005179,0.027793,0.034811,0.00841,0.016346,0.004344,0.00249,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,--wIGbLEhlpl_UeAIyDmZQ,-0.005456,0.009802,0.01644,0.004111,0.026086,0.012159,-0.01191,0.019884,-0.000267,0.009167,-0.022002,-0.000376,0.022163,-0.012592,-0.016783,0.007154,0.030941,0.010765,0.013459,0.011096,0.019684,0.008331,0.005821,-0.019253,0.01065,-0.026495,0.0028,0.004084,0.015505,-0.022831,0.009739,0.000924,-0.003843,0.001612,-0.001618,-0.007613,-0.009442,-0.005349,-0.002633,-0.003725,0.014835,0.013204,-0.009251,-0.003674,0.004898,0.01651,-0.012062,-2.8e-05,-0.003391,-0.023944,0.000422,-0.012971,0.005639,-0.005384,0.018614,0.013661,0.013891,0.020802,-0.015975,0.002161,0.00921,0.018026,-0.018251,-0.017948,-0.010045,-0.006756,-0.023177,0.017577,0.015901,-0.017156,-0.030951,0.006246,0.004852,0.010155,-0.01865,-0.003744,-0.006849,0.002759,-0.013968,-0.023144,-0.011524,-0.014173,0.01309,-0.008078,0.005399,0.011827,0.011999,0.006736,-0.002318,0.017966,-0.001693,-0.020036,-0.004857,-0.019216,-0.023563,0.013857,-0.003167,0.006324,0.008236,-0.012372,-0.005626,0.007414,0.014332,-0.021266,-0.005814,-0.001382,-0.019748,0.001419,-0.005417,-0.0118,-0.002742,-0.008808,0.005969,-0.01634,0.004003,-0.02227,-0.001811,0.005603,-0.000967,-0.010754,0.009281,0.010762,-0.00423,0.017744,-0.005627,-0.021057,0.004062,-0.009549,-0.026366,-0.012437,-0.018023,-0.005857,-0.004375,-0.009411,0.006758,-0.012338,-0.016047,0.016621,-0.00795,0.005086,-0.009438,-0.015627,-0.008897,0.009759,0.020124,0.006253,0.010768,-0.02042,0.003077,-0.00804,0.011339,0.002316,-0.005414,-0.004821,-0.001869,0.015924,-0.002956,-0.025049,-0.01275,0.001372,0.007707,0.002852,0.004081,0.003575,0.034528,0.028531,0.005457,-0.00511,0.004155,-0.012019,0.001941,0.01011,-0.006044,0.007181,-0.005672,0.018773,0.018902,-0.000433,0.012279,0.005938,0.000586,0.001359,-0.018573,-0.009612,-0.008295,-0.010517,-0.014956,-0.003387,-0.014494,0.001876,0.012112,0.002879,-0.011921,0.020464,-0.029488,0.016904,0.00712,-0.003552,-0.028669,0.002088,-0.001001,-0.004917,-0.002217,-0.009109,-0.003918,0.003644,-0.004833,0.00502,-0.0047,0.010091,-0.002118,-0.008017,-0.007088,0.03099,-0.014254,0.011027,-0.01389,0.008267,0.009554,0.005332,0.012902,-0.000232,0.007208,-0.018913,0.008089,-0.001019,0.017901,-0.008769,0.009083,0.007134,0.0089,0.008112,0.021203,-0.013173,0.005674,0.021077,-0.001882,-0.003455,0.003263,-0.011494,-0.002357,-0.013575,-0.014636,-0.001604,-0.009959,-0.00911,0.016737,-0.008355,0.0031,0.002337,-0.000418,-0.005265,0.00794,0.003551,-0.012436,-0.008322,0.017082,0.011061,-0.029862,-0.019501,-0.002858,0.001951,-0.000122,0.003853,0.00827,-0.023261,0.012332,-0.009129,0.000946,-0.011941,0.022079,-0.005741,-0.000776,-0.00152,0.009702,-0.007817,0.001017,-0.014863,0.009951,-0.000982,-0.013537,0.002709,0.014539,-0.007789,0.031251,0.011054,-0.014146,-0.003772,0.002339,0.011464,-0.006964,0.023242,0.004164,0.004424,-0.001646,0.005231,-0.00919,-0.002553,0.002079,-0.02935,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14,False,False,False,False,False,False,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,-000aQFeK6tqVLndf7xORg,0.010634,0.02208,0.014209,-0.017468,0.022044,-0.0014,-0.002871,0.023146,0.002384,0.001417,-0.010938,0.005055,0.027587,-0.017047,-0.013811,-0.002557,0.041712,0.002163,0.028847,0.000981,0.024016,0.0029,-0.005201,-0.009223,-0.002743,-0.017438,-0.003572,0.006089,0.010395,-0.013209,0.012886,-0.001177,-0.014673,0.022489,-0.009622,-0.008018,-0.018578,-0.016016,-0.012349,-0.014922,0.011807,0.012447,-0.011533,0.00777,0.021696,0.018031,-0.009846,0.018853,-0.006351,-0.024626,-0.005112,-0.009186,0.006745,-0.015911,0.015193,0.013527,0.023132,0.027268,-0.010801,0.012058,0.004636,0.011673,-0.029823,-0.026776,-0.001081,-0.013225,-0.019039,0.0359,0.018251,-0.018557,-0.02795,0.005416,-0.006096,0.014236,-0.016713,0.007266,-0.006211,0.013172,-0.027839,-0.01764,-0.010107,-0.012879,0.005412,-0.002554,0.003613,-0.000204,0.0191,0.015422,0.012548,0.025712,0.003411,-0.020152,0.010858,-0.018758,-0.026953,0.018693,-0.008354,0.01096,-0.000944,-0.027852,-5.5e-05,-0.001036,0.008829,-0.023901,-0.004498,-0.012934,-0.001759,0.00228,-0.010393,-0.001095,0.009954,0.000122,0.017624,-0.028894,0.008357,-0.017014,-0.009783,0.018886,-0.009965,-0.015121,0.00761,0.025632,-0.005411,0.021959,-0.012702,-0.042302,0.004007,-0.010247,-0.025184,-0.011162,-0.01954,0.015698,-0.020799,-0.011743,0.032221,-0.039323,-0.014192,0.035332,-0.019248,0.007542,-0.012299,-0.021357,0.001803,0.011896,0.01258,-0.007564,0.010964,-0.029188,-0.01086,-0.021306,0.014168,-0.00815,-0.006527,0.007753,-0.003765,0.011067,-0.004922,-0.023077,-0.007052,0.015688,0.018018,-0.000261,0.004033,0.011681,0.028107,0.016449,0.012073,-0.003362,0.001624,-0.013004,-0.00061,0.007606,0.008367,0.000196,-0.00947,0.017505,-0.002442,0.008911,0.010356,0.003271,-0.006186,-0.001911,-0.012913,0.008914,-0.023485,-0.0192,-0.031872,-0.007105,-0.024002,-0.002116,0.016141,0.007223,-0.011142,0.018811,-0.043218,0.011272,0.010321,-0.018093,-0.021208,0.003711,-0.0045,-0.006089,-0.001891,-0.007828,-0.010498,-0.001914,-0.00019,0.015799,-0.006593,0.032376,-0.01564,-0.002543,9e-05,0.024382,-0.024875,0.017178,-0.026717,0.00886,0.015729,0.002536,0.019144,0.001338,0.00015,-0.023235,0.018891,-0.008904,0.031017,-0.009575,0.007091,0.021063,0.010742,0.015952,0.016892,-0.020155,0.000906,0.037056,-0.00926,-0.009961,0.015665,-0.015672,0.006226,-0.025471,-0.019,0.001999,-0.01858,-0.010708,0.013912,-0.021817,-0.001499,0.009913,-0.022437,-0.002935,0.003617,-0.007614,-0.016705,-0.02221,0.031644,3.8e-05,-0.042605,-0.019705,0.005296,0.004769,-0.001515,0.01438,0.000286,-0.025709,0.018251,-0.010956,-0.004383,-0.01399,0.013778,-0.006711,0.002063,0.009063,-0.001155,-0.017032,0.008638,-0.022255,0.014825,0.002668,-0.017494,0.010238,0.009991,-0.01573,0.032765,0.01212,-0.005577,-0.006436,0.006122,0.005328,-1.9e-05,0.027976,0.005147,0.000412,0.008574,0.0186,-0.008427,-0.000187,0.014652,-0.030129,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [35]:
# Clean

# Remove rows with NaNs
print('Before: ', len(all_features_business))
all_features_business = all_features_business.dropna(axis=0)
print('After:  ', len(all_features_business))

Before:  13943
After:   13922


In [236]:
# First, shuffle the dataframe 
all_features_business = all_features_business.sample(frac=1)

# Create final y and x 
y_df = all_features_business[all_cats]
x_cols = [ele for ele in all_features_business.columns if ele not in all_cats+['categories', 'business_id']]
# May also want to remove from x_cols: 'cool', 'funny', 'useful', 'stars', 'categories', 'review_count' 
# May also want to drop rows that do not contain more than a threshold number of reviews (20?, 100?)
x_df = all_features_business[x_cols]

# Numpy arrays
x = x_df.values
y = y_df.values

# Classifier wants 1/0, not T/F
y = y.astype(int)

# Split into Train/Test sets
def splitSets(x, y, test_size=0.2):
    test_size_absolute = np.int(test_size * len(x))
    X_test, X_train = x[:test_size_absolute,:], x[test_size_absolute:,:]
    y_test, y_train = y[:test_size_absolute,:], y[test_size_absolute:,:]
    return X_train, X_test, y_train, y_test
    
test_size = 0.2
X_train, X_test, y_train, y_test = splitSets(x, y, test_size=test_size)

In [237]:
y_test

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 1, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

# Category Prediction

In [238]:
# Multilabel Classification
# RandomForestClassifier supports multilabel classification

# Most other classifiers will require use of 
    # sklearn.multioutput.MultiOutputClassifier to run a separate classifier model for each targe
    
from sklearn.ensemble import RandomForestClassifier

In [239]:
rfc = RandomForestClassifier(n_estimators=10, n_jobs=-1)

In [240]:
rfc.fit(X_train,y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=-1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

## Recall (and other classification metrics)

In our case we want a Recall = TPR (True Positive Rate) close to 1 since we want to Recall ALL correct categories. 

The only requirement we have for Precision is that it be less than 1. This is because we want some FPs (False Positives) since these are what WE ARE RECOMMENDING!!

In [241]:
from sklearn.metrics import classification_report

y_predict = rfc.predict(X_test)
print(classification_report(y_test, y_predict, target_names=all_cats))

                                precision    recall  f1-score   support

                     Nightlife       0.77      0.33      0.46       220
                    SportsBars       1.00      0.03      0.05        40
                   Restaurants       0.91      0.86      0.88       966
                          Bars       0.74      0.28      0.40       253
         American(Traditional)       0.00      0.00      0.00         0
                         Pizza       1.00      0.33      0.49       122
                   HairRemoval       0.25      0.02      0.03        55
               NailTechnicians       0.00      0.00      0.00         5
                   Beauty&Spas       0.97      0.55      0.70       282
                    NailSalons       0.83      0.34      0.49        73
                        Waxing       0.50      0.03      0.06        34
                       DaySpas       0.00      0.00      0.00        50
                   Electronics       0.00      0.00      0.00  

  'precision', 'predicted', average, warn_for)
  'recall', 'true', average, warn_for)
  'precision', 'predicted', average, warn_for)


In [242]:
from sklearn.metrics import recall_score 

recall_all_cats = recall_score(y_test, y_predict, average=None)
recall_all_cats

  'recall', 'true', average, warn_for)


array([0.32727273, 0.025     , 0.86231884, ..., 0.        , 0.        ,
       0.        ])

## Top RECOMMENDATIONS

Look at the top NONMATCHING RESULTS, which are the top recommendations!

In [243]:
y_proba = rfc.predict_proba(X_test)

In [244]:
print( len(y_proba), ' L')
print( len(y_proba[0]), ' W')
print( len(y_proba[0][0]), " D (0: False prob'y, 1: True prob'y)")

1090  L
2784  W
2  D (0: False prob'y, 1: True prob'y)


In [245]:
y_proba[0][0]

array([1., 0.])

In [246]:
reccs_binary = (y_test == 0) & (y_predict == 1)
reccs_binary

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [247]:
all_cats_ser = pd.Series(data=all_cats)

In [248]:
all_cats_true = []
all_cats_recc = []
for biz in range(len(y_test)):
    cats_true = ', '.join(list(all_cats_ser[y_test[biz,:]==1]))
    all_cats_true.append(cats_true)
    
    cats_recc = ', '.join(list(all_cats_ser[reccs_binary[biz,:]==True]))
    all_cats_recc.append(cats_recc)

reccs_df = pd.DataFrame(data=all_cats_true, columns=['Labeled'])
reccs_df['Recommended'] = all_cats_recc
reccs_df.tail()

Unnamed: 0,Labeled,Recommended
2779,"RealEstate, HomeServices, PropertyManagement, ...",
2780,"Health&Medical, Doctors, Obstetricians&Gynecol...",
2781,"LocalServices, SelfStorage",
2782,"Restaurants, Sandwiches",
2783,"Shopping, Fashion, Women'sClothing, Men'sCloth...",


In [249]:
reccs_df['categories'] = all_features_business['categories'].iloc[:len(reccs_df)]
reccs_df.tail()

Unnamed: 0,Labeled,Recommended,categories
2779,"RealEstate, HomeServices, PropertyManagement, ...",,
2780,"Health&Medical, Doctors, Obstetricians&Gynecol...",,
2781,"LocalServices, SelfStorage",,
2782,"Restaurants, Sandwiches",,"Arts & Entertainment, Lounges, Coffee & Tea, N..."
2783,"Shopping, Fashion, Women'sClothing, Men'sCloth...",,


In [None]:
break

In [None]:
# This is where I need to pick up. I need to match the dataframes 
# so that I can match reviews etc and judge how well the recommender is doing

In [232]:
list(all_features_business['categories'].tail())

['Nail Salons, Beauty & Spas',
 'Event Planning & Services, Hotels & Travel, Restaurants, Hotels',
 'Preschools, Education, Child Care & Day Care, Local Services',
 'Japanese, Restaurants, Sushi Bars',
 'Automotive, Commercial Truck Dealers, Commercial Truck Repair, Auto Repair, Body Shops']

In [221]:
reccs_df[reccs_df['Recommended']!=''].sort_values(by='Recommended')

Unnamed: 0,Labeled,Recommended
10202,"RealEstate, HomeServices, RealEstateServices, ...",ActiveLife
8046,"LocalServices, ChildCare&DayCare",ActiveLife
654,"LocalServices, Arts&Entertainment, CommunitySe...",ActiveLife
3366,"Electronics, Shopping, PhotographyStores&Services",ActiveLife
2815,"LocalServices, Shopping, Education, MusicalIns...",ActiveLife
10807,"EventPlanning&Services, Party&EventPlanning, P...",ActiveLife
10263,"Nightlife, Arts&Entertainment, ComedyClubs, Pe...",ActiveLife
1627,"Shopping, EventPlanning&Services, Education, M...",ActiveLife
8442,"Restaurants, Food, Sandwiches, FastFood",ActiveLife
8464,"Arts&Entertainment, PerformingArts",ActiveLife


In [222]:
# Given X_test row, find identical row in all_features_business and use that to find 'business_id'
all_features_business[x_cols].values == X_test[0]

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False,  True]])

In [223]:
X_test.shape

(11138, 305)

In [224]:
all_features_business[x_cols].values.shape

(13922, 305)