<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Acknowledgements" data-toc-modified-id="Acknowledgements-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Acknowledgements</a></span></li><li><span><a href="#Prepare-data-and-model" data-toc-modified-id="Prepare-data-and-model-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare data and model</a></span></li><li><span><a href="#Make-feature-matrix-(word2vec,-votes,-stars)" data-toc-modified-id="Make-feature-matrix-(word2vec,-votes,-stars)-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Make feature matrix (word2vec, votes, stars)</a></span></li><li><span><a href="#Create-Label-y-(Business-categories)" data-toc-modified-id="Create-Label-y-(Business-categories)-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Create Label y (Business categories)</a></span></li><li><span><a href="#Join-x,y-(feature-matrix,-category)-using-business_id" data-toc-modified-id="Join-x,y-(feature-matrix,-category)-using-business_id-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Join x,y (feature matrix, category) using business_id</a></span></li><li><span><a href="#Category-Prediction" data-toc-modified-id="Category-Prediction-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Category Prediction</a></span><ul class="toc-item"><li><span><a href="#Recall-(and-other-classification-metrics)" data-toc-modified-id="Recall-(and-other-classification-metrics)-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Recall (and other classification metrics)</a></span></li><li><span><a href="#Top-RECOMMENDATIONS" data-toc-modified-id="Top-RECOMMENDATIONS-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Top RECOMMENDATIONS</a></span></li></ul></li><li><span><a href="#Cluster-with-metadata-(useful,-cool,-funny,-stars)" data-toc-modified-id="Cluster-with-metadata-(useful,-cool,-funny,-stars)-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Cluster with metadata (useful, cool, funny, stars)</a></span></li></ul></div>

# Acknowledgements
Thanks to the tutorial: https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-3-more-fun-with-word-vectors

# Prepare data and model

In [76]:
%matplotlib inline
import pandas as pd
pd.options.display.max_columns = 999
pd.options.display.max_rows=999
import numpy as np
import matplotlib.pyplot as plt

import re

import nltk
import nltk.data
nltk.download('stopwords')
from nltk.corpus import stopwords # Import the stop word list



[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/daviderickson/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
def load_reviews(size='small'): 
    if size == 'small':
        filename = r'../../data/small-review.json'
    elif size == 'intermediate':
        filename = r'../../data/intermediate-review.json'
    elif size == 'full':
        filename = r'../../data/review.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfreviews = load_reviews(size='intermediate')

In [3]:
dfreviews.head()

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,ujmEBvifdJM6h6RLv4wQIg,0,2013-05-07 04:34:36,1,Q1sbwvVQXV2734tPgoKj4Q,1.0,Total bill for this horrible service? Over $8G...,6,hG7b0MtEbXx5QzbzE6C_VA
1,NZnhc2sEQy3RmzKTZnqtwQ,0,2017-01-14 21:30:33,0,GJXCdrto3ASJOqKeVWPi6Q,5.0,I *adore* Travis at the Hard Rock's new Kelly ...,0,yXQM5uF2jS6es16SJzNHfg
2,WTqjgwHlXbSFevF32_DJVw,0,2016-11-09 20:09:03,0,2TzJjDVDEuAW6MR5Vuc1ug,5.0,I have to say that this office really has it t...,3,n6-Gk65cPZL6Uz8qRm3NYw
3,ikCg8xy5JIg_NGPx-MSIDA,0,2018-01-09 20:56:38,0,yi0R0Ugj_xUx_Nek0-_Qig,5.0,Went in for a lunch. Steak sandwich was delici...,0,dacAIZ6fTM6mqwW5uxkskg
4,b1b1eb3uo-w561D0ZfCEiQ,0,2018-01-30 23:07:38,0,11a8sVPMUFtaC7_ABRkmtw,1.0,Today was my second out of three sessions I ha...,7,ssoyf2_x0EQMed6fgHeMyQ


In [4]:
dfreviews.columns

Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id'],
      dtype='object')

In [5]:
dfreviews['text'][0]

'Total bill for this horrible service? Over $8Gs. These crooks actually had the nerve to charge us $69 for 3 pills. I checked online the pills can be had for 19 cents EACH! Avoid Hospital ERs at all costs.'

In [6]:
# For simplicity, drop anything that isn't a letter
# Numbers and symbols may have interesting meaning and could be explore later

def lettersOnly(string):
    return re.sub("[^a-zA-Z]", " ", string) 

dfreviews['text'] = dfreviews['text'].apply(lettersOnly)


In [7]:
dfreviews['text'][0]

'Total bill for this horrible service  Over   Gs  These crooks actually had the nerve to charge us     for   pills  I checked online the pills can be had for    cents EACH  Avoid Hospital ERs at all costs '

In [8]:
def review_to_wordlist(string, remove_stopwords=False):
    string = re.sub("[^a-zA-Z]", " ", string) # keep only letters. more complex model possible later
    words =  string.lower().split() # make everything lowercase. split into words
    if remove_stopwords:
        stops = set(stopwords.words('english')) # create a fast lookup for stopwords
        words = [w for w in words if not w in stops] # remove stopwords
    return( words) # return a list of words
    
# dfreviews['text'] = dfreviews['text'].apply(review_to_words) # apply to reviews in dataframe


In [9]:
# Word2Vec expects single sentences, each one as a list of words

# Load the punkt tokenizer
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')

# Define a function to split a review into parsed sentences
def review_to_sentences( review, tokenizer, remove_stopwords=False ):
    # Function to split a review into parsed sentences. Returns a 
    # list of sentences, where each sentence is a list of words
    #
    # 1. Use the NLTK tokenizer to split the paragraph into sentences
    raw_sentences = tokenizer.tokenize(review.strip())
    #
    # 2. Loop over each sentence
    sentences = []
    for raw_sentence in raw_sentences:
        # If a sentence is empty, skip it
        if len(raw_sentence) > 0:
            # Otherwise, call review_to_wordlist to get a list of words
            sentences.append( review_to_wordlist( raw_sentence, \
              remove_stopwords ))
    #
    # Return the list of sentences (each sentence is a list of words,
    # so this returns a list of lists
    return sentences

In [10]:
sentences = []  # Initialize an empty list of sentences

print("Parsing sentences")
for review in dfreviews["text"]:
    sentences += review_to_sentences(review, tokenizer)

Parsing sentences


In [11]:
# Import the built-in logging module and configure it so that Word2Vec 
# creates nice output messages
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',\
    level=logging.INFO)

# Set values for various parameters
num_features = 300    # Word vector dimensionality                      
min_word_count = 40   # Minimum word count                        
num_workers = 4       # Number of threads to run in parallel
context = 10          # Context window size                                                                                    
downsampling = 1e-3   # Downsample setting for frequent words

# Initialize and train the model (this will take some time)
from gensim.models import word2vec
print("Training model...")
model = word2vec.Word2Vec(sentences, workers=num_workers, \
            size=num_features, min_count = min_word_count, \
            window = context, sample = downsampling)

# If you don't plan to train the model any further, calling 
# init_sims will make the model much more memory-efficient.
model.init_sims(replace=True)

# It can be helpful to create a meaningful model name and 
# save the model for later use. You can load it later using Word2Vec.load()
model_name = "300features_40minwords_10context"
model.save(model_name)

2020-01-21 21:41:36,516 : INFO : 'pattern' package not found; tag filters are not available for English
2020-01-21 21:41:36,528 : INFO : collecting all words and their counts
2020-01-21 21:41:36,528 : INFO : PROGRESS: at sentence #0, processed 0 words, keeping 0 word types


Training model...


2020-01-21 21:41:36,729 : INFO : PROGRESS: at sentence #10000, processed 1088334 words, keeping 25539 word types
2020-01-21 21:41:36,911 : INFO : PROGRESS: at sentence #20000, processed 2172597 words, keeping 35463 word types
2020-01-21 21:41:37,086 : INFO : PROGRESS: at sentence #30000, processed 3251616 words, keeping 42649 word types
2020-01-21 21:41:37,292 : INFO : PROGRESS: at sentence #40000, processed 4373996 words, keeping 48893 word types
2020-01-21 21:41:37,474 : INFO : PROGRESS: at sentence #50000, processed 5471587 words, keeping 53964 word types
2020-01-21 21:41:37,649 : INFO : PROGRESS: at sentence #60000, processed 6570064 words, keeping 58362 word types
2020-01-21 21:41:37,832 : INFO : PROGRESS: at sentence #70000, processed 7667364 words, keeping 62704 word types
2020-01-21 21:41:38,004 : INFO : PROGRESS: at sentence #80000, processed 8768955 words, keeping 66443 word types
2020-01-21 21:41:38,176 : INFO : PROGRESS: at sentence #90000, processed 9872097 words, keeping 

2020-01-21 21:42:20,436 : INFO : EPOCH 5 - PROGRESS: at 20.53% examples, 789415 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:21,448 : INFO : EPOCH 5 - PROGRESS: at 30.26% examples, 772301 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:22,451 : INFO : EPOCH 5 - PROGRESS: at 40.21% examples, 777214 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:23,452 : INFO : EPOCH 5 - PROGRESS: at 50.41% examples, 780440 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:24,458 : INFO : EPOCH 5 - PROGRESS: at 60.76% examples, 784652 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:25,474 : INFO : EPOCH 5 - PROGRESS: at 71.12% examples, 786514 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:26,474 : INFO : EPOCH 5 - PROGRESS: at 79.34% examples, 768400 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:27,475 : INFO : EPOCH 5 - PROGRESS: at 89.58% examples, 771905 words/s, in_qsize 7, out_qsize 0
2020-01-21 21:42:28,461 : INFO : worker thread finished; awaiting finish of 3 more threads
2020-

In [12]:
model.most_similar('pizza')

  """Entry point for launching an IPython kernel.


[('crust', 0.7070269584655762),
 ('pizzas', 0.6834176182746887),
 ('pepperoni', 0.6815404891967773),
 ('calzone', 0.6130505800247192),
 ('margherita', 0.6126407384872437),
 ('dough', 0.5469943284988403),
 ('mozzarella', 0.5322738885879517),
 ('slice', 0.5247739553451538),
 ('lasagna', 0.5243027210235596),
 ('meatball', 0.5210219621658325)]

In [13]:
model.most_similar('service')

  """Entry point for launching an IPython kernel.


[('waitstaff', 0.5219206809997559),
 ('staff', 0.46909892559051514),
 ('servers', 0.43119579553604126),
 ('hospitality', 0.41726261377334595),
 ('communication', 0.40516912937164307),
 ('execution', 0.4035489559173584),
 ('ambience', 0.401478111743927),
 ('consistently', 0.3970610499382019),
 ('value', 0.39341211318969727),
 ('bartenders', 0.3915520906448364)]

In [14]:
model.most_similar('bad')

  """Entry point for launching an IPython kernel.


[('terrible', 0.6198078393936157),
 ('horrible', 0.5894107222557068),
 ('good', 0.5501216650009155),
 ('poor', 0.544106125831604),
 ('awful', 0.5215410590171814),
 ('alright', 0.4838787317276001),
 ('disappointing', 0.4818119704723358),
 ('greatest', 0.4734565019607544),
 ('crappy', 0.45638489723205566),
 ('pathetic', 0.4444549083709717)]

In [15]:
import numpy as np  # Make sure that numpy is imported

def makeFeatureVec(words, model, num_features):
    # Function to average all of the word vectors in a given
    # paragraph
    #
    # Pre-initialize an empty numpy array (for speed)
    featureVec = np.zeros((num_features,),dtype="float32")
    #
    nwords = 0.
    # 
    # WV.Index2word is a list that contains the names of the words in 
    # the model's vocabulary. Convert it to a set, for speed 
    index2word_set = set(model.wv.index2word)
    #
    # Loop over each word in the review and, if it is in the model's
    # vocaublary, add its feature vector to the total
    for word in words:
        if word in index2word_set: 
            nwords = nwords + 1.
            featureVec = np.add(featureVec,model[word])
    # 
    # Divide the result by the number of words to get the average
    featureVec = np.divide(featureVec,nwords)
    return featureVec


def getAvgFeatureVecs(reviews, model, num_features):
    # Given a set of reviews (each one a list of words), calculate 
    # the average feature vector for each one and return a 2D numpy array 
    # 
    # Initialize a counter
    counter = int(0.)
    # 
    # Preallocate a 2D numpy array, for speed
    reviewFeatureVecs = np.zeros((len(reviews),num_features),dtype="float32")
    # 
    # Loop through the reviews
    for review in reviews:
       #
       # Print a status message every 1000th review
       if counter%1000. == 0.:
           print ("Review %d of %d" % (counter, len(reviews)))
       # 
       # Call the function (defined above) that makes average feature vectors
       reviewFeatureVecs[counter] = makeFeatureVec(review, model, \
           num_features)
       #
       # Increment the counter
       counter = counter + 1
    return reviewFeatureVecs

In [16]:
# ****************************************************************
# Calculate average feature vectors
# using the functions we defined above. Notice that we now use stop word
# removal.

clean_reviews = []
for review in dfreviews["text"]:
    clean_reviews.append( review_to_wordlist( review, \
        remove_stopwords=True ))

reviewDataVecs = getAvgFeatureVecs( clean_reviews, model, num_features )

Review 0 of 100000




Review 1000 of 100000
Review 2000 of 100000
Review 3000 of 100000
Review 4000 of 100000
Review 5000 of 100000
Review 6000 of 100000
Review 7000 of 100000
Review 8000 of 100000




Review 9000 of 100000
Review 10000 of 100000
Review 11000 of 100000
Review 12000 of 100000
Review 13000 of 100000
Review 14000 of 100000
Review 15000 of 100000
Review 16000 of 100000
Review 17000 of 100000
Review 18000 of 100000
Review 19000 of 100000
Review 20000 of 100000
Review 21000 of 100000
Review 22000 of 100000
Review 23000 of 100000
Review 24000 of 100000
Review 25000 of 100000
Review 26000 of 100000
Review 27000 of 100000
Review 28000 of 100000
Review 29000 of 100000
Review 30000 of 100000
Review 31000 of 100000
Review 32000 of 100000
Review 33000 of 100000
Review 34000 of 100000
Review 35000 of 100000
Review 36000 of 100000
Review 37000 of 100000
Review 38000 of 100000
Review 39000 of 100000
Review 40000 of 100000
Review 41000 of 100000
Review 42000 of 100000
Review 43000 of 100000
Review 44000 of 100000
Review 45000 of 100000
Review 46000 of 100000
Review 47000 of 100000
Review 48000 of 100000
Review 49000 of 100000
Review 50000 of 100000
Review 51000 of 100000
Review 52000

# Make feature matrix (word2vec, votes, stars)

In [17]:
reviewDataVecs.shape[1]

300

In [18]:
# Add non-text data back to feature matrix
review_features = ['cool', 'funny', 'useful', 'stars' , 'business_id']
all_features_labels = ['w2v{}'.format(idx) for idx in range(reviewDataVecs.shape[1])] + review_features
all_features = np.append(reviewDataVecs, dfreviews[review_features].to_numpy(), 1)


In [19]:
# Create df 
all_features_df = pd.DataFrame(data=all_features, columns=all_features_labels)

# Convert all but business_id to numerical
business_ids = all_features_df['business_id']
all_features_df = all_features_df.iloc[:,:-1].astype('float64')
all_features_df['business_id'] = business_ids
del business_ids

# Group by business_id
all_features_business = all_features_df.groupby(by='business_id').mean()

In [20]:
all_features_business.head()

Unnamed: 0_level_0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
business_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1,Unnamed: 199_level_1,Unnamed: 200_level_1,Unnamed: 201_level_1,Unnamed: 202_level_1,Unnamed: 203_level_1,Unnamed: 204_level_1,Unnamed: 205_level_1,Unnamed: 206_level_1,Unnamed: 207_level_1,Unnamed: 208_level_1,Unnamed: 209_level_1,Unnamed: 210_level_1,Unnamed: 211_level_1,Unnamed: 212_level_1,Unnamed: 213_level_1,Unnamed: 214_level_1,Unnamed: 215_level_1,Unnamed: 216_level_1,Unnamed: 217_level_1,Unnamed: 218_level_1,Unnamed: 219_level_1,Unnamed: 220_level_1,Unnamed: 221_level_1,Unnamed: 222_level_1,Unnamed: 223_level_1,Unnamed: 224_level_1,Unnamed: 225_level_1,Unnamed: 226_level_1,Unnamed: 227_level_1,Unnamed: 228_level_1,Unnamed: 229_level_1,Unnamed: 230_level_1,Unnamed: 231_level_1,Unnamed: 232_level_1,Unnamed: 233_level_1,Unnamed: 234_level_1,Unnamed: 235_level_1,Unnamed: 236_level_1,Unnamed: 237_level_1,Unnamed: 238_level_1,Unnamed: 239_level_1,Unnamed: 240_level_1,Unnamed: 241_level_1,Unnamed: 242_level_1,Unnamed: 243_level_1,Unnamed: 244_level_1,Unnamed: 245_level_1,Unnamed: 246_level_1,Unnamed: 247_level_1,Unnamed: 248_level_1,Unnamed: 249_level_1,Unnamed: 250_level_1,Unnamed: 251_level_1,Unnamed: 252_level_1,Unnamed: 253_level_1,Unnamed: 254_level_1,Unnamed: 255_level_1,Unnamed: 256_level_1,Unnamed: 257_level_1,Unnamed: 258_level_1,Unnamed: 259_level_1,Unnamed: 260_level_1,Unnamed: 261_level_1,Unnamed: 262_level_1,Unnamed: 263_level_1,Unnamed: 264_level_1,Unnamed: 265_level_1,Unnamed: 266_level_1,Unnamed: 267_level_1,Unnamed: 268_level_1,Unnamed: 269_level_1,Unnamed: 270_level_1,Unnamed: 271_level_1,Unnamed: 272_level_1,Unnamed: 273_level_1,Unnamed: 274_level_1,Unnamed: 275_level_1,Unnamed: 276_level_1,Unnamed: 277_level_1,Unnamed: 278_level_1,Unnamed: 279_level_1,Unnamed: 280_level_1,Unnamed: 281_level_1,Unnamed: 282_level_1,Unnamed: 283_level_1,Unnamed: 284_level_1,Unnamed: 285_level_1,Unnamed: 286_level_1,Unnamed: 287_level_1,Unnamed: 288_level_1,Unnamed: 289_level_1,Unnamed: 290_level_1,Unnamed: 291_level_1,Unnamed: 292_level_1,Unnamed: 293_level_1,Unnamed: 294_level_1,Unnamed: 295_level_1,Unnamed: 296_level_1,Unnamed: 297_level_1,Unnamed: 298_level_1,Unnamed: 299_level_1,Unnamed: 300_level_1,Unnamed: 301_level_1,Unnamed: 302_level_1,Unnamed: 303_level_1,Unnamed: 304_level_1
--I7YYLada0tSLkORTHb5Q,-0.013726,-0.000967,0.009651,-0.00913,0.016051,0.000565,-0.006745,0.006949,0.021217,0.017539,0.002967,-0.003814,0.016394,0.011443,-0.010743,0.015422,-0.007492,0.006412,-0.018508,-0.003763,-0.010258,-0.034641,0.003971,0.012807,0.020696,0.008306,0.000858,0.016784,-0.017261,-0.019222,0.018138,0.002409,0.030819,0.007352,-0.009905,0.003296,0.009896,0.020073,0.009849,-0.011995,-0.003346,0.018278,0.011761,-0.008173,-0.002904,0.000559,-0.023254,-0.000756,0.014323,0.021623,0.009647,-0.000923,0.018409,0.00414,-0.01598,0.003323,-0.012654,0.005351,-0.01058,0.004594,-0.014951,0.002638,-0.001018,0.019544,0.009782,0.034841,-0.003797,0.001427,-0.016187,-0.008701,-0.004137,-0.003884,0.012173,-0.004549,-0.008592,-0.019691,-0.01144,-0.001867,0.003111,-0.013886,0.008983,0.011444,0.00786,0.005983,0.007405,0.007301,0.01534,-0.007426,0.025241,-0.018652,-0.003055,0.006571,-0.003629,0.005686,0.008752,0.000279,-0.017226,0.01451,-0.003971,-0.003905,0.016058,-0.022591,0.009685,0.020854,-0.006142,-0.005774,0.009259,0.008155,-0.007269,-0.007763,0.006829,-0.021465,-0.007562,0.003837,-0.003232,-0.010939,-0.001754,-0.019062,-0.013329,-0.009903,-0.001265,0.021219,-0.003071,0.002521,-0.009624,-0.006912,-0.008346,0.018096,-0.008363,-0.004684,0.006336,0.01281,-0.000807,0.008201,0.012035,-0.009213,-0.009921,0.007764,0.008899,0.007216,-0.005796,0.006133,0.013801,-0.015935,-0.007177,0.002236,0.003129,-0.012927,-0.006159,0.004225,0.026937,0.001358,0.012673,-0.004315,-0.002035,0.008295,-0.001315,0.005393,0.003733,0.024581,-0.01967,0.008099,0.003728,-0.018815,0.00101,-0.008273,0.00078,0.007821,-0.010501,0.013363,-0.017681,-0.014448,0.004403,0.000807,-0.007189,0.00873,0.000223,0.014109,0.030282,-0.002383,-0.013673,-0.0035,-0.00739,0.013017,-0.007018,-0.018764,0.004093,0.001161,-0.003004,0.00046,-0.01164,-0.014427,0.006012,-0.003372,0.008276,0.000163,0.005292,0.009312,0.005778,0.01478,0.010748,0.006528,-0.011686,-0.003398,-0.008658,0.005432,0.016919,-0.006005,-0.005508,-0.001077,0.001628,0.004388,-0.006823,0.00876,-0.006772,-0.00931,-0.002256,0.013054,-0.005372,0.019703,-0.004731,-0.01366,-0.003449,-0.00618,0.001936,0.002446,-0.010641,-0.010339,-0.008864,0.019644,-0.009308,-0.014862,0.018339,0.021996,-0.015752,0.000254,-0.016399,-0.011201,0.006879,-0.002404,-0.002863,0.009902,-0.020603,0.010032,0.000563,-0.035289,-0.00262,0.005378,0.001884,-0.026027,0.000318,-0.01816,0.013455,0.03332,0.021081,-0.007704,0.010812,0.000512,-0.000615,0.00342,0.005232,-0.009454,0.020532,-0.004067,0.006363,-2e-05,-0.010406,-0.024523,-0.008722,-0.002289,0.00773,-0.001044,-0.00901,-0.004684,-0.012579,0.006227,0.002931,-0.022759,0.003203,-0.00883,0.006987,-0.013734,-0.02531,-0.002771,-0.002157,-0.018193,-0.002759,0.006644,-0.003563,-0.000256,0.004574,-0.013884,0.010145,-0.018473,0.009391,-0.010135,-0.01557,0.007416,0.000119,0.006864,0.352941,0.352941,0.823529,3.647059
--U98MNlDym2cLn36BBPgQ,-0.005564,-0.000944,-0.00348,0.002853,0.007491,-0.00066,-0.003497,0.002325,0.026685,0.005464,-0.007248,-0.000974,0.018046,0.008357,-0.019039,0.008971,-0.001258,0.003943,-0.025755,-0.00109,0.014484,-0.030295,0.021154,0.005877,0.010187,0.013338,0.007523,0.01294,-0.014777,-0.024164,0.0411,-0.00077,0.007845,0.014877,-0.005499,-0.015006,0.019996,0.004474,0.001769,-0.011596,0.013088,0.00664,-0.003198,-0.006436,0.00995,-0.001843,-0.026097,-0.008505,0.000224,0.004894,0.008225,-0.003288,0.005818,-0.010148,-0.010819,0.005193,-0.005695,0.016921,-0.004829,0.016215,-0.020262,-0.007885,0.012467,0.021577,0.002958,0.00394,0.003204,-0.009085,-0.006976,0.004098,-0.008992,-0.013135,-0.000485,0.005959,-0.002578,-0.020665,-0.020021,-0.001745,0.00531,0.005375,0.026599,-0.002281,0.002476,-0.008919,0.011867,0.011753,0.013241,0.007646,0.015286,-0.019387,-0.000861,0.00991,0.004722,0.006433,0.019949,-0.003987,0.005567,0.023403,-0.00255,-0.009002,0.005454,-0.006714,0.007594,0.016321,-0.000346,-0.000252,0.014122,-0.007365,-0.023506,-0.004877,-0.003139,-0.01621,0.002644,-0.010215,0.010724,-0.00731,0.005211,-0.016495,-0.010979,0.0036,-0.009721,0.012106,-0.00903,0.006079,-0.013465,-0.004483,-0.001365,0.012768,-0.013028,0.000788,0.007313,0.012509,-0.006384,-0.004753,0.008417,-0.011181,-0.01563,0.005005,0.003815,-0.000974,-0.009535,-0.000988,0.012781,-0.020511,-0.006674,0.011453,-0.018501,-0.005196,-0.003227,-0.010871,0.006396,0.004615,-0.000946,-0.004127,-0.003311,-0.000663,0.00251,0.008027,0.000299,0.012433,-0.013752,0.002294,0.004479,-0.019047,0.009406,-0.001238,-0.004585,0.009398,-0.012283,0.013845,-0.004649,-0.009626,0.009543,0.001538,-0.021331,0.012181,-0.00048,0.017829,0.008854,-0.011601,-0.002123,-0.013013,0.00407,-0.002811,-0.017783,-0.006919,0.009057,0.003706,0.001674,-0.004231,-0.022987,-0.002309,0.00444,-0.003228,-0.004219,0.003064,-0.009686,0.006762,0.000211,-0.00358,0.009524,0.005179,-0.007249,0.001858,-0.012122,0.002947,0.008471,0.001587,-0.014388,-0.004259,0.000955,0.00248,-0.0132,0.010114,-0.004476,-0.002501,-0.009151,-0.011466,-0.009791,0.025125,-0.009181,-0.003719,-0.002027,-0.002515,-0.009595,-0.010926,-0.008428,-0.013462,-0.012293,0.012809,-0.01791,-0.001546,0.007263,0.005538,0.001986,-0.011381,-0.00802,-0.013125,0.011645,-0.010927,0.019354,-7.5e-05,-0.012854,-0.002883,0.001277,-0.026105,0.008271,-0.000996,0.004661,-0.016013,-0.002846,-0.014104,0.00492,0.017741,0.011252,-0.002487,0.005355,-0.001425,0.006644,0.007952,0.020606,-0.016269,0.022072,-0.009253,0.004192,0.000376,-0.002333,-0.025644,0.001453,-0.000391,0.011531,0.005834,-0.006042,-0.016676,-0.004796,0.010255,0.001142,-0.019872,-0.000122,-0.00659,0.008564,0.011062,-0.006372,0.000337,0.003716,-0.003676,-0.002182,-0.006977,-0.002101,-0.021149,-0.00186,-0.015195,0.015674,-0.015427,-0.009989,-0.003153,-0.007485,-0.008983,0.005612,-0.005011,0.0,0.0,2.0,3.0
--j-kaNMCo1-DYzddCsA5Q,-0.011408,0.004498,-0.005131,0.040393,0.015439,-0.023149,-0.018121,-0.01275,0.048993,-0.003531,-0.024677,-0.032462,-0.015342,0.000457,-0.004219,0.025674,-0.009732,0.007939,0.027066,-0.026292,-0.020693,0.019704,-0.00073,0.006854,0.021814,0.014953,-0.006072,0.02136,-0.024738,-0.02165,-0.003612,0.00143,0.030509,-0.009074,-0.037931,0.03775,-0.011981,0.023262,-0.019438,-0.011353,0.015401,0.012555,-0.026999,-0.012126,0.040942,0.043306,-0.030898,-0.00164,0.025986,-0.013791,-0.013998,0.011637,0.01525,0.014251,0.015818,0.01556,-0.003851,0.014891,0.016089,-0.019991,0.0242,-0.007817,-0.010958,0.026268,0.014333,0.027851,0.021072,0.000597,-0.009619,0.007727,-0.019088,-0.011223,-0.016202,0.008886,-0.018291,0.008942,-0.006631,-0.003544,0.037693,0.03982,0.012824,0.006585,0.002338,0.000905,-0.000615,0.025102,-0.00569,0.039397,0.035818,0.02456,-0.000305,0.018925,-0.020685,-0.029385,-0.015098,-0.010915,0.013361,0.028953,0.000481,-0.005198,-0.00643,-0.033768,0.00138,0.000388,-0.032795,-0.008521,0.005831,-0.005367,-0.006409,-0.059394,0.005388,-0.017854,-0.010545,0.018931,-0.006061,-0.007233,-0.002718,-0.006306,0.001461,-0.003748,-0.001259,-0.005952,-0.000632,-0.013196,-0.000417,-0.010504,0.00345,0.02094,-0.019679,-0.029353,0.01774,-0.010969,-0.019441,0.0028,0.002309,0.004318,0.001813,0.019408,-0.01001,-0.021436,0.028445,0.015566,0.00528,0.009365,-0.015892,-0.001641,-0.001963,-0.009057,0.003751,0.02662,0.045652,-0.016511,0.03674,-0.036992,0.019457,0.005505,-0.00045,0.039375,0.028597,0.012334,-0.017266,-0.001321,0.017346,-0.028994,0.000102,-0.022917,0.022877,-0.006573,-0.018327,0.005066,0.007471,-0.014181,0.013244,-0.002325,0.013869,0.012507,-0.013054,-0.003175,0.022951,0.000244,-0.008114,0.039611,-0.003074,-0.033306,-0.010178,-0.00323,0.014981,0.004657,0.003973,-0.001902,0.000326,-0.011839,0.027661,-0.002376,0.023028,0.010613,-0.007019,0.020521,0.006992,-0.00388,0.008437,0.012277,-0.036878,-0.002529,-0.009768,-0.006173,0.025762,0.037304,-0.002869,0.005554,-0.013933,0.008529,-0.020774,0.012585,0.008027,-0.003281,-0.008263,0.005063,0.016019,0.007121,-0.011471,-0.020077,0.016207,-0.024511,0.0003,-0.012528,0.004442,-0.008622,0.001793,0.000132,-0.013576,-0.018616,0.014267,0.01957,-0.055023,0.009012,0.001078,-0.008804,0.005712,-0.004416,0.011757,0.002296,-0.049666,0.007852,-0.028842,0.02521,-0.018244,-0.029429,0.005054,-0.01895,0.003347,-0.020108,0.01701,0.01176,0.004549,-0.037733,-0.02671,-0.008685,0.013084,0.022865,-0.010566,0.001083,0.040527,0.006276,0.009053,0.002601,-0.033974,-0.044425,-0.013743,-0.018428,0.009389,-0.031036,0.012233,0.012775,0.02045,0.001214,-0.01151,0.000485,-0.004856,0.010159,0.004035,-0.000834,-0.022557,0.011678,0.022661,-0.029659,-0.002098,0.005847,-0.001321,0.004675,-0.027056,-0.014537,0.033893,-0.032624,0.037274,-0.004122,-0.024268,-0.00463,-0.013603,0.012318,0.0,0.0,0.0,5.0
--wIGbLEhlpl_UeAIyDmZQ,0.03603,-0.004617,-0.013974,0.020747,0.038429,-0.005669,-0.002198,0.015588,-0.002853,-0.009565,-0.002643,-0.014395,-0.012292,0.00845,0.007603,-0.003827,0.021573,0.008389,-0.002452,0.010293,0.022692,0.002092,-0.011285,0.0121,-0.02604,0.003841,-0.000393,0.008389,-0.013863,-0.019479,0.035048,0.008522,-0.007378,0.004424,-0.009327,-0.014849,0.0085,-0.023346,-0.022681,-0.015705,0.023771,-0.001025,-0.022194,0.019416,0.001624,0.011531,-0.015406,-0.00171,0.009043,0.00396,-0.006161,-0.000354,-0.00407,0.00208,0.008475,-0.012978,-0.008557,0.008782,-0.007786,0.002105,-0.015182,0.015293,0.009434,-0.001137,0.007613,-0.016997,0.002806,-0.018171,-0.011341,0.022138,-0.023217,-0.013575,-0.016987,-0.003429,0.008313,-0.000669,-0.021147,0.000362,0.005093,0.021298,0.014194,-0.015987,-0.020088,-0.006929,-0.010896,0.006858,0.005904,0.031907,0.025973,-0.01457,0.002702,0.020056,-0.01249,-0.010422,0.011489,-0.012906,0.011035,0.03269,-0.011232,0.001898,0.01185,0.01813,-0.012305,-0.008048,0.00629,0.000533,0.01797,-0.003583,-0.01239,-0.017582,-0.006649,0.002712,-0.007194,-0.01096,0.008641,0.001024,-0.003371,-0.013689,-0.002284,0.01018,-0.005039,-0.01597,-0.004896,0.004551,0.009634,-0.001396,-0.004828,0.017685,-0.008758,-0.018465,0.010553,0.004184,-0.00454,-0.003628,-0.002454,-0.006859,8.7e-05,0.007116,0.008144,-0.013943,0.007145,0.003522,-0.001099,-0.000666,0.00379,0.006113,0.003153,0.005155,0.006425,-0.013315,-0.012087,-0.012384,0.007106,0.002679,0.010285,-0.005324,-0.004294,0.017046,0.011173,0.003112,-0.012589,-0.014093,0.008187,-0.009397,0.014803,0.003589,0.022513,-0.011319,0.02168,-0.000437,0.001318,-0.009923,0.007855,-0.006369,-0.016201,0.012389,0.005126,-0.000259,0.008205,0.002956,0.007049,-9e-06,-0.003225,-0.007604,-0.022469,0.012772,0.014779,-0.02159,0.001233,-0.000396,-0.005885,-0.002753,0.01871,-0.006596,-0.0002,0.002954,0.001937,0.008445,0.00241,-0.007021,0.010886,-0.017201,-0.007942,0.005915,-0.014905,-0.00642,-0.003849,0.01359,-0.002348,0.005322,0.000939,0.015895,-0.042152,-0.007171,0.006589,0.006256,-0.023661,-0.000313,0.015247,-0.00439,-0.034916,-0.001136,-0.02216,0.001242,-0.010273,-0.014189,0.002097,-0.003391,-0.000624,0.015829,-0.023511,-0.009568,-0.015818,-0.016161,0.001785,-0.006375,-0.00487,-0.010851,-0.007529,-0.019133,0.006462,-0.022415,-0.008734,-0.018169,-0.012631,0.023061,0.013576,-0.019669,-0.00891,0.005105,0.022213,-0.013286,0.011676,-0.006114,-0.018936,-0.007927,-0.00477,0.010402,0.00524,-0.000165,-0.000383,0.002561,0.001053,-0.001729,-0.004861,0.000307,0.008562,-0.008165,-0.000832,-0.009696,0.006255,0.009771,-0.002139,-0.000518,0.000565,0.007697,-0.014454,-0.003493,-0.021064,0.001639,0.001644,0.004618,0.002439,0.005728,0.024569,-0.016353,-0.008612,-0.008367,0.003,-0.001397,-0.009965,0.001797,0.001847,0.004512,-0.01266,0.014142,-0.015593,-0.015314,0.012027,-0.008484,0.666667,0.166667,3.0,3.833333
-000aQFeK6tqVLndf7xORg,0.037063,-0.009315,-0.026367,0.034366,0.063646,0.002534,-0.017493,0.019709,0.01087,-0.006368,-0.008021,-0.0156,-0.005306,0.002922,-0.01058,0.001384,0.017555,0.014387,0.016768,-0.00018,0.016056,0.006687,-0.007475,0.019319,-0.027481,0.013232,0.011752,0.003663,-0.015931,-0.024199,0.029718,0.011359,-0.004956,0.005804,-0.020293,0.001284,0.015996,-0.029603,-0.041141,-0.014429,0.021545,0.000492,-0.034298,0.019427,0.002762,0.021009,-0.032378,-0.004774,0.017271,0.00297,0.004246,0.006514,-0.007337,-0.001333,0.012854,-0.010315,-0.001858,0.008479,0.000681,-0.006574,-0.004562,0.022622,0.008401,0.006384,0.00242,-0.019281,-0.005052,-0.010938,0.002747,0.018117,-0.034967,-0.018739,-0.013164,0.006984,0.014279,0.002119,-0.02226,0.005805,0.006742,0.033168,0.012665,0.004269,-0.013083,-0.006228,-0.022316,0.008169,0.00291,0.05077,0.036307,-0.018526,0.012091,0.01554,-0.017117,-0.003561,0.023017,-0.012839,0.017719,0.033495,-0.00655,0.005886,0.021863,0.021454,-0.018608,-0.001935,0.002049,0.001544,0.022386,-0.000454,-0.011431,-0.022651,0.000362,0.004123,-0.00905,-0.001328,0.007845,0.000739,0.006232,-0.007364,0.002612,0.00221,0.004153,-0.034616,0.001015,-0.005685,0.01145,-0.008641,-0.010007,0.018973,-0.011076,-0.016471,0.005733,-0.004936,-0.01818,-0.001098,-0.001558,-0.000404,0.012638,0.004509,0.00615,-0.017909,0.017598,0.01067,-0.005524,0.016061,-0.002045,0.022372,0.01295,0.012163,0.028965,-0.011939,0.006377,-0.000237,0.015302,-0.00127,0.008944,-0.011075,-0.012375,0.01696,0.010727,0.009859,-0.031381,-0.000325,0.010991,-0.011284,0.011155,-0.004994,0.031776,-0.00276,0.022864,0.008213,0.005658,-0.015535,0.008584,-0.00343,-0.021115,0.014513,-0.010297,0.000324,0.016335,0.005401,0.009448,0.002699,-0.01129,-0.012861,-0.011274,-0.00531,0.018629,-0.011298,0.007405,0.000958,-0.010471,-0.00069,0.018937,-0.009247,-0.000434,0.004826,-0.003482,0.020534,0.012395,-0.010455,0.010883,-0.022955,-0.01066,0.015764,-0.011173,0.000755,-0.005792,0.022928,0.00315,0.016144,0.010836,0.021907,-0.048891,-0.000169,0.005724,0.008939,-0.023912,0.012689,0.016462,-0.012716,-0.037714,0.000926,-0.029815,-0.008867,-0.018469,-0.01903,0.001562,-0.016534,0.009348,-0.001217,-0.030756,-0.015793,-0.009945,-0.001586,-0.018029,-0.008044,-0.006953,-0.018411,-0.013626,-0.021759,0.004791,-0.021245,-0.018038,-0.03359,-0.023835,0.024545,-0.000545,-0.017671,-0.002003,0.006448,0.030536,-0.029008,0.014564,-0.011317,-0.013324,-0.023969,-0.010229,0.008008,0.01544,0.021932,-0.009427,-1.3e-05,0.01224,0.014289,0.004621,-0.002303,0.013994,-0.013895,-0.006972,-0.010657,0.017143,0.020828,0.013929,0.007456,0.007746,-0.012964,-0.025667,0.002786,-0.020329,0.004065,0.015754,0.020314,0.008037,2.7e-05,0.034369,-0.016854,0.009845,-0.012831,0.00527,-0.01097,-0.006481,0.012282,0.006312,0.002442,0.007883,0.003245,-0.032445,-0.011714,0.013191,-0.002767,0.666667,0.0,0.0,5.0


In [21]:
all_features_business.describe()

Unnamed: 0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
count,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13943.0,13943.0,13943.0,13943.0
mean,0.016325,-0.004364,-0.006402,0.020158,0.029502,-0.003208,-0.010685,0.009376,0.012552,-0.004472,-0.00331,-0.014167,0.002102,0.004008,-0.007513,0.002721,0.002262,0.008271,-0.002091,0.001201,0.003657,-0.004931,-0.004659,0.012236,0.001064,0.00538,0.003229,0.015308,-0.0175,-0.02208,0.023508,0.001627,0.008613,0.010868,-0.015696,0.002348,0.012835,-0.004725,-0.011509,-0.008027,0.011702,0.002077,-0.016804,0.004758,0.010488,0.015464,-0.016128,-0.005858,0.007545,0.012074,-0.003229,0.003621,0.003007,0.008399,-0.004939,-0.007916,-0.014333,0.008893,0.001688,-0.002745,-0.005491,0.006055,0.000806,0.010479,0.010355,0.004473,0.001419,-0.006288,-0.00677,0.00828,-0.014861,-0.007812,-0.000345,-0.000589,-0.001102,-0.005828,-0.008541,-4.3e-05,0.01047,0.007231,0.013327,0.002172,-0.001125,-0.002084,0.00035,0.010422,0.006678,0.014141,0.022084,-0.012352,0.000125,0.014464,-0.006962,-0.002064,0.007533,-0.007547,0.00034,0.017195,-0.001856,-0.002323,0.006585,-0.00519,0.002508,0.009883,-0.005286,-0.001015,0.013699,-0.000954,-0.006634,-0.022509,0.00432,-0.006362,-0.008834,0.000226,0.00358,-0.001637,0.002604,-0.013077,-0.006838,-0.001755,-0.004074,-0.002535,-0.002303,0.000682,0.002816,-0.003984,-0.00812,0.014383,-0.011605,-0.007736,0.008098,0.008186,-0.006991,0.000786,0.005079,-0.005328,-0.007102,0.007856,0.001601,-0.002274,0.004999,0.005435,0.007235,-0.00343,-0.006605,0.006426,0.005171,-0.002945,0.003437,-0.000989,0.012936,-0.001124,0.009919,-0.010635,0.008095,3.2e-05,-0.002573,0.011731,0.010755,0.007755,-0.012716,-0.003414,0.007596,-0.015664,0.01329,-0.006062,0.013889,0.001042,-1.3e-05,0.006703,-0.005037,-0.012355,0.002775,-0.000866,-0.011181,0.010144,-0.003659,0.002984,0.014948,-0.002125,-0.003902,0.008244,-0.002106,-0.004093,-0.010134,-0.003427,0.00935,-0.006079,0.001406,0.002739,-0.005702,-0.007553,0.012649,-0.003805,0.005536,0.004437,-0.002156,0.011755,0.000147,-0.002476,0.00695,-3.4e-05,-0.011695,0.001917,-0.003531,0.002458,0.011911,0.006233,-0.003769,-0.000725,-0.001579,0.00786,-0.022498,0.002156,0.005403,0.001114,-0.008939,0.000973,0.005984,0.004714,-0.016299,-0.004099,-0.008188,-0.002256,-0.004915,-0.003994,-0.008201,-0.00672,-0.004167,0.011294,-0.015993,-0.010462,0.003714,0.003722,-0.016488,-0.002046,-0.006642,-0.009611,0.000742,-0.008056,0.00246,-0.005209,-0.017966,-0.005446,-0.006858,0.001209,0.005783,-0.007552,-0.002264,-0.007618,0.006727,-0.015642,0.011891,0.009692,-0.000371,-0.007003,-4.7e-05,0.002266,0.002852,0.00276,0.005937,-0.003825,0.013298,-0.004863,0.005719,-0.000142,-0.005432,-0.014547,-0.005443,-0.00758,0.007072,0.000137,-0.000767,-0.006952,0.001103,0.006491,-0.004917,-0.008838,-0.002564,0.001235,0.005916,0.003009,-0.006107,0.004903,0.009185,-0.014782,-0.000598,-0.000796,-0.001341,-0.000836,-0.009814,-0.00353,0.006524,-0.012717,0.005736,0.000154,-0.011608,-0.008251,0.006874,-0.002084,0.486991,0.423987,1.434996,3.615964
std,0.020554,0.012452,0.014713,0.018115,0.016154,0.016375,0.014017,0.012072,0.017694,0.014805,0.012081,0.013043,0.015999,0.009887,0.010943,0.015948,0.016816,0.010805,0.019317,0.011971,0.019381,0.02254,0.01433,0.010719,0.019765,0.012857,0.010735,0.011013,0.011178,0.011968,0.017179,0.00982,0.018909,0.012169,0.011306,0.013636,0.011342,0.023447,0.016721,0.011306,0.012687,0.012359,0.023068,0.018725,0.011798,0.018817,0.012682,0.013829,0.011311,0.016433,0.015707,0.012399,0.017634,0.011882,0.013756,0.017883,0.012234,0.010305,0.012433,0.010396,0.014381,0.013071,0.014185,0.011517,0.010877,0.022234,0.009885,0.015301,0.010162,0.015272,0.016318,0.010238,0.014695,0.009296,0.010829,0.017681,0.01492,0.012561,0.013988,0.020116,0.014862,0.011736,0.014181,0.014143,0.012466,0.010994,0.01311,0.023644,0.010428,0.012139,0.009228,0.016228,0.012725,0.015899,0.011339,0.013259,0.019478,0.01448,0.01064,0.009745,0.011242,0.018351,0.012597,0.012072,0.013552,0.011336,0.01272,0.012014,0.014945,0.014781,0.012127,0.012491,0.010273,0.011488,0.010633,0.010956,0.01197,0.011445,0.012978,0.014046,0.01294,0.019911,0.010413,0.013838,0.012718,0.010022,0.009834,0.010084,0.008063,0.01598,0.014206,0.01049,0.012687,0.013874,0.011176,0.012597,0.014283,0.00895,0.015268,0.009524,0.01188,0.014478,0.010808,0.013622,0.01072,0.011716,0.011737,0.01046,0.011349,0.014648,0.017057,0.011491,0.012712,0.011835,0.013107,0.011011,0.009211,0.013702,0.012745,0.012421,0.009939,0.011656,0.01337,0.012365,0.012303,0.010567,0.014304,0.010146,0.015899,0.011502,0.01598,0.010509,0.010762,0.014364,0.010368,0.010773,0.012851,0.012772,0.014983,0.011662,0.017801,0.015477,0.0087,0.018481,0.014031,0.011728,0.012535,0.011977,0.008994,0.01139,0.012383,0.015358,0.011328,0.01076,0.015521,0.011118,0.010124,0.011056,0.010779,0.017004,0.010101,0.016308,0.012653,0.012113,0.011694,0.01091,0.010882,0.015487,0.011973,0.013723,0.011185,0.014902,0.01743,0.00837,0.011059,0.01391,0.0153,0.012739,0.012899,0.013975,0.01562,0.010723,0.0141,0.009838,0.013218,0.014188,0.009491,0.011881,0.010988,0.014753,0.011313,0.01084,0.016657,0.015125,0.014965,0.011463,0.012535,0.01179,0.012287,0.012918,0.015016,0.013582,0.013511,0.019018,0.014519,0.026055,0.012715,0.014212,0.011216,0.011976,0.012583,0.012151,0.013909,0.019666,0.019604,0.012433,0.013876,0.011735,0.011831,0.012005,0.011337,0.011828,0.014168,0.011532,0.013186,0.009804,0.012762,0.013548,0.008838,0.01312,0.013056,0.014841,0.012631,0.013153,0.010179,0.012877,0.013392,0.016627,0.013472,0.012573,0.010218,0.013189,0.01921,0.011567,0.016143,0.009654,0.012193,0.013455,0.009357,0.016418,0.013683,0.015193,0.013677,0.015928,0.013133,0.011674,0.012714,0.0134,0.013025,0.013634,1.299472,1.070148,2.371442,1.277067
min,-0.073719,-0.068793,-0.125102,-0.110815,-0.077022,-0.07573,-0.094913,-0.077733,-0.099305,-0.117995,-0.068821,-0.099214,-0.067135,-0.062514,-0.06513,-0.125602,-0.067201,-0.055624,-0.073307,-0.065356,-0.078135,-0.092116,-0.067021,-0.072985,-0.078486,-0.058699,-0.092668,-0.047436,-0.091846,-0.097693,-0.061998,-0.050814,-0.073023,-0.076901,-0.075857,-0.067967,-0.049055,-0.107107,-0.085696,-0.06461,-0.052952,-0.055837,-0.146782,-0.104218,-0.053225,-0.067503,-0.095154,-0.112098,-0.04041,-0.074357,-0.068572,-0.055941,-0.087082,-0.104769,-0.07713,-0.103339,-0.148718,-0.074796,-0.084428,-0.056771,-0.065648,-0.068903,-0.071832,-0.051936,-0.069274,-0.075849,-0.049353,-0.073867,-0.05188,-0.091876,-0.079002,-0.078624,-0.073353,-0.070125,-0.075235,-0.07967,-0.108385,-0.067899,-0.065875,-0.10544,-0.069059,-0.059534,-0.066836,-0.094225,-0.060152,-0.066346,-0.060756,-0.093567,-0.032249,-0.089628,-0.057734,-0.060555,-0.082647,-0.066506,-0.049713,-0.08054,-0.120024,-0.059101,-0.057598,-0.06764,-0.065931,-0.114235,-0.060646,-0.060138,-0.065763,-0.060679,-0.05068,-0.083897,-0.09827,-0.116362,-0.055602,-0.071361,-0.076855,-0.063167,-0.047553,-0.065728,-0.070179,-0.081278,-0.089729,-0.088007,-0.096402,-0.084499,-0.060242,-0.080945,-0.059027,-0.061199,-0.067998,-0.036976,-0.118344,-0.089133,-0.075235,-0.046505,-0.078282,-0.066676,-0.073388,-0.078787,-0.097518,-0.070549,-0.073255,-0.062906,-0.078614,-0.063481,-0.07986,-0.059192,-0.064969,-0.060857,-0.0672,-0.066126,-0.046514,-0.069588,-0.085414,-0.08414,-0.049769,-0.115653,-0.055584,-0.062517,-0.050557,-0.109405,-0.048033,-0.061188,-0.075351,-0.069088,-0.050148,-0.083824,-0.080002,-0.068857,-0.060934,-0.057806,-0.074038,-0.07547,-0.103698,-0.066464,-0.069583,-0.077002,-0.090499,-0.068838,-0.076005,-0.078568,-0.041799,-0.074521,-0.083732,-0.075671,-0.053274,-0.078415,-0.088355,-0.065679,-0.044575,-0.060134,-0.058103,-0.069079,-0.05717,-0.072961,-0.070845,-0.054738,-0.077747,-0.068209,-0.059248,-0.042737,-0.07616,-0.087065,-0.066344,-0.081634,-0.09302,-0.062501,-0.06483,-0.055145,-0.07333,-0.067167,-0.055143,-0.064509,-0.050577,-0.063369,-0.114923,-0.04779,-0.051962,-0.070044,-0.099449,-0.104439,-0.0876,-0.082254,-0.085637,-0.057474,-0.090904,-0.062226,-0.070627,-0.079993,-0.125287,-0.077741,-0.061486,-0.059418,-0.084991,-0.079817,-0.118719,-0.072377,-0.095838,-0.061635,-0.065326,-0.099982,-0.062443,-0.083863,-0.088458,-0.072548,-0.089128,-0.092696,-0.096745,-0.119593,-0.052382,-0.074802,-0.081027,-0.104036,-0.06538,-0.084104,-0.082247,-0.072778,-0.083132,-0.063662,-0.085287,-0.061191,-0.080648,-0.059286,-0.044697,-0.071333,-0.058157,-0.091648,-0.097666,-0.062937,-0.076294,-0.079168,-0.061294,-0.103937,-0.059084,-0.090549,-0.053707,-0.076091,-0.060789,-0.065553,-0.071756,-0.095134,-0.057994,-0.060692,-0.061964,-0.059437,-0.13635,-0.058042,-0.080274,-0.078592,-0.068832,-0.05433,-0.063027,-0.070677,-0.090119,-0.076373,-0.082559,-0.129757,-0.099355,-0.067213,-0.089026,-0.089942,-0.059396,-0.092728,0.0,0.0,0.0,1.0
25%,-0.000546,-0.012085,-0.014651,0.006877,0.018292,-0.012245,-0.018752,0.002143,0.001716,-0.014701,-0.010641,-0.02256,-0.00984,-0.001791,-0.014448,-0.005668,-0.009667,0.002185,-0.016138,-0.006082,-0.010662,-0.023947,-0.013981,0.006981,-0.014114,-0.002546,-0.003099,0.008209,-0.024174,-0.029393,0.012966,-0.003986,-0.004708,0.003519,-0.022272,-0.005863,0.006185,-0.022004,-0.023501,-0.014775,0.003637,-0.006656,-0.032072,-0.009335,0.00327,0.003292,-0.024111,-0.013986,0.000327,0.002063,-0.013566,-0.00458,-0.008912,0.000974,-0.015029,-0.019704,-0.022078,0.002723,-0.005896,-0.008499,-0.013693,-0.002761,-0.007861,0.003377,0.003622,-0.012286,-0.004017,-0.015551,-0.013965,-0.002706,-0.026891,-0.01392,-0.010572,-0.005924,-0.007846,-0.019074,-0.017745,-0.007651,0.001763,-0.006785,0.004636,-0.005172,-0.011012,-0.011258,-0.008624,0.003989,-0.001878,-0.005288,0.016068,-0.01947,-0.005384,0.003134,-0.01516,-0.01256,0.001037,-0.015603,-0.013889,0.007748,-0.008032,-0.008576,-0.000295,-0.019079,-0.006176,0.002261,-0.012965,-0.007815,0.006096,-0.007802,-0.015544,-0.031315,-0.003489,-0.014115,-0.014786,-0.006988,-0.003035,-0.008558,-0.004276,-0.020175,-0.01514,-0.010658,-0.010864,-0.017093,-0.008528,-0.007411,-0.00651,-0.009924,-0.013897,0.00843,-0.016264,-0.017576,-0.001003,0.002,-0.01494,-0.008003,-0.0017,-0.013789,-0.017066,0.00297,-0.007691,-0.008189,-0.002672,-0.003243,0.000389,-0.013514,-0.012931,-0.001707,-0.002554,-0.009041,-0.004517,-0.010298,0.002734,-0.007299,0.001902,-0.017778,-0.001504,-0.007408,-0.008292,0.002974,0.002772,-0.000402,-0.018475,-0.01048,-0.001219,-0.022629,0.006593,-0.012222,0.004642,-0.004797,-0.012293,0.000154,-0.016296,-0.018244,-0.003148,-0.010594,-0.017652,0.003132,-0.011669,-0.004692,0.004887,-0.009833,-0.016181,-0.00124,-0.007114,-0.016286,-0.018767,-0.011067,0.001598,-0.013833,-0.004042,-0.004196,-0.013753,-0.018226,0.005799,-0.009734,-0.0038,-0.002911,-0.008268,0.005065,-0.005394,-0.014491,0.001497,-0.01099,-0.018656,-0.005516,-0.010075,-0.003414,0.004867,-0.003924,-0.011213,-0.008849,-0.008521,-0.002054,-0.035191,-0.002767,-0.002421,-0.009238,-0.019082,-0.006838,-0.003087,-0.004136,-0.026951,-0.010861,-0.016188,-0.00777,-0.013518,-0.012908,-0.013971,-0.014323,-0.011392,0.001659,-0.022878,-0.016921,-0.005671,-0.0067,-0.025529,-0.008778,-0.015209,-0.016334,-0.00745,-0.017172,-0.006637,-0.013966,-0.026076,-0.019867,-0.015949,-0.021245,-0.001809,-0.018143,-0.009421,-0.014818,-0.002181,-0.0226,0.003264,-0.005097,-0.015078,-0.014423,-0.009641,-0.005228,-0.004155,-0.004397,-0.000895,-0.011057,0.004407,-0.010854,-0.001398,-0.005349,-0.013223,-0.023273,-0.010521,-0.015164,-0.001287,-0.007554,-0.009038,-0.014909,-0.005469,-0.001503,-0.01376,-0.022154,-0.011103,-0.007917,8.5e-05,-0.005702,-0.018035,-0.001905,-0.002105,-0.02021,-0.00888,-0.008314,-0.007287,-0.009577,-0.018188,-0.013646,-0.000589,-0.023249,-0.002379,-0.007412,-0.019201,-0.015968,-0.001378,-0.010072,0.0,0.0,0.15251,3.0
50%,0.016733,-0.004203,-0.005613,0.019299,0.028704,-0.00342,-0.010226,0.009292,0.012851,-0.004694,-0.002751,-0.013656,0.001169,0.004559,-0.007221,0.003518,0.000732,0.008244,-0.00505,0.001088,0.002679,-0.002163,-0.004313,0.012295,0.000544,0.006248,0.003028,0.015657,-0.017548,-0.021962,0.023822,0.001898,0.009893,0.011054,-0.014987,0.002904,0.01226,-0.004953,-0.012634,-0.008609,0.011747,0.002076,-0.014898,0.004121,0.010448,0.013138,-0.016394,-0.005305,0.007054,0.012467,-0.002712,0.002299,0.004749,0.008064,-0.004381,-0.007167,-0.01442,0.009141,0.000481,-0.00219,-0.006764,0.00561,0.000576,0.010757,0.010424,0.003957,0.001171,-0.006407,-0.007056,0.010103,-0.014794,-0.007554,-0.001895,-0.000763,-0.001567,-0.0053,-0.008403,-0.000613,0.010248,0.010729,0.013325,0.002085,-0.001082,-0.002152,-0.00025,0.010666,0.007272,0.018077,0.022068,-0.013305,-3.3e-05,0.015226,-0.007096,-0.00169,0.007955,-0.006257,0.002287,0.016286,-0.002371,-0.002855,0.006662,-0.005629,0.0021,0.010128,-0.004969,-0.001456,0.012903,-0.000333,-0.007319,-0.021835,0.003784,-0.006325,-0.008265,0.00027,0.00285,-0.001741,0.002571,-0.013106,-0.00601,-0.002932,-0.004619,-0.006498,-0.002814,0.000604,0.002247,-0.004392,-0.008308,0.014505,-0.011591,-0.007312,0.007323,0.008679,-0.006662,0.00036,0.005708,-0.005919,-0.007708,0.007877,0.001876,-0.002371,0.004169,0.004725,0.007707,-0.003724,-0.006898,0.00582,0.004806,-0.003918,0.001945,-0.001302,0.014367,-0.000254,0.009372,-0.010349,0.007203,0.000632,-0.002693,0.010655,0.0093,0.007953,-0.012048,-0.003273,0.007454,-0.014914,0.013473,-0.006315,0.013733,0.001252,0.001974,0.007177,-0.004045,-0.011933,0.002531,-0.000167,-0.011054,0.010116,-0.003738,0.003277,0.015103,-0.003169,-0.005894,0.007003,-0.002008,-0.006109,-0.009463,-0.002946,0.008962,-0.005141,0.001113,0.002326,-0.006496,-0.007592,0.01305,-0.003545,0.005596,0.0044,-0.002508,0.01176,0.00091,-0.00367,0.007497,0.00036,-0.011842,0.001724,-0.003294,0.002222,0.011963,0.005576,-0.00492,-0.001721,-0.001349,0.006968,-0.021972,0.00221,0.005081,0.000801,-0.007988,0.0013,0.006936,0.005645,-0.016734,-0.004536,-0.008001,-0.002072,-0.004525,-0.002293,-0.008391,-0.007073,-0.005003,0.012356,-0.015467,-0.010841,0.005623,0.00368,-0.016782,-0.002405,-0.006967,-0.008923,0.001028,-0.007609,0.002467,-0.004758,-0.017338,-0.004776,-0.005599,0.007109,0.005003,-0.008906,-0.001899,-0.007628,0.006108,-0.015762,0.012614,0.007818,-0.003246,-0.007106,0.00013,0.001322,0.002232,0.00341,0.005785,-0.004521,0.014114,-0.004863,0.006439,0.00015,-0.005741,-0.015861,-0.005195,-0.006224,0.006738,0.000979,-0.002102,-0.006931,0.000682,0.007846,-0.004542,-0.00851,-0.00131,0.000734,0.006049,0.003436,-0.005875,0.003982,0.00922,-0.014545,-0.001121,-0.00119,-0.001391,-0.001532,-0.009373,-0.00481,0.007044,-0.01197,0.00511,-0.000143,-0.010749,-0.007847,0.006637,-0.002199,0.076923,0.0,1.0,4.0
75%,0.031606,0.002963,0.003282,0.031759,0.039735,0.00464,-0.002006,0.016706,0.02336,0.005852,0.00423,-0.005677,0.014889,0.009979,-0.000589,0.012215,0.013939,0.014547,0.009744,0.008668,0.017672,0.012231,0.005085,0.017971,0.017645,0.013565,0.009517,0.022374,-0.010644,-0.014523,0.03396,0.007517,0.022794,0.018662,-0.00864,0.010686,0.018866,0.013547,0.00103,-0.001922,0.020083,0.010783,0.000759,0.017616,0.01765,0.025395,-0.008236,0.002454,0.014345,0.021911,0.006863,0.011238,0.015138,0.015243,0.004695,0.004022,-0.006695,0.015247,0.008791,0.003859,0.000827,0.014694,0.009948,0.018279,0.017258,0.023096,0.006447,0.001894,-0.000191,0.019325,-0.002791,-0.00175,0.009527,0.004718,0.005288,0.005971,0.000635,0.00708,0.019054,0.021449,0.022569,0.009431,0.009269,0.00681,0.009568,0.017108,0.015854,0.033177,0.02825,-0.006539,0.00549,0.026002,0.001275,0.00848,0.014669,0.001508,0.014343,0.026254,0.003724,0.003839,0.013292,0.007639,0.011527,0.017482,0.003126,0.005317,0.020334,0.006692,0.001713,-0.012769,0.011803,0.001517,-0.002322,0.007364,0.010116,0.00508,0.009708,-0.006059,0.00184,0.006144,0.001605,0.012543,0.003472,0.008948,0.011669,0.00146,-0.002585,0.020265,-0.006896,0.002068,0.016338,0.014464,0.001034,0.008723,0.012169,0.002936,0.002482,0.012813,0.011172,0.003814,0.012234,0.013384,0.01459,0.005619,-0.000272,0.013796,0.012171,0.00291,0.010795,0.007757,0.024405,0.006099,0.017258,-0.003349,0.01633,0.007388,0.003036,0.019838,0.017515,0.015916,-0.006532,0.003636,0.016217,-0.007894,0.020493,0.000132,0.023613,0.007222,0.011305,0.013443,0.006594,-0.006124,0.009038,0.009042,-0.004674,0.016998,0.00453,0.011157,0.024522,0.004672,0.008522,0.016889,0.002915,0.007852,-0.001189,0.004362,0.016661,0.002144,0.006637,0.009359,0.001806,0.003374,0.020018,0.002324,0.014761,0.011846,0.003429,0.018479,0.006508,0.009009,0.012921,0.012071,-0.004342,0.009883,0.003446,0.00781,0.019165,0.015765,0.002609,0.006089,0.005392,0.016987,-0.009086,0.007075,0.012809,0.01111,0.00224,0.008896,0.0154,0.014413,-0.005378,0.002615,-0.00017,0.003712,0.003836,0.005475,-0.002661,0.000674,0.002827,0.02076,-0.008485,-0.004297,0.014502,0.014437,-0.007365,0.004287,0.001482,-0.002133,0.009256,0.001129,0.01187,0.003877,-0.009286,0.008633,0.002799,0.020111,0.012526,0.003163,0.004909,0.000259,0.015259,-0.008997,0.020592,0.026085,0.014981,0.000358,0.009421,0.009224,0.009648,0.010231,0.012498,0.003089,0.022872,0.001825,0.013503,0.005319,0.002441,-0.006557,-0.000197,0.001357,0.015541,0.009133,0.006696,0.000703,0.007077,0.015077,0.003765,0.003487,0.006023,0.010201,0.011774,0.012384,0.006981,0.01055,0.020284,-0.009004,0.006924,0.00552,0.004594,0.006203,-0.001236,0.005053,0.014192,-0.00149,0.013304,0.007298,-0.003696,-0.000167,0.01523,0.005924,0.555556,0.5,1.833333,5.0
max,0.118439,0.06502,0.060483,0.112224,0.143686,0.140567,0.077582,0.068756,0.116803,0.082203,0.065248,0.056565,0.070505,0.074439,0.047104,0.076161,0.083904,0.076685,0.117653,0.057574,0.088794,0.083256,0.063776,0.069566,0.06745,0.066656,0.07135,0.082424,0.054426,0.03519,0.141051,0.056965,0.104441,0.077002,0.056302,0.070126,0.082541,0.121275,0.063158,0.065974,0.072392,0.078398,0.082,0.100086,0.083056,0.130159,0.07497,0.071468,0.065421,0.118037,0.105601,0.066878,0.087606,0.072196,0.063808,0.096269,0.053936,0.06216,0.086588,0.055589,0.116356,0.170715,0.108233,0.104528,0.071074,0.089545,0.093725,0.101497,0.059559,0.065063,0.054461,0.054452,0.093303,0.076354,0.06055,0.075497,0.069522,0.098912,0.074669,0.108568,0.086664,0.06338,0.085351,0.069534,0.050293,0.068003,0.067995,0.105155,0.079075,0.089125,0.075645,0.077203,0.061159,0.096841,0.068444,0.048177,0.07522,0.098224,0.062449,0.061327,0.082086,0.077775,0.07191,0.073408,0.070114,0.070761,0.117867,0.048679,0.075462,0.055618,0.092096,0.059352,0.036258,0.067987,0.070877,0.058852,0.087522,0.072463,0.064422,0.109064,0.106999,0.104666,0.057919,0.090779,0.071362,0.064292,0.064215,0.068166,0.035795,0.085827,0.096853,0.070158,0.064224,0.097405,0.07368,0.06609,0.082592,0.052233,0.081036,0.090104,0.065014,0.097103,0.065538,0.074973,0.128249,0.088321,0.079515,0.073947,0.108221,0.09008,0.129148,0.091241,0.084517,0.066951,0.076197,0.063615,0.062311,0.084779,0.089203,0.133346,0.02762,0.100878,0.135393,0.040098,0.107249,0.04736,0.079742,0.05741,0.071348,0.125158,0.061837,0.050592,0.074983,0.069387,0.050011,0.070236,0.068579,0.07768,0.087382,0.054772,0.078923,0.085908,0.04198,0.110059,0.110044,0.052548,0.093191,0.073495,0.06139,0.058937,0.066723,0.062152,0.099096,0.058306,0.106666,0.06237,0.059176,0.064789,0.058455,0.110749,0.122443,0.065,0.057058,0.062965,0.063099,0.075713,0.066453,0.112149,0.086971,0.10886,0.080972,0.072949,0.053625,0.058112,0.068601,0.059471,0.065012,0.072987,0.058883,0.073975,0.07588,0.047688,0.097169,0.068253,0.105917,0.059336,0.066174,0.071191,0.063082,0.105063,0.07144,0.050368,0.098291,0.089725,0.063963,0.070184,0.059162,0.045104,0.063342,0.059629,0.080921,0.063152,0.055867,0.106638,0.056562,0.126117,0.12017,0.053867,0.065164,0.113357,0.065641,0.06903,0.084079,0.125217,0.139557,0.071125,0.067028,0.069521,0.085559,0.065913,0.097048,0.066735,0.095808,0.05163,0.065127,0.070766,0.053448,0.066154,0.059844,0.047832,0.068704,0.068147,0.073129,0.129443,0.107777,0.076251,0.064082,0.074898,0.088251,0.065712,0.067805,0.076649,0.075781,0.092052,0.094725,0.031567,0.142107,0.117434,0.04684,0.156178,0.070533,0.123118,0.073632,0.047041,0.071957,0.059871,0.058829,0.084698,0.081167,0.104584,56.0,28.0,75.0,5.0


# Create Label y (Business categories)

In [22]:
def load_business_df(): 
    filename = r'../../data/business.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfbusiness = load_business_df()

In [23]:
dfbusiness.head()

Unnamed: 0,address,attributes,business_id,categories,city,hours,is_open,latitude,longitude,name,postal_code,review_count,stars,state
0,2818 E Camino Acequia Drive,{'GoodForKids': 'False'},1SWheh84yJXfytovILXOAQ,"Golf, Active Life",Phoenix,,0,33.522143,-112.018481,Arizona Biltmore Golf Club,85016,5,3.0,AZ
1,30 Eglinton Avenue W,"{'RestaurantsReservations': 'True', 'GoodForMe...",QXAEGFB4oINsVuTFxEYKFQ,"Specialty Food, Restaurants, Dim Sum, Imported...",Mississauga,"{'Monday': '9:0-0:0', 'Tuesday': '9:0-0:0', 'W...",1,43.605499,-79.652289,Emerald Chinese Restaurant,L5R 3E7,128,2.5,ON
2,"10110 Johnston Rd, Ste 15","{'GoodForKids': 'True', 'NoiseLevel': 'u'avera...",gnKjwL_1w79qoiV3IC_xQQ,"Sushi Bars, Restaurants, Japanese",Charlotte,"{'Monday': '17:30-21:30', 'Wednesday': '17:30-...",1,35.092564,-80.859132,Musashi Japanese Restaurant,28210,170,4.0,NC
3,"15655 W Roosevelt St, Ste 237",,xvX2CttrVhyG2z1dFg_0xw,"Insurance, Financial Services",Goodyear,"{'Monday': '8:0-17:0', 'Tuesday': '8:0-17:0', ...",1,33.455613,-112.395596,Farmers Insurance - Paul Lorenz,85338,3,5.0,AZ
4,"4209 Stuart Andrew Blvd, Ste F","{'BusinessAcceptsBitcoin': 'False', 'ByAppoint...",HhyxOkGAM07SRYtlQ4wMFQ,"Plumbing, Shopping, Local Services, Home Servi...",Charlotte,"{'Monday': '7:0-23:0', 'Tuesday': '7:0-23:0', ...",1,35.190012,-80.887223,Queen City Plumbing,28217,4,4.0,NC


# Join x,y (feature matrix, category) using business_id

In [24]:
dfbusiness.columns

Index(['address', 'attributes', 'business_id', 'categories', 'city', 'hours',
       'is_open', 'latitude', 'longitude', 'name', 'postal_code',
       'review_count', 'stars', 'state'],
      dtype='object')

In [25]:
len(dfbusiness['stars'].unique())

9

In [26]:
# Add business details to features df
keep_cols = ['business_id', 'categories', 'review_count']
all_features_business = all_features_business.merge(dfbusiness[keep_cols], how='left', on='business_id') 

In [27]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.013726,-0.000967,0.009651,-0.00913,0.016051,0.000565,-0.006745,0.006949,0.021217,0.017539,0.002967,-0.003814,0.016394,0.011443,-0.010743,0.015422,-0.007492,0.006412,-0.018508,-0.003763,-0.010258,-0.034641,0.003971,0.012807,0.020696,0.008306,0.000858,0.016784,-0.017261,-0.019222,0.018138,0.002409,0.030819,0.007352,-0.009905,0.003296,0.009896,0.020073,0.009849,-0.011995,-0.003346,0.018278,0.011761,-0.008173,-0.002904,0.000559,-0.023254,-0.000756,0.014323,0.021623,0.009647,-0.000923,0.018409,0.00414,-0.01598,0.003323,-0.012654,0.005351,-0.01058,0.004594,-0.014951,0.002638,-0.001018,0.019544,0.009782,0.034841,-0.003797,0.001427,-0.016187,-0.008701,-0.004137,-0.003884,0.012173,-0.004549,-0.008592,-0.019691,-0.01144,-0.001867,0.003111,-0.013886,0.008983,0.011444,0.00786,0.005983,0.007405,0.007301,0.01534,-0.007426,0.025241,-0.018652,-0.003055,0.006571,-0.003629,0.005686,0.008752,0.000279,-0.017226,0.01451,-0.003971,-0.003905,0.016058,-0.022591,0.009685,0.020854,-0.006142,-0.005774,0.009259,0.008155,-0.007269,-0.007763,0.006829,-0.021465,-0.007562,0.003837,-0.003232,-0.010939,-0.001754,-0.019062,-0.013329,-0.009903,-0.001265,0.021219,-0.003071,0.002521,-0.009624,-0.006912,-0.008346,0.018096,-0.008363,-0.004684,0.006336,0.01281,-0.000807,0.008201,0.012035,-0.009213,-0.009921,0.007764,0.008899,0.007216,-0.005796,0.006133,0.013801,-0.015935,-0.007177,0.002236,0.003129,-0.012927,-0.006159,0.004225,0.026937,0.001358,0.012673,-0.004315,-0.002035,0.008295,-0.001315,0.005393,0.003733,0.024581,-0.01967,0.008099,0.003728,-0.018815,0.00101,-0.008273,0.00078,0.007821,-0.010501,0.013363,-0.017681,-0.014448,0.004403,0.000807,-0.007189,0.00873,0.000223,0.014109,0.030282,-0.002383,-0.013673,-0.0035,-0.00739,0.013017,-0.007018,-0.018764,0.004093,0.001161,-0.003004,0.00046,-0.01164,-0.014427,0.006012,-0.003372,0.008276,0.000163,0.005292,0.009312,0.005778,0.01478,0.010748,0.006528,-0.011686,-0.003398,-0.008658,0.005432,0.016919,-0.006005,-0.005508,-0.001077,0.001628,0.004388,-0.006823,0.00876,-0.006772,-0.00931,-0.002256,0.013054,-0.005372,0.019703,-0.004731,-0.01366,-0.003449,-0.00618,0.001936,0.002446,-0.010641,-0.010339,-0.008864,0.019644,-0.009308,-0.014862,0.018339,0.021996,-0.015752,0.000254,-0.016399,-0.011201,0.006879,-0.002404,-0.002863,0.009902,-0.020603,0.010032,0.000563,-0.035289,-0.00262,0.005378,0.001884,-0.026027,0.000318,-0.01816,0.013455,0.03332,0.021081,-0.007704,0.010812,0.000512,-0.000615,0.00342,0.005232,-0.009454,0.020532,-0.004067,0.006363,-2e-05,-0.010406,-0.024523,-0.008722,-0.002289,0.00773,-0.001044,-0.00901,-0.004684,-0.012579,0.006227,0.002931,-0.022759,0.003203,-0.00883,0.006987,-0.013734,-0.02531,-0.002771,-0.002157,-0.018193,-0.002759,0.006644,-0.003563,-0.000256,0.004574,-0.013884,0.010145,-0.018473,0.009391,-0.010135,-0.01557,0.007416,0.000119,0.006864,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,-0.005564,-0.000944,-0.00348,0.002853,0.007491,-0.00066,-0.003497,0.002325,0.026685,0.005464,-0.007248,-0.000974,0.018046,0.008357,-0.019039,0.008971,-0.001258,0.003943,-0.025755,-0.00109,0.014484,-0.030295,0.021154,0.005877,0.010187,0.013338,0.007523,0.01294,-0.014777,-0.024164,0.0411,-0.00077,0.007845,0.014877,-0.005499,-0.015006,0.019996,0.004474,0.001769,-0.011596,0.013088,0.00664,-0.003198,-0.006436,0.00995,-0.001843,-0.026097,-0.008505,0.000224,0.004894,0.008225,-0.003288,0.005818,-0.010148,-0.010819,0.005193,-0.005695,0.016921,-0.004829,0.016215,-0.020262,-0.007885,0.012467,0.021577,0.002958,0.00394,0.003204,-0.009085,-0.006976,0.004098,-0.008992,-0.013135,-0.000485,0.005959,-0.002578,-0.020665,-0.020021,-0.001745,0.00531,0.005375,0.026599,-0.002281,0.002476,-0.008919,0.011867,0.011753,0.013241,0.007646,0.015286,-0.019387,-0.000861,0.00991,0.004722,0.006433,0.019949,-0.003987,0.005567,0.023403,-0.00255,-0.009002,0.005454,-0.006714,0.007594,0.016321,-0.000346,-0.000252,0.014122,-0.007365,-0.023506,-0.004877,-0.003139,-0.01621,0.002644,-0.010215,0.010724,-0.00731,0.005211,-0.016495,-0.010979,0.0036,-0.009721,0.012106,-0.00903,0.006079,-0.013465,-0.004483,-0.001365,0.012768,-0.013028,0.000788,0.007313,0.012509,-0.006384,-0.004753,0.008417,-0.011181,-0.01563,0.005005,0.003815,-0.000974,-0.009535,-0.000988,0.012781,-0.020511,-0.006674,0.011453,-0.018501,-0.005196,-0.003227,-0.010871,0.006396,0.004615,-0.000946,-0.004127,-0.003311,-0.000663,0.00251,0.008027,0.000299,0.012433,-0.013752,0.002294,0.004479,-0.019047,0.009406,-0.001238,-0.004585,0.009398,-0.012283,0.013845,-0.004649,-0.009626,0.009543,0.001538,-0.021331,0.012181,-0.00048,0.017829,0.008854,-0.011601,-0.002123,-0.013013,0.00407,-0.002811,-0.017783,-0.006919,0.009057,0.003706,0.001674,-0.004231,-0.022987,-0.002309,0.00444,-0.003228,-0.004219,0.003064,-0.009686,0.006762,0.000211,-0.00358,0.009524,0.005179,-0.007249,0.001858,-0.012122,0.002947,0.008471,0.001587,-0.014388,-0.004259,0.000955,0.00248,-0.0132,0.010114,-0.004476,-0.002501,-0.009151,-0.011466,-0.009791,0.025125,-0.009181,-0.003719,-0.002027,-0.002515,-0.009595,-0.010926,-0.008428,-0.013462,-0.012293,0.012809,-0.01791,-0.001546,0.007263,0.005538,0.001986,-0.011381,-0.00802,-0.013125,0.011645,-0.010927,0.019354,-7.5e-05,-0.012854,-0.002883,0.001277,-0.026105,0.008271,-0.000996,0.004661,-0.016013,-0.002846,-0.014104,0.00492,0.017741,0.011252,-0.002487,0.005355,-0.001425,0.006644,0.007952,0.020606,-0.016269,0.022072,-0.009253,0.004192,0.000376,-0.002333,-0.025644,0.001453,-0.000391,0.011531,0.005834,-0.006042,-0.016676,-0.004796,0.010255,0.001142,-0.019872,-0.000122,-0.00659,0.008564,0.011062,-0.006372,0.000337,0.003716,-0.003676,-0.002182,-0.006977,-0.002101,-0.021149,-0.00186,-0.015195,0.015674,-0.015427,-0.009989,-0.003153,-0.007485,-0.008983,0.005612,-0.005011,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,-0.011408,0.004498,-0.005131,0.040393,0.015439,-0.023149,-0.018121,-0.01275,0.048993,-0.003531,-0.024677,-0.032462,-0.015342,0.000457,-0.004219,0.025674,-0.009732,0.007939,0.027066,-0.026292,-0.020693,0.019704,-0.00073,0.006854,0.021814,0.014953,-0.006072,0.02136,-0.024738,-0.02165,-0.003612,0.00143,0.030509,-0.009074,-0.037931,0.03775,-0.011981,0.023262,-0.019438,-0.011353,0.015401,0.012555,-0.026999,-0.012126,0.040942,0.043306,-0.030898,-0.00164,0.025986,-0.013791,-0.013998,0.011637,0.01525,0.014251,0.015818,0.01556,-0.003851,0.014891,0.016089,-0.019991,0.0242,-0.007817,-0.010958,0.026268,0.014333,0.027851,0.021072,0.000597,-0.009619,0.007727,-0.019088,-0.011223,-0.016202,0.008886,-0.018291,0.008942,-0.006631,-0.003544,0.037693,0.03982,0.012824,0.006585,0.002338,0.000905,-0.000615,0.025102,-0.00569,0.039397,0.035818,0.02456,-0.000305,0.018925,-0.020685,-0.029385,-0.015098,-0.010915,0.013361,0.028953,0.000481,-0.005198,-0.00643,-0.033768,0.00138,0.000388,-0.032795,-0.008521,0.005831,-0.005367,-0.006409,-0.059394,0.005388,-0.017854,-0.010545,0.018931,-0.006061,-0.007233,-0.002718,-0.006306,0.001461,-0.003748,-0.001259,-0.005952,-0.000632,-0.013196,-0.000417,-0.010504,0.00345,0.02094,-0.019679,-0.029353,0.01774,-0.010969,-0.019441,0.0028,0.002309,0.004318,0.001813,0.019408,-0.01001,-0.021436,0.028445,0.015566,0.00528,0.009365,-0.015892,-0.001641,-0.001963,-0.009057,0.003751,0.02662,0.045652,-0.016511,0.03674,-0.036992,0.019457,0.005505,-0.00045,0.039375,0.028597,0.012334,-0.017266,-0.001321,0.017346,-0.028994,0.000102,-0.022917,0.022877,-0.006573,-0.018327,0.005066,0.007471,-0.014181,0.013244,-0.002325,0.013869,0.012507,-0.013054,-0.003175,0.022951,0.000244,-0.008114,0.039611,-0.003074,-0.033306,-0.010178,-0.00323,0.014981,0.004657,0.003973,-0.001902,0.000326,-0.011839,0.027661,-0.002376,0.023028,0.010613,-0.007019,0.020521,0.006992,-0.00388,0.008437,0.012277,-0.036878,-0.002529,-0.009768,-0.006173,0.025762,0.037304,-0.002869,0.005554,-0.013933,0.008529,-0.020774,0.012585,0.008027,-0.003281,-0.008263,0.005063,0.016019,0.007121,-0.011471,-0.020077,0.016207,-0.024511,0.0003,-0.012528,0.004442,-0.008622,0.001793,0.000132,-0.013576,-0.018616,0.014267,0.01957,-0.055023,0.009012,0.001078,-0.008804,0.005712,-0.004416,0.011757,0.002296,-0.049666,0.007852,-0.028842,0.02521,-0.018244,-0.029429,0.005054,-0.01895,0.003347,-0.020108,0.01701,0.01176,0.004549,-0.037733,-0.02671,-0.008685,0.013084,0.022865,-0.010566,0.001083,0.040527,0.006276,0.009053,0.002601,-0.033974,-0.044425,-0.013743,-0.018428,0.009389,-0.031036,0.012233,0.012775,0.02045,0.001214,-0.01151,0.000485,-0.004856,0.010159,0.004035,-0.000834,-0.022557,0.011678,0.022661,-0.029659,-0.002098,0.005847,-0.001321,0.004675,-0.027056,-0.014537,0.033893,-0.032624,0.037274,-0.004122,-0.024268,-0.00463,-0.013603,0.012318,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,0.03603,-0.004617,-0.013974,0.020747,0.038429,-0.005669,-0.002198,0.015588,-0.002853,-0.009565,-0.002643,-0.014395,-0.012292,0.00845,0.007603,-0.003827,0.021573,0.008389,-0.002452,0.010293,0.022692,0.002092,-0.011285,0.0121,-0.02604,0.003841,-0.000393,0.008389,-0.013863,-0.019479,0.035048,0.008522,-0.007378,0.004424,-0.009327,-0.014849,0.0085,-0.023346,-0.022681,-0.015705,0.023771,-0.001025,-0.022194,0.019416,0.001624,0.011531,-0.015406,-0.00171,0.009043,0.00396,-0.006161,-0.000354,-0.00407,0.00208,0.008475,-0.012978,-0.008557,0.008782,-0.007786,0.002105,-0.015182,0.015293,0.009434,-0.001137,0.007613,-0.016997,0.002806,-0.018171,-0.011341,0.022138,-0.023217,-0.013575,-0.016987,-0.003429,0.008313,-0.000669,-0.021147,0.000362,0.005093,0.021298,0.014194,-0.015987,-0.020088,-0.006929,-0.010896,0.006858,0.005904,0.031907,0.025973,-0.01457,0.002702,0.020056,-0.01249,-0.010422,0.011489,-0.012906,0.011035,0.03269,-0.011232,0.001898,0.01185,0.01813,-0.012305,-0.008048,0.00629,0.000533,0.01797,-0.003583,-0.01239,-0.017582,-0.006649,0.002712,-0.007194,-0.01096,0.008641,0.001024,-0.003371,-0.013689,-0.002284,0.01018,-0.005039,-0.01597,-0.004896,0.004551,0.009634,-0.001396,-0.004828,0.017685,-0.008758,-0.018465,0.010553,0.004184,-0.00454,-0.003628,-0.002454,-0.006859,8.7e-05,0.007116,0.008144,-0.013943,0.007145,0.003522,-0.001099,-0.000666,0.00379,0.006113,0.003153,0.005155,0.006425,-0.013315,-0.012087,-0.012384,0.007106,0.002679,0.010285,-0.005324,-0.004294,0.017046,0.011173,0.003112,-0.012589,-0.014093,0.008187,-0.009397,0.014803,0.003589,0.022513,-0.011319,0.02168,-0.000437,0.001318,-0.009923,0.007855,-0.006369,-0.016201,0.012389,0.005126,-0.000259,0.008205,0.002956,0.007049,-9e-06,-0.003225,-0.007604,-0.022469,0.012772,0.014779,-0.02159,0.001233,-0.000396,-0.005885,-0.002753,0.01871,-0.006596,-0.0002,0.002954,0.001937,0.008445,0.00241,-0.007021,0.010886,-0.017201,-0.007942,0.005915,-0.014905,-0.00642,-0.003849,0.01359,-0.002348,0.005322,0.000939,0.015895,-0.042152,-0.007171,0.006589,0.006256,-0.023661,-0.000313,0.015247,-0.00439,-0.034916,-0.001136,-0.02216,0.001242,-0.010273,-0.014189,0.002097,-0.003391,-0.000624,0.015829,-0.023511,-0.009568,-0.015818,-0.016161,0.001785,-0.006375,-0.00487,-0.010851,-0.007529,-0.019133,0.006462,-0.022415,-0.008734,-0.018169,-0.012631,0.023061,0.013576,-0.019669,-0.00891,0.005105,0.022213,-0.013286,0.011676,-0.006114,-0.018936,-0.007927,-0.00477,0.010402,0.00524,-0.000165,-0.000383,0.002561,0.001053,-0.001729,-0.004861,0.000307,0.008562,-0.008165,-0.000832,-0.009696,0.006255,0.009771,-0.002139,-0.000518,0.000565,0.007697,-0.014454,-0.003493,-0.021064,0.001639,0.001644,0.004618,0.002439,0.005728,0.024569,-0.016353,-0.008612,-0.008367,0.003,-0.001397,-0.009965,0.001797,0.001847,0.004512,-0.01266,0.014142,-0.015593,-0.015314,0.012027,-0.008484,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,0.037063,-0.009315,-0.026367,0.034366,0.063646,0.002534,-0.017493,0.019709,0.01087,-0.006368,-0.008021,-0.0156,-0.005306,0.002922,-0.01058,0.001384,0.017555,0.014387,0.016768,-0.00018,0.016056,0.006687,-0.007475,0.019319,-0.027481,0.013232,0.011752,0.003663,-0.015931,-0.024199,0.029718,0.011359,-0.004956,0.005804,-0.020293,0.001284,0.015996,-0.029603,-0.041141,-0.014429,0.021545,0.000492,-0.034298,0.019427,0.002762,0.021009,-0.032378,-0.004774,0.017271,0.00297,0.004246,0.006514,-0.007337,-0.001333,0.012854,-0.010315,-0.001858,0.008479,0.000681,-0.006574,-0.004562,0.022622,0.008401,0.006384,0.00242,-0.019281,-0.005052,-0.010938,0.002747,0.018117,-0.034967,-0.018739,-0.013164,0.006984,0.014279,0.002119,-0.02226,0.005805,0.006742,0.033168,0.012665,0.004269,-0.013083,-0.006228,-0.022316,0.008169,0.00291,0.05077,0.036307,-0.018526,0.012091,0.01554,-0.017117,-0.003561,0.023017,-0.012839,0.017719,0.033495,-0.00655,0.005886,0.021863,0.021454,-0.018608,-0.001935,0.002049,0.001544,0.022386,-0.000454,-0.011431,-0.022651,0.000362,0.004123,-0.00905,-0.001328,0.007845,0.000739,0.006232,-0.007364,0.002612,0.00221,0.004153,-0.034616,0.001015,-0.005685,0.01145,-0.008641,-0.010007,0.018973,-0.011076,-0.016471,0.005733,-0.004936,-0.01818,-0.001098,-0.001558,-0.000404,0.012638,0.004509,0.00615,-0.017909,0.017598,0.01067,-0.005524,0.016061,-0.002045,0.022372,0.01295,0.012163,0.028965,-0.011939,0.006377,-0.000237,0.015302,-0.00127,0.008944,-0.011075,-0.012375,0.01696,0.010727,0.009859,-0.031381,-0.000325,0.010991,-0.011284,0.011155,-0.004994,0.031776,-0.00276,0.022864,0.008213,0.005658,-0.015535,0.008584,-0.00343,-0.021115,0.014513,-0.010297,0.000324,0.016335,0.005401,0.009448,0.002699,-0.01129,-0.012861,-0.011274,-0.00531,0.018629,-0.011298,0.007405,0.000958,-0.010471,-0.00069,0.018937,-0.009247,-0.000434,0.004826,-0.003482,0.020534,0.012395,-0.010455,0.010883,-0.022955,-0.01066,0.015764,-0.011173,0.000755,-0.005792,0.022928,0.00315,0.016144,0.010836,0.021907,-0.048891,-0.000169,0.005724,0.008939,-0.023912,0.012689,0.016462,-0.012716,-0.037714,0.000926,-0.029815,-0.008867,-0.018469,-0.01903,0.001562,-0.016534,0.009348,-0.001217,-0.030756,-0.015793,-0.009945,-0.001586,-0.018029,-0.008044,-0.006953,-0.018411,-0.013626,-0.021759,0.004791,-0.021245,-0.018038,-0.03359,-0.023835,0.024545,-0.000545,-0.017671,-0.002003,0.006448,0.030536,-0.029008,0.014564,-0.011317,-0.013324,-0.023969,-0.010229,0.008008,0.01544,0.021932,-0.009427,-1.3e-05,0.01224,0.014289,0.004621,-0.002303,0.013994,-0.013895,-0.006972,-0.010657,0.017143,0.020828,0.013929,0.007456,0.007746,-0.012964,-0.025667,0.002786,-0.020329,0.004065,0.015754,0.020314,0.008037,2.7e-05,0.034369,-0.016854,0.009845,-0.012831,0.00527,-0.01097,-0.006481,0.012282,0.006312,0.002442,0.007883,0.003245,-0.032445,-0.011714,0.013191,-0.002767,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [28]:
all_features_business['categories'][0]

'Nightlife, Sports Bars, Restaurants, Bars, American (Traditional)'

In [29]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.013726,-0.000967,0.009651,-0.00913,0.016051,0.000565,-0.006745,0.006949,0.021217,0.017539,0.002967,-0.003814,0.016394,0.011443,-0.010743,0.015422,-0.007492,0.006412,-0.018508,-0.003763,-0.010258,-0.034641,0.003971,0.012807,0.020696,0.008306,0.000858,0.016784,-0.017261,-0.019222,0.018138,0.002409,0.030819,0.007352,-0.009905,0.003296,0.009896,0.020073,0.009849,-0.011995,-0.003346,0.018278,0.011761,-0.008173,-0.002904,0.000559,-0.023254,-0.000756,0.014323,0.021623,0.009647,-0.000923,0.018409,0.00414,-0.01598,0.003323,-0.012654,0.005351,-0.01058,0.004594,-0.014951,0.002638,-0.001018,0.019544,0.009782,0.034841,-0.003797,0.001427,-0.016187,-0.008701,-0.004137,-0.003884,0.012173,-0.004549,-0.008592,-0.019691,-0.01144,-0.001867,0.003111,-0.013886,0.008983,0.011444,0.00786,0.005983,0.007405,0.007301,0.01534,-0.007426,0.025241,-0.018652,-0.003055,0.006571,-0.003629,0.005686,0.008752,0.000279,-0.017226,0.01451,-0.003971,-0.003905,0.016058,-0.022591,0.009685,0.020854,-0.006142,-0.005774,0.009259,0.008155,-0.007269,-0.007763,0.006829,-0.021465,-0.007562,0.003837,-0.003232,-0.010939,-0.001754,-0.019062,-0.013329,-0.009903,-0.001265,0.021219,-0.003071,0.002521,-0.009624,-0.006912,-0.008346,0.018096,-0.008363,-0.004684,0.006336,0.01281,-0.000807,0.008201,0.012035,-0.009213,-0.009921,0.007764,0.008899,0.007216,-0.005796,0.006133,0.013801,-0.015935,-0.007177,0.002236,0.003129,-0.012927,-0.006159,0.004225,0.026937,0.001358,0.012673,-0.004315,-0.002035,0.008295,-0.001315,0.005393,0.003733,0.024581,-0.01967,0.008099,0.003728,-0.018815,0.00101,-0.008273,0.00078,0.007821,-0.010501,0.013363,-0.017681,-0.014448,0.004403,0.000807,-0.007189,0.00873,0.000223,0.014109,0.030282,-0.002383,-0.013673,-0.0035,-0.00739,0.013017,-0.007018,-0.018764,0.004093,0.001161,-0.003004,0.00046,-0.01164,-0.014427,0.006012,-0.003372,0.008276,0.000163,0.005292,0.009312,0.005778,0.01478,0.010748,0.006528,-0.011686,-0.003398,-0.008658,0.005432,0.016919,-0.006005,-0.005508,-0.001077,0.001628,0.004388,-0.006823,0.00876,-0.006772,-0.00931,-0.002256,0.013054,-0.005372,0.019703,-0.004731,-0.01366,-0.003449,-0.00618,0.001936,0.002446,-0.010641,-0.010339,-0.008864,0.019644,-0.009308,-0.014862,0.018339,0.021996,-0.015752,0.000254,-0.016399,-0.011201,0.006879,-0.002404,-0.002863,0.009902,-0.020603,0.010032,0.000563,-0.035289,-0.00262,0.005378,0.001884,-0.026027,0.000318,-0.01816,0.013455,0.03332,0.021081,-0.007704,0.010812,0.000512,-0.000615,0.00342,0.005232,-0.009454,0.020532,-0.004067,0.006363,-2e-05,-0.010406,-0.024523,-0.008722,-0.002289,0.00773,-0.001044,-0.00901,-0.004684,-0.012579,0.006227,0.002931,-0.022759,0.003203,-0.00883,0.006987,-0.013734,-0.02531,-0.002771,-0.002157,-0.018193,-0.002759,0.006644,-0.003563,-0.000256,0.004574,-0.013884,0.010145,-0.018473,0.009391,-0.010135,-0.01557,0.007416,0.000119,0.006864,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,-0.005564,-0.000944,-0.00348,0.002853,0.007491,-0.00066,-0.003497,0.002325,0.026685,0.005464,-0.007248,-0.000974,0.018046,0.008357,-0.019039,0.008971,-0.001258,0.003943,-0.025755,-0.00109,0.014484,-0.030295,0.021154,0.005877,0.010187,0.013338,0.007523,0.01294,-0.014777,-0.024164,0.0411,-0.00077,0.007845,0.014877,-0.005499,-0.015006,0.019996,0.004474,0.001769,-0.011596,0.013088,0.00664,-0.003198,-0.006436,0.00995,-0.001843,-0.026097,-0.008505,0.000224,0.004894,0.008225,-0.003288,0.005818,-0.010148,-0.010819,0.005193,-0.005695,0.016921,-0.004829,0.016215,-0.020262,-0.007885,0.012467,0.021577,0.002958,0.00394,0.003204,-0.009085,-0.006976,0.004098,-0.008992,-0.013135,-0.000485,0.005959,-0.002578,-0.020665,-0.020021,-0.001745,0.00531,0.005375,0.026599,-0.002281,0.002476,-0.008919,0.011867,0.011753,0.013241,0.007646,0.015286,-0.019387,-0.000861,0.00991,0.004722,0.006433,0.019949,-0.003987,0.005567,0.023403,-0.00255,-0.009002,0.005454,-0.006714,0.007594,0.016321,-0.000346,-0.000252,0.014122,-0.007365,-0.023506,-0.004877,-0.003139,-0.01621,0.002644,-0.010215,0.010724,-0.00731,0.005211,-0.016495,-0.010979,0.0036,-0.009721,0.012106,-0.00903,0.006079,-0.013465,-0.004483,-0.001365,0.012768,-0.013028,0.000788,0.007313,0.012509,-0.006384,-0.004753,0.008417,-0.011181,-0.01563,0.005005,0.003815,-0.000974,-0.009535,-0.000988,0.012781,-0.020511,-0.006674,0.011453,-0.018501,-0.005196,-0.003227,-0.010871,0.006396,0.004615,-0.000946,-0.004127,-0.003311,-0.000663,0.00251,0.008027,0.000299,0.012433,-0.013752,0.002294,0.004479,-0.019047,0.009406,-0.001238,-0.004585,0.009398,-0.012283,0.013845,-0.004649,-0.009626,0.009543,0.001538,-0.021331,0.012181,-0.00048,0.017829,0.008854,-0.011601,-0.002123,-0.013013,0.00407,-0.002811,-0.017783,-0.006919,0.009057,0.003706,0.001674,-0.004231,-0.022987,-0.002309,0.00444,-0.003228,-0.004219,0.003064,-0.009686,0.006762,0.000211,-0.00358,0.009524,0.005179,-0.007249,0.001858,-0.012122,0.002947,0.008471,0.001587,-0.014388,-0.004259,0.000955,0.00248,-0.0132,0.010114,-0.004476,-0.002501,-0.009151,-0.011466,-0.009791,0.025125,-0.009181,-0.003719,-0.002027,-0.002515,-0.009595,-0.010926,-0.008428,-0.013462,-0.012293,0.012809,-0.01791,-0.001546,0.007263,0.005538,0.001986,-0.011381,-0.00802,-0.013125,0.011645,-0.010927,0.019354,-7.5e-05,-0.012854,-0.002883,0.001277,-0.026105,0.008271,-0.000996,0.004661,-0.016013,-0.002846,-0.014104,0.00492,0.017741,0.011252,-0.002487,0.005355,-0.001425,0.006644,0.007952,0.020606,-0.016269,0.022072,-0.009253,0.004192,0.000376,-0.002333,-0.025644,0.001453,-0.000391,0.011531,0.005834,-0.006042,-0.016676,-0.004796,0.010255,0.001142,-0.019872,-0.000122,-0.00659,0.008564,0.011062,-0.006372,0.000337,0.003716,-0.003676,-0.002182,-0.006977,-0.002101,-0.021149,-0.00186,-0.015195,0.015674,-0.015427,-0.009989,-0.003153,-0.007485,-0.008983,0.005612,-0.005011,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,-0.011408,0.004498,-0.005131,0.040393,0.015439,-0.023149,-0.018121,-0.01275,0.048993,-0.003531,-0.024677,-0.032462,-0.015342,0.000457,-0.004219,0.025674,-0.009732,0.007939,0.027066,-0.026292,-0.020693,0.019704,-0.00073,0.006854,0.021814,0.014953,-0.006072,0.02136,-0.024738,-0.02165,-0.003612,0.00143,0.030509,-0.009074,-0.037931,0.03775,-0.011981,0.023262,-0.019438,-0.011353,0.015401,0.012555,-0.026999,-0.012126,0.040942,0.043306,-0.030898,-0.00164,0.025986,-0.013791,-0.013998,0.011637,0.01525,0.014251,0.015818,0.01556,-0.003851,0.014891,0.016089,-0.019991,0.0242,-0.007817,-0.010958,0.026268,0.014333,0.027851,0.021072,0.000597,-0.009619,0.007727,-0.019088,-0.011223,-0.016202,0.008886,-0.018291,0.008942,-0.006631,-0.003544,0.037693,0.03982,0.012824,0.006585,0.002338,0.000905,-0.000615,0.025102,-0.00569,0.039397,0.035818,0.02456,-0.000305,0.018925,-0.020685,-0.029385,-0.015098,-0.010915,0.013361,0.028953,0.000481,-0.005198,-0.00643,-0.033768,0.00138,0.000388,-0.032795,-0.008521,0.005831,-0.005367,-0.006409,-0.059394,0.005388,-0.017854,-0.010545,0.018931,-0.006061,-0.007233,-0.002718,-0.006306,0.001461,-0.003748,-0.001259,-0.005952,-0.000632,-0.013196,-0.000417,-0.010504,0.00345,0.02094,-0.019679,-0.029353,0.01774,-0.010969,-0.019441,0.0028,0.002309,0.004318,0.001813,0.019408,-0.01001,-0.021436,0.028445,0.015566,0.00528,0.009365,-0.015892,-0.001641,-0.001963,-0.009057,0.003751,0.02662,0.045652,-0.016511,0.03674,-0.036992,0.019457,0.005505,-0.00045,0.039375,0.028597,0.012334,-0.017266,-0.001321,0.017346,-0.028994,0.000102,-0.022917,0.022877,-0.006573,-0.018327,0.005066,0.007471,-0.014181,0.013244,-0.002325,0.013869,0.012507,-0.013054,-0.003175,0.022951,0.000244,-0.008114,0.039611,-0.003074,-0.033306,-0.010178,-0.00323,0.014981,0.004657,0.003973,-0.001902,0.000326,-0.011839,0.027661,-0.002376,0.023028,0.010613,-0.007019,0.020521,0.006992,-0.00388,0.008437,0.012277,-0.036878,-0.002529,-0.009768,-0.006173,0.025762,0.037304,-0.002869,0.005554,-0.013933,0.008529,-0.020774,0.012585,0.008027,-0.003281,-0.008263,0.005063,0.016019,0.007121,-0.011471,-0.020077,0.016207,-0.024511,0.0003,-0.012528,0.004442,-0.008622,0.001793,0.000132,-0.013576,-0.018616,0.014267,0.01957,-0.055023,0.009012,0.001078,-0.008804,0.005712,-0.004416,0.011757,0.002296,-0.049666,0.007852,-0.028842,0.02521,-0.018244,-0.029429,0.005054,-0.01895,0.003347,-0.020108,0.01701,0.01176,0.004549,-0.037733,-0.02671,-0.008685,0.013084,0.022865,-0.010566,0.001083,0.040527,0.006276,0.009053,0.002601,-0.033974,-0.044425,-0.013743,-0.018428,0.009389,-0.031036,0.012233,0.012775,0.02045,0.001214,-0.01151,0.000485,-0.004856,0.010159,0.004035,-0.000834,-0.022557,0.011678,0.022661,-0.029659,-0.002098,0.005847,-0.001321,0.004675,-0.027056,-0.014537,0.033893,-0.032624,0.037274,-0.004122,-0.024268,-0.00463,-0.013603,0.012318,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,0.03603,-0.004617,-0.013974,0.020747,0.038429,-0.005669,-0.002198,0.015588,-0.002853,-0.009565,-0.002643,-0.014395,-0.012292,0.00845,0.007603,-0.003827,0.021573,0.008389,-0.002452,0.010293,0.022692,0.002092,-0.011285,0.0121,-0.02604,0.003841,-0.000393,0.008389,-0.013863,-0.019479,0.035048,0.008522,-0.007378,0.004424,-0.009327,-0.014849,0.0085,-0.023346,-0.022681,-0.015705,0.023771,-0.001025,-0.022194,0.019416,0.001624,0.011531,-0.015406,-0.00171,0.009043,0.00396,-0.006161,-0.000354,-0.00407,0.00208,0.008475,-0.012978,-0.008557,0.008782,-0.007786,0.002105,-0.015182,0.015293,0.009434,-0.001137,0.007613,-0.016997,0.002806,-0.018171,-0.011341,0.022138,-0.023217,-0.013575,-0.016987,-0.003429,0.008313,-0.000669,-0.021147,0.000362,0.005093,0.021298,0.014194,-0.015987,-0.020088,-0.006929,-0.010896,0.006858,0.005904,0.031907,0.025973,-0.01457,0.002702,0.020056,-0.01249,-0.010422,0.011489,-0.012906,0.011035,0.03269,-0.011232,0.001898,0.01185,0.01813,-0.012305,-0.008048,0.00629,0.000533,0.01797,-0.003583,-0.01239,-0.017582,-0.006649,0.002712,-0.007194,-0.01096,0.008641,0.001024,-0.003371,-0.013689,-0.002284,0.01018,-0.005039,-0.01597,-0.004896,0.004551,0.009634,-0.001396,-0.004828,0.017685,-0.008758,-0.018465,0.010553,0.004184,-0.00454,-0.003628,-0.002454,-0.006859,8.7e-05,0.007116,0.008144,-0.013943,0.007145,0.003522,-0.001099,-0.000666,0.00379,0.006113,0.003153,0.005155,0.006425,-0.013315,-0.012087,-0.012384,0.007106,0.002679,0.010285,-0.005324,-0.004294,0.017046,0.011173,0.003112,-0.012589,-0.014093,0.008187,-0.009397,0.014803,0.003589,0.022513,-0.011319,0.02168,-0.000437,0.001318,-0.009923,0.007855,-0.006369,-0.016201,0.012389,0.005126,-0.000259,0.008205,0.002956,0.007049,-9e-06,-0.003225,-0.007604,-0.022469,0.012772,0.014779,-0.02159,0.001233,-0.000396,-0.005885,-0.002753,0.01871,-0.006596,-0.0002,0.002954,0.001937,0.008445,0.00241,-0.007021,0.010886,-0.017201,-0.007942,0.005915,-0.014905,-0.00642,-0.003849,0.01359,-0.002348,0.005322,0.000939,0.015895,-0.042152,-0.007171,0.006589,0.006256,-0.023661,-0.000313,0.015247,-0.00439,-0.034916,-0.001136,-0.02216,0.001242,-0.010273,-0.014189,0.002097,-0.003391,-0.000624,0.015829,-0.023511,-0.009568,-0.015818,-0.016161,0.001785,-0.006375,-0.00487,-0.010851,-0.007529,-0.019133,0.006462,-0.022415,-0.008734,-0.018169,-0.012631,0.023061,0.013576,-0.019669,-0.00891,0.005105,0.022213,-0.013286,0.011676,-0.006114,-0.018936,-0.007927,-0.00477,0.010402,0.00524,-0.000165,-0.000383,0.002561,0.001053,-0.001729,-0.004861,0.000307,0.008562,-0.008165,-0.000832,-0.009696,0.006255,0.009771,-0.002139,-0.000518,0.000565,0.007697,-0.014454,-0.003493,-0.021064,0.001639,0.001644,0.004618,0.002439,0.005728,0.024569,-0.016353,-0.008612,-0.008367,0.003,-0.001397,-0.009965,0.001797,0.001847,0.004512,-0.01266,0.014142,-0.015593,-0.015314,0.012027,-0.008484,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,0.037063,-0.009315,-0.026367,0.034366,0.063646,0.002534,-0.017493,0.019709,0.01087,-0.006368,-0.008021,-0.0156,-0.005306,0.002922,-0.01058,0.001384,0.017555,0.014387,0.016768,-0.00018,0.016056,0.006687,-0.007475,0.019319,-0.027481,0.013232,0.011752,0.003663,-0.015931,-0.024199,0.029718,0.011359,-0.004956,0.005804,-0.020293,0.001284,0.015996,-0.029603,-0.041141,-0.014429,0.021545,0.000492,-0.034298,0.019427,0.002762,0.021009,-0.032378,-0.004774,0.017271,0.00297,0.004246,0.006514,-0.007337,-0.001333,0.012854,-0.010315,-0.001858,0.008479,0.000681,-0.006574,-0.004562,0.022622,0.008401,0.006384,0.00242,-0.019281,-0.005052,-0.010938,0.002747,0.018117,-0.034967,-0.018739,-0.013164,0.006984,0.014279,0.002119,-0.02226,0.005805,0.006742,0.033168,0.012665,0.004269,-0.013083,-0.006228,-0.022316,0.008169,0.00291,0.05077,0.036307,-0.018526,0.012091,0.01554,-0.017117,-0.003561,0.023017,-0.012839,0.017719,0.033495,-0.00655,0.005886,0.021863,0.021454,-0.018608,-0.001935,0.002049,0.001544,0.022386,-0.000454,-0.011431,-0.022651,0.000362,0.004123,-0.00905,-0.001328,0.007845,0.000739,0.006232,-0.007364,0.002612,0.00221,0.004153,-0.034616,0.001015,-0.005685,0.01145,-0.008641,-0.010007,0.018973,-0.011076,-0.016471,0.005733,-0.004936,-0.01818,-0.001098,-0.001558,-0.000404,0.012638,0.004509,0.00615,-0.017909,0.017598,0.01067,-0.005524,0.016061,-0.002045,0.022372,0.01295,0.012163,0.028965,-0.011939,0.006377,-0.000237,0.015302,-0.00127,0.008944,-0.011075,-0.012375,0.01696,0.010727,0.009859,-0.031381,-0.000325,0.010991,-0.011284,0.011155,-0.004994,0.031776,-0.00276,0.022864,0.008213,0.005658,-0.015535,0.008584,-0.00343,-0.021115,0.014513,-0.010297,0.000324,0.016335,0.005401,0.009448,0.002699,-0.01129,-0.012861,-0.011274,-0.00531,0.018629,-0.011298,0.007405,0.000958,-0.010471,-0.00069,0.018937,-0.009247,-0.000434,0.004826,-0.003482,0.020534,0.012395,-0.010455,0.010883,-0.022955,-0.01066,0.015764,-0.011173,0.000755,-0.005792,0.022928,0.00315,0.016144,0.010836,0.021907,-0.048891,-0.000169,0.005724,0.008939,-0.023912,0.012689,0.016462,-0.012716,-0.037714,0.000926,-0.029815,-0.008867,-0.018469,-0.01903,0.001562,-0.016534,0.009348,-0.001217,-0.030756,-0.015793,-0.009945,-0.001586,-0.018029,-0.008044,-0.006953,-0.018411,-0.013626,-0.021759,0.004791,-0.021245,-0.018038,-0.03359,-0.023835,0.024545,-0.000545,-0.017671,-0.002003,0.006448,0.030536,-0.029008,0.014564,-0.011317,-0.013324,-0.023969,-0.010229,0.008008,0.01544,0.021932,-0.009427,-1.3e-05,0.01224,0.014289,0.004621,-0.002303,0.013994,-0.013895,-0.006972,-0.010657,0.017143,0.020828,0.013929,0.007456,0.007746,-0.012964,-0.025667,0.002786,-0.020329,0.004065,0.015754,0.020314,0.008037,2.7e-05,0.034369,-0.016854,0.009845,-0.012831,0.00527,-0.01097,-0.006481,0.012282,0.006312,0.002442,0.007883,0.003245,-0.032445,-0.011714,0.013191,-0.002767,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [30]:
def stringDFColToBinaryCols(df, series_name):
    # Create list of all categories
    all_cats = []
    for string in df[series_name]:
        string = str(string)
        cats = string.strip().replace(' ', '').split(',')
        for cat in cats:
            if cat not in all_cats:
                all_cats.append(cat)
    # Make binary for each cat for each row
    for cat in all_cats:
        df[cat] = df[series_name].str.strip().str.replace(' ', '').str.contains(cat)
        # This technique will have some problems. 'Golf' may appear in non-Golf categories (ie 'Disc Golf')
        # Can be fixed with regular expressions: ',Golf,' OR 'BOF Golf,' OR ',Golf EOF'
    
    return df, all_cats
        
all_features_business, all_cats = stringDFColToBinaryCols(all_features_business, 'categories')

  if sys.path[0] == '':


In [31]:
print(all_cats)

['Nightlife', 'SportsBars', 'Restaurants', 'Bars', 'American(Traditional)', 'Pizza', 'HairRemoval', 'NailTechnicians', 'Beauty&Spas', 'NailSalons', 'Waxing', 'DaySpas', 'Electronics', 'ProfessionalServices', 'LocalServices', 'ElectronicsRepair', 'Computers', 'Shopping', 'Automotive', 'AutoRepair', 'Chinese', 'EyelashService', 'TobaccoShops', 'VapeShops', 'CarDealers', 'UsedCarDealers', 'Dentists', 'GeneralDentistry', 'CosmeticDentists', 'PediatricDentists', 'Health&Medical', 'Tex-Mex', 'Mexican', 'Arts&Entertainment', 'Festivals', 'Food', 'FoodTrucks', 'FarmersMarket', 'Portuguese', 'Bakeries', 'ChickenShop', 'Barbeque', 'EventPlanning&Services', 'EventPhotography', 'Photographers', 'SessionPhotography', 'SkinCare', 'Antiques', 'IceCream&FrozenYogurt', 'Donuts', 'SpecialtyFood', 'WebDesign', 'GraphicDesign', 'Marketing', 'RecyclingCenter', 'Caterers', 'Southern', 'ComfortFood', 'Breakfast&Brunch', 'French', 'American(New)', 'Burgers', 'Sandwiches', 'Coffee&Tea', 'Brasseries', 'Gyms', '

In [32]:
print(
    len(all_features_business[all_features_business['Golf']==True]), 
    len(all_features_business[all_features_business['DiscGolf']==True]), 
)

61 1


In [33]:
print(all_features_business[all_features_business['DiscGolf']==True]['categories'].values)
print('Should not have a True value for Golf, but does. Problem to deal with in the future.')
print(all_features_business[all_features_business['DiscGolf']==True]['Golf'].values)

['Sporting Goods, Active Life, Bike Rentals, Disc Golf, Shopping']
Should not have a True value for Golf, but does. Problem to deal with in the future.
[True]


In [34]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count,Nightlife,SportsBars,Restaurants,Bars,American(Traditional),Pizza,HairRemoval,NailTechnicians,Beauty&Spas,NailSalons,Waxing,DaySpas,Electronics,ProfessionalServices,LocalServices,ElectronicsRepair,Computers,Shopping,Automotive,AutoRepair,Chinese,EyelashService,TobaccoShops,VapeShops,CarDealers,UsedCarDealers,Dentists,GeneralDentistry,CosmeticDentists,PediatricDentists,Health&Medical,Tex-Mex,Mexican,Arts&Entertainment,Festivals,Food,FoodTrucks,FarmersMarket,Portuguese,Bakeries,ChickenShop,Barbeque,EventPlanning&Services,EventPhotography,Photographers,SessionPhotography,SkinCare,Antiques,IceCream&FrozenYogurt,Donuts,SpecialtyFood,WebDesign,GraphicDesign,Marketing,RecyclingCenter,Caterers,Southern,ComfortFood,Breakfast&Brunch,French,American(New),Burgers,Sandwiches,Coffee&Tea,Brasseries,Gyms,ChildCare&DayCare,LeisureCenters,Fitness&Instruction,ActiveLife,HardwareStores,Home&Garden,RealEstate,Condominiums,Hotels,HomeServices,ShoppingCenters,Hotels&Travel,HairSalons,EthnicFood,Turkish,InternationalGrocery,TapasBars,ShippingCenters,PrintingServices,Massage,MassageTherapy,Reflexology,Buffets,Korean,SushiBars,Japanese,Cafes,Soup,Golf,Venues&EventSpaces,AutoDetailing,BodyShops,AutoCustomization,Towing,Trainers,WeightLossCenters,FoodDeliveryServices,FastFood,Delis,Ethiopian,Vegetarian,Painters,DrywallInstallation&Repair,StuccoServices,Orthodontists,Periodontists,OralSurgeons,Piercing,Tattoo,Chiropractors,Optometrists,Italian,Couriers&DeliveryServices,PublicServices&Government,SportingGoods,Fashion,GolfEquipment,Bikes,Ski&SnowboardShops,SportsWear,BikeRepair/Maintenance,Filipino,PetGroomers,Veterinarians,PetSitting,Pets,PetServices,AutoGlassServices,RealEstateServices,RealEstateAgents,Pakistani,Indian,CardioClasses,DanceStudios,ChickenWings,Cosmetics&BeautySupply,Desserts,Sewing&Alterations,Arts&Crafts,Wheel&RimRepair,Tires,AutoParts&Supplies,Colonics,Saunas,Doctors,MedicalSpas,Naturopathic/Holistic,MeditationCenters,Reiki,SpiritualShop,Orthopedists,SportsMedicine,Surgeons,Grocery,MedicalCenters,InteriorDesign,Rugs,FurnitureStores,HomeDecor,Mattresses,Women'sClothing,Men'sClothing,ShoeStores,JuiceBars&Smoothies,Acupuncture,LaserHairRemoval,FamilyPractice,UrgentCare,Thai,AsianFusion,Vietnamese,Laotian,HomeCleaning,CarpetCleaning,Accessories,Barbers,Gluten-Free,SpeechTherapists,PhysicalTherapy,OccupationalTherapy,Seafood,Steakhouses,Wholesalers,DiscountStore,PartySupplies,DepartmentStores,...,Gelato,TelevisionServiceProviders,Fences&Gates,MetalFabricators,ScubaDiving,Diving,DiveShops,WatchRepair,Halotherapy,CulturalCenter,Lakes,Macarons,CustomCakes,Aquariums,BusinessConsulting,BotanicalGardens,PaintStores,Moroccan,Persian/Iranian,DataRecovery,Cajun/Creole,PartyEquipmentRentals,CarBrokers,BootCamps,Musicians,PartyCharacters,MusicProductionServices,Cuban,PuertoRican,RVDealers,RVRental,Bowling,Venezuelan,SummerCamps,PetAdoption,RefinishingServices,PublicTransportation,CommercialTruckDealers,CommercialTruckRepair,FoodStands,CommercialRealEstate,OutletStores,Campgrounds,RVParks,Resorts,TalentAgencies,GutterServices,UsedBookstore,AdultEducation,StripteaseDancers,DanceSchools,Wallpapering,GoldBuyers,PawnShops,Videographers,Arabian,DonationCenter,TravelAgents,Basque,Spanish,WaterDelivery,WaterStores,Kosher,SkateParks,Izakaya,Poutineries,BailBondsmen,PressureWashers,Herbs&Spices,PhotoBoothRentals,CannabisDispensaries,Poke,ArtClasses,Teppanyaki,Oncologist,HotPot,Szechuan,IrishPub,CyclingClasses,MountainBiking,ShoeRepair,ShoeShine,Cupcakes,SafeStores,Hunting&FishingSupplies,RehabilitationCenter,BasketballCourts,CountryClubs,Endocrinologists,Neurologist,Irish,PetCremationServices,PersonalInjuryLaw,Divorce&FamilyLaw,BankruptcyLaw,Immunodermatologists,RetirementHomes,Cantonese,PoleDancingClasses,Rodeo,VinylRecords,Props,Delicatessen,EthnicGrocery,GuestHouses,YelpEvents,RestaurantSupplies,PatioCoverings,Masonry/Concrete,DigitizingServices,Framing,TestPreparation,PrivateTutors,Skydiving,HomeHealthCare,MedicalSupplies,Psychologists,ModernEuropean,Shutters,FabricStores,SouvenirShops,Russian,CheeseShops,CarWindowTinting,FireProtectionServices,FacePainting,Tuscan,Gastroenterologist,Butcher,Blood&PlasmaDonationCenters,German,Keys&Locksmiths,DUILaw,CriminalDefenseLaw,Investing,SmogCheckStations,CarInspectors,BrewingSupplies,HongKongStyleCafe,PublicMarkets,VehicleWraps,Airports,TeethWhitening,RVRepair,CountertopInstallation,MortuaryServices,SnowRemoval,EstatePlanningLaw,Wills,Trusts,&Probates,BusinessLaw,Airlines,Estheticians,Engraving,TrophyShops,CandleStores,PopcornShops,Fishing,TrailerDealers,BeachBars,BeachVolleyball,ArtificialTurf,PanAsian,DJs,Paintball,MiniGolf,GoKarts,Wigs,GolfLessons,Opera&Ballet,Jazz&Blues,Waffles,SolarInstallation,HomeEnergyAuditors,CannabisClinics,Uzbek,Prenatal/PerinatalCare,Hypnosis/Hypnotherapy,Eatertainment,Afghan,HealthInsuranceOffices,BeverageStore,Tiling,Sicilian,Bartenders,SpineSurgeons,Carpenters,Singaporean,SkilledNursing,Live/RawFood,SepticServices,PrintMedia,SkatingRinks,InternetCafes,WineTours,Boating,DemolitionServices,ProductDesign,3DPrinting,RoadsideAssistance,Himalayan/Nepalese,Officiants,Kickboxing,Boxing,CookingClasses,CookingSchools,PersonalChefs,Indonesian,AquariumServices,Brazilian,LaboratoryTesting,HockeyEquipment,SkateShops,RealEstatePhotography,Video/FilmProduction,Sandblasting,Perfume,PrivateJetCharter,SoulFood,Bookbinding,TanningBeds,RealEstateLaw,EmergencyPetHospital,BoatCharters,Rafting/Kayaking,BoudoirPhotography,Argentine,SocialClubs,OutdoorFurnitureStores,SouthAfrican,AcaiBowls,LactationServices,PlacentaEncapsulations,Observatories,Ukrainian,Planetarium,Cabaret,Hakka,Sailing,FireplaceServices,Gunsmith,UniversityHousing,IndoorPlaycentre,Embassy,OliveOil,Karate,LocalFishStores,MotorsportVehicleRepairs,Synagogues,GuitarStores,MobileDentRepair,Paddleboarding,Distilleries,PostOffices,PetTransportation,CurrencyExchange,PastaShops,Smokehouse,Hydrotherapy,Pop-upShops,Videos&VideoGameRental,OxygenBars,ExcavationServices,MobileHomeRepair,PickYourOwnFarms,Farms,Scottish,British,Passport&VisaServices,PianoBars,PoliceDepartments,WeddingChapels,RegistrationServices,FloatSpa,DayCamps,TrainStations,Prosthodontists,MedicalCannabisReferrals,Mongolian,Orthotics,ChristmasTrees,ClubCrawl,ScreenPrinting,HazardousWasteDisposal,EnvironmentalAbatement,LawnServices,HennaArtists,KidsHairSalons,Zoos,EmploymentLaw,DebtReliefServices,VehicleShipping,Hats,BusTours,DinnerTheater,EstateLiquidation,GeneralLitigation,Coffee&TeaSupplies,Soccer,TrailerRepair,Awnings,Pretzels,ArtSpaceRentals,EditorialServices,Honduran,Nicaraguan,Marinas,CareerCounseling,TeamBuildingActivities,TownCarService,PayrollServices,AerialFitness,CremationServices,GolfCartRentals,GolfCartDealers,LivestockFeed&Supply,UltrasoundImagingCenters,GrillingEquipment,LightingStores,Donairs,Falafel,CannabisTours,PersonalAssistants,AcneTreatment,Clowns,Magicians,InstallmentLoans,Prosthetics,ParentingClasses,FoodBanks,StreetArt,Buses,DialysisClinics,Newspapers&Magazines,Cideries,AutoSecurity,TrailerRental,TabletopGames,MedicalTransportation,SoftwareDevelopment,HolidayDecoratingServices,HolidayDecorations,Cambodian,BirdShops,LanguageSchools,SeniorCenters,OsteopathicPhysicians,PetHospice,TrafficSchools,TrafficTicketingLaw,Urologists,Taekwondo,FarmEquipmentRepair,Coffeeshops,Sunglasses,AnimalPhysicalTherapy,Rheumatologists,PartyBikeRentals,Bangladeshi,Vocational&TechnicalSchool,PetWasteRemoval,Pathologists,Aestheticians,PsychicMediums,TastingClasses,WineTastingClasses,BodyContouring,PumpkinPatches,GeneratorInstallation/Repair,AddictionMedicine,VacationRentalAgents,AppraisalServices,Snorkeling,Dominican,Gemstones&Minerals,Cryotherapy,Trinidadian,ImmigrationLaw,SupperClubs,Burmese,AssistedLivingFacilities,PianoServices,HomeownerAssociation,ScavengerHunts,WalkingTours,BeerTours,BartendingSchools,Carousels,ConciergeMedicine,Matchmakers,WellDrilling,SriLankan,Trains,FurnitureRental,Badminton,PetPhotography,TitleLoans,DanceWear,IVHydration,CPRClasses,BikeSharing,NannyServices,Cafeteria,MistingSystemServices,HorseBoarding,Recording&RehearsalStudios,DisabilityLaw,SocialSecurityLaw,HabilitativeServices,CSA,RetinaSpecialists,BoatDealers,HearingAidProviders,PowderCoating,CircuitTrainingGyms,RotisserieChicken,EnvironmentalTesting,BingoHalls,ValetServices,SugarShacks,Austrian,Races&Competitions,Anesthesiologists,HouseSitters,TikiBars,CarShareServices,Squash,VisitorCenters,CheeseTastingClasses,FleaMarkets,WorkersCompensationLaw,Mosques,HolisticAnimalCare,Firewood,FoodTours,VascularMedicine,Tableware,Hydroponics,HighFidelityAudioEquipment,BarCrawl,BounceHouseRentals,BuddhistTemples,DIYAutoShop,HerbalShops,LANCenters,ConveyorBeltSushi,Egyptian,ReligiousSchools,HairLossCenters,Armenian,MotorcycleGear,ElderCarePlanning,BoatTours,BusRental,RacingExperience,HomeStaging,ReligiousItems,Ziplining,Colombian,Rolfing,Haitian,WildlifeControl,ConceptShops,DiscGolf,Drive-InTheater,TaiChi,International,TenantandEvictionLaw,Doulas,Neurotologists,Belgian,EthicalGrocery,Shanghainese,Machine&ToolRental,FirstAidClasses,HealthRetreats,Empanadas,AirportTerminals,RoofInspectors,Airsoft,VocalCoach,TelevisionStations,IceDelivery,Gerontologists,CustomsBrokers,MotorsportVehicleDealers,FlightInstruction,Cheerleading,RockClimbing,BalloonServices,ATVRentals/Tours,MassageSchools,Pool&Billiards,PettingZoos,Toxicologists,WaterParks,AirportLounges,Australian
0,--I7YYLada0tSLkORTHb5Q,-0.013726,-0.000967,0.009651,-0.00913,0.016051,0.000565,-0.006745,0.006949,0.021217,0.017539,0.002967,-0.003814,0.016394,0.011443,-0.010743,0.015422,-0.007492,0.006412,-0.018508,-0.003763,-0.010258,-0.034641,0.003971,0.012807,0.020696,0.008306,0.000858,0.016784,-0.017261,-0.019222,0.018138,0.002409,0.030819,0.007352,-0.009905,0.003296,0.009896,0.020073,0.009849,-0.011995,-0.003346,0.018278,0.011761,-0.008173,-0.002904,0.000559,-0.023254,-0.000756,0.014323,0.021623,0.009647,-0.000923,0.018409,0.00414,-0.01598,0.003323,-0.012654,0.005351,-0.01058,0.004594,-0.014951,0.002638,-0.001018,0.019544,0.009782,0.034841,-0.003797,0.001427,-0.016187,-0.008701,-0.004137,-0.003884,0.012173,-0.004549,-0.008592,-0.019691,-0.01144,-0.001867,0.003111,-0.013886,0.008983,0.011444,0.00786,0.005983,0.007405,0.007301,0.01534,-0.007426,0.025241,-0.018652,-0.003055,0.006571,-0.003629,0.005686,0.008752,0.000279,-0.017226,0.01451,-0.003971,-0.003905,0.016058,-0.022591,0.009685,0.020854,-0.006142,-0.005774,0.009259,0.008155,-0.007269,-0.007763,0.006829,-0.021465,-0.007562,0.003837,-0.003232,-0.010939,-0.001754,-0.019062,-0.013329,-0.009903,-0.001265,0.021219,-0.003071,0.002521,-0.009624,-0.006912,-0.008346,0.018096,-0.008363,-0.004684,0.006336,0.01281,-0.000807,0.008201,0.012035,-0.009213,-0.009921,0.007764,0.008899,0.007216,-0.005796,0.006133,0.013801,-0.015935,-0.007177,0.002236,0.003129,-0.012927,-0.006159,0.004225,0.026937,0.001358,0.012673,-0.004315,-0.002035,0.008295,-0.001315,0.005393,0.003733,0.024581,-0.01967,0.008099,0.003728,-0.018815,0.00101,-0.008273,0.00078,0.007821,-0.010501,0.013363,-0.017681,-0.014448,0.004403,0.000807,-0.007189,0.00873,0.000223,0.014109,0.030282,-0.002383,-0.013673,-0.0035,-0.00739,0.013017,-0.007018,-0.018764,0.004093,0.001161,-0.003004,0.00046,-0.01164,-0.014427,0.006012,-0.003372,0.008276,0.000163,0.005292,0.009312,0.005778,0.01478,0.010748,0.006528,-0.011686,-0.003398,-0.008658,0.005432,0.016919,-0.006005,-0.005508,-0.001077,0.001628,0.004388,-0.006823,0.00876,-0.006772,-0.00931,-0.002256,0.013054,-0.005372,0.019703,-0.004731,-0.01366,-0.003449,-0.00618,0.001936,0.002446,-0.010641,-0.010339,-0.008864,0.019644,-0.009308,-0.014862,0.018339,0.021996,-0.015752,0.000254,-0.016399,-0.011201,0.006879,-0.002404,-0.002863,0.009902,-0.020603,0.010032,0.000563,-0.035289,-0.00262,0.005378,0.001884,-0.026027,0.000318,-0.01816,0.013455,0.03332,0.021081,-0.007704,0.010812,0.000512,-0.000615,0.00342,0.005232,-0.009454,0.020532,-0.004067,0.006363,-2e-05,-0.010406,-0.024523,-0.008722,-0.002289,0.00773,-0.001044,-0.00901,-0.004684,-0.012579,0.006227,0.002931,-0.022759,0.003203,-0.00883,0.006987,-0.013734,-0.02531,-0.002771,-0.002157,-0.018193,-0.002759,0.006644,-0.003563,-0.000256,0.004574,-0.013884,0.010145,-0.018473,0.009391,-0.010135,-0.01557,0.007416,0.000119,0.006864,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,--U98MNlDym2cLn36BBPgQ,-0.005564,-0.000944,-0.00348,0.002853,0.007491,-0.00066,-0.003497,0.002325,0.026685,0.005464,-0.007248,-0.000974,0.018046,0.008357,-0.019039,0.008971,-0.001258,0.003943,-0.025755,-0.00109,0.014484,-0.030295,0.021154,0.005877,0.010187,0.013338,0.007523,0.01294,-0.014777,-0.024164,0.0411,-0.00077,0.007845,0.014877,-0.005499,-0.015006,0.019996,0.004474,0.001769,-0.011596,0.013088,0.00664,-0.003198,-0.006436,0.00995,-0.001843,-0.026097,-0.008505,0.000224,0.004894,0.008225,-0.003288,0.005818,-0.010148,-0.010819,0.005193,-0.005695,0.016921,-0.004829,0.016215,-0.020262,-0.007885,0.012467,0.021577,0.002958,0.00394,0.003204,-0.009085,-0.006976,0.004098,-0.008992,-0.013135,-0.000485,0.005959,-0.002578,-0.020665,-0.020021,-0.001745,0.00531,0.005375,0.026599,-0.002281,0.002476,-0.008919,0.011867,0.011753,0.013241,0.007646,0.015286,-0.019387,-0.000861,0.00991,0.004722,0.006433,0.019949,-0.003987,0.005567,0.023403,-0.00255,-0.009002,0.005454,-0.006714,0.007594,0.016321,-0.000346,-0.000252,0.014122,-0.007365,-0.023506,-0.004877,-0.003139,-0.01621,0.002644,-0.010215,0.010724,-0.00731,0.005211,-0.016495,-0.010979,0.0036,-0.009721,0.012106,-0.00903,0.006079,-0.013465,-0.004483,-0.001365,0.012768,-0.013028,0.000788,0.007313,0.012509,-0.006384,-0.004753,0.008417,-0.011181,-0.01563,0.005005,0.003815,-0.000974,-0.009535,-0.000988,0.012781,-0.020511,-0.006674,0.011453,-0.018501,-0.005196,-0.003227,-0.010871,0.006396,0.004615,-0.000946,-0.004127,-0.003311,-0.000663,0.00251,0.008027,0.000299,0.012433,-0.013752,0.002294,0.004479,-0.019047,0.009406,-0.001238,-0.004585,0.009398,-0.012283,0.013845,-0.004649,-0.009626,0.009543,0.001538,-0.021331,0.012181,-0.00048,0.017829,0.008854,-0.011601,-0.002123,-0.013013,0.00407,-0.002811,-0.017783,-0.006919,0.009057,0.003706,0.001674,-0.004231,-0.022987,-0.002309,0.00444,-0.003228,-0.004219,0.003064,-0.009686,0.006762,0.000211,-0.00358,0.009524,0.005179,-0.007249,0.001858,-0.012122,0.002947,0.008471,0.001587,-0.014388,-0.004259,0.000955,0.00248,-0.0132,0.010114,-0.004476,-0.002501,-0.009151,-0.011466,-0.009791,0.025125,-0.009181,-0.003719,-0.002027,-0.002515,-0.009595,-0.010926,-0.008428,-0.013462,-0.012293,0.012809,-0.01791,-0.001546,0.007263,0.005538,0.001986,-0.011381,-0.00802,-0.013125,0.011645,-0.010927,0.019354,-7.5e-05,-0.012854,-0.002883,0.001277,-0.026105,0.008271,-0.000996,0.004661,-0.016013,-0.002846,-0.014104,0.00492,0.017741,0.011252,-0.002487,0.005355,-0.001425,0.006644,0.007952,0.020606,-0.016269,0.022072,-0.009253,0.004192,0.000376,-0.002333,-0.025644,0.001453,-0.000391,0.011531,0.005834,-0.006042,-0.016676,-0.004796,0.010255,0.001142,-0.019872,-0.000122,-0.00659,0.008564,0.011062,-0.006372,0.000337,0.003716,-0.003676,-0.002182,-0.006977,-0.002101,-0.021149,-0.00186,-0.015195,0.015674,-0.015427,-0.009989,-0.003153,-0.007485,-0.008983,0.005612,-0.005011,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,--j-kaNMCo1-DYzddCsA5Q,-0.011408,0.004498,-0.005131,0.040393,0.015439,-0.023149,-0.018121,-0.01275,0.048993,-0.003531,-0.024677,-0.032462,-0.015342,0.000457,-0.004219,0.025674,-0.009732,0.007939,0.027066,-0.026292,-0.020693,0.019704,-0.00073,0.006854,0.021814,0.014953,-0.006072,0.02136,-0.024738,-0.02165,-0.003612,0.00143,0.030509,-0.009074,-0.037931,0.03775,-0.011981,0.023262,-0.019438,-0.011353,0.015401,0.012555,-0.026999,-0.012126,0.040942,0.043306,-0.030898,-0.00164,0.025986,-0.013791,-0.013998,0.011637,0.01525,0.014251,0.015818,0.01556,-0.003851,0.014891,0.016089,-0.019991,0.0242,-0.007817,-0.010958,0.026268,0.014333,0.027851,0.021072,0.000597,-0.009619,0.007727,-0.019088,-0.011223,-0.016202,0.008886,-0.018291,0.008942,-0.006631,-0.003544,0.037693,0.03982,0.012824,0.006585,0.002338,0.000905,-0.000615,0.025102,-0.00569,0.039397,0.035818,0.02456,-0.000305,0.018925,-0.020685,-0.029385,-0.015098,-0.010915,0.013361,0.028953,0.000481,-0.005198,-0.00643,-0.033768,0.00138,0.000388,-0.032795,-0.008521,0.005831,-0.005367,-0.006409,-0.059394,0.005388,-0.017854,-0.010545,0.018931,-0.006061,-0.007233,-0.002718,-0.006306,0.001461,-0.003748,-0.001259,-0.005952,-0.000632,-0.013196,-0.000417,-0.010504,0.00345,0.02094,-0.019679,-0.029353,0.01774,-0.010969,-0.019441,0.0028,0.002309,0.004318,0.001813,0.019408,-0.01001,-0.021436,0.028445,0.015566,0.00528,0.009365,-0.015892,-0.001641,-0.001963,-0.009057,0.003751,0.02662,0.045652,-0.016511,0.03674,-0.036992,0.019457,0.005505,-0.00045,0.039375,0.028597,0.012334,-0.017266,-0.001321,0.017346,-0.028994,0.000102,-0.022917,0.022877,-0.006573,-0.018327,0.005066,0.007471,-0.014181,0.013244,-0.002325,0.013869,0.012507,-0.013054,-0.003175,0.022951,0.000244,-0.008114,0.039611,-0.003074,-0.033306,-0.010178,-0.00323,0.014981,0.004657,0.003973,-0.001902,0.000326,-0.011839,0.027661,-0.002376,0.023028,0.010613,-0.007019,0.020521,0.006992,-0.00388,0.008437,0.012277,-0.036878,-0.002529,-0.009768,-0.006173,0.025762,0.037304,-0.002869,0.005554,-0.013933,0.008529,-0.020774,0.012585,0.008027,-0.003281,-0.008263,0.005063,0.016019,0.007121,-0.011471,-0.020077,0.016207,-0.024511,0.0003,-0.012528,0.004442,-0.008622,0.001793,0.000132,-0.013576,-0.018616,0.014267,0.01957,-0.055023,0.009012,0.001078,-0.008804,0.005712,-0.004416,0.011757,0.002296,-0.049666,0.007852,-0.028842,0.02521,-0.018244,-0.029429,0.005054,-0.01895,0.003347,-0.020108,0.01701,0.01176,0.004549,-0.037733,-0.02671,-0.008685,0.013084,0.022865,-0.010566,0.001083,0.040527,0.006276,0.009053,0.002601,-0.033974,-0.044425,-0.013743,-0.018428,0.009389,-0.031036,0.012233,0.012775,0.02045,0.001214,-0.01151,0.000485,-0.004856,0.010159,0.004035,-0.000834,-0.022557,0.011678,0.022661,-0.029659,-0.002098,0.005847,-0.001321,0.004675,-0.027056,-0.014537,0.033893,-0.032624,0.037274,-0.004122,-0.024268,-0.00463,-0.013603,0.012318,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,--wIGbLEhlpl_UeAIyDmZQ,0.03603,-0.004617,-0.013974,0.020747,0.038429,-0.005669,-0.002198,0.015588,-0.002853,-0.009565,-0.002643,-0.014395,-0.012292,0.00845,0.007603,-0.003827,0.021573,0.008389,-0.002452,0.010293,0.022692,0.002092,-0.011285,0.0121,-0.02604,0.003841,-0.000393,0.008389,-0.013863,-0.019479,0.035048,0.008522,-0.007378,0.004424,-0.009327,-0.014849,0.0085,-0.023346,-0.022681,-0.015705,0.023771,-0.001025,-0.022194,0.019416,0.001624,0.011531,-0.015406,-0.00171,0.009043,0.00396,-0.006161,-0.000354,-0.00407,0.00208,0.008475,-0.012978,-0.008557,0.008782,-0.007786,0.002105,-0.015182,0.015293,0.009434,-0.001137,0.007613,-0.016997,0.002806,-0.018171,-0.011341,0.022138,-0.023217,-0.013575,-0.016987,-0.003429,0.008313,-0.000669,-0.021147,0.000362,0.005093,0.021298,0.014194,-0.015987,-0.020088,-0.006929,-0.010896,0.006858,0.005904,0.031907,0.025973,-0.01457,0.002702,0.020056,-0.01249,-0.010422,0.011489,-0.012906,0.011035,0.03269,-0.011232,0.001898,0.01185,0.01813,-0.012305,-0.008048,0.00629,0.000533,0.01797,-0.003583,-0.01239,-0.017582,-0.006649,0.002712,-0.007194,-0.01096,0.008641,0.001024,-0.003371,-0.013689,-0.002284,0.01018,-0.005039,-0.01597,-0.004896,0.004551,0.009634,-0.001396,-0.004828,0.017685,-0.008758,-0.018465,0.010553,0.004184,-0.00454,-0.003628,-0.002454,-0.006859,8.7e-05,0.007116,0.008144,-0.013943,0.007145,0.003522,-0.001099,-0.000666,0.00379,0.006113,0.003153,0.005155,0.006425,-0.013315,-0.012087,-0.012384,0.007106,0.002679,0.010285,-0.005324,-0.004294,0.017046,0.011173,0.003112,-0.012589,-0.014093,0.008187,-0.009397,0.014803,0.003589,0.022513,-0.011319,0.02168,-0.000437,0.001318,-0.009923,0.007855,-0.006369,-0.016201,0.012389,0.005126,-0.000259,0.008205,0.002956,0.007049,-9e-06,-0.003225,-0.007604,-0.022469,0.012772,0.014779,-0.02159,0.001233,-0.000396,-0.005885,-0.002753,0.01871,-0.006596,-0.0002,0.002954,0.001937,0.008445,0.00241,-0.007021,0.010886,-0.017201,-0.007942,0.005915,-0.014905,-0.00642,-0.003849,0.01359,-0.002348,0.005322,0.000939,0.015895,-0.042152,-0.007171,0.006589,0.006256,-0.023661,-0.000313,0.015247,-0.00439,-0.034916,-0.001136,-0.02216,0.001242,-0.010273,-0.014189,0.002097,-0.003391,-0.000624,0.015829,-0.023511,-0.009568,-0.015818,-0.016161,0.001785,-0.006375,-0.00487,-0.010851,-0.007529,-0.019133,0.006462,-0.022415,-0.008734,-0.018169,-0.012631,0.023061,0.013576,-0.019669,-0.00891,0.005105,0.022213,-0.013286,0.011676,-0.006114,-0.018936,-0.007927,-0.00477,0.010402,0.00524,-0.000165,-0.000383,0.002561,0.001053,-0.001729,-0.004861,0.000307,0.008562,-0.008165,-0.000832,-0.009696,0.006255,0.009771,-0.002139,-0.000518,0.000565,0.007697,-0.014454,-0.003493,-0.021064,0.001639,0.001644,0.004618,0.002439,0.005728,0.024569,-0.016353,-0.008612,-0.008367,0.003,-0.001397,-0.009965,0.001797,0.001847,0.004512,-0.01266,0.014142,-0.015593,-0.015314,0.012027,-0.008484,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14,False,False,False,False,False,False,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,-000aQFeK6tqVLndf7xORg,0.037063,-0.009315,-0.026367,0.034366,0.063646,0.002534,-0.017493,0.019709,0.01087,-0.006368,-0.008021,-0.0156,-0.005306,0.002922,-0.01058,0.001384,0.017555,0.014387,0.016768,-0.00018,0.016056,0.006687,-0.007475,0.019319,-0.027481,0.013232,0.011752,0.003663,-0.015931,-0.024199,0.029718,0.011359,-0.004956,0.005804,-0.020293,0.001284,0.015996,-0.029603,-0.041141,-0.014429,0.021545,0.000492,-0.034298,0.019427,0.002762,0.021009,-0.032378,-0.004774,0.017271,0.00297,0.004246,0.006514,-0.007337,-0.001333,0.012854,-0.010315,-0.001858,0.008479,0.000681,-0.006574,-0.004562,0.022622,0.008401,0.006384,0.00242,-0.019281,-0.005052,-0.010938,0.002747,0.018117,-0.034967,-0.018739,-0.013164,0.006984,0.014279,0.002119,-0.02226,0.005805,0.006742,0.033168,0.012665,0.004269,-0.013083,-0.006228,-0.022316,0.008169,0.00291,0.05077,0.036307,-0.018526,0.012091,0.01554,-0.017117,-0.003561,0.023017,-0.012839,0.017719,0.033495,-0.00655,0.005886,0.021863,0.021454,-0.018608,-0.001935,0.002049,0.001544,0.022386,-0.000454,-0.011431,-0.022651,0.000362,0.004123,-0.00905,-0.001328,0.007845,0.000739,0.006232,-0.007364,0.002612,0.00221,0.004153,-0.034616,0.001015,-0.005685,0.01145,-0.008641,-0.010007,0.018973,-0.011076,-0.016471,0.005733,-0.004936,-0.01818,-0.001098,-0.001558,-0.000404,0.012638,0.004509,0.00615,-0.017909,0.017598,0.01067,-0.005524,0.016061,-0.002045,0.022372,0.01295,0.012163,0.028965,-0.011939,0.006377,-0.000237,0.015302,-0.00127,0.008944,-0.011075,-0.012375,0.01696,0.010727,0.009859,-0.031381,-0.000325,0.010991,-0.011284,0.011155,-0.004994,0.031776,-0.00276,0.022864,0.008213,0.005658,-0.015535,0.008584,-0.00343,-0.021115,0.014513,-0.010297,0.000324,0.016335,0.005401,0.009448,0.002699,-0.01129,-0.012861,-0.011274,-0.00531,0.018629,-0.011298,0.007405,0.000958,-0.010471,-0.00069,0.018937,-0.009247,-0.000434,0.004826,-0.003482,0.020534,0.012395,-0.010455,0.010883,-0.022955,-0.01066,0.015764,-0.011173,0.000755,-0.005792,0.022928,0.00315,0.016144,0.010836,0.021907,-0.048891,-0.000169,0.005724,0.008939,-0.023912,0.012689,0.016462,-0.012716,-0.037714,0.000926,-0.029815,-0.008867,-0.018469,-0.01903,0.001562,-0.016534,0.009348,-0.001217,-0.030756,-0.015793,-0.009945,-0.001586,-0.018029,-0.008044,-0.006953,-0.018411,-0.013626,-0.021759,0.004791,-0.021245,-0.018038,-0.03359,-0.023835,0.024545,-0.000545,-0.017671,-0.002003,0.006448,0.030536,-0.029008,0.014564,-0.011317,-0.013324,-0.023969,-0.010229,0.008008,0.01544,0.021932,-0.009427,-1.3e-05,0.01224,0.014289,0.004621,-0.002303,0.013994,-0.013895,-0.006972,-0.010657,0.017143,0.020828,0.013929,0.007456,0.007746,-0.012964,-0.025667,0.002786,-0.020329,0.004065,0.015754,0.020314,0.008037,2.7e-05,0.034369,-0.016854,0.009845,-0.012831,0.00527,-0.01097,-0.006481,0.012282,0.006312,0.002442,0.007883,0.003245,-0.032445,-0.011714,0.013191,-0.002767,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [35]:
# Clean

# Remove rows with NaNs
print('Before: ', len(all_features_business))
all_features_business = all_features_business.dropna(axis=0)
print('After:  ', len(all_features_business))

Before:  13943
After:   13922


In [80]:
# First, shuffle the dataframe 
# and reset the index. (Makes for easier handling of train/test later)
all_features_business = all_features_business.sample(frac=1).reset_index(drop=True)

# Create final y and x 
y_df = all_features_business[all_cats]
x_cols = [ele for ele in all_features_business.columns if ele not in all_cats+['categories', 'business_id']]
# May also want to remove from x_cols: 'cool', 'funny', 'useful', 'stars', 'categories', 'review_count' 
# May also want to drop rows that do not contain more than a threshold number of reviews (20?, 100?)
x_df = all_features_business[x_cols]

# Numpy arrays
x = x_df.values
y = y_df.values

# Classifier wants 1/0, not T/F
y = y.astype(int)

# Split into Train/Test sets
def splitSets(x, y, test_size=0.2):
    test_size_absolute = np.int(test_size * len(x))
    X_test, X_train = x[:test_size_absolute,:], x[test_size_absolute:,:]
    y_test, y_train = y[:test_size_absolute,:], y[test_size_absolute:,:]
    return X_train, X_test, y_train, y_test
    
test_size = 0.2
X_train, X_test, y_train, y_test = splitSets(x, y, test_size=test_size)

In [81]:
y_test

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [1, 0, 1, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

# Category Prediction

In [82]:
# Multilabel Classification
# RandomForestClassifier supports multilabel classification

# Most other classifiers will require use of 
    # sklearn.multioutput.MultiOutputClassifier to run a separate classifier model for each targe
    
from sklearn.ensemble import RandomForestClassifier

In [83]:
rfc = RandomForestClassifier(n_estimators=10, n_jobs=-1)

In [84]:
rfc.fit(X_train,y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=-1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

## Recall (and other classification metrics)

In our case we want a Recall = TPR (True Positive Rate) close to 1 since we want to Recall ALL correct categories. 

The only requirement we have for Precision is that it be less than 1. This is because we want some FPs (False Positives) since these are what WE ARE RECOMMENDING!!

In [85]:
from sklearn.metrics import classification_report

y_predict = rfc.predict(X_test)
print(classification_report(y_test, y_predict, target_names=all_cats))

                                precision    recall  f1-score   support

                     Nightlife       0.84      0.30      0.44       233
                    SportsBars       0.00      0.00      0.00        36
                   Restaurants       0.91      0.88      0.89       964
                          Bars       0.81      0.27      0.41       259
         American(Traditional)       0.00      0.00      0.00         0
                         Pizza       0.97      0.31      0.47       106
                   HairRemoval       0.33      0.02      0.03        66
               NailTechnicians       0.00      0.00      0.00         4
                   Beauty&Spas       0.99      0.58      0.73       264
                    NailSalons       0.92      0.45      0.61        75
                        Waxing       0.50      0.02      0.05        41
                       DaySpas       1.00      0.03      0.05        37
                   Electronics       0.00      0.00      0.00  

  'precision', 'predicted', average, warn_for)
  'recall', 'true', average, warn_for)
  'precision', 'predicted', average, warn_for)


In [86]:
from sklearn.metrics import recall_score 

recall_all_cats = recall_score(y_test, y_predict, average=None)
recall_all_cats

  'recall', 'true', average, warn_for)


array([0.30042918, 0.        , 0.87966805, ..., 0.        , 0.        ,
       0.        ])

## Top RECOMMENDATIONS

Look at the top NONMATCHING RESULTS, which are the top recommendations!

In [87]:
y_proba = rfc.predict_proba(X_test)

In [88]:
print( len(y_proba), ' L')
print( len(y_proba[0]), ' W')
print( len(y_proba[0][0]), " D (0: False prob'y, 1: True prob'y)")

1090  L
2784  W
2  D (0: False prob'y, 1: True prob'y)


In [89]:
y_proba[0][0]

array([0.7, 0.3])

In [90]:
reccs_binary = (y_test == 0) & (y_predict == 1)
reccs_binary

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False,  True, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [91]:
all_cats_ser = pd.Series(data=all_cats)

In [92]:
all_cats_true = []
all_cats_recc = []
for biz in range(len(y_test)):
    cats_true = ', '.join(list(all_cats_ser[y_test[biz,:]==1]))
    all_cats_true.append(cats_true)
    
    cats_recc = ', '.join(list(all_cats_ser[reccs_binary[biz,:]==True]))
    all_cats_recc.append(cats_recc)

reccs_df = pd.DataFrame(data=all_cats_true, columns=['Labeled'])
reccs_df['Recommended'] = all_cats_recc
reccs_df.tail()

Unnamed: 0,Labeled,Recommended
2779,"Veterinarians, Pets",
2780,"Food, ConvenienceStores",Shopping
2781,"Shopping, Home&Garden, HomeDecor, Kitchen&Bath",
2782,"Food, IceCream&FrozenYogurt",Restaurants
2783,"LocalServices, PestControl",


In [93]:
list(all_features_business.columns)

['business_id',
 'w2v0',
 'w2v1',
 'w2v2',
 'w2v3',
 'w2v4',
 'w2v5',
 'w2v6',
 'w2v7',
 'w2v8',
 'w2v9',
 'w2v10',
 'w2v11',
 'w2v12',
 'w2v13',
 'w2v14',
 'w2v15',
 'w2v16',
 'w2v17',
 'w2v18',
 'w2v19',
 'w2v20',
 'w2v21',
 'w2v22',
 'w2v23',
 'w2v24',
 'w2v25',
 'w2v26',
 'w2v27',
 'w2v28',
 'w2v29',
 'w2v30',
 'w2v31',
 'w2v32',
 'w2v33',
 'w2v34',
 'w2v35',
 'w2v36',
 'w2v37',
 'w2v38',
 'w2v39',
 'w2v40',
 'w2v41',
 'w2v42',
 'w2v43',
 'w2v44',
 'w2v45',
 'w2v46',
 'w2v47',
 'w2v48',
 'w2v49',
 'w2v50',
 'w2v51',
 'w2v52',
 'w2v53',
 'w2v54',
 'w2v55',
 'w2v56',
 'w2v57',
 'w2v58',
 'w2v59',
 'w2v60',
 'w2v61',
 'w2v62',
 'w2v63',
 'w2v64',
 'w2v65',
 'w2v66',
 'w2v67',
 'w2v68',
 'w2v69',
 'w2v70',
 'w2v71',
 'w2v72',
 'w2v73',
 'w2v74',
 'w2v75',
 'w2v76',
 'w2v77',
 'w2v78',
 'w2v79',
 'w2v80',
 'w2v81',
 'w2v82',
 'w2v83',
 'w2v84',
 'w2v85',
 'w2v86',
 'w2v87',
 'w2v88',
 'w2v89',
 'w2v90',
 'w2v91',
 'w2v92',
 'w2v93',
 'w2v94',
 'w2v95',
 'w2v96',
 'w2v97',
 'w2v98',
 'w2

In [94]:
reccs_df['categories'] = all_features_business['categories'].iloc[:len(reccs_df)]
reccs_df['business_id'] = all_features_business['business_id'].iloc[:len(reccs_df)]
reccs_df.tail()

Unnamed: 0,Labeled,Recommended,categories,business_id
2779,"Veterinarians, Pets",,"Pets, Veterinarians",XCNFeIUMsMV5_NaHi6bHGw
2780,"Food, ConvenienceStores",Shopping,"Food, Convenience Stores",lHZgNaqcTTTu2qX8j2qzxA
2781,"Shopping, Home&Garden, HomeDecor, Kitchen&Bath",,"Shopping, Home & Garden, Home Decor, Kitchen & Bath",H7rpWv02D6WTu6IpNNDkWw
2782,"Food, IceCream&FrozenYogurt",Restaurants,"Food, Ice Cream & Frozen Yogurt",EwN1LCoJXB0z_a-LxLFKyQ
2783,"LocalServices, PestControl",,"Pest Control, Local Services",Q_BzLkCJWf0iB1wfRWcnZw


In [95]:
# This is where I need to pick up. I need to match the dataframes 
# so that I can match reviews etc and judge how well the recommender is doing

In [96]:
list(all_features_business['categories'].tail())

['Chicken Wings, Fast Food, Restaurants, Chicken Shop',
 'Home Services, Junk Removal & Hauling, Hotels & Travel, Piano Services, Automotive, Musical Instrument Services, Movers, Local Services, Couriers & Delivery Services, Vehicle Shipping, Transportation',
 'Local Services, Dry Cleaning & Laundry, Laundry Services',
 'Food, Street Vendors, Farmers Market',
 'Real Estate, Home Services, Apartments']

In [97]:
len(reccs_df[reccs_df['Recommended']!=''])

296

In [98]:
reccs_df[reccs_df['Recommended']!=''].sort_values(by='Recommended')

Unnamed: 0,Labeled,Recommended,categories,business_id
998,"Arts&Entertainment, Festivals",ActiveLife,"Festivals, Arts & Entertainment",8PhLT_WRxnrZfKWICllMUQ
643,"Nightlife, Bars, EventPlanning&Services, ActiveLife, Venues&EventSpaces, Pubs, RecreationCenters, SportsClubs, Beaches, BeachBars, BeachVolleyball",Arts&Entertainment,"Recreation Centers, Pubs, Beach Bars, Bars, Nightlife, Active Life, Venues & Event Spaces, Beach Volleyball, Event Planning & Services, Beaches, Sports Clubs",8XSC5y10zpdDITOC-pUK5Q
885,"Restaurants, Vietnamese",AsianFusion,"Vietnamese, Restaurants",KEGLWeFAWXvo0W2LnujhtQ
1526,"Automotive, AutoParts&Supplies",AutoRepair,"Automotive, Auto Parts & Supplies",QgXTuzc_i8dO_p2XD7PP0Q
2088,"Automotive, AutoGlassServices, Wheel&RimRepair, Tires, CarWash",AutoRepair,"Tires, Wheel & Rim Repair, Automotive, Auto Glass Services, Car Wash",qNChnVaoWDS-RFd9NDBPEA
741,"Automotive, CarDealers, AutoParts&Supplies",AutoRepair,"Auto Parts & Supplies, Automotive, Car Dealers",gazN60LwAxWUVk2xo_3KbA
780,"Automotive, BodyShops",AutoRepair,"Automotive, Body Shops",va8PGeWUCaCnKsRPXGp1LQ
1128,"Automotive, Hotels, Hotels&Travel, MotorcycleRepair, MotorcycleRental, MotorcycleDealers",AutoRepair,"Motorcycle Repair, Motorcycle Rental, Motorcycle Dealers, Automotive, Hotels & Travel",-ucQnELMVRIUOi3-Kv5r0Q
978,"Automotive, AutoParts&Supplies",AutoRepair,"Auto Parts & Supplies, Automotive",jubHvev4-PO1PBNHFD73vw
775,"Automotive, Wheel&RimRepair, Tires, AutoParts&Supplies","AutoRepair, OilChangeStations","Wheel & Rim Repair, Automotive, Auto Parts & Supplies, Tires",FlUimPjl-6AkZk0neebkWA


In [99]:
dfreviews.columns

Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id'],
      dtype='object')

In [100]:
pd.set_option('display.max_colwidth',2000)
dfreviews[dfreviews['business_id']=='QZkSIa1Be9QIFLD6NUBxPQ']['text']

24768                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

In [101]:
# Given X_test row, find identical row in all_features_business and use that to find 'business_id'
all_features_business[x_cols].values == X_test[0]

array([[ True,  True,  True, ...,  True,  True,  True],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [102]:
X_test.shape

(2784, 305)

In [103]:
all_features_business[x_cols].values.shape

(13922, 305)