<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Acknowledgements" data-toc-modified-id="Acknowledgements-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Acknowledgements</a></span></li><li><span><a href="#Prepare-data-and-model" data-toc-modified-id="Prepare-data-and-model-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare data and model</a></span></li><li><span><a href="#Make-feature-matrix-(word2vec,-votes,-stars)" data-toc-modified-id="Make-feature-matrix-(word2vec,-votes,-stars)-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Make feature matrix (word2vec, votes, stars)</a></span></li><li><span><a href="#Create-Label-y-(Business-categories)" data-toc-modified-id="Create-Label-y-(Business-categories)-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Create Label y (Business categories)</a></span></li><li><span><a href="#Join-x,y-(feature-matrix,-category)-using-business_id" data-toc-modified-id="Join-x,y-(feature-matrix,-category)-using-business_id-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Join x,y (feature matrix, category) using business_id</a></span></li><li><span><a href="#Category-Prediction" data-toc-modified-id="Category-Prediction-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Category Prediction</a></span><ul class="toc-item"><li><span><a href="#Recall-(and-other-classification-metrics)" data-toc-modified-id="Recall-(and-other-classification-metrics)-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Recall (and other classification metrics)</a></span></li><li><span><a href="#RECOMMENDATIONS" data-toc-modified-id="RECOMMENDATIONS-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>RECOMMENDATIONS</a></span></li></ul></li></ul></div>

# Acknowledgements
Thanks to the tutorial: https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-3-more-fun-with-word-vectors

# Prepare data and model

In [1]:
%matplotlib inline
import pandas as pd
pd.options.display.max_columns = 999
pd.options.display.max_rows=999
import numpy as np
import matplotlib.pyplot as plt

import re

import nltk
import nltk.data
nltk.download('stopwords')
from nltk.corpus import stopwords # Import the stop word list



[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/daviderickson/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
def load_reviews(size='small'): 
    if size == 'small':
        filename = r'../../data/small-review.json'
    elif size == 'intermediate':
        filename = r'../../data/intermediate-review.json'
    elif size == 'full':
        filename = r'../../data/review.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfreviews = load_reviews(size='intermediate')

In [3]:
dfreviews.head()

Unnamed: 0,business_id,cool,date,funny,review_id,stars,text,useful,user_id
0,ujmEBvifdJM6h6RLv4wQIg,0,2013-05-07 04:34:36,1,Q1sbwvVQXV2734tPgoKj4Q,1.0,Total bill for this horrible service? Over $8G...,6,hG7b0MtEbXx5QzbzE6C_VA
1,NZnhc2sEQy3RmzKTZnqtwQ,0,2017-01-14 21:30:33,0,GJXCdrto3ASJOqKeVWPi6Q,5.0,I *adore* Travis at the Hard Rock's new Kelly ...,0,yXQM5uF2jS6es16SJzNHfg
2,WTqjgwHlXbSFevF32_DJVw,0,2016-11-09 20:09:03,0,2TzJjDVDEuAW6MR5Vuc1ug,5.0,I have to say that this office really has it t...,3,n6-Gk65cPZL6Uz8qRm3NYw
3,ikCg8xy5JIg_NGPx-MSIDA,0,2018-01-09 20:56:38,0,yi0R0Ugj_xUx_Nek0-_Qig,5.0,Went in for a lunch. Steak sandwich was delici...,0,dacAIZ6fTM6mqwW5uxkskg
4,b1b1eb3uo-w561D0ZfCEiQ,0,2018-01-30 23:07:38,0,11a8sVPMUFtaC7_ABRkmtw,1.0,Today was my second out of three sessions I ha...,7,ssoyf2_x0EQMed6fgHeMyQ


In [4]:
dfreviews.columns

Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id'],
      dtype='object')

In [5]:
dfreviews['text'][0]

'Total bill for this horrible service? Over $8Gs. These crooks actually had the nerve to charge us $69 for 3 pills. I checked online the pills can be had for 19 cents EACH! Avoid Hospital ERs at all costs.'

In [6]:
# For simplicity, drop anything that isn't a letter
# Numbers and symbols may have interesting meaning and could be explore later

def lettersOnly(string):
    return re.sub("[^a-zA-Z]", " ", string) 

dfreviews['text'] = dfreviews['text'].apply(lettersOnly)


In [7]:
dfreviews['text'][0]

'Total bill for this horrible service  Over   Gs  These crooks actually had the nerve to charge us     for   pills  I checked online the pills can be had for    cents EACH  Avoid Hospital ERs at all costs '

In [8]:
def review_to_wordlist(string, remove_stopwords=False):
    string = re.sub("[^a-zA-Z]", " ", string) # keep only letters. more complex model possible later
    words =  string.lower().split() # make everything lowercase. split into words
    if remove_stopwords:
        stops = set(stopwords.words('english')) # create a fast lookup for stopwords
        words = [w for w in words if not w in stops] # remove stopwords
    return( words) # return a list of words
    
# dfreviews['text'] = dfreviews['text'].apply(review_to_words) # apply to reviews in dataframe


In [9]:
# Word2Vec expects single sentences, each one as a list of words

# Load the punkt tokenizer
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')

# Define a function to split a review into parsed sentences
def review_to_sentences( review, tokenizer, remove_stopwords=False ):
    # Function to split a review into parsed sentences. Returns a 
    # list of sentences, where each sentence is a list of words
    #
    # 1. Use the NLTK tokenizer to split the paragraph into sentences
    raw_sentences = tokenizer.tokenize(review.strip())
    #
    # 2. Loop over each sentence
    sentences = []
    for raw_sentence in raw_sentences:
        # If a sentence is empty, skip it
        if len(raw_sentence) > 0:
            # Otherwise, call review_to_wordlist to get a list of words
            sentences.append( review_to_wordlist( raw_sentence, \
              remove_stopwords ))
    #
    # Return the list of sentences (each sentence is a list of words,
    # so this returns a list of lists
    return sentences

In [10]:
sentences = []  # Initialize an empty list of sentences

print("Parsing sentences")
for review in dfreviews["text"]:
    sentences += review_to_sentences(review, tokenizer)

Parsing sentences


In [11]:
# Import the built-in logging module and configure it so that Word2Vec 
# creates nice output messages
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',\
    level=logging.INFO)

# Set values for various parameters
num_features = 300    # Word vector dimensionality                      
min_word_count = 40   # Minimum word count                        
num_workers = 4       # Number of threads to run in parallel
context = 10          # Context window size                                                                                    
downsampling = 1e-3   # Downsample setting for frequent words

# Initialize and train the model (this will take some time)
from gensim.models import word2vec
print("Training model...")
model = word2vec.Word2Vec(sentences, workers=num_workers, \
            size=num_features, min_count = min_word_count, \
            window = context, sample = downsampling)

# If you don't plan to train the model any further, calling 
# init_sims will make the model much more memory-efficient.
model.init_sims(replace=True)

# It can be helpful to create a meaningful model name and 
# save the model for later use. You can load it later using Word2Vec.load()
model_name = "300features_40minwords_10context"
model.save(model_name)

2020-01-22 12:49:05,613 : INFO : 'pattern' package not found; tag filters are not available for English
2020-01-22 12:49:05,631 : INFO : collecting all words and their counts
2020-01-22 12:49:05,632 : INFO : PROGRESS: at sentence #0, processed 0 words, keeping 0 word types


Training model...


2020-01-22 12:49:05,845 : INFO : PROGRESS: at sentence #10000, processed 1088334 words, keeping 25539 word types
2020-01-22 12:49:06,032 : INFO : PROGRESS: at sentence #20000, processed 2172597 words, keeping 35463 word types
2020-01-22 12:49:06,229 : INFO : PROGRESS: at sentence #30000, processed 3251616 words, keeping 42649 word types
2020-01-22 12:49:06,588 : INFO : PROGRESS: at sentence #40000, processed 4373996 words, keeping 48893 word types
2020-01-22 12:49:06,865 : INFO : PROGRESS: at sentence #50000, processed 5471587 words, keeping 53964 word types
2020-01-22 12:49:07,115 : INFO : PROGRESS: at sentence #60000, processed 6570064 words, keeping 58362 word types
2020-01-22 12:49:07,313 : INFO : PROGRESS: at sentence #70000, processed 7667364 words, keeping 62704 word types
2020-01-22 12:49:07,518 : INFO : PROGRESS: at sentence #80000, processed 8768955 words, keeping 66443 word types
2020-01-22 12:49:07,721 : INFO : PROGRESS: at sentence #90000, processed 9872097 words, keeping 

2020-01-22 12:49:56,094 : INFO : EPOCH 3 - PROGRESS: at 77.19% examples, 393850 words/s, in_qsize 7, out_qsize 2
2020-01-22 12:49:57,095 : INFO : EPOCH 3 - PROGRESS: at 80.32% examples, 384762 words/s, in_qsize 7, out_qsize 0
2020-01-22 12:49:58,122 : INFO : EPOCH 3 - PROGRESS: at 85.05% examples, 383075 words/s, in_qsize 8, out_qsize 0
2020-01-22 12:49:59,142 : INFO : EPOCH 3 - PROGRESS: at 90.29% examples, 384375 words/s, in_qsize 7, out_qsize 0
2020-01-22 12:50:00,149 : INFO : EPOCH 3 - PROGRESS: at 93.99% examples, 379255 words/s, in_qsize 7, out_qsize 0
2020-01-22 12:50:01,153 : INFO : EPOCH 3 - PROGRESS: at 97.27% examples, 373284 words/s, in_qsize 7, out_qsize 0
2020-01-22 12:50:01,675 : INFO : worker thread finished; awaiting finish of 3 more threads
2020-01-22 12:50:01,694 : INFO : worker thread finished; awaiting finish of 2 more threads
2020-01-22 12:50:01,701 : INFO : worker thread finished; awaiting finish of 1 more threads
2020-01-22 12:50:01,727 : INFO : worker thread fi

In [12]:
model.most_similar('pizza')

  """Entry point for launching an IPython kernel.


[('pepperoni', 0.6951333284378052),
 ('crust', 0.6904908418655396),
 ('pizzas', 0.6607687473297119),
 ('calzone', 0.6115842461585999),
 ('margherita', 0.6036222577095032),
 ('lasagna', 0.5305042862892151),
 ('mozzarella', 0.5288822650909424),
 ('slice', 0.5260140299797058),
 ('subs', 0.5125235319137573),
 ('pasta', 0.5124855041503906)]

In [13]:
model.most_similar('service')

  """Entry point for launching an IPython kernel.


[('waitstaff', 0.5213185548782349),
 ('staff', 0.47348546981811523),
 ('bartenders', 0.418529748916626),
 ('servers', 0.4099265933036804),
 ('communication', 0.40983614325523376),
 ('personality', 0.40979495644569397),
 ('hospitality', 0.39634326100349426),
 ('hostesses', 0.392345666885376),
 ('food', 0.38768860697746277),
 ('experience', 0.38350218534469604)]

In [14]:
model.most_similar('bad')

  """Entry point for launching an IPython kernel.


[('terrible', 0.6226011514663696),
 ('horrible', 0.5923745036125183),
 ('good', 0.5514881610870361),
 ('awful', 0.5388085842132568),
 ('poor', 0.5210546255111694),
 ('disappointing', 0.500664234161377),
 ('subpar', 0.48986661434173584),
 ('alright', 0.48567622900009155),
 ('greatest', 0.46916308999061584),
 ('okay', 0.45262426137924194)]

In [15]:
import numpy as np  # Make sure that numpy is imported

def makeFeatureVec(words, model, num_features):
    # Function to average all of the word vectors in a given
    # paragraph
    #
    # Pre-initialize an empty numpy array (for speed)
    featureVec = np.zeros((num_features,),dtype="float32")
    #
    nwords = 0.
    # 
    # WV.Index2word is a list that contains the names of the words in 
    # the model's vocabulary. Convert it to a set, for speed 
    index2word_set = set(model.wv.index2word)
    #
    # Loop over each word in the review and, if it is in the model's
    # vocaublary, add its feature vector to the total
    for word in words:
        if word in index2word_set: 
            nwords = nwords + 1.
            featureVec = np.add(featureVec,model[word])
    # 
    # Divide the result by the number of words to get the average
    featureVec = np.divide(featureVec,nwords)
    return featureVec


def getAvgFeatureVecs(reviews, model, num_features):
    # Given a set of reviews (each one a list of words), calculate 
    # the average feature vector for each one and return a 2D numpy array 
    # 
    # Initialize a counter
    counter = int(0.)
    # 
    # Preallocate a 2D numpy array, for speed
    reviewFeatureVecs = np.zeros((len(reviews),num_features),dtype="float32")
    # 
    # Loop through the reviews
    for review in reviews:
       #
       # Print a status message every 1000th review
       if counter%1000. == 0.:
           print ("Review %d of %d" % (counter, len(reviews)))
       # 
       # Call the function (defined above) that makes average feature vectors
       reviewFeatureVecs[counter] = makeFeatureVec(review, model, \
           num_features)
       #
       # Increment the counter
       counter = counter + 1
    return reviewFeatureVecs

In [16]:
# ****************************************************************
# Calculate average feature vectors
# using the functions we defined above. Notice that we now use stop word
# removal.

clean_reviews = []
for review in dfreviews["text"]:
    clean_reviews.append( review_to_wordlist( review, \
        remove_stopwords=True ))

reviewDataVecs = getAvgFeatureVecs( clean_reviews, model, num_features )

Review 0 of 100000




Review 1000 of 100000
Review 2000 of 100000
Review 3000 of 100000
Review 4000 of 100000
Review 5000 of 100000
Review 6000 of 100000
Review 7000 of 100000
Review 8000 of 100000




Review 9000 of 100000
Review 10000 of 100000
Review 11000 of 100000
Review 12000 of 100000
Review 13000 of 100000
Review 14000 of 100000
Review 15000 of 100000
Review 16000 of 100000
Review 17000 of 100000
Review 18000 of 100000
Review 19000 of 100000
Review 20000 of 100000
Review 21000 of 100000
Review 22000 of 100000
Review 23000 of 100000
Review 24000 of 100000
Review 25000 of 100000
Review 26000 of 100000
Review 27000 of 100000
Review 28000 of 100000
Review 29000 of 100000
Review 30000 of 100000
Review 31000 of 100000
Review 32000 of 100000
Review 33000 of 100000
Review 34000 of 100000
Review 35000 of 100000
Review 36000 of 100000
Review 37000 of 100000
Review 38000 of 100000
Review 39000 of 100000
Review 40000 of 100000
Review 41000 of 100000
Review 42000 of 100000
Review 43000 of 100000
Review 44000 of 100000
Review 45000 of 100000
Review 46000 of 100000
Review 47000 of 100000
Review 48000 of 100000
Review 49000 of 100000
Review 50000 of 100000
Review 51000 of 100000
Review 52000

# Make feature matrix (word2vec, votes, stars)

In [17]:
reviewDataVecs.shape[1]

300

In [18]:
# Add non-text data back to feature matrix
review_features = ['cool', 'funny', 'useful', 'stars' , 'business_id']
all_features_labels = ['w2v{}'.format(idx) for idx in range(reviewDataVecs.shape[1])] + review_features
all_features = np.append(reviewDataVecs, dfreviews[review_features].to_numpy(), 1)


In [19]:
# Create df 
all_features_df = pd.DataFrame(data=all_features, columns=all_features_labels)

# Convert all but business_id to numerical
business_ids = all_features_df['business_id']
all_features_df = all_features_df.iloc[:,:-1].astype('float64')
all_features_df['business_id'] = business_ids
del business_ids

# Group by business_id
all_features_business = all_features_df.groupby(by='business_id').mean()

In [20]:
all_features_business.head()

Unnamed: 0_level_0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
business_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1,Unnamed: 199_level_1,Unnamed: 200_level_1,Unnamed: 201_level_1,Unnamed: 202_level_1,Unnamed: 203_level_1,Unnamed: 204_level_1,Unnamed: 205_level_1,Unnamed: 206_level_1,Unnamed: 207_level_1,Unnamed: 208_level_1,Unnamed: 209_level_1,Unnamed: 210_level_1,Unnamed: 211_level_1,Unnamed: 212_level_1,Unnamed: 213_level_1,Unnamed: 214_level_1,Unnamed: 215_level_1,Unnamed: 216_level_1,Unnamed: 217_level_1,Unnamed: 218_level_1,Unnamed: 219_level_1,Unnamed: 220_level_1,Unnamed: 221_level_1,Unnamed: 222_level_1,Unnamed: 223_level_1,Unnamed: 224_level_1,Unnamed: 225_level_1,Unnamed: 226_level_1,Unnamed: 227_level_1,Unnamed: 228_level_1,Unnamed: 229_level_1,Unnamed: 230_level_1,Unnamed: 231_level_1,Unnamed: 232_level_1,Unnamed: 233_level_1,Unnamed: 234_level_1,Unnamed: 235_level_1,Unnamed: 236_level_1,Unnamed: 237_level_1,Unnamed: 238_level_1,Unnamed: 239_level_1,Unnamed: 240_level_1,Unnamed: 241_level_1,Unnamed: 242_level_1,Unnamed: 243_level_1,Unnamed: 244_level_1,Unnamed: 245_level_1,Unnamed: 246_level_1,Unnamed: 247_level_1,Unnamed: 248_level_1,Unnamed: 249_level_1,Unnamed: 250_level_1,Unnamed: 251_level_1,Unnamed: 252_level_1,Unnamed: 253_level_1,Unnamed: 254_level_1,Unnamed: 255_level_1,Unnamed: 256_level_1,Unnamed: 257_level_1,Unnamed: 258_level_1,Unnamed: 259_level_1,Unnamed: 260_level_1,Unnamed: 261_level_1,Unnamed: 262_level_1,Unnamed: 263_level_1,Unnamed: 264_level_1,Unnamed: 265_level_1,Unnamed: 266_level_1,Unnamed: 267_level_1,Unnamed: 268_level_1,Unnamed: 269_level_1,Unnamed: 270_level_1,Unnamed: 271_level_1,Unnamed: 272_level_1,Unnamed: 273_level_1,Unnamed: 274_level_1,Unnamed: 275_level_1,Unnamed: 276_level_1,Unnamed: 277_level_1,Unnamed: 278_level_1,Unnamed: 279_level_1,Unnamed: 280_level_1,Unnamed: 281_level_1,Unnamed: 282_level_1,Unnamed: 283_level_1,Unnamed: 284_level_1,Unnamed: 285_level_1,Unnamed: 286_level_1,Unnamed: 287_level_1,Unnamed: 288_level_1,Unnamed: 289_level_1,Unnamed: 290_level_1,Unnamed: 291_level_1,Unnamed: 292_level_1,Unnamed: 293_level_1,Unnamed: 294_level_1,Unnamed: 295_level_1,Unnamed: 296_level_1,Unnamed: 297_level_1,Unnamed: 298_level_1,Unnamed: 299_level_1,Unnamed: 300_level_1,Unnamed: 301_level_1,Unnamed: 302_level_1,Unnamed: 303_level_1,Unnamed: 304_level_1
--I7YYLada0tSLkORTHb5Q,-0.007499,0.01587,-0.009091,-0.006852,-0.000165,0.024793,0.008644,-0.014792,0.009979,0.003942,0.0041,-0.031356,-0.010247,0.005387,0.011627,0.020621,0.015115,-0.008609,0.002743,-0.010083,-0.006879,-0.005206,0.009593,-0.012116,-0.00858,0.004146,0.017497,-0.010027,-0.002657,-0.003539,-0.003672,-0.003878,-0.019413,-0.004522,0.010478,-0.011138,-0.00152,0.001624,0.002736,-0.001654,0.014157,0.012092,-0.011507,6.5e-05,0.02149,0.012381,-0.010089,-5.6e-05,0.006849,0.012484,0.022499,0.007505,0.001161,0.006571,-0.011028,-0.008634,-0.003242,-0.005032,0.016984,-0.006729,-0.007712,0.016816,-0.009824,-0.01381,-0.018497,0.00596,-0.00389,0.005296,0.001982,0.005957,0.010111,0.011505,-0.009504,0.006016,0.000145,-0.001613,0.003529,-0.009751,-0.001159,0.00135,0.004055,-0.012402,-0.011015,0.001537,0.009277,-0.006952,0.018196,0.011549,-0.003539,0.003471,-0.001982,-0.006949,-0.004149,-0.001219,-0.018856,0.018914,-0.002559,0.020924,-0.02201,-0.016058,-0.019431,0.024031,0.010372,-0.024269,0.001566,0.003882,-0.016965,0.014903,-0.000333,0.005388,-0.004297,-0.022004,0.007121,-0.015847,0.010955,0.008343,0.002458,-0.004447,-0.022745,-0.016351,-0.015083,0.016055,-0.010969,0.004302,-0.002894,-0.003404,0.013917,0.003703,-0.026776,-0.001713,0.000421,-0.011638,0.00956,0.00662,0.002412,0.005148,-0.014819,-0.002384,0.013313,-0.011222,-0.011357,-0.015396,-0.004863,-0.00418,-0.007797,-0.008371,0.005455,0.006528,0.016183,0.014788,0.010562,0.015067,0.008007,0.003813,-0.009594,-0.019872,0.006658,-0.004194,-0.000726,-0.019882,0.003207,0.008335,-0.00353,-0.00852,0.003569,0.014994,0.012201,0.009086,0.00645,-0.01886,-0.002093,0.004624,0.013989,-0.011511,-0.000781,0.017287,0.005218,0.033981,-0.004402,-0.001438,0.023224,-0.008902,0.007502,-0.032931,0.001991,0.006755,-0.008278,-0.004967,-0.016611,0.00286,-0.027153,-0.00717,0.009309,-0.020619,0.000266,-0.001966,0.02771,-0.024209,-0.00121,-0.015006,-0.012201,0.003239,0.010807,-0.020057,-0.004619,-0.006609,0.018128,0.006096,0.006753,0.008909,0.002935,-0.01132,-0.015718,0.009725,0.021607,0.006392,0.020936,0.024687,0.011394,0.020137,-0.002377,0.007439,0.014349,-0.008634,-0.021233,0.001051,0.013405,-0.011792,0.01148,-0.024658,0.012449,0.006896,-0.006742,0.01825,0.01273,0.002604,0.005603,0.013587,-0.015682,-0.001813,0.000986,-0.001446,0.001805,0.000281,-0.006727,0.002159,-0.000811,0.015007,0.011092,0.00152,-0.012781,-0.009524,0.001715,0.013574,0.006682,0.004416,-0.016188,-0.00107,0.02416,-0.007177,0.003758,-0.002125,0.01329,0.009893,0.010383,0.014488,-0.018683,0.005121,0.012145,0.02989,-0.003379,0.013333,-0.012451,0.005107,0.010078,-0.003755,-0.002551,0.024659,-0.006757,-0.001592,0.012685,-0.008899,-0.01607,0.000193,-0.01233,-0.013128,0.018632,-0.013797,-0.017402,0.009316,8.5e-05,-0.000346,0.00496,-0.004972,0.007819,-0.001563,-0.007926,-0.00514,-0.002637,0.004487,0.352941,0.352941,0.823529,3.647059
--U98MNlDym2cLn36BBPgQ,0.001321,0.020397,-0.005849,0.001654,-0.011622,0.018849,0.005494,-0.015423,0.008618,0.007287,0.007399,-0.023791,0.000994,0.002044,-0.006207,0.025938,0.002294,-0.010482,0.00371,-0.009301,0.002224,-0.012484,0.016786,-0.018724,-0.007389,-0.005424,0.009281,0.005434,0.004109,0.009846,-0.006699,-0.003709,-0.005755,0.005977,0.010481,0.001521,0.011386,-0.001846,0.000209,0.016299,0.013629,-0.016856,-0.009517,-0.002898,0.006809,0.00016,-0.011914,-0.001501,0.013424,0.005944,0.02339,0.00945,-0.001096,0.009402,-0.011847,0.001044,-0.00358,0.022533,0.02438,-0.000276,-0.007263,0.015476,-0.003131,-0.009706,-0.016027,-0.002222,-0.00582,0.006852,0.002422,0.001015,0.00543,0.010693,0.007849,0.026505,-3.7e-05,-0.01016,-0.02102,-0.002758,-0.007959,0.003969,-0.006524,-0.010391,-0.012534,0.002711,0.004077,-0.000967,0.00462,-0.011257,-0.009275,-0.001961,0.004368,-0.002741,0.002213,-0.001484,-0.013602,0.005373,-0.001872,0.006827,-0.008297,-0.008082,-0.005082,-0.002831,0.010845,-0.016601,0.001404,-0.009223,-0.009052,0.00487,0.004361,0.006548,0.001582,-0.003647,0.010116,-0.011908,-0.005118,-0.001061,0.017997,0.003667,-0.007199,-0.001373,-0.005872,0.006095,-0.000244,0.002786,-9e-05,0.002776,0.011398,0.012544,-0.00881,-0.003625,0.008335,-0.018189,0.021909,-0.006605,0.001991,0.010614,-0.008953,0.002743,-0.01012,-0.006879,-0.013393,-0.01036,-0.007192,0.006551,-0.003993,-0.012971,0.018007,-0.009865,0.018597,0.006843,-0.002549,0.011944,0.018651,0.008417,-0.007486,-0.004775,0.012657,0.003012,-0.000857,-0.006485,0.002627,0.008823,-0.004679,0.007138,-0.002575,0.009535,0.006989,0.009448,-0.002979,-0.016121,-0.009967,0.001045,0.007279,-0.008831,0.000771,0.034161,-0.006813,0.023669,0.013984,-0.013003,0.009409,0.002028,0.012934,-0.025267,-0.002378,-0.005684,0.007701,-0.00501,-0.002147,0.003824,-0.007827,0.008128,-0.00025,-0.017537,0.004941,0.001875,0.020632,-0.011595,-0.019383,-0.008837,-0.003078,0.006548,0.009257,-0.018428,-0.003066,-0.012686,0.017781,-0.006594,0.002138,0.004893,-0.004178,-0.018483,-0.021718,0.005285,0.002581,0.011907,0.019082,0.019257,0.010664,0.007317,-0.002005,-0.002601,0.004837,-0.006308,-0.01078,0.00151,0.01077,-0.018656,0.010772,-0.013842,0.006009,-0.010805,0.005438,0.015986,0.018399,0.002517,-0.007091,0.006768,-0.002893,-0.001234,0.005622,-0.006657,0.011644,-0.011799,0.006536,0.017361,-0.000455,0.008157,0.011874,-0.019345,-0.016238,-0.006056,-0.01017,0.007585,0.012319,-0.000601,-0.018317,0.007401,0.022479,-0.004789,-0.00281,-0.006522,0.011438,-0.008014,0.030327,0.009807,-0.019028,0.00948,-0.000841,0.019035,0.017731,0.025443,-0.008603,0.009164,0.003628,-0.001672,-0.011721,0.025006,0.016348,0.006681,0.005001,-0.010569,-0.006204,0.000571,-0.00706,0.000485,0.016113,-0.026779,-0.006788,0.016352,0.006836,-0.009109,-0.000684,-0.00864,-0.007533,0.009403,0.006655,0.006638,-8.9e-05,0.023506,0.0,0.0,2.0,3.0
--j-kaNMCo1-DYzddCsA5Q,-0.031706,0.048197,0.002592,-0.014077,0.012455,0.021341,0.006416,0.023812,0.008884,-0.010029,0.026321,-0.035189,-0.029355,-0.003366,0.010322,0.010225,-0.013881,-0.014311,0.009284,-0.004531,0.015355,-0.012754,0.040054,-0.005349,-0.01496,-0.017134,0.015219,-0.005011,-0.005252,-0.018325,0.006987,-0.000175,-0.002259,-0.0018,0.046835,0.004774,-0.005338,0.002812,0.034205,-0.000922,0.041472,0.008703,-0.027083,0.031797,0.002945,-0.00315,-0.037086,0.023052,0.014693,0.006894,0.033686,-0.005447,-0.044836,-0.041553,0.007038,0.033026,0.018956,0.019412,0.040987,-0.002007,-0.044471,-0.004943,0.006001,-0.012231,-0.004839,-0.002395,-0.014001,0.043168,0.011314,-0.010929,0.003008,0.023616,0.035201,0.020261,0.005463,-0.003348,0.03982,0.01225,0.013604,0.017491,-0.029549,0.003226,0.009524,0.011124,0.003292,-0.032335,-0.002262,0.001353,0.020998,0.007193,0.003009,0.000301,0.002265,-0.005233,-0.002113,-0.00373,-0.009042,0.008372,0.012576,-0.011596,-0.024452,0.03889,-0.01924,-0.011677,-0.020417,0.014937,-0.026773,0.022879,-0.02211,-0.002352,-0.007221,-0.008771,-0.005963,0.005788,-0.005936,-0.001119,0.009281,0.00632,-0.032762,-0.018522,0.014365,0.006239,0.007098,0.031014,-0.002293,0.005433,-0.001168,-0.01514,-0.013594,0.027634,0.018961,-0.01302,0.02217,0.035544,0.008864,0.020313,0.00212,-0.014756,0.030835,-0.004879,-0.03069,0.019472,-0.013574,-0.004114,0.01387,0.039049,0.03777,0.00324,0.01691,0.010527,0.042552,0.02334,-0.038049,-0.025515,-0.018268,-0.011539,0.014282,-0.006163,-0.015156,0.004478,0.018854,0.01609,0.005756,-0.023261,-0.001588,0.00746,-0.020568,-0.004637,-0.007909,-0.027837,0.002383,0.033475,-0.042774,-0.02559,0.020001,0.006792,-0.010391,0.034303,0.016683,0.000285,-0.01123,-0.022597,0.027094,-0.030294,-0.020465,0.016756,0.017633,0.013609,0.017103,0.017212,-0.012723,-0.006932,0.023111,0.002712,0.011501,-0.006055,0.020469,-0.024268,-0.002767,-0.036422,0.002019,0.002155,0.013061,-0.008971,-0.015068,0.000662,0.000248,0.014378,-0.040984,0.019256,-0.016069,0.004206,-0.014408,0.005932,0.034386,-0.0332,-0.018477,0.020917,0.039714,-0.015137,-0.005553,0.023126,0.001117,0.007508,0.003806,-0.029266,0.006011,0.005167,0.01324,0.013694,0.020257,0.004307,-0.018121,0.025035,0.001026,-0.013214,-0.016001,-0.009035,0.002485,-0.012465,-0.00023,-0.003857,-0.003385,0.008617,0.004842,0.002352,-0.024324,-0.008021,0.023252,0.051692,0.013226,-0.014763,-0.010242,0.016647,0.015237,0.006865,-0.009102,0.007311,0.029586,0.012973,0.001781,0.000875,0.009083,-0.000437,-0.006084,-0.010229,-0.011938,0.009713,0.017088,0.011857,-0.004196,0.000431,-0.022829,-0.012663,0.020776,-0.042662,0.012982,0.007005,0.006552,0.015705,-0.004542,-0.006312,-0.014624,0.000435,-0.019186,-0.019089,0.004377,-0.015651,-0.032356,-0.02179,-0.019216,0.023988,0.036593,0.010598,0.007846,-0.006808,0.006012,-0.00479,-0.009403,-0.023446,0.0,0.0,0.0,5.0
--wIGbLEhlpl_UeAIyDmZQ,-0.025492,0.019936,-0.00978,0.008989,0.006961,-0.007019,-0.010135,-0.005605,0.006441,0.014264,-0.004153,-0.013604,0.029235,-0.009206,-0.012113,0.015941,-0.023862,-0.004637,0.023066,0.00594,0.001028,-0.011015,0.021822,-0.018162,0.019159,-0.017638,-0.007374,0.008013,0.013594,0.004332,-0.004372,-0.000418,0.007773,0.01345,0.021573,0.014685,0.00535,-0.003031,0.00416,0.030382,0.02766,-0.012948,-0.012107,0.018031,0.009482,0.003252,-0.011,-0.009094,0.024033,0.003302,0.016488,0.008022,-0.008947,-0.017594,-0.012784,0.010665,0.006593,0.008323,0.029553,0.009798,0.005283,0.005461,-0.004438,-0.015892,-0.007072,-0.002535,-0.006522,-0.010301,-0.014766,0.008775,0.000726,-0.025882,0.009679,0.000513,-0.010131,-0.011865,-0.030439,-0.002678,-0.000552,-0.011225,-0.020444,-0.007347,-0.014832,-0.016297,-0.005811,-0.000288,-0.005281,0.003615,-0.00144,0.001941,-0.00617,0.017623,0.013533,0.007779,0.004645,-0.009148,-0.006272,-0.000626,0.013421,-0.003977,0.013309,-0.006342,-0.013755,-0.008102,0.010545,-0.009056,-0.015485,0.00261,-0.005081,-0.00295,0.005824,0.012043,0.008401,0.000881,-0.007029,0.00192,0.009014,-0.003405,-0.014651,-0.002772,0.002065,0.003278,0.021487,0.016243,0.000225,-0.010734,0.006012,9e-05,0.010165,0.011665,0.01584,-0.015023,-0.002547,0.003871,0.022617,0.00047,0.021889,0.000163,-0.016224,0.002425,0.004522,0.021262,0.006554,0.007554,-0.0082,-0.003171,0.024925,-0.014101,-0.01307,0.00566,0.0008,0.005688,0.000536,-0.012746,-0.015648,0.019713,0.007474,-0.014455,-0.004201,0.004916,-0.010583,0.002872,0.008594,0.008084,-0.009078,-0.014606,0.000752,0.003925,-0.004628,0.003482,8.8e-05,0.022214,-0.002939,0.007237,-0.012251,0.005304,-0.023208,-0.00915,0.03032,-0.017759,-0.02806,-0.012879,0.031513,-0.010033,0.002677,0.018922,-0.015384,0.001332,0.022275,0.018732,0.002626,-0.007305,-0.022455,-0.022342,0.002293,-0.00297,0.012568,0.004259,0.002387,-0.023521,-0.014398,0.008061,-0.00713,-0.004541,-0.006768,0.008254,0.002175,0.013936,-0.011764,-0.005329,-0.003379,-0.024168,-0.009096,0.001449,-0.007915,-0.023049,-0.007399,-0.006107,0.029451,-0.035164,-0.013084,-0.0046,0.002214,-0.004734,0.000585,-0.010677,0.02521,-0.008014,-0.00974,-0.003139,4.9e-05,0.001037,0.016224,0.002176,-0.018329,-0.010924,0.006367,-0.020801,0.01834,-0.006561,-0.018475,0.001385,0.001504,-0.005265,0.021937,0.026431,-0.010178,-0.004649,0.014743,-0.006455,-0.000592,0.010101,0.005465,0.016865,0.023015,-0.018072,-0.002269,-0.00815,0.023424,0.008583,-0.008682,0.007142,0.000363,-0.00172,0.022179,-0.006348,-0.001533,0.013606,-0.020222,0.007255,0.002943,0.012594,0.005401,0.004614,0.00883,-0.00246,-0.001011,0.002633,0.009794,0.003525,0.006961,-0.010522,-0.005311,-0.001347,-0.019297,0.006033,-0.013824,-0.000982,0.003249,0.007947,0.003068,0.000488,0.005373,0.015897,0.00668,0.002806,0.013151,0.016076,0.000884,0.006555,0.666667,0.166667,3.0,3.833333
-000aQFeK6tqVLndf7xORg,-0.027738,0.037412,0.007019,0.006407,0.004354,-0.005285,-0.010273,0.004754,0.006071,0.018834,0.012821,-0.021983,-0.002804,0.002488,-0.010496,0.004216,-0.032971,-0.017167,0.01614,0.002577,0.005492,-0.018144,0.031077,-0.024672,0.017904,-0.018859,0.003628,0.025943,0.012707,-0.002589,-0.002832,-0.007103,-0.005866,0.02241,0.013158,0.013854,0.003774,-1e-06,0.015016,0.041297,0.032408,0.003349,-0.026224,0.037172,0.005429,0.007529,-0.021226,-0.007134,0.033146,-0.00593,0.026232,0.000369,-0.011271,-0.015598,-0.010524,0.011385,0.013815,0.011965,0.040931,0.008883,0.000582,-0.003109,-0.004049,-0.022657,0.010396,-0.012263,-0.014782,-0.005251,-0.016635,-0.00386,0.004131,-0.024414,0.003984,-0.003817,-0.023049,-0.012739,-0.019521,0.009405,0.01038,-0.004879,-0.03375,-0.000219,0.00301,-0.017039,0.005289,-0.01204,0.001985,0.003886,0.009272,0.005672,-0.011414,0.020128,-0.004136,0.003024,0.035359,-0.025288,-0.012003,-0.004979,0.01354,-0.00376,0.016228,-0.002336,-0.008641,-0.008916,0.023729,-0.013717,-0.010763,-0.001542,-0.004317,-0.004265,-0.006154,-0.000357,0.007941,0.004022,-0.01004,-0.002143,0.004216,-0.008887,-0.003396,-0.015967,0.004146,-0.005276,0.022121,0.009692,-0.006947,-0.015748,0.003391,-0.006775,0.00755,0.009769,0.028079,-0.035267,-0.004585,0.008808,0.028737,0.018575,0.028816,-0.008854,-0.008475,0.000487,0.004927,0.02862,0.003009,0.00621,-0.004447,0.001013,0.030245,-0.015197,-0.010834,-0.002448,0.013291,0.012359,-0.00825,-0.021378,-0.013719,0.020623,0.009189,-0.017598,-0.009157,-0.004822,0.005702,-0.001931,0.000602,-0.005581,-0.008448,-0.026672,0.001478,-0.002303,0.000887,-0.008978,-0.007218,0.038184,-0.01691,0.017915,-0.011773,0.014865,-0.031074,-0.00053,0.042592,-0.015951,-0.023318,-0.027595,0.034925,-0.015474,0.011361,0.018718,-0.018819,-0.002008,0.041432,0.024755,0.001769,-0.020968,-0.012661,-0.023198,0.010154,-0.02107,0.012445,-0.012889,-0.000791,-0.023533,-0.017947,-0.001886,0.003658,0.000924,-0.017163,0.002154,-0.00072,0.018059,-0.012959,0.011919,-0.00214,-0.035065,-0.016817,0.018926,-0.018268,-0.023407,-0.018184,0.002846,0.036747,-0.052445,-0.015186,0.001086,-0.003678,0.006969,-0.004229,-0.020942,0.022191,-0.01646,-0.013144,-0.000691,0.007086,-0.001749,0.014215,0.005489,-0.016498,-0.015637,0.00889,-0.014859,0.019654,-0.021839,-0.021766,0.000612,-0.000924,-0.004216,0.023697,0.025991,-0.002733,-0.008449,0.012186,-0.004689,0.019142,0.005275,-0.004139,0.00995,0.020918,-0.021781,-0.012863,3.1e-05,0.028667,0.010145,-0.006074,0.007731,-9.3e-05,-0.006233,0.024673,-0.009111,-0.008371,0.019192,-0.004005,0.011155,0.005353,-0.002033,-0.011192,0.008456,0.01359,0.005049,-0.000745,0.009627,0.013333,0.009758,-0.006261,-0.018125,0.000727,-1e-05,-0.029143,0.003974,-0.018425,0.00472,-0.005071,0.005684,0.002536,0.010516,0.005451,0.036543,0.008084,-0.00268,0.008032,0.02726,-0.014735,-0.000253,0.666667,0.0,0.0,5.0


In [21]:
all_features_business.describe()

Unnamed: 0,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars
count,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13942.0,13943.0,13943.0,13943.0,13943.0
mean,-0.00839,0.021778,-0.002635,-0.002136,0.002495,0.014598,0.000629,-0.009871,0.008903,0.005276,0.011162,-0.023475,0.001838,-0.002154,0.00191,0.015653,-0.007991,-0.01254,0.014445,0.002841,0.003373,-0.008364,0.014755,-0.010015,0.004644,-0.010291,0.005193,-0.00089,0.002603,0.000717,-0.005025,-0.002875,0.003705,-0.00022,0.019935,0.009875,0.000631,0.004124,0.010766,0.009432,0.012857,0.006511,-0.017276,0.011432,0.007551,0.00257,-0.011296,0.001541,0.016013,0.000627,0.027226,0.002533,-0.005695,-0.007806,-0.010128,0.006696,0.003569,0.010557,0.02568,0.000845,-0.007403,0.008107,7.9e-05,-0.012628,-0.008367,0.000555,-0.004862,0.006496,-0.003914,0.002045,0.007805,-0.007263,0.00489,0.006333,-0.005577,-0.011072,-0.005269,-0.00342,0.000752,-0.002017,-0.011635,-0.006746,-0.007637,-0.007982,0.003039,-0.002989,0.002398,0.009304,-0.001539,0.001035,-0.001691,0.008997,0.000947,0.00786,-0.000968,-0.003037,-0.008369,0.005652,0.001329,-0.000563,-0.004947,0.012037,-0.002789,-0.00728,-0.00347,-0.001712,-0.015278,0.007212,-0.007401,0.002472,-0.004462,-0.001991,0.005059,-0.00469,0.008913,0.001702,0.011435,-0.000139,-0.014046,-0.008595,-0.002913,0.00469,0.00786,0.008435,-0.005047,-0.006265,0.0062,0.001101,-0.009543,0.005776,0.011811,-0.019563,0.006865,0.006045,0.012196,0.008626,0.000358,-0.000461,0.000691,-0.0033,-0.00746,0.002882,-0.002319,0.002144,-0.005844,0.001782,0.01684,-1.1e-05,0.004583,0.005685,0.00622,0.01359,-0.001143,-0.006433,-0.010743,0.001041,0.00979,-0.007379,-0.005617,-0.002998,-0.000239,0.007415,-0.000915,-0.000115,-0.000593,-0.000379,0.002705,0.00447,0.003321,-0.012935,-0.001253,0.014016,-0.00586,0.000408,-0.000336,0.010505,-0.013399,0.016468,0.009182,-0.00565,-0.006279,-0.012772,0.019707,-0.016116,0.005752,0.014795,-0.005801,-0.00127,0.010733,0.010856,-0.00869,-0.009675,-0.003624,-0.013903,0.004643,-0.004139,0.014025,-0.012617,-0.000399,-0.024004,-0.009386,-0.000926,0.002988,-0.006112,-0.007546,0.003935,0.008558,0.007888,-0.008976,0.006806,-0.004353,-0.016235,-0.011257,0.007753,0.004798,-0.011308,0.000481,0.008569,0.023149,-0.012039,0.000245,0.00362,0.006373,-0.003023,-0.005628,-0.010273,0.016256,-0.008155,0.000721,-0.005937,0.008646,-0.002807,0.004741,0.014019,-0.005149,-0.000434,0.000501,-0.00164,0.003784,-0.000125,-0.007376,0.000858,6.3e-05,0.000686,0.008639,0.013073,-0.005655,0.007299,0.010384,-0.000618,-0.003807,-0.002876,-0.00423,0.011351,0.015766,-0.009672,-0.005806,0.004114,0.021437,0.000971,-0.000953,0.002178,0.00529,-0.00359,0.009683,0.004413,-0.006772,0.010949,0.001884,0.011902,0.00349,0.006632,-0.003758,0.004021,0.008578,-0.00722,0.000116,0.011187,0.011708,0.00415,0.001405,-0.006943,-0.004926,-0.003104,-0.012505,-0.003149,-0.00021,-0.010128,-0.011202,0.004952,0.000954,0.003716,0.005721,0.008624,0.007111,0.000435,0.001807,0.001807,-0.002421,-0.001926,0.486991,0.423987,1.434996,3.615964
std,0.018105,0.014443,0.010845,0.010024,0.016199,0.02045,0.012249,0.018266,0.011159,0.013281,0.010771,0.011207,0.01936,0.011561,0.018003,0.013262,0.016887,0.012253,0.015834,0.009699,0.010173,0.01115,0.013711,0.013004,0.015662,0.014088,0.011984,0.013138,0.010102,0.010146,0.013536,0.010725,0.012797,0.014893,0.012511,0.015251,0.009692,0.012183,0.012224,0.024109,0.013015,0.016098,0.010771,0.020334,0.013161,0.012582,0.010032,0.012983,0.010891,0.01321,0.01107,0.011109,0.013872,0.015102,0.008862,0.017342,0.014598,0.010526,0.014358,0.011059,0.0113,0.013836,0.010519,0.01001,0.0182,0.011605,0.013944,0.018828,0.012088,0.011231,0.011937,0.01848,0.01912,0.014467,0.014168,0.013169,0.018633,0.011258,0.01109,0.010061,0.01783,0.010448,0.014698,0.013898,0.011787,0.012294,0.013023,0.012808,0.011718,0.013227,0.009032,0.011592,0.011428,0.014723,0.01578,0.016802,0.010877,0.018238,0.015955,0.014161,0.014006,0.017083,0.011835,0.011616,0.014981,0.011817,0.011203,0.012411,0.011445,0.010411,0.012365,0.014968,0.013048,0.014611,0.017244,0.01228,0.009884,0.011905,0.012444,0.012995,0.012591,0.011319,0.015136,0.012386,0.012329,0.010905,0.01052,0.013366,0.016101,0.010014,0.012379,0.012294,0.012049,0.017195,0.014218,0.012414,0.017739,0.010728,0.017261,0.012073,0.012221,0.014329,0.01118,0.017375,0.010613,0.013068,0.020524,0.013407,0.01416,0.01112,0.015174,0.010885,0.015431,0.018992,0.013828,0.021067,0.012028,0.010358,0.009523,0.012568,0.009963,0.011182,0.01157,0.010995,0.008643,0.017487,0.011463,0.010699,0.010914,0.011707,0.012031,0.01575,0.016848,0.012617,0.014061,0.018966,0.015232,0.017639,0.014961,0.014424,0.018629,0.015674,0.012718,0.013078,0.01242,0.010133,0.010811,0.011138,0.022075,0.009595,0.016635,0.015016,0.015516,0.009882,0.010354,0.009974,0.012522,0.012988,0.013622,0.011265,0.010175,0.012264,0.010654,0.01187,0.010846,0.012396,0.011592,0.01132,0.014325,0.010456,0.010472,0.012222,0.009263,0.012541,0.020797,0.019828,0.017303,0.016228,0.012885,0.025221,0.013942,0.010696,0.013824,0.011243,0.015756,0.013401,0.009488,0.01084,0.013376,0.014697,0.010068,0.009496,0.015213,0.013978,0.018659,0.015823,0.010723,0.017604,0.015001,0.009948,0.017428,0.010915,0.009706,0.008856,0.013572,0.019784,0.010932,0.014061,0.010266,0.015366,0.016899,0.011935,0.010247,0.009902,0.010873,0.011347,0.011336,0.013197,0.010114,0.01313,0.016211,0.011147,0.011743,0.0141,0.016525,0.011412,0.010838,0.012994,0.015987,0.016147,0.011875,0.013243,0.011206,0.011381,0.010274,0.01331,0.011115,0.013818,0.015339,0.010605,0.014072,0.011888,0.012077,0.01564,0.009653,0.01041,0.01548,0.013077,0.014286,0.01382,0.011769,0.010986,0.01041,0.019643,0.012373,0.011839,0.012024,0.014362,0.012125,0.020088,1.299472,1.070148,2.371442,1.277067
min,-0.099496,-0.070957,-0.074949,-0.060086,-0.106072,-0.083072,-0.060613,-0.146295,-0.060355,-0.059504,-0.03828,-0.083194,-0.108914,-0.066528,-0.118215,-0.049281,-0.11199,-0.073229,-0.058696,-0.049754,-0.04981,-0.070426,-0.054982,-0.080184,-0.057668,-0.123606,-0.064646,-0.091485,-0.05162,-0.060041,-0.096261,-0.065164,-0.072247,-0.0794,-0.039884,-0.054039,-0.076281,-0.068223,-0.073694,-0.076489,-0.065307,-0.074905,-0.07513,-0.091338,-0.068153,-0.061468,-0.067784,-0.093408,-0.050904,-0.112867,-0.045612,-0.055154,-0.087486,-0.072782,-0.060001,-0.104506,-0.065606,-0.04963,-0.054222,-0.061626,-0.08533,-0.083104,-0.05243,-0.080989,-0.081713,-0.079371,-0.075703,-0.09665,-0.084475,-0.055175,-0.058504,-0.116234,-0.100547,-0.07539,-0.071922,-0.071594,-0.134977,-0.07152,-0.055523,-0.059862,-0.075887,-0.06932,-0.084367,-0.070856,-0.055506,-0.071886,-0.071754,-0.045597,-0.062417,-0.096653,-0.051961,-0.051005,-0.075467,-0.069816,-0.060058,-0.071287,-0.096743,-0.078813,-0.076858,-0.07618,-0.103056,-0.108024,-0.05774,-0.094108,-0.094991,-0.052856,-0.081964,-0.077098,-0.06198,-0.060444,-0.084258,-0.059973,-0.093668,-0.07043,-0.060507,-0.083169,-0.051326,-0.097007,-0.089792,-0.067109,-0.074111,-0.057132,-0.064086,-0.053184,-0.097883,-0.065118,-0.041993,-0.064055,-0.08179,-0.04879,-0.050514,-0.106375,-0.106661,-0.084744,-0.071626,-0.0407,-0.075297,-0.078319,-0.063153,-0.074508,-0.09275,-0.083915,-0.065656,-0.096162,-0.069446,-0.07464,-0.123809,-0.074049,-0.06306,-0.069886,-0.072093,-0.052718,-0.080356,-0.076783,-0.098319,-0.100728,-0.082691,-0.076803,-0.063588,-0.075896,-0.057317,-0.075826,-0.056552,-0.061783,-0.052334,-0.077964,-0.057569,-0.053701,-0.063316,-0.066655,-0.09511,-0.071212,-0.099311,-0.055129,-0.068584,-0.148013,-0.097885,-0.069585,-0.067115,-0.078005,-0.084047,-0.088042,-0.040096,-0.071846,-0.047757,-0.055198,-0.079552,-0.071039,-0.080095,-0.043211,-0.079653,-0.083135,-0.075244,-0.077691,-0.059613,-0.061214,-0.045385,-0.126879,-0.111082,-0.118694,-0.07895,-0.069775,-0.065348,-0.057675,-0.13177,-0.055263,-0.0569,-0.065957,-0.085914,-0.044234,-0.084049,-0.096226,-0.055368,-0.107035,-0.07615,-0.11247,-0.074204,-0.060802,-0.039955,-0.112514,-0.059537,-0.058118,-0.063202,-0.055528,-0.124288,-0.078137,-0.04873,-0.056874,-0.052042,-0.098916,-0.034507,-0.050694,-0.079325,-0.05757,-0.081538,-0.064183,-0.066153,-0.07191,-0.065178,-0.059491,-0.099799,-0.079781,-0.054899,-0.06505,-0.059847,-0.083971,-0.091342,-0.078992,-0.037935,-0.089374,-0.098873,-0.076896,-0.058687,-0.068863,-0.053758,-0.062383,-0.064831,-0.071892,-0.029551,-0.080706,-0.069507,-0.069125,-0.077786,-0.090106,-0.100049,-0.061444,-0.08498,-0.06062,-0.09934,-0.064442,-0.075152,-0.06053,-0.07728,-0.052831,-0.043839,-0.094356,-0.073408,-0.050438,-0.093171,-0.064979,-0.068436,-0.096611,-0.071472,-0.139963,-0.075716,-0.072675,-0.078061,-0.097939,-0.073854,-0.071395,-0.065674,-0.047206,-0.055407,-0.063286,-0.055552,-0.076293,-0.064589,-0.073835,-0.066846,-0.10476,0.0,0.0,0.0,1.0
25%,-0.019841,0.01219,-0.009715,-0.008588,-0.006282,0.001508,-0.007413,-0.019872,0.001859,-0.003679,0.004317,-0.030204,-0.010166,-0.009219,-0.010632,0.007949,-0.01985,-0.018969,0.003751,-0.003308,-0.002707,-0.014913,0.005341,-0.018498,-0.006343,-0.017911,-0.002224,-0.009763,-0.003711,-0.004921,-0.012106,-0.009309,-0.004625,-0.009704,0.012195,-0.00188,-0.005126,-0.003445,0.003897,-0.009629,0.004045,-0.003507,-0.023415,-0.002887,-0.000933,-0.005169,-0.016988,-0.006359,0.009402,-0.006116,0.020923,-0.004617,-0.014726,-0.019278,-0.015378,-0.005298,-0.005846,0.004002,0.015551,-0.005883,-0.013416,-7.1e-05,-0.006521,-0.018431,-0.021522,-0.006593,-0.013965,-0.004839,-0.01101,-0.004146,0.001101,-0.019274,-0.007578,-0.002222,-0.013278,-0.019542,-0.016228,-0.010293,-0.005576,-0.008015,-0.025077,-0.013654,-0.016928,-0.017057,-0.004199,-0.010086,-0.006251,0.00115,-0.00918,-0.005589,-0.00703,0.000533,-0.006,0.000381,-0.012711,-0.015851,-0.013682,-0.007716,-0.010759,-0.009386,-0.014265,0.001074,-0.010628,-0.014451,-0.012467,-0.009798,-0.02191,-0.000745,-0.014069,-0.003885,-0.011588,-0.012911,-0.001281,-0.01567,-0.003274,-0.006265,0.006046,-0.005609,-0.021688,-0.01735,-0.010371,-0.003145,-0.003986,0.000216,-0.012036,-0.012705,-0.000406,-0.00689,-0.022027,-0.000293,0.003016,-0.027201,-0.000192,-0.00434,0.003619,0.000892,-0.013813,-0.00651,-0.011069,-0.011329,-0.015751,-0.006887,-0.009408,-0.00876,-0.012145,-0.006364,0.00466,-0.008936,-0.005277,-0.00114,-0.003242,0.007247,-0.01101,-0.019959,-0.019054,-0.014908,0.003488,-0.013281,-0.010708,-0.011486,-0.006148,0.000926,-0.008669,-0.006056,-0.005614,-0.013198,-0.004716,-0.001846,-0.002802,-0.021295,-0.008452,0.003358,-0.017232,-0.008838,-0.009174,0.000684,-0.023433,0.003862,-0.001075,-0.015725,-0.020585,-0.022665,0.010574,-0.024902,-0.001989,0.008251,-0.012019,-0.007979,-0.007642,0.004675,-0.021355,-0.019342,-0.014574,-0.019812,-0.001317,-0.009763,0.005847,-0.019156,-0.007971,-0.029984,-0.015062,-0.007559,-0.003439,-0.013691,-0.013821,-0.004629,0.001382,0.001202,-0.018084,0.000708,-0.010254,-0.023678,-0.016928,0.000646,-0.011025,-0.024718,-0.011994,-0.003217,0.01465,-0.031565,-0.009885,-0.0031,-0.002733,-0.010612,-0.016381,-0.018441,0.010273,-0.014784,-0.009266,-0.015949,0.001867,-0.008457,-0.005014,0.005858,-0.019682,-0.011476,-0.005956,-0.014373,-0.007377,-0.006137,-0.019759,-0.005795,-0.005414,-0.004672,-0.000905,-0.000429,-0.011775,-0.001737,0.004073,-0.010621,-0.01673,-0.010442,-0.010141,0.005599,0.009061,-0.016809,-0.012078,-0.003766,0.0155,-0.008437,-0.011375,-0.004804,-0.001828,-0.012897,-0.000318,-0.002623,-0.012948,0.002053,-0.008579,0.000251,-0.00279,-0.000883,-0.01127,-0.003083,0.002311,-0.015105,-0.006299,0.001148,0.001026,-0.0029,-0.008748,-0.014178,-0.012665,-0.012276,-0.018451,-0.009684,-0.011135,-0.017413,-0.020136,-0.003177,-0.005841,-0.003432,-0.000677,-0.007153,-0.000551,-0.005647,-0.005245,-0.00762,-0.009723,-0.012373,0.0,0.0,0.15251,3.0
50%,-0.008472,0.02284,-0.002609,-0.002318,0.001752,0.013259,0.001191,-0.010197,0.009215,0.005099,0.010841,-0.024437,0.000535,-0.001538,0.002405,0.016529,-0.008088,-0.012443,0.013529,0.002526,0.002844,-0.008044,0.014038,-0.00933,0.004244,-0.008912,0.005523,-0.001102,0.002797,0.001148,-0.004645,-0.003175,0.004327,-0.000556,0.019059,0.011783,0.000626,0.00403,0.010571,0.006996,0.013409,0.005942,-0.016606,0.011134,0.008186,0.002918,-0.011017,0.001604,0.015733,0.001456,0.026522,0.002391,-0.006305,-0.007967,-0.010088,0.008102,0.003394,0.009997,0.025207,0.000108,-0.007014,0.008641,-0.000586,-0.012513,-0.010691,0.000804,-0.004471,0.005398,-0.003447,0.002167,0.008384,-0.006288,0.005462,0.006005,-0.004256,-0.010306,-0.004152,-0.003368,0.00108,-0.001747,-0.014146,-0.006779,-0.008867,-0.007752,0.003158,-0.002606,0.00238,0.00898,-0.002473,0.00121,-0.001671,0.009776,0.000279,0.006759,-0.001883,-0.002184,-0.007716,0.004247,0.003928,-0.001206,-0.005641,0.013154,-0.003761,-0.006792,-0.00448,-0.000962,-0.015505,0.007453,-0.007029,0.002773,-0.003864,-0.002394,0.006014,-0.004629,0.007581,0.001611,0.011493,0.000372,-0.014457,-0.009667,-0.002622,0.004144,0.008955,0.008593,-0.004779,-0.006357,0.006058,0.00194,-0.010117,0.005524,0.011177,-0.018985,0.007229,0.006728,0.013116,0.007954,0.000215,-0.0002,0.000891,-0.003669,-0.008514,0.003831,-0.002805,0.001073,-0.006305,0.001282,0.019326,-0.000127,0.00452,0.005878,0.005507,0.013674,-0.0012,-0.009122,-0.010952,0.002075,0.009949,-0.006521,-0.005487,-0.003348,-0.000166,0.007502,-0.00103,0.000168,-0.000385,-0.000693,0.003361,0.004791,0.003808,-0.013775,-0.001726,0.012196,-0.006829,0.000348,-0.000379,0.012753,-0.011968,0.015122,0.008756,-0.005204,-0.007841,-0.010747,0.019141,-0.016608,0.004563,0.013906,-0.004929,-0.001497,0.013235,0.010291,-0.009957,-0.008124,-0.003959,-0.014034,0.004876,-0.00383,0.014866,-0.012033,-0.000773,-0.023343,-0.009028,-0.000899,0.003517,-0.006748,-0.007307,0.003632,0.008344,0.007409,-0.008337,0.006569,-0.00387,-0.015803,-0.011612,0.007816,0.005258,-0.011658,-0.000735,0.007327,0.022913,-0.014027,0.000544,0.003893,0.006507,-0.00356,-0.006193,-0.009575,0.01612,-0.008942,0.001646,-0.006523,0.008977,-0.002828,0.005108,0.016484,-0.009144,-0.000736,0.000649,-0.002683,0.003964,-0.000222,-0.005025,0.000498,0.000135,0.000608,0.009325,0.011307,-0.004881,0.006784,0.010343,-0.001389,-0.005094,-0.002967,-0.004035,0.011055,0.015596,-0.009138,-0.006097,0.003333,0.021461,0.001162,-0.000731,0.00235,0.005913,-0.00325,0.008983,0.005036,-0.006725,0.010552,0.002563,0.010517,0.004314,0.007528,-0.003701,0.003693,0.008551,-0.006928,7.7e-05,0.010861,0.010957,0.003428,0.002427,-0.00672,-0.004855,-0.003287,-0.012341,-0.003626,-0.001653,-0.011069,-0.013232,0.005632,0.001041,0.00305,0.005099,0.009523,0.006407,0.001295,0.001975,0.000198,-0.002355,0.000316,0.076923,0.0,1.0,4.0
75%,0.002934,0.031469,0.004429,0.004228,0.011726,0.026499,0.008555,0.000854,0.016186,0.014263,0.017433,-0.01784,0.01373,0.00533,0.015306,0.023855,0.00484,-0.006577,0.023466,0.008637,0.009156,-0.001506,0.023306,-0.001169,0.01417,-0.001353,0.012364,0.007969,0.009004,0.006837,0.002729,0.003124,0.012196,0.009202,0.026741,0.020729,0.006494,0.011498,0.018149,0.028007,0.021679,0.015387,-0.010564,0.025068,0.01652,0.010136,-0.005611,0.00941,0.022369,0.008987,0.033084,0.009465,0.002779,0.004094,-0.005008,0.019062,0.013004,0.016868,0.036098,0.006435,-0.000724,0.016767,0.006534,-0.006716,0.004765,0.007817,0.004405,0.01718,0.003346,0.008653,0.014829,0.006103,0.017916,0.014813,0.002747,-0.002169,0.00677,0.003413,0.007101,0.004092,0.001417,5.1e-05,8e-06,0.001224,0.009535,0.004859,0.011064,0.016451,0.005239,0.008123,0.00368,0.017059,0.00787,0.013556,0.009524,0.010362,-0.002254,0.018874,0.012684,0.007904,0.004053,0.02342,0.00522,0.000126,0.00486,0.006358,-0.008701,0.014915,-0.00071,0.009169,0.00338,0.008092,0.012574,0.006105,0.020481,0.009464,0.017195,0.006488,-0.006677,-0.000821,0.005446,0.01264,0.019341,0.01648,0.002503,-8e-05,0.012136,0.008645,0.001691,0.011669,0.02019,-0.01162,0.014044,0.016656,0.021199,0.015636,0.012925,0.005987,0.011955,0.0045,-7.2e-05,0.012793,0.004635,0.013164,0.000252,0.009467,0.029962,0.00795,0.015055,0.012421,0.014447,0.019915,0.008835,0.006463,-0.00334,0.017272,0.016822,-0.000594,-0.000435,0.004492,0.005699,0.01406,0.006654,0.006378,0.004537,0.013016,0.010088,0.011086,0.009913,-0.005061,0.005939,0.023517,0.005871,0.009561,0.008119,0.022681,-0.002161,0.029456,0.019043,0.004393,0.009017,-0.001761,0.02867,-0.00824,0.012687,0.020739,0.000894,0.005294,0.02819,0.016705,0.002938,0.00084,0.007142,-0.008007,0.010904,0.001598,0.022671,-0.005049,0.00763,-0.016984,-0.003363,0.006171,0.009134,0.000687,-0.001164,0.0117,0.016039,0.014392,0.001234,0.012484,0.001912,-0.008683,-0.005814,0.015427,0.021601,0.003285,0.013623,0.020676,0.031399,0.010665,0.010186,0.010376,0.01549,0.004176,0.004192,-0.001428,0.022056,-0.001883,0.010892,0.003926,0.015464,0.002723,0.014932,0.023502,0.011279,0.010233,0.007062,0.01101,0.014694,0.005777,0.006366,0.006914,0.005024,0.005677,0.018278,0.027053,0.001247,0.016827,0.016643,0.008838,0.008483,0.005226,0.001834,0.016802,0.022661,-0.002279,-0.000179,0.01147,0.027563,0.010009,0.008712,0.009324,0.012743,0.005312,0.019253,0.011787,-0.000209,0.019572,0.012669,0.024311,0.010757,0.015109,0.003831,0.010759,0.014653,0.001169,0.00657,0.022149,0.022281,0.010575,0.012021,0.000146,0.003147,0.005262,-0.006365,0.003287,0.011056,-0.004279,-0.003711,0.013563,0.007983,0.010212,0.011895,0.023998,0.014517,0.007427,0.009407,0.010411,0.004723,0.011048,0.555556,0.5,1.833333,5.0
max,0.116077,0.092948,0.061171,0.062565,0.098129,0.191384,0.062707,0.065401,0.085557,0.084326,0.077278,0.067791,0.095307,0.049926,0.081325,0.12979,0.065947,0.088064,0.107934,0.067978,0.104715,0.04843,0.094839,0.054444,0.138499,0.060451,0.068011,0.061564,0.069708,0.052342,0.071625,0.064561,0.080009,0.12619,0.133505,0.096335,0.049399,0.073156,0.074686,0.09792,0.075704,0.124292,0.037642,0.096234,0.060703,0.07074,0.06761,0.066943,0.10094,0.062752,0.082969,0.062117,0.072819,0.062219,0.038836,0.080694,0.069802,0.072575,0.087074,0.104184,0.045232,0.070057,0.051028,0.052553,0.06892,0.068994,0.061181,0.110694,0.055574,0.071316,0.077351,0.071526,0.089958,0.097455,0.07982,0.059923,0.089617,0.057738,0.133575,0.055499,0.099543,0.068226,0.126455,0.078044,0.103558,0.075382,0.075837,0.108137,0.060376,0.095953,0.053373,0.061853,0.069553,0.185667,0.083488,0.078605,0.052815,0.093548,0.071201,0.069393,0.061478,0.100102,0.090095,0.064884,0.055292,0.067387,0.05027,0.078922,0.078024,0.053102,0.063997,0.076342,0.056904,0.065691,0.095476,0.066127,0.070296,0.053285,0.073869,0.064544,0.087484,0.076806,0.081401,0.082052,0.050827,0.062453,0.082856,0.105139,0.07443,0.059377,0.079609,0.085144,0.079892,0.094782,0.090888,0.086033,0.08825,0.059949,0.094705,0.065482,0.062329,0.06734,0.071085,0.081044,0.059412,0.089264,0.089015,0.092912,0.064109,0.087022,0.092685,0.092465,0.072346,0.090889,0.10056,0.084514,0.104154,0.043002,0.046335,0.088478,0.049507,0.094211,0.090065,0.066376,0.049209,0.079852,0.068317,0.081142,0.107361,0.071754,0.06262,0.095875,0.122778,0.079274,0.071158,0.141365,0.047019,0.088459,0.082901,0.105783,0.072548,0.047857,0.098478,0.057289,0.076892,0.136077,0.047199,0.049668,0.089853,0.072013,0.09571,0.063411,0.074416,0.039007,0.068736,0.051489,0.075979,0.063918,0.071883,0.046469,0.062196,0.085334,0.11717,0.06669,0.049012,0.083452,0.066676,0.072407,0.148158,0.077833,0.054531,0.061231,0.047551,0.062982,0.089624,0.066872,0.096662,0.164057,0.118171,0.090443,0.08288,0.061406,0.152755,0.054022,0.120973,0.063569,0.066101,0.046101,0.059193,0.06137,0.10165,0.050854,0.071476,0.103233,0.081677,0.10129,0.064751,0.080342,0.063075,0.102277,0.056264,0.0797,0.086517,0.054548,0.071948,0.101756,0.072855,0.084344,0.098247,0.087882,0.072976,0.053238,0.053074,0.076004,0.078658,0.05568,0.078443,0.087993,0.074559,0.072468,0.100861,0.093074,0.076537,0.068904,0.097604,0.059389,0.051224,0.088514,0.0833,0.100106,0.0636,0.073844,0.049915,0.07702,0.075295,0.081609,0.064351,0.064367,0.087821,0.058437,0.077517,0.067107,0.130079,0.099862,0.044955,0.086514,0.077494,0.126437,0.068946,0.065618,0.081366,0.083463,0.079886,0.083106,0.079304,0.052836,0.058802,0.074491,0.070368,0.076403,56.0,28.0,75.0,5.0


# Create Label y (Business categories)

In [22]:
def load_business_df(): 
    filename = r'../../data/business.json'
    new_list = []
    for line in open(filename):
       new_list.append(json.loads(line))
    return pd.DataFrame.from_records(new_list)

dfbusiness = load_business_df()

In [23]:
dfbusiness.head()

Unnamed: 0,address,attributes,business_id,categories,city,hours,is_open,latitude,longitude,name,postal_code,review_count,stars,state
0,2818 E Camino Acequia Drive,{'GoodForKids': 'False'},1SWheh84yJXfytovILXOAQ,"Golf, Active Life",Phoenix,,0,33.522143,-112.018481,Arizona Biltmore Golf Club,85016,5,3.0,AZ
1,30 Eglinton Avenue W,"{'RestaurantsReservations': 'True', 'GoodForMe...",QXAEGFB4oINsVuTFxEYKFQ,"Specialty Food, Restaurants, Dim Sum, Imported...",Mississauga,"{'Monday': '9:0-0:0', 'Tuesday': '9:0-0:0', 'W...",1,43.605499,-79.652289,Emerald Chinese Restaurant,L5R 3E7,128,2.5,ON
2,"10110 Johnston Rd, Ste 15","{'GoodForKids': 'True', 'NoiseLevel': 'u'avera...",gnKjwL_1w79qoiV3IC_xQQ,"Sushi Bars, Restaurants, Japanese",Charlotte,"{'Monday': '17:30-21:30', 'Wednesday': '17:30-...",1,35.092564,-80.859132,Musashi Japanese Restaurant,28210,170,4.0,NC
3,"15655 W Roosevelt St, Ste 237",,xvX2CttrVhyG2z1dFg_0xw,"Insurance, Financial Services",Goodyear,"{'Monday': '8:0-17:0', 'Tuesday': '8:0-17:0', ...",1,33.455613,-112.395596,Farmers Insurance - Paul Lorenz,85338,3,5.0,AZ
4,"4209 Stuart Andrew Blvd, Ste F","{'BusinessAcceptsBitcoin': 'False', 'ByAppoint...",HhyxOkGAM07SRYtlQ4wMFQ,"Plumbing, Shopping, Local Services, Home Servi...",Charlotte,"{'Monday': '7:0-23:0', 'Tuesday': '7:0-23:0', ...",1,35.190012,-80.887223,Queen City Plumbing,28217,4,4.0,NC


# Join x,y (feature matrix, category) using business_id

In [24]:
dfbusiness.columns

Index(['address', 'attributes', 'business_id', 'categories', 'city', 'hours',
       'is_open', 'latitude', 'longitude', 'name', 'postal_code',
       'review_count', 'stars', 'state'],
      dtype='object')

In [25]:
len(dfbusiness['stars'].unique())

9

In [26]:
# Add business details to features df
keep_cols = ['business_id', 'categories', 'review_count']
all_features_business = all_features_business.merge(dfbusiness[keep_cols], how='left', on='business_id') 

In [27]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.007499,0.01587,-0.009091,-0.006852,-0.000165,0.024793,0.008644,-0.014792,0.009979,0.003942,0.0041,-0.031356,-0.010247,0.005387,0.011627,0.020621,0.015115,-0.008609,0.002743,-0.010083,-0.006879,-0.005206,0.009593,-0.012116,-0.00858,0.004146,0.017497,-0.010027,-0.002657,-0.003539,-0.003672,-0.003878,-0.019413,-0.004522,0.010478,-0.011138,-0.00152,0.001624,0.002736,-0.001654,0.014157,0.012092,-0.011507,6.5e-05,0.02149,0.012381,-0.010089,-5.6e-05,0.006849,0.012484,0.022499,0.007505,0.001161,0.006571,-0.011028,-0.008634,-0.003242,-0.005032,0.016984,-0.006729,-0.007712,0.016816,-0.009824,-0.01381,-0.018497,0.00596,-0.00389,0.005296,0.001982,0.005957,0.010111,0.011505,-0.009504,0.006016,0.000145,-0.001613,0.003529,-0.009751,-0.001159,0.00135,0.004055,-0.012402,-0.011015,0.001537,0.009277,-0.006952,0.018196,0.011549,-0.003539,0.003471,-0.001982,-0.006949,-0.004149,-0.001219,-0.018856,0.018914,-0.002559,0.020924,-0.02201,-0.016058,-0.019431,0.024031,0.010372,-0.024269,0.001566,0.003882,-0.016965,0.014903,-0.000333,0.005388,-0.004297,-0.022004,0.007121,-0.015847,0.010955,0.008343,0.002458,-0.004447,-0.022745,-0.016351,-0.015083,0.016055,-0.010969,0.004302,-0.002894,-0.003404,0.013917,0.003703,-0.026776,-0.001713,0.000421,-0.011638,0.00956,0.00662,0.002412,0.005148,-0.014819,-0.002384,0.013313,-0.011222,-0.011357,-0.015396,-0.004863,-0.00418,-0.007797,-0.008371,0.005455,0.006528,0.016183,0.014788,0.010562,0.015067,0.008007,0.003813,-0.009594,-0.019872,0.006658,-0.004194,-0.000726,-0.019882,0.003207,0.008335,-0.00353,-0.00852,0.003569,0.014994,0.012201,0.009086,0.00645,-0.01886,-0.002093,0.004624,0.013989,-0.011511,-0.000781,0.017287,0.005218,0.033981,-0.004402,-0.001438,0.023224,-0.008902,0.007502,-0.032931,0.001991,0.006755,-0.008278,-0.004967,-0.016611,0.00286,-0.027153,-0.00717,0.009309,-0.020619,0.000266,-0.001966,0.02771,-0.024209,-0.00121,-0.015006,-0.012201,0.003239,0.010807,-0.020057,-0.004619,-0.006609,0.018128,0.006096,0.006753,0.008909,0.002935,-0.01132,-0.015718,0.009725,0.021607,0.006392,0.020936,0.024687,0.011394,0.020137,-0.002377,0.007439,0.014349,-0.008634,-0.021233,0.001051,0.013405,-0.011792,0.01148,-0.024658,0.012449,0.006896,-0.006742,0.01825,0.01273,0.002604,0.005603,0.013587,-0.015682,-0.001813,0.000986,-0.001446,0.001805,0.000281,-0.006727,0.002159,-0.000811,0.015007,0.011092,0.00152,-0.012781,-0.009524,0.001715,0.013574,0.006682,0.004416,-0.016188,-0.00107,0.02416,-0.007177,0.003758,-0.002125,0.01329,0.009893,0.010383,0.014488,-0.018683,0.005121,0.012145,0.02989,-0.003379,0.013333,-0.012451,0.005107,0.010078,-0.003755,-0.002551,0.024659,-0.006757,-0.001592,0.012685,-0.008899,-0.01607,0.000193,-0.01233,-0.013128,0.018632,-0.013797,-0.017402,0.009316,8.5e-05,-0.000346,0.00496,-0.004972,0.007819,-0.001563,-0.007926,-0.00514,-0.002637,0.004487,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,0.001321,0.020397,-0.005849,0.001654,-0.011622,0.018849,0.005494,-0.015423,0.008618,0.007287,0.007399,-0.023791,0.000994,0.002044,-0.006207,0.025938,0.002294,-0.010482,0.00371,-0.009301,0.002224,-0.012484,0.016786,-0.018724,-0.007389,-0.005424,0.009281,0.005434,0.004109,0.009846,-0.006699,-0.003709,-0.005755,0.005977,0.010481,0.001521,0.011386,-0.001846,0.000209,0.016299,0.013629,-0.016856,-0.009517,-0.002898,0.006809,0.00016,-0.011914,-0.001501,0.013424,0.005944,0.02339,0.00945,-0.001096,0.009402,-0.011847,0.001044,-0.00358,0.022533,0.02438,-0.000276,-0.007263,0.015476,-0.003131,-0.009706,-0.016027,-0.002222,-0.00582,0.006852,0.002422,0.001015,0.00543,0.010693,0.007849,0.026505,-3.7e-05,-0.01016,-0.02102,-0.002758,-0.007959,0.003969,-0.006524,-0.010391,-0.012534,0.002711,0.004077,-0.000967,0.00462,-0.011257,-0.009275,-0.001961,0.004368,-0.002741,0.002213,-0.001484,-0.013602,0.005373,-0.001872,0.006827,-0.008297,-0.008082,-0.005082,-0.002831,0.010845,-0.016601,0.001404,-0.009223,-0.009052,0.00487,0.004361,0.006548,0.001582,-0.003647,0.010116,-0.011908,-0.005118,-0.001061,0.017997,0.003667,-0.007199,-0.001373,-0.005872,0.006095,-0.000244,0.002786,-9e-05,0.002776,0.011398,0.012544,-0.00881,-0.003625,0.008335,-0.018189,0.021909,-0.006605,0.001991,0.010614,-0.008953,0.002743,-0.01012,-0.006879,-0.013393,-0.01036,-0.007192,0.006551,-0.003993,-0.012971,0.018007,-0.009865,0.018597,0.006843,-0.002549,0.011944,0.018651,0.008417,-0.007486,-0.004775,0.012657,0.003012,-0.000857,-0.006485,0.002627,0.008823,-0.004679,0.007138,-0.002575,0.009535,0.006989,0.009448,-0.002979,-0.016121,-0.009967,0.001045,0.007279,-0.008831,0.000771,0.034161,-0.006813,0.023669,0.013984,-0.013003,0.009409,0.002028,0.012934,-0.025267,-0.002378,-0.005684,0.007701,-0.00501,-0.002147,0.003824,-0.007827,0.008128,-0.00025,-0.017537,0.004941,0.001875,0.020632,-0.011595,-0.019383,-0.008837,-0.003078,0.006548,0.009257,-0.018428,-0.003066,-0.012686,0.017781,-0.006594,0.002138,0.004893,-0.004178,-0.018483,-0.021718,0.005285,0.002581,0.011907,0.019082,0.019257,0.010664,0.007317,-0.002005,-0.002601,0.004837,-0.006308,-0.01078,0.00151,0.01077,-0.018656,0.010772,-0.013842,0.006009,-0.010805,0.005438,0.015986,0.018399,0.002517,-0.007091,0.006768,-0.002893,-0.001234,0.005622,-0.006657,0.011644,-0.011799,0.006536,0.017361,-0.000455,0.008157,0.011874,-0.019345,-0.016238,-0.006056,-0.01017,0.007585,0.012319,-0.000601,-0.018317,0.007401,0.022479,-0.004789,-0.00281,-0.006522,0.011438,-0.008014,0.030327,0.009807,-0.019028,0.00948,-0.000841,0.019035,0.017731,0.025443,-0.008603,0.009164,0.003628,-0.001672,-0.011721,0.025006,0.016348,0.006681,0.005001,-0.010569,-0.006204,0.000571,-0.00706,0.000485,0.016113,-0.026779,-0.006788,0.016352,0.006836,-0.009109,-0.000684,-0.00864,-0.007533,0.009403,0.006655,0.006638,-8.9e-05,0.023506,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,-0.031706,0.048197,0.002592,-0.014077,0.012455,0.021341,0.006416,0.023812,0.008884,-0.010029,0.026321,-0.035189,-0.029355,-0.003366,0.010322,0.010225,-0.013881,-0.014311,0.009284,-0.004531,0.015355,-0.012754,0.040054,-0.005349,-0.01496,-0.017134,0.015219,-0.005011,-0.005252,-0.018325,0.006987,-0.000175,-0.002259,-0.0018,0.046835,0.004774,-0.005338,0.002812,0.034205,-0.000922,0.041472,0.008703,-0.027083,0.031797,0.002945,-0.00315,-0.037086,0.023052,0.014693,0.006894,0.033686,-0.005447,-0.044836,-0.041553,0.007038,0.033026,0.018956,0.019412,0.040987,-0.002007,-0.044471,-0.004943,0.006001,-0.012231,-0.004839,-0.002395,-0.014001,0.043168,0.011314,-0.010929,0.003008,0.023616,0.035201,0.020261,0.005463,-0.003348,0.03982,0.01225,0.013604,0.017491,-0.029549,0.003226,0.009524,0.011124,0.003292,-0.032335,-0.002262,0.001353,0.020998,0.007193,0.003009,0.000301,0.002265,-0.005233,-0.002113,-0.00373,-0.009042,0.008372,0.012576,-0.011596,-0.024452,0.03889,-0.01924,-0.011677,-0.020417,0.014937,-0.026773,0.022879,-0.02211,-0.002352,-0.007221,-0.008771,-0.005963,0.005788,-0.005936,-0.001119,0.009281,0.00632,-0.032762,-0.018522,0.014365,0.006239,0.007098,0.031014,-0.002293,0.005433,-0.001168,-0.01514,-0.013594,0.027634,0.018961,-0.01302,0.02217,0.035544,0.008864,0.020313,0.00212,-0.014756,0.030835,-0.004879,-0.03069,0.019472,-0.013574,-0.004114,0.01387,0.039049,0.03777,0.00324,0.01691,0.010527,0.042552,0.02334,-0.038049,-0.025515,-0.018268,-0.011539,0.014282,-0.006163,-0.015156,0.004478,0.018854,0.01609,0.005756,-0.023261,-0.001588,0.00746,-0.020568,-0.004637,-0.007909,-0.027837,0.002383,0.033475,-0.042774,-0.02559,0.020001,0.006792,-0.010391,0.034303,0.016683,0.000285,-0.01123,-0.022597,0.027094,-0.030294,-0.020465,0.016756,0.017633,0.013609,0.017103,0.017212,-0.012723,-0.006932,0.023111,0.002712,0.011501,-0.006055,0.020469,-0.024268,-0.002767,-0.036422,0.002019,0.002155,0.013061,-0.008971,-0.015068,0.000662,0.000248,0.014378,-0.040984,0.019256,-0.016069,0.004206,-0.014408,0.005932,0.034386,-0.0332,-0.018477,0.020917,0.039714,-0.015137,-0.005553,0.023126,0.001117,0.007508,0.003806,-0.029266,0.006011,0.005167,0.01324,0.013694,0.020257,0.004307,-0.018121,0.025035,0.001026,-0.013214,-0.016001,-0.009035,0.002485,-0.012465,-0.00023,-0.003857,-0.003385,0.008617,0.004842,0.002352,-0.024324,-0.008021,0.023252,0.051692,0.013226,-0.014763,-0.010242,0.016647,0.015237,0.006865,-0.009102,0.007311,0.029586,0.012973,0.001781,0.000875,0.009083,-0.000437,-0.006084,-0.010229,-0.011938,0.009713,0.017088,0.011857,-0.004196,0.000431,-0.022829,-0.012663,0.020776,-0.042662,0.012982,0.007005,0.006552,0.015705,-0.004542,-0.006312,-0.014624,0.000435,-0.019186,-0.019089,0.004377,-0.015651,-0.032356,-0.02179,-0.019216,0.023988,0.036593,0.010598,0.007846,-0.006808,0.006012,-0.00479,-0.009403,-0.023446,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,-0.025492,0.019936,-0.00978,0.008989,0.006961,-0.007019,-0.010135,-0.005605,0.006441,0.014264,-0.004153,-0.013604,0.029235,-0.009206,-0.012113,0.015941,-0.023862,-0.004637,0.023066,0.00594,0.001028,-0.011015,0.021822,-0.018162,0.019159,-0.017638,-0.007374,0.008013,0.013594,0.004332,-0.004372,-0.000418,0.007773,0.01345,0.021573,0.014685,0.00535,-0.003031,0.00416,0.030382,0.02766,-0.012948,-0.012107,0.018031,0.009482,0.003252,-0.011,-0.009094,0.024033,0.003302,0.016488,0.008022,-0.008947,-0.017594,-0.012784,0.010665,0.006593,0.008323,0.029553,0.009798,0.005283,0.005461,-0.004438,-0.015892,-0.007072,-0.002535,-0.006522,-0.010301,-0.014766,0.008775,0.000726,-0.025882,0.009679,0.000513,-0.010131,-0.011865,-0.030439,-0.002678,-0.000552,-0.011225,-0.020444,-0.007347,-0.014832,-0.016297,-0.005811,-0.000288,-0.005281,0.003615,-0.00144,0.001941,-0.00617,0.017623,0.013533,0.007779,0.004645,-0.009148,-0.006272,-0.000626,0.013421,-0.003977,0.013309,-0.006342,-0.013755,-0.008102,0.010545,-0.009056,-0.015485,0.00261,-0.005081,-0.00295,0.005824,0.012043,0.008401,0.000881,-0.007029,0.00192,0.009014,-0.003405,-0.014651,-0.002772,0.002065,0.003278,0.021487,0.016243,0.000225,-0.010734,0.006012,9e-05,0.010165,0.011665,0.01584,-0.015023,-0.002547,0.003871,0.022617,0.00047,0.021889,0.000163,-0.016224,0.002425,0.004522,0.021262,0.006554,0.007554,-0.0082,-0.003171,0.024925,-0.014101,-0.01307,0.00566,0.0008,0.005688,0.000536,-0.012746,-0.015648,0.019713,0.007474,-0.014455,-0.004201,0.004916,-0.010583,0.002872,0.008594,0.008084,-0.009078,-0.014606,0.000752,0.003925,-0.004628,0.003482,8.8e-05,0.022214,-0.002939,0.007237,-0.012251,0.005304,-0.023208,-0.00915,0.03032,-0.017759,-0.02806,-0.012879,0.031513,-0.010033,0.002677,0.018922,-0.015384,0.001332,0.022275,0.018732,0.002626,-0.007305,-0.022455,-0.022342,0.002293,-0.00297,0.012568,0.004259,0.002387,-0.023521,-0.014398,0.008061,-0.00713,-0.004541,-0.006768,0.008254,0.002175,0.013936,-0.011764,-0.005329,-0.003379,-0.024168,-0.009096,0.001449,-0.007915,-0.023049,-0.007399,-0.006107,0.029451,-0.035164,-0.013084,-0.0046,0.002214,-0.004734,0.000585,-0.010677,0.02521,-0.008014,-0.00974,-0.003139,4.9e-05,0.001037,0.016224,0.002176,-0.018329,-0.010924,0.006367,-0.020801,0.01834,-0.006561,-0.018475,0.001385,0.001504,-0.005265,0.021937,0.026431,-0.010178,-0.004649,0.014743,-0.006455,-0.000592,0.010101,0.005465,0.016865,0.023015,-0.018072,-0.002269,-0.00815,0.023424,0.008583,-0.008682,0.007142,0.000363,-0.00172,0.022179,-0.006348,-0.001533,0.013606,-0.020222,0.007255,0.002943,0.012594,0.005401,0.004614,0.00883,-0.00246,-0.001011,0.002633,0.009794,0.003525,0.006961,-0.010522,-0.005311,-0.001347,-0.019297,0.006033,-0.013824,-0.000982,0.003249,0.007947,0.003068,0.000488,0.005373,0.015897,0.00668,0.002806,0.013151,0.016076,0.000884,0.006555,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,-0.027738,0.037412,0.007019,0.006407,0.004354,-0.005285,-0.010273,0.004754,0.006071,0.018834,0.012821,-0.021983,-0.002804,0.002488,-0.010496,0.004216,-0.032971,-0.017167,0.01614,0.002577,0.005492,-0.018144,0.031077,-0.024672,0.017904,-0.018859,0.003628,0.025943,0.012707,-0.002589,-0.002832,-0.007103,-0.005866,0.02241,0.013158,0.013854,0.003774,-1e-06,0.015016,0.041297,0.032408,0.003349,-0.026224,0.037172,0.005429,0.007529,-0.021226,-0.007134,0.033146,-0.00593,0.026232,0.000369,-0.011271,-0.015598,-0.010524,0.011385,0.013815,0.011965,0.040931,0.008883,0.000582,-0.003109,-0.004049,-0.022657,0.010396,-0.012263,-0.014782,-0.005251,-0.016635,-0.00386,0.004131,-0.024414,0.003984,-0.003817,-0.023049,-0.012739,-0.019521,0.009405,0.01038,-0.004879,-0.03375,-0.000219,0.00301,-0.017039,0.005289,-0.01204,0.001985,0.003886,0.009272,0.005672,-0.011414,0.020128,-0.004136,0.003024,0.035359,-0.025288,-0.012003,-0.004979,0.01354,-0.00376,0.016228,-0.002336,-0.008641,-0.008916,0.023729,-0.013717,-0.010763,-0.001542,-0.004317,-0.004265,-0.006154,-0.000357,0.007941,0.004022,-0.01004,-0.002143,0.004216,-0.008887,-0.003396,-0.015967,0.004146,-0.005276,0.022121,0.009692,-0.006947,-0.015748,0.003391,-0.006775,0.00755,0.009769,0.028079,-0.035267,-0.004585,0.008808,0.028737,0.018575,0.028816,-0.008854,-0.008475,0.000487,0.004927,0.02862,0.003009,0.00621,-0.004447,0.001013,0.030245,-0.015197,-0.010834,-0.002448,0.013291,0.012359,-0.00825,-0.021378,-0.013719,0.020623,0.009189,-0.017598,-0.009157,-0.004822,0.005702,-0.001931,0.000602,-0.005581,-0.008448,-0.026672,0.001478,-0.002303,0.000887,-0.008978,-0.007218,0.038184,-0.01691,0.017915,-0.011773,0.014865,-0.031074,-0.00053,0.042592,-0.015951,-0.023318,-0.027595,0.034925,-0.015474,0.011361,0.018718,-0.018819,-0.002008,0.041432,0.024755,0.001769,-0.020968,-0.012661,-0.023198,0.010154,-0.02107,0.012445,-0.012889,-0.000791,-0.023533,-0.017947,-0.001886,0.003658,0.000924,-0.017163,0.002154,-0.00072,0.018059,-0.012959,0.011919,-0.00214,-0.035065,-0.016817,0.018926,-0.018268,-0.023407,-0.018184,0.002846,0.036747,-0.052445,-0.015186,0.001086,-0.003678,0.006969,-0.004229,-0.020942,0.022191,-0.01646,-0.013144,-0.000691,0.007086,-0.001749,0.014215,0.005489,-0.016498,-0.015637,0.00889,-0.014859,0.019654,-0.021839,-0.021766,0.000612,-0.000924,-0.004216,0.023697,0.025991,-0.002733,-0.008449,0.012186,-0.004689,0.019142,0.005275,-0.004139,0.00995,0.020918,-0.021781,-0.012863,3.1e-05,0.028667,0.010145,-0.006074,0.007731,-9.3e-05,-0.006233,0.024673,-0.009111,-0.008371,0.019192,-0.004005,0.011155,0.005353,-0.002033,-0.011192,0.008456,0.01359,0.005049,-0.000745,0.009627,0.013333,0.009758,-0.006261,-0.018125,0.000727,-1e-05,-0.029143,0.003974,-0.018425,0.00472,-0.005071,0.005684,0.002536,0.010516,0.005451,0.036543,0.008084,-0.00268,0.008032,0.02726,-0.014735,-0.000253,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [28]:
all_features_business['categories'][0]

'Nightlife, Sports Bars, Restaurants, Bars, American (Traditional)'

In [29]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count
0,--I7YYLada0tSLkORTHb5Q,-0.007499,0.01587,-0.009091,-0.006852,-0.000165,0.024793,0.008644,-0.014792,0.009979,0.003942,0.0041,-0.031356,-0.010247,0.005387,0.011627,0.020621,0.015115,-0.008609,0.002743,-0.010083,-0.006879,-0.005206,0.009593,-0.012116,-0.00858,0.004146,0.017497,-0.010027,-0.002657,-0.003539,-0.003672,-0.003878,-0.019413,-0.004522,0.010478,-0.011138,-0.00152,0.001624,0.002736,-0.001654,0.014157,0.012092,-0.011507,6.5e-05,0.02149,0.012381,-0.010089,-5.6e-05,0.006849,0.012484,0.022499,0.007505,0.001161,0.006571,-0.011028,-0.008634,-0.003242,-0.005032,0.016984,-0.006729,-0.007712,0.016816,-0.009824,-0.01381,-0.018497,0.00596,-0.00389,0.005296,0.001982,0.005957,0.010111,0.011505,-0.009504,0.006016,0.000145,-0.001613,0.003529,-0.009751,-0.001159,0.00135,0.004055,-0.012402,-0.011015,0.001537,0.009277,-0.006952,0.018196,0.011549,-0.003539,0.003471,-0.001982,-0.006949,-0.004149,-0.001219,-0.018856,0.018914,-0.002559,0.020924,-0.02201,-0.016058,-0.019431,0.024031,0.010372,-0.024269,0.001566,0.003882,-0.016965,0.014903,-0.000333,0.005388,-0.004297,-0.022004,0.007121,-0.015847,0.010955,0.008343,0.002458,-0.004447,-0.022745,-0.016351,-0.015083,0.016055,-0.010969,0.004302,-0.002894,-0.003404,0.013917,0.003703,-0.026776,-0.001713,0.000421,-0.011638,0.00956,0.00662,0.002412,0.005148,-0.014819,-0.002384,0.013313,-0.011222,-0.011357,-0.015396,-0.004863,-0.00418,-0.007797,-0.008371,0.005455,0.006528,0.016183,0.014788,0.010562,0.015067,0.008007,0.003813,-0.009594,-0.019872,0.006658,-0.004194,-0.000726,-0.019882,0.003207,0.008335,-0.00353,-0.00852,0.003569,0.014994,0.012201,0.009086,0.00645,-0.01886,-0.002093,0.004624,0.013989,-0.011511,-0.000781,0.017287,0.005218,0.033981,-0.004402,-0.001438,0.023224,-0.008902,0.007502,-0.032931,0.001991,0.006755,-0.008278,-0.004967,-0.016611,0.00286,-0.027153,-0.00717,0.009309,-0.020619,0.000266,-0.001966,0.02771,-0.024209,-0.00121,-0.015006,-0.012201,0.003239,0.010807,-0.020057,-0.004619,-0.006609,0.018128,0.006096,0.006753,0.008909,0.002935,-0.01132,-0.015718,0.009725,0.021607,0.006392,0.020936,0.024687,0.011394,0.020137,-0.002377,0.007439,0.014349,-0.008634,-0.021233,0.001051,0.013405,-0.011792,0.01148,-0.024658,0.012449,0.006896,-0.006742,0.01825,0.01273,0.002604,0.005603,0.013587,-0.015682,-0.001813,0.000986,-0.001446,0.001805,0.000281,-0.006727,0.002159,-0.000811,0.015007,0.011092,0.00152,-0.012781,-0.009524,0.001715,0.013574,0.006682,0.004416,-0.016188,-0.00107,0.02416,-0.007177,0.003758,-0.002125,0.01329,0.009893,0.010383,0.014488,-0.018683,0.005121,0.012145,0.02989,-0.003379,0.013333,-0.012451,0.005107,0.010078,-0.003755,-0.002551,0.024659,-0.006757,-0.001592,0.012685,-0.008899,-0.01607,0.000193,-0.01233,-0.013128,0.018632,-0.013797,-0.017402,0.009316,8.5e-05,-0.000346,0.00496,-0.004972,0.007819,-0.001563,-0.007926,-0.00514,-0.002637,0.004487,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96
1,--U98MNlDym2cLn36BBPgQ,0.001321,0.020397,-0.005849,0.001654,-0.011622,0.018849,0.005494,-0.015423,0.008618,0.007287,0.007399,-0.023791,0.000994,0.002044,-0.006207,0.025938,0.002294,-0.010482,0.00371,-0.009301,0.002224,-0.012484,0.016786,-0.018724,-0.007389,-0.005424,0.009281,0.005434,0.004109,0.009846,-0.006699,-0.003709,-0.005755,0.005977,0.010481,0.001521,0.011386,-0.001846,0.000209,0.016299,0.013629,-0.016856,-0.009517,-0.002898,0.006809,0.00016,-0.011914,-0.001501,0.013424,0.005944,0.02339,0.00945,-0.001096,0.009402,-0.011847,0.001044,-0.00358,0.022533,0.02438,-0.000276,-0.007263,0.015476,-0.003131,-0.009706,-0.016027,-0.002222,-0.00582,0.006852,0.002422,0.001015,0.00543,0.010693,0.007849,0.026505,-3.7e-05,-0.01016,-0.02102,-0.002758,-0.007959,0.003969,-0.006524,-0.010391,-0.012534,0.002711,0.004077,-0.000967,0.00462,-0.011257,-0.009275,-0.001961,0.004368,-0.002741,0.002213,-0.001484,-0.013602,0.005373,-0.001872,0.006827,-0.008297,-0.008082,-0.005082,-0.002831,0.010845,-0.016601,0.001404,-0.009223,-0.009052,0.00487,0.004361,0.006548,0.001582,-0.003647,0.010116,-0.011908,-0.005118,-0.001061,0.017997,0.003667,-0.007199,-0.001373,-0.005872,0.006095,-0.000244,0.002786,-9e-05,0.002776,0.011398,0.012544,-0.00881,-0.003625,0.008335,-0.018189,0.021909,-0.006605,0.001991,0.010614,-0.008953,0.002743,-0.01012,-0.006879,-0.013393,-0.01036,-0.007192,0.006551,-0.003993,-0.012971,0.018007,-0.009865,0.018597,0.006843,-0.002549,0.011944,0.018651,0.008417,-0.007486,-0.004775,0.012657,0.003012,-0.000857,-0.006485,0.002627,0.008823,-0.004679,0.007138,-0.002575,0.009535,0.006989,0.009448,-0.002979,-0.016121,-0.009967,0.001045,0.007279,-0.008831,0.000771,0.034161,-0.006813,0.023669,0.013984,-0.013003,0.009409,0.002028,0.012934,-0.025267,-0.002378,-0.005684,0.007701,-0.00501,-0.002147,0.003824,-0.007827,0.008128,-0.00025,-0.017537,0.004941,0.001875,0.020632,-0.011595,-0.019383,-0.008837,-0.003078,0.006548,0.009257,-0.018428,-0.003066,-0.012686,0.017781,-0.006594,0.002138,0.004893,-0.004178,-0.018483,-0.021718,0.005285,0.002581,0.011907,0.019082,0.019257,0.010664,0.007317,-0.002005,-0.002601,0.004837,-0.006308,-0.01078,0.00151,0.01077,-0.018656,0.010772,-0.013842,0.006009,-0.010805,0.005438,0.015986,0.018399,0.002517,-0.007091,0.006768,-0.002893,-0.001234,0.005622,-0.006657,0.011644,-0.011799,0.006536,0.017361,-0.000455,0.008157,0.011874,-0.019345,-0.016238,-0.006056,-0.01017,0.007585,0.012319,-0.000601,-0.018317,0.007401,0.022479,-0.004789,-0.00281,-0.006522,0.011438,-0.008014,0.030327,0.009807,-0.019028,0.00948,-0.000841,0.019035,0.017731,0.025443,-0.008603,0.009164,0.003628,-0.001672,-0.011721,0.025006,0.016348,0.006681,0.005001,-0.010569,-0.006204,0.000571,-0.00706,0.000485,0.016113,-0.026779,-0.006788,0.016352,0.006836,-0.009109,-0.000684,-0.00864,-0.007533,0.009403,0.006655,0.006638,-8.9e-05,0.023506,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4
2,--j-kaNMCo1-DYzddCsA5Q,-0.031706,0.048197,0.002592,-0.014077,0.012455,0.021341,0.006416,0.023812,0.008884,-0.010029,0.026321,-0.035189,-0.029355,-0.003366,0.010322,0.010225,-0.013881,-0.014311,0.009284,-0.004531,0.015355,-0.012754,0.040054,-0.005349,-0.01496,-0.017134,0.015219,-0.005011,-0.005252,-0.018325,0.006987,-0.000175,-0.002259,-0.0018,0.046835,0.004774,-0.005338,0.002812,0.034205,-0.000922,0.041472,0.008703,-0.027083,0.031797,0.002945,-0.00315,-0.037086,0.023052,0.014693,0.006894,0.033686,-0.005447,-0.044836,-0.041553,0.007038,0.033026,0.018956,0.019412,0.040987,-0.002007,-0.044471,-0.004943,0.006001,-0.012231,-0.004839,-0.002395,-0.014001,0.043168,0.011314,-0.010929,0.003008,0.023616,0.035201,0.020261,0.005463,-0.003348,0.03982,0.01225,0.013604,0.017491,-0.029549,0.003226,0.009524,0.011124,0.003292,-0.032335,-0.002262,0.001353,0.020998,0.007193,0.003009,0.000301,0.002265,-0.005233,-0.002113,-0.00373,-0.009042,0.008372,0.012576,-0.011596,-0.024452,0.03889,-0.01924,-0.011677,-0.020417,0.014937,-0.026773,0.022879,-0.02211,-0.002352,-0.007221,-0.008771,-0.005963,0.005788,-0.005936,-0.001119,0.009281,0.00632,-0.032762,-0.018522,0.014365,0.006239,0.007098,0.031014,-0.002293,0.005433,-0.001168,-0.01514,-0.013594,0.027634,0.018961,-0.01302,0.02217,0.035544,0.008864,0.020313,0.00212,-0.014756,0.030835,-0.004879,-0.03069,0.019472,-0.013574,-0.004114,0.01387,0.039049,0.03777,0.00324,0.01691,0.010527,0.042552,0.02334,-0.038049,-0.025515,-0.018268,-0.011539,0.014282,-0.006163,-0.015156,0.004478,0.018854,0.01609,0.005756,-0.023261,-0.001588,0.00746,-0.020568,-0.004637,-0.007909,-0.027837,0.002383,0.033475,-0.042774,-0.02559,0.020001,0.006792,-0.010391,0.034303,0.016683,0.000285,-0.01123,-0.022597,0.027094,-0.030294,-0.020465,0.016756,0.017633,0.013609,0.017103,0.017212,-0.012723,-0.006932,0.023111,0.002712,0.011501,-0.006055,0.020469,-0.024268,-0.002767,-0.036422,0.002019,0.002155,0.013061,-0.008971,-0.015068,0.000662,0.000248,0.014378,-0.040984,0.019256,-0.016069,0.004206,-0.014408,0.005932,0.034386,-0.0332,-0.018477,0.020917,0.039714,-0.015137,-0.005553,0.023126,0.001117,0.007508,0.003806,-0.029266,0.006011,0.005167,0.01324,0.013694,0.020257,0.004307,-0.018121,0.025035,0.001026,-0.013214,-0.016001,-0.009035,0.002485,-0.012465,-0.00023,-0.003857,-0.003385,0.008617,0.004842,0.002352,-0.024324,-0.008021,0.023252,0.051692,0.013226,-0.014763,-0.010242,0.016647,0.015237,0.006865,-0.009102,0.007311,0.029586,0.012973,0.001781,0.000875,0.009083,-0.000437,-0.006084,-0.010229,-0.011938,0.009713,0.017088,0.011857,-0.004196,0.000431,-0.022829,-0.012663,0.020776,-0.042662,0.012982,0.007005,0.006552,0.015705,-0.004542,-0.006312,-0.014624,0.000435,-0.019186,-0.019089,0.004377,-0.015651,-0.032356,-0.02179,-0.019216,0.023988,0.036593,0.010598,0.007846,-0.006808,0.006012,-0.00479,-0.009403,-0.023446,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4
3,--wIGbLEhlpl_UeAIyDmZQ,-0.025492,0.019936,-0.00978,0.008989,0.006961,-0.007019,-0.010135,-0.005605,0.006441,0.014264,-0.004153,-0.013604,0.029235,-0.009206,-0.012113,0.015941,-0.023862,-0.004637,0.023066,0.00594,0.001028,-0.011015,0.021822,-0.018162,0.019159,-0.017638,-0.007374,0.008013,0.013594,0.004332,-0.004372,-0.000418,0.007773,0.01345,0.021573,0.014685,0.00535,-0.003031,0.00416,0.030382,0.02766,-0.012948,-0.012107,0.018031,0.009482,0.003252,-0.011,-0.009094,0.024033,0.003302,0.016488,0.008022,-0.008947,-0.017594,-0.012784,0.010665,0.006593,0.008323,0.029553,0.009798,0.005283,0.005461,-0.004438,-0.015892,-0.007072,-0.002535,-0.006522,-0.010301,-0.014766,0.008775,0.000726,-0.025882,0.009679,0.000513,-0.010131,-0.011865,-0.030439,-0.002678,-0.000552,-0.011225,-0.020444,-0.007347,-0.014832,-0.016297,-0.005811,-0.000288,-0.005281,0.003615,-0.00144,0.001941,-0.00617,0.017623,0.013533,0.007779,0.004645,-0.009148,-0.006272,-0.000626,0.013421,-0.003977,0.013309,-0.006342,-0.013755,-0.008102,0.010545,-0.009056,-0.015485,0.00261,-0.005081,-0.00295,0.005824,0.012043,0.008401,0.000881,-0.007029,0.00192,0.009014,-0.003405,-0.014651,-0.002772,0.002065,0.003278,0.021487,0.016243,0.000225,-0.010734,0.006012,9e-05,0.010165,0.011665,0.01584,-0.015023,-0.002547,0.003871,0.022617,0.00047,0.021889,0.000163,-0.016224,0.002425,0.004522,0.021262,0.006554,0.007554,-0.0082,-0.003171,0.024925,-0.014101,-0.01307,0.00566,0.0008,0.005688,0.000536,-0.012746,-0.015648,0.019713,0.007474,-0.014455,-0.004201,0.004916,-0.010583,0.002872,0.008594,0.008084,-0.009078,-0.014606,0.000752,0.003925,-0.004628,0.003482,8.8e-05,0.022214,-0.002939,0.007237,-0.012251,0.005304,-0.023208,-0.00915,0.03032,-0.017759,-0.02806,-0.012879,0.031513,-0.010033,0.002677,0.018922,-0.015384,0.001332,0.022275,0.018732,0.002626,-0.007305,-0.022455,-0.022342,0.002293,-0.00297,0.012568,0.004259,0.002387,-0.023521,-0.014398,0.008061,-0.00713,-0.004541,-0.006768,0.008254,0.002175,0.013936,-0.011764,-0.005329,-0.003379,-0.024168,-0.009096,0.001449,-0.007915,-0.023049,-0.007399,-0.006107,0.029451,-0.035164,-0.013084,-0.0046,0.002214,-0.004734,0.000585,-0.010677,0.02521,-0.008014,-0.00974,-0.003139,4.9e-05,0.001037,0.016224,0.002176,-0.018329,-0.010924,0.006367,-0.020801,0.01834,-0.006561,-0.018475,0.001385,0.001504,-0.005265,0.021937,0.026431,-0.010178,-0.004649,0.014743,-0.006455,-0.000592,0.010101,0.005465,0.016865,0.023015,-0.018072,-0.002269,-0.00815,0.023424,0.008583,-0.008682,0.007142,0.000363,-0.00172,0.022179,-0.006348,-0.001533,0.013606,-0.020222,0.007255,0.002943,0.012594,0.005401,0.004614,0.00883,-0.00246,-0.001011,0.002633,0.009794,0.003525,0.006961,-0.010522,-0.005311,-0.001347,-0.019297,0.006033,-0.013824,-0.000982,0.003249,0.007947,0.003068,0.000488,0.005373,0.015897,0.00668,0.002806,0.013151,0.016076,0.000884,0.006555,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14
4,-000aQFeK6tqVLndf7xORg,-0.027738,0.037412,0.007019,0.006407,0.004354,-0.005285,-0.010273,0.004754,0.006071,0.018834,0.012821,-0.021983,-0.002804,0.002488,-0.010496,0.004216,-0.032971,-0.017167,0.01614,0.002577,0.005492,-0.018144,0.031077,-0.024672,0.017904,-0.018859,0.003628,0.025943,0.012707,-0.002589,-0.002832,-0.007103,-0.005866,0.02241,0.013158,0.013854,0.003774,-1e-06,0.015016,0.041297,0.032408,0.003349,-0.026224,0.037172,0.005429,0.007529,-0.021226,-0.007134,0.033146,-0.00593,0.026232,0.000369,-0.011271,-0.015598,-0.010524,0.011385,0.013815,0.011965,0.040931,0.008883,0.000582,-0.003109,-0.004049,-0.022657,0.010396,-0.012263,-0.014782,-0.005251,-0.016635,-0.00386,0.004131,-0.024414,0.003984,-0.003817,-0.023049,-0.012739,-0.019521,0.009405,0.01038,-0.004879,-0.03375,-0.000219,0.00301,-0.017039,0.005289,-0.01204,0.001985,0.003886,0.009272,0.005672,-0.011414,0.020128,-0.004136,0.003024,0.035359,-0.025288,-0.012003,-0.004979,0.01354,-0.00376,0.016228,-0.002336,-0.008641,-0.008916,0.023729,-0.013717,-0.010763,-0.001542,-0.004317,-0.004265,-0.006154,-0.000357,0.007941,0.004022,-0.01004,-0.002143,0.004216,-0.008887,-0.003396,-0.015967,0.004146,-0.005276,0.022121,0.009692,-0.006947,-0.015748,0.003391,-0.006775,0.00755,0.009769,0.028079,-0.035267,-0.004585,0.008808,0.028737,0.018575,0.028816,-0.008854,-0.008475,0.000487,0.004927,0.02862,0.003009,0.00621,-0.004447,0.001013,0.030245,-0.015197,-0.010834,-0.002448,0.013291,0.012359,-0.00825,-0.021378,-0.013719,0.020623,0.009189,-0.017598,-0.009157,-0.004822,0.005702,-0.001931,0.000602,-0.005581,-0.008448,-0.026672,0.001478,-0.002303,0.000887,-0.008978,-0.007218,0.038184,-0.01691,0.017915,-0.011773,0.014865,-0.031074,-0.00053,0.042592,-0.015951,-0.023318,-0.027595,0.034925,-0.015474,0.011361,0.018718,-0.018819,-0.002008,0.041432,0.024755,0.001769,-0.020968,-0.012661,-0.023198,0.010154,-0.02107,0.012445,-0.012889,-0.000791,-0.023533,-0.017947,-0.001886,0.003658,0.000924,-0.017163,0.002154,-0.00072,0.018059,-0.012959,0.011919,-0.00214,-0.035065,-0.016817,0.018926,-0.018268,-0.023407,-0.018184,0.002846,0.036747,-0.052445,-0.015186,0.001086,-0.003678,0.006969,-0.004229,-0.020942,0.022191,-0.01646,-0.013144,-0.000691,0.007086,-0.001749,0.014215,0.005489,-0.016498,-0.015637,0.00889,-0.014859,0.019654,-0.021839,-0.021766,0.000612,-0.000924,-0.004216,0.023697,0.025991,-0.002733,-0.008449,0.012186,-0.004689,0.019142,0.005275,-0.004139,0.00995,0.020918,-0.021781,-0.012863,3.1e-05,0.028667,0.010145,-0.006074,0.007731,-9.3e-05,-0.006233,0.024673,-0.009111,-0.008371,0.019192,-0.004005,0.011155,0.005353,-0.002033,-0.011192,0.008456,0.01359,0.005049,-0.000745,0.009627,0.013333,0.009758,-0.006261,-0.018125,0.000727,-1e-05,-0.029143,0.003974,-0.018425,0.00472,-0.005071,0.005684,0.002536,0.010516,0.005451,0.036543,0.008084,-0.00268,0.008032,0.02726,-0.014735,-0.000253,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7


In [30]:
def stringDFColToBinaryCols(df, series_name):
    # Create list of all categories
    all_cats = []
    for string in df[series_name]:
        string = str(string)
        cats = string.strip().replace(' ', '').split(',')
        for cat in cats:
            if cat not in all_cats:
                all_cats.append(cat)
    # Make binary for each cat for each row
    for cat in all_cats:
        df[cat] = df[series_name].str.strip().str.replace(' ', '').str.contains(cat)
        # This technique will have some problems. 'Golf' may appear in non-Golf categories (ie 'Disc Golf')
        # Can be fixed with regular expressions: ',Golf,' OR 'BOF Golf,' OR ',Golf EOF'
    
    return df, all_cats
        
all_features_business, all_cats = stringDFColToBinaryCols(all_features_business, 'categories')

  if sys.path[0] == '':


In [31]:
print(all_cats)

['Nightlife', 'SportsBars', 'Restaurants', 'Bars', 'American(Traditional)', 'Pizza', 'HairRemoval', 'NailTechnicians', 'Beauty&Spas', 'NailSalons', 'Waxing', 'DaySpas', 'Electronics', 'ProfessionalServices', 'LocalServices', 'ElectronicsRepair', 'Computers', 'Shopping', 'Automotive', 'AutoRepair', 'Chinese', 'EyelashService', 'TobaccoShops', 'VapeShops', 'CarDealers', 'UsedCarDealers', 'Dentists', 'GeneralDentistry', 'CosmeticDentists', 'PediatricDentists', 'Health&Medical', 'Tex-Mex', 'Mexican', 'Arts&Entertainment', 'Festivals', 'Food', 'FoodTrucks', 'FarmersMarket', 'Portuguese', 'Bakeries', 'ChickenShop', 'Barbeque', 'EventPlanning&Services', 'EventPhotography', 'Photographers', 'SessionPhotography', 'SkinCare', 'Antiques', 'IceCream&FrozenYogurt', 'Donuts', 'SpecialtyFood', 'WebDesign', 'GraphicDesign', 'Marketing', 'RecyclingCenter', 'Caterers', 'Southern', 'ComfortFood', 'Breakfast&Brunch', 'French', 'American(New)', 'Burgers', 'Sandwiches', 'Coffee&Tea', 'Brasseries', 'Gyms', '

In [32]:
print(
    len(all_features_business[all_features_business['Golf']==True]), 
    len(all_features_business[all_features_business['DiscGolf']==True]), 
)

61 1


In [33]:
print(all_features_business[all_features_business['DiscGolf']==True]['categories'].values)
print('Should not have a True value for Golf, but does. Problem to deal with in the future.')
print(all_features_business[all_features_business['DiscGolf']==True]['Golf'].values)

['Sporting Goods, Active Life, Bike Rentals, Disc Golf, Shopping']
Should not have a True value for Golf, but does. Problem to deal with in the future.
[True]


In [34]:
all_features_business.head()

Unnamed: 0,business_id,w2v0,w2v1,w2v2,w2v3,w2v4,w2v5,w2v6,w2v7,w2v8,w2v9,w2v10,w2v11,w2v12,w2v13,w2v14,w2v15,w2v16,w2v17,w2v18,w2v19,w2v20,w2v21,w2v22,w2v23,w2v24,w2v25,w2v26,w2v27,w2v28,w2v29,w2v30,w2v31,w2v32,w2v33,w2v34,w2v35,w2v36,w2v37,w2v38,w2v39,w2v40,w2v41,w2v42,w2v43,w2v44,w2v45,w2v46,w2v47,w2v48,w2v49,w2v50,w2v51,w2v52,w2v53,w2v54,w2v55,w2v56,w2v57,w2v58,w2v59,w2v60,w2v61,w2v62,w2v63,w2v64,w2v65,w2v66,w2v67,w2v68,w2v69,w2v70,w2v71,w2v72,w2v73,w2v74,w2v75,w2v76,w2v77,w2v78,w2v79,w2v80,w2v81,w2v82,w2v83,w2v84,w2v85,w2v86,w2v87,w2v88,w2v89,w2v90,w2v91,w2v92,w2v93,w2v94,w2v95,w2v96,w2v97,w2v98,w2v99,w2v100,w2v101,w2v102,w2v103,w2v104,w2v105,w2v106,w2v107,w2v108,w2v109,w2v110,w2v111,w2v112,w2v113,w2v114,w2v115,w2v116,w2v117,w2v118,w2v119,w2v120,w2v121,w2v122,w2v123,w2v124,w2v125,w2v126,w2v127,w2v128,w2v129,w2v130,w2v131,w2v132,w2v133,w2v134,w2v135,w2v136,w2v137,w2v138,w2v139,w2v140,w2v141,w2v142,w2v143,w2v144,w2v145,w2v146,w2v147,w2v148,w2v149,w2v150,w2v151,w2v152,w2v153,w2v154,w2v155,w2v156,w2v157,w2v158,w2v159,w2v160,w2v161,w2v162,w2v163,w2v164,w2v165,w2v166,w2v167,w2v168,w2v169,w2v170,w2v171,w2v172,w2v173,w2v174,w2v175,w2v176,w2v177,w2v178,w2v179,w2v180,w2v181,w2v182,w2v183,w2v184,w2v185,w2v186,w2v187,w2v188,w2v189,w2v190,w2v191,w2v192,w2v193,w2v194,w2v195,w2v196,w2v197,w2v198,w2v199,w2v200,w2v201,w2v202,w2v203,w2v204,w2v205,w2v206,w2v207,w2v208,w2v209,w2v210,w2v211,w2v212,w2v213,w2v214,w2v215,w2v216,w2v217,w2v218,w2v219,w2v220,w2v221,w2v222,w2v223,w2v224,w2v225,w2v226,w2v227,w2v228,w2v229,w2v230,w2v231,w2v232,w2v233,w2v234,w2v235,w2v236,w2v237,w2v238,w2v239,w2v240,w2v241,w2v242,w2v243,w2v244,w2v245,w2v246,w2v247,w2v248,w2v249,w2v250,w2v251,w2v252,w2v253,w2v254,w2v255,w2v256,w2v257,w2v258,w2v259,w2v260,w2v261,w2v262,w2v263,w2v264,w2v265,w2v266,w2v267,w2v268,w2v269,w2v270,w2v271,w2v272,w2v273,w2v274,w2v275,w2v276,w2v277,w2v278,w2v279,w2v280,w2v281,w2v282,w2v283,w2v284,w2v285,w2v286,w2v287,w2v288,w2v289,w2v290,w2v291,w2v292,w2v293,w2v294,w2v295,w2v296,w2v297,w2v298,w2v299,cool,funny,useful,stars,categories,review_count,Nightlife,SportsBars,Restaurants,Bars,American(Traditional),Pizza,HairRemoval,NailTechnicians,Beauty&Spas,NailSalons,Waxing,DaySpas,Electronics,ProfessionalServices,LocalServices,ElectronicsRepair,Computers,Shopping,Automotive,AutoRepair,Chinese,EyelashService,TobaccoShops,VapeShops,CarDealers,UsedCarDealers,Dentists,GeneralDentistry,CosmeticDentists,PediatricDentists,Health&Medical,Tex-Mex,Mexican,Arts&Entertainment,Festivals,Food,FoodTrucks,FarmersMarket,Portuguese,Bakeries,ChickenShop,Barbeque,EventPlanning&Services,EventPhotography,Photographers,SessionPhotography,SkinCare,Antiques,IceCream&FrozenYogurt,Donuts,SpecialtyFood,WebDesign,GraphicDesign,Marketing,RecyclingCenter,Caterers,Southern,ComfortFood,Breakfast&Brunch,French,American(New),Burgers,Sandwiches,Coffee&Tea,Brasseries,Gyms,ChildCare&DayCare,LeisureCenters,Fitness&Instruction,ActiveLife,HardwareStores,Home&Garden,RealEstate,Condominiums,Hotels,HomeServices,ShoppingCenters,Hotels&Travel,HairSalons,EthnicFood,Turkish,InternationalGrocery,TapasBars,ShippingCenters,PrintingServices,Massage,MassageTherapy,Reflexology,Buffets,Korean,SushiBars,Japanese,Cafes,Soup,Golf,Venues&EventSpaces,AutoDetailing,BodyShops,AutoCustomization,Towing,Trainers,WeightLossCenters,FoodDeliveryServices,FastFood,Delis,Ethiopian,Vegetarian,Painters,DrywallInstallation&Repair,StuccoServices,Orthodontists,Periodontists,OralSurgeons,Piercing,Tattoo,Chiropractors,Optometrists,Italian,Couriers&DeliveryServices,PublicServices&Government,SportingGoods,Fashion,GolfEquipment,Bikes,Ski&SnowboardShops,SportsWear,BikeRepair/Maintenance,Filipino,PetGroomers,Veterinarians,PetSitting,Pets,PetServices,AutoGlassServices,RealEstateServices,RealEstateAgents,Pakistani,Indian,CardioClasses,DanceStudios,ChickenWings,Cosmetics&BeautySupply,Desserts,Sewing&Alterations,Arts&Crafts,Wheel&RimRepair,Tires,AutoParts&Supplies,Colonics,Saunas,Doctors,MedicalSpas,Naturopathic/Holistic,MeditationCenters,Reiki,SpiritualShop,Orthopedists,SportsMedicine,Surgeons,Grocery,MedicalCenters,InteriorDesign,Rugs,FurnitureStores,HomeDecor,Mattresses,Women'sClothing,Men'sClothing,ShoeStores,JuiceBars&Smoothies,Acupuncture,LaserHairRemoval,FamilyPractice,UrgentCare,Thai,AsianFusion,Vietnamese,Laotian,HomeCleaning,CarpetCleaning,Accessories,Barbers,Gluten-Free,SpeechTherapists,PhysicalTherapy,OccupationalTherapy,Seafood,Steakhouses,Wholesalers,DiscountStore,PartySupplies,DepartmentStores,...,Gelato,TelevisionServiceProviders,Fences&Gates,MetalFabricators,ScubaDiving,Diving,DiveShops,WatchRepair,Halotherapy,CulturalCenter,Lakes,Macarons,CustomCakes,Aquariums,BusinessConsulting,BotanicalGardens,PaintStores,Moroccan,Persian/Iranian,DataRecovery,Cajun/Creole,PartyEquipmentRentals,CarBrokers,BootCamps,Musicians,PartyCharacters,MusicProductionServices,Cuban,PuertoRican,RVDealers,RVRental,Bowling,Venezuelan,SummerCamps,PetAdoption,RefinishingServices,PublicTransportation,CommercialTruckDealers,CommercialTruckRepair,FoodStands,CommercialRealEstate,OutletStores,Campgrounds,RVParks,Resorts,TalentAgencies,GutterServices,UsedBookstore,AdultEducation,StripteaseDancers,DanceSchools,Wallpapering,GoldBuyers,PawnShops,Videographers,Arabian,DonationCenter,TravelAgents,Basque,Spanish,WaterDelivery,WaterStores,Kosher,SkateParks,Izakaya,Poutineries,BailBondsmen,PressureWashers,Herbs&Spices,PhotoBoothRentals,CannabisDispensaries,Poke,ArtClasses,Teppanyaki,Oncologist,HotPot,Szechuan,IrishPub,CyclingClasses,MountainBiking,ShoeRepair,ShoeShine,Cupcakes,SafeStores,Hunting&FishingSupplies,RehabilitationCenter,BasketballCourts,CountryClubs,Endocrinologists,Neurologist,Irish,PetCremationServices,PersonalInjuryLaw,Divorce&FamilyLaw,BankruptcyLaw,Immunodermatologists,RetirementHomes,Cantonese,PoleDancingClasses,Rodeo,VinylRecords,Props,Delicatessen,EthnicGrocery,GuestHouses,YelpEvents,RestaurantSupplies,PatioCoverings,Masonry/Concrete,DigitizingServices,Framing,TestPreparation,PrivateTutors,Skydiving,HomeHealthCare,MedicalSupplies,Psychologists,ModernEuropean,Shutters,FabricStores,SouvenirShops,Russian,CheeseShops,CarWindowTinting,FireProtectionServices,FacePainting,Tuscan,Gastroenterologist,Butcher,Blood&PlasmaDonationCenters,German,Keys&Locksmiths,DUILaw,CriminalDefenseLaw,Investing,SmogCheckStations,CarInspectors,BrewingSupplies,HongKongStyleCafe,PublicMarkets,VehicleWraps,Airports,TeethWhitening,RVRepair,CountertopInstallation,MortuaryServices,SnowRemoval,EstatePlanningLaw,Wills,Trusts,&Probates,BusinessLaw,Airlines,Estheticians,Engraving,TrophyShops,CandleStores,PopcornShops,Fishing,TrailerDealers,BeachBars,BeachVolleyball,ArtificialTurf,PanAsian,DJs,Paintball,MiniGolf,GoKarts,Wigs,GolfLessons,Opera&Ballet,Jazz&Blues,Waffles,SolarInstallation,HomeEnergyAuditors,CannabisClinics,Uzbek,Prenatal/PerinatalCare,Hypnosis/Hypnotherapy,Eatertainment,Afghan,HealthInsuranceOffices,BeverageStore,Tiling,Sicilian,Bartenders,SpineSurgeons,Carpenters,Singaporean,SkilledNursing,Live/RawFood,SepticServices,PrintMedia,SkatingRinks,InternetCafes,WineTours,Boating,DemolitionServices,ProductDesign,3DPrinting,RoadsideAssistance,Himalayan/Nepalese,Officiants,Kickboxing,Boxing,CookingClasses,CookingSchools,PersonalChefs,Indonesian,AquariumServices,Brazilian,LaboratoryTesting,HockeyEquipment,SkateShops,RealEstatePhotography,Video/FilmProduction,Sandblasting,Perfume,PrivateJetCharter,SoulFood,Bookbinding,TanningBeds,RealEstateLaw,EmergencyPetHospital,BoatCharters,Rafting/Kayaking,BoudoirPhotography,Argentine,SocialClubs,OutdoorFurnitureStores,SouthAfrican,AcaiBowls,LactationServices,PlacentaEncapsulations,Observatories,Ukrainian,Planetarium,Cabaret,Hakka,Sailing,FireplaceServices,Gunsmith,UniversityHousing,IndoorPlaycentre,Embassy,OliveOil,Karate,LocalFishStores,MotorsportVehicleRepairs,Synagogues,GuitarStores,MobileDentRepair,Paddleboarding,Distilleries,PostOffices,PetTransportation,CurrencyExchange,PastaShops,Smokehouse,Hydrotherapy,Pop-upShops,Videos&VideoGameRental,OxygenBars,ExcavationServices,MobileHomeRepair,PickYourOwnFarms,Farms,Scottish,British,Passport&VisaServices,PianoBars,PoliceDepartments,WeddingChapels,RegistrationServices,FloatSpa,DayCamps,TrainStations,Prosthodontists,MedicalCannabisReferrals,Mongolian,Orthotics,ChristmasTrees,ClubCrawl,ScreenPrinting,HazardousWasteDisposal,EnvironmentalAbatement,LawnServices,HennaArtists,KidsHairSalons,Zoos,EmploymentLaw,DebtReliefServices,VehicleShipping,Hats,BusTours,DinnerTheater,EstateLiquidation,GeneralLitigation,Coffee&TeaSupplies,Soccer,TrailerRepair,Awnings,Pretzels,ArtSpaceRentals,EditorialServices,Honduran,Nicaraguan,Marinas,CareerCounseling,TeamBuildingActivities,TownCarService,PayrollServices,AerialFitness,CremationServices,GolfCartRentals,GolfCartDealers,LivestockFeed&Supply,UltrasoundImagingCenters,GrillingEquipment,LightingStores,Donairs,Falafel,CannabisTours,PersonalAssistants,AcneTreatment,Clowns,Magicians,InstallmentLoans,Prosthetics,ParentingClasses,FoodBanks,StreetArt,Buses,DialysisClinics,Newspapers&Magazines,Cideries,AutoSecurity,TrailerRental,TabletopGames,MedicalTransportation,SoftwareDevelopment,HolidayDecoratingServices,HolidayDecorations,Cambodian,BirdShops,LanguageSchools,SeniorCenters,OsteopathicPhysicians,PetHospice,TrafficSchools,TrafficTicketingLaw,Urologists,Taekwondo,FarmEquipmentRepair,Coffeeshops,Sunglasses,AnimalPhysicalTherapy,Rheumatologists,PartyBikeRentals,Bangladeshi,Vocational&TechnicalSchool,PetWasteRemoval,Pathologists,Aestheticians,PsychicMediums,TastingClasses,WineTastingClasses,BodyContouring,PumpkinPatches,GeneratorInstallation/Repair,AddictionMedicine,VacationRentalAgents,AppraisalServices,Snorkeling,Dominican,Gemstones&Minerals,Cryotherapy,Trinidadian,ImmigrationLaw,SupperClubs,Burmese,AssistedLivingFacilities,PianoServices,HomeownerAssociation,ScavengerHunts,WalkingTours,BeerTours,BartendingSchools,Carousels,ConciergeMedicine,Matchmakers,WellDrilling,SriLankan,Trains,FurnitureRental,Badminton,PetPhotography,TitleLoans,DanceWear,IVHydration,CPRClasses,BikeSharing,NannyServices,Cafeteria,MistingSystemServices,HorseBoarding,Recording&RehearsalStudios,DisabilityLaw,SocialSecurityLaw,HabilitativeServices,CSA,RetinaSpecialists,BoatDealers,HearingAidProviders,PowderCoating,CircuitTrainingGyms,RotisserieChicken,EnvironmentalTesting,BingoHalls,ValetServices,SugarShacks,Austrian,Races&Competitions,Anesthesiologists,HouseSitters,TikiBars,CarShareServices,Squash,VisitorCenters,CheeseTastingClasses,FleaMarkets,WorkersCompensationLaw,Mosques,HolisticAnimalCare,Firewood,FoodTours,VascularMedicine,Tableware,Hydroponics,HighFidelityAudioEquipment,BarCrawl,BounceHouseRentals,BuddhistTemples,DIYAutoShop,HerbalShops,LANCenters,ConveyorBeltSushi,Egyptian,ReligiousSchools,HairLossCenters,Armenian,MotorcycleGear,ElderCarePlanning,BoatTours,BusRental,RacingExperience,HomeStaging,ReligiousItems,Ziplining,Colombian,Rolfing,Haitian,WildlifeControl,ConceptShops,DiscGolf,Drive-InTheater,TaiChi,International,TenantandEvictionLaw,Doulas,Neurotologists,Belgian,EthicalGrocery,Shanghainese,Machine&ToolRental,FirstAidClasses,HealthRetreats,Empanadas,AirportTerminals,RoofInspectors,Airsoft,VocalCoach,TelevisionStations,IceDelivery,Gerontologists,CustomsBrokers,MotorsportVehicleDealers,FlightInstruction,Cheerleading,RockClimbing,BalloonServices,ATVRentals/Tours,MassageSchools,Pool&Billiards,PettingZoos,Toxicologists,WaterParks,AirportLounges,Australian
0,--I7YYLada0tSLkORTHb5Q,-0.007499,0.01587,-0.009091,-0.006852,-0.000165,0.024793,0.008644,-0.014792,0.009979,0.003942,0.0041,-0.031356,-0.010247,0.005387,0.011627,0.020621,0.015115,-0.008609,0.002743,-0.010083,-0.006879,-0.005206,0.009593,-0.012116,-0.00858,0.004146,0.017497,-0.010027,-0.002657,-0.003539,-0.003672,-0.003878,-0.019413,-0.004522,0.010478,-0.011138,-0.00152,0.001624,0.002736,-0.001654,0.014157,0.012092,-0.011507,6.5e-05,0.02149,0.012381,-0.010089,-5.6e-05,0.006849,0.012484,0.022499,0.007505,0.001161,0.006571,-0.011028,-0.008634,-0.003242,-0.005032,0.016984,-0.006729,-0.007712,0.016816,-0.009824,-0.01381,-0.018497,0.00596,-0.00389,0.005296,0.001982,0.005957,0.010111,0.011505,-0.009504,0.006016,0.000145,-0.001613,0.003529,-0.009751,-0.001159,0.00135,0.004055,-0.012402,-0.011015,0.001537,0.009277,-0.006952,0.018196,0.011549,-0.003539,0.003471,-0.001982,-0.006949,-0.004149,-0.001219,-0.018856,0.018914,-0.002559,0.020924,-0.02201,-0.016058,-0.019431,0.024031,0.010372,-0.024269,0.001566,0.003882,-0.016965,0.014903,-0.000333,0.005388,-0.004297,-0.022004,0.007121,-0.015847,0.010955,0.008343,0.002458,-0.004447,-0.022745,-0.016351,-0.015083,0.016055,-0.010969,0.004302,-0.002894,-0.003404,0.013917,0.003703,-0.026776,-0.001713,0.000421,-0.011638,0.00956,0.00662,0.002412,0.005148,-0.014819,-0.002384,0.013313,-0.011222,-0.011357,-0.015396,-0.004863,-0.00418,-0.007797,-0.008371,0.005455,0.006528,0.016183,0.014788,0.010562,0.015067,0.008007,0.003813,-0.009594,-0.019872,0.006658,-0.004194,-0.000726,-0.019882,0.003207,0.008335,-0.00353,-0.00852,0.003569,0.014994,0.012201,0.009086,0.00645,-0.01886,-0.002093,0.004624,0.013989,-0.011511,-0.000781,0.017287,0.005218,0.033981,-0.004402,-0.001438,0.023224,-0.008902,0.007502,-0.032931,0.001991,0.006755,-0.008278,-0.004967,-0.016611,0.00286,-0.027153,-0.00717,0.009309,-0.020619,0.000266,-0.001966,0.02771,-0.024209,-0.00121,-0.015006,-0.012201,0.003239,0.010807,-0.020057,-0.004619,-0.006609,0.018128,0.006096,0.006753,0.008909,0.002935,-0.01132,-0.015718,0.009725,0.021607,0.006392,0.020936,0.024687,0.011394,0.020137,-0.002377,0.007439,0.014349,-0.008634,-0.021233,0.001051,0.013405,-0.011792,0.01148,-0.024658,0.012449,0.006896,-0.006742,0.01825,0.01273,0.002604,0.005603,0.013587,-0.015682,-0.001813,0.000986,-0.001446,0.001805,0.000281,-0.006727,0.002159,-0.000811,0.015007,0.011092,0.00152,-0.012781,-0.009524,0.001715,0.013574,0.006682,0.004416,-0.016188,-0.00107,0.02416,-0.007177,0.003758,-0.002125,0.01329,0.009893,0.010383,0.014488,-0.018683,0.005121,0.012145,0.02989,-0.003379,0.013333,-0.012451,0.005107,0.010078,-0.003755,-0.002551,0.024659,-0.006757,-0.001592,0.012685,-0.008899,-0.01607,0.000193,-0.01233,-0.013128,0.018632,-0.013797,-0.017402,0.009316,8.5e-05,-0.000346,0.00496,-0.004972,0.007819,-0.001563,-0.007926,-0.00514,-0.002637,0.004487,0.352941,0.352941,0.823529,3.647059,"Nightlife, Sports Bars, Restaurants, Bars, Ame...",96,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,--U98MNlDym2cLn36BBPgQ,0.001321,0.020397,-0.005849,0.001654,-0.011622,0.018849,0.005494,-0.015423,0.008618,0.007287,0.007399,-0.023791,0.000994,0.002044,-0.006207,0.025938,0.002294,-0.010482,0.00371,-0.009301,0.002224,-0.012484,0.016786,-0.018724,-0.007389,-0.005424,0.009281,0.005434,0.004109,0.009846,-0.006699,-0.003709,-0.005755,0.005977,0.010481,0.001521,0.011386,-0.001846,0.000209,0.016299,0.013629,-0.016856,-0.009517,-0.002898,0.006809,0.00016,-0.011914,-0.001501,0.013424,0.005944,0.02339,0.00945,-0.001096,0.009402,-0.011847,0.001044,-0.00358,0.022533,0.02438,-0.000276,-0.007263,0.015476,-0.003131,-0.009706,-0.016027,-0.002222,-0.00582,0.006852,0.002422,0.001015,0.00543,0.010693,0.007849,0.026505,-3.7e-05,-0.01016,-0.02102,-0.002758,-0.007959,0.003969,-0.006524,-0.010391,-0.012534,0.002711,0.004077,-0.000967,0.00462,-0.011257,-0.009275,-0.001961,0.004368,-0.002741,0.002213,-0.001484,-0.013602,0.005373,-0.001872,0.006827,-0.008297,-0.008082,-0.005082,-0.002831,0.010845,-0.016601,0.001404,-0.009223,-0.009052,0.00487,0.004361,0.006548,0.001582,-0.003647,0.010116,-0.011908,-0.005118,-0.001061,0.017997,0.003667,-0.007199,-0.001373,-0.005872,0.006095,-0.000244,0.002786,-9e-05,0.002776,0.011398,0.012544,-0.00881,-0.003625,0.008335,-0.018189,0.021909,-0.006605,0.001991,0.010614,-0.008953,0.002743,-0.01012,-0.006879,-0.013393,-0.01036,-0.007192,0.006551,-0.003993,-0.012971,0.018007,-0.009865,0.018597,0.006843,-0.002549,0.011944,0.018651,0.008417,-0.007486,-0.004775,0.012657,0.003012,-0.000857,-0.006485,0.002627,0.008823,-0.004679,0.007138,-0.002575,0.009535,0.006989,0.009448,-0.002979,-0.016121,-0.009967,0.001045,0.007279,-0.008831,0.000771,0.034161,-0.006813,0.023669,0.013984,-0.013003,0.009409,0.002028,0.012934,-0.025267,-0.002378,-0.005684,0.007701,-0.00501,-0.002147,0.003824,-0.007827,0.008128,-0.00025,-0.017537,0.004941,0.001875,0.020632,-0.011595,-0.019383,-0.008837,-0.003078,0.006548,0.009257,-0.018428,-0.003066,-0.012686,0.017781,-0.006594,0.002138,0.004893,-0.004178,-0.018483,-0.021718,0.005285,0.002581,0.011907,0.019082,0.019257,0.010664,0.007317,-0.002005,-0.002601,0.004837,-0.006308,-0.01078,0.00151,0.01077,-0.018656,0.010772,-0.013842,0.006009,-0.010805,0.005438,0.015986,0.018399,0.002517,-0.007091,0.006768,-0.002893,-0.001234,0.005622,-0.006657,0.011644,-0.011799,0.006536,0.017361,-0.000455,0.008157,0.011874,-0.019345,-0.016238,-0.006056,-0.01017,0.007585,0.012319,-0.000601,-0.018317,0.007401,0.022479,-0.004789,-0.00281,-0.006522,0.011438,-0.008014,0.030327,0.009807,-0.019028,0.00948,-0.000841,0.019035,0.017731,0.025443,-0.008603,0.009164,0.003628,-0.001672,-0.011721,0.025006,0.016348,0.006681,0.005001,-0.010569,-0.006204,0.000571,-0.00706,0.000485,0.016113,-0.026779,-0.006788,0.016352,0.006836,-0.009109,-0.000684,-0.00864,-0.007533,0.009403,0.006655,0.006638,-8.9e-05,0.023506,0.0,0.0,2.0,3.0,"Pizza, Restaurants",4,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,--j-kaNMCo1-DYzddCsA5Q,-0.031706,0.048197,0.002592,-0.014077,0.012455,0.021341,0.006416,0.023812,0.008884,-0.010029,0.026321,-0.035189,-0.029355,-0.003366,0.010322,0.010225,-0.013881,-0.014311,0.009284,-0.004531,0.015355,-0.012754,0.040054,-0.005349,-0.01496,-0.017134,0.015219,-0.005011,-0.005252,-0.018325,0.006987,-0.000175,-0.002259,-0.0018,0.046835,0.004774,-0.005338,0.002812,0.034205,-0.000922,0.041472,0.008703,-0.027083,0.031797,0.002945,-0.00315,-0.037086,0.023052,0.014693,0.006894,0.033686,-0.005447,-0.044836,-0.041553,0.007038,0.033026,0.018956,0.019412,0.040987,-0.002007,-0.044471,-0.004943,0.006001,-0.012231,-0.004839,-0.002395,-0.014001,0.043168,0.011314,-0.010929,0.003008,0.023616,0.035201,0.020261,0.005463,-0.003348,0.03982,0.01225,0.013604,0.017491,-0.029549,0.003226,0.009524,0.011124,0.003292,-0.032335,-0.002262,0.001353,0.020998,0.007193,0.003009,0.000301,0.002265,-0.005233,-0.002113,-0.00373,-0.009042,0.008372,0.012576,-0.011596,-0.024452,0.03889,-0.01924,-0.011677,-0.020417,0.014937,-0.026773,0.022879,-0.02211,-0.002352,-0.007221,-0.008771,-0.005963,0.005788,-0.005936,-0.001119,0.009281,0.00632,-0.032762,-0.018522,0.014365,0.006239,0.007098,0.031014,-0.002293,0.005433,-0.001168,-0.01514,-0.013594,0.027634,0.018961,-0.01302,0.02217,0.035544,0.008864,0.020313,0.00212,-0.014756,0.030835,-0.004879,-0.03069,0.019472,-0.013574,-0.004114,0.01387,0.039049,0.03777,0.00324,0.01691,0.010527,0.042552,0.02334,-0.038049,-0.025515,-0.018268,-0.011539,0.014282,-0.006163,-0.015156,0.004478,0.018854,0.01609,0.005756,-0.023261,-0.001588,0.00746,-0.020568,-0.004637,-0.007909,-0.027837,0.002383,0.033475,-0.042774,-0.02559,0.020001,0.006792,-0.010391,0.034303,0.016683,0.000285,-0.01123,-0.022597,0.027094,-0.030294,-0.020465,0.016756,0.017633,0.013609,0.017103,0.017212,-0.012723,-0.006932,0.023111,0.002712,0.011501,-0.006055,0.020469,-0.024268,-0.002767,-0.036422,0.002019,0.002155,0.013061,-0.008971,-0.015068,0.000662,0.000248,0.014378,-0.040984,0.019256,-0.016069,0.004206,-0.014408,0.005932,0.034386,-0.0332,-0.018477,0.020917,0.039714,-0.015137,-0.005553,0.023126,0.001117,0.007508,0.003806,-0.029266,0.006011,0.005167,0.01324,0.013694,0.020257,0.004307,-0.018121,0.025035,0.001026,-0.013214,-0.016001,-0.009035,0.002485,-0.012465,-0.00023,-0.003857,-0.003385,0.008617,0.004842,0.002352,-0.024324,-0.008021,0.023252,0.051692,0.013226,-0.014763,-0.010242,0.016647,0.015237,0.006865,-0.009102,0.007311,0.029586,0.012973,0.001781,0.000875,0.009083,-0.000437,-0.006084,-0.010229,-0.011938,0.009713,0.017088,0.011857,-0.004196,0.000431,-0.022829,-0.012663,0.020776,-0.042662,0.012982,0.007005,0.006552,0.015705,-0.004542,-0.006312,-0.014624,0.000435,-0.019186,-0.019089,0.004377,-0.015651,-0.032356,-0.02179,-0.019216,0.023988,0.036593,0.010598,0.007846,-0.006808,0.006012,-0.00479,-0.009403,-0.023446,0.0,0.0,0.0,5.0,"Hair Removal, Nail Technicians, Beauty & Spas,...",4,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,--wIGbLEhlpl_UeAIyDmZQ,-0.025492,0.019936,-0.00978,0.008989,0.006961,-0.007019,-0.010135,-0.005605,0.006441,0.014264,-0.004153,-0.013604,0.029235,-0.009206,-0.012113,0.015941,-0.023862,-0.004637,0.023066,0.00594,0.001028,-0.011015,0.021822,-0.018162,0.019159,-0.017638,-0.007374,0.008013,0.013594,0.004332,-0.004372,-0.000418,0.007773,0.01345,0.021573,0.014685,0.00535,-0.003031,0.00416,0.030382,0.02766,-0.012948,-0.012107,0.018031,0.009482,0.003252,-0.011,-0.009094,0.024033,0.003302,0.016488,0.008022,-0.008947,-0.017594,-0.012784,0.010665,0.006593,0.008323,0.029553,0.009798,0.005283,0.005461,-0.004438,-0.015892,-0.007072,-0.002535,-0.006522,-0.010301,-0.014766,0.008775,0.000726,-0.025882,0.009679,0.000513,-0.010131,-0.011865,-0.030439,-0.002678,-0.000552,-0.011225,-0.020444,-0.007347,-0.014832,-0.016297,-0.005811,-0.000288,-0.005281,0.003615,-0.00144,0.001941,-0.00617,0.017623,0.013533,0.007779,0.004645,-0.009148,-0.006272,-0.000626,0.013421,-0.003977,0.013309,-0.006342,-0.013755,-0.008102,0.010545,-0.009056,-0.015485,0.00261,-0.005081,-0.00295,0.005824,0.012043,0.008401,0.000881,-0.007029,0.00192,0.009014,-0.003405,-0.014651,-0.002772,0.002065,0.003278,0.021487,0.016243,0.000225,-0.010734,0.006012,9e-05,0.010165,0.011665,0.01584,-0.015023,-0.002547,0.003871,0.022617,0.00047,0.021889,0.000163,-0.016224,0.002425,0.004522,0.021262,0.006554,0.007554,-0.0082,-0.003171,0.024925,-0.014101,-0.01307,0.00566,0.0008,0.005688,0.000536,-0.012746,-0.015648,0.019713,0.007474,-0.014455,-0.004201,0.004916,-0.010583,0.002872,0.008594,0.008084,-0.009078,-0.014606,0.000752,0.003925,-0.004628,0.003482,8.8e-05,0.022214,-0.002939,0.007237,-0.012251,0.005304,-0.023208,-0.00915,0.03032,-0.017759,-0.02806,-0.012879,0.031513,-0.010033,0.002677,0.018922,-0.015384,0.001332,0.022275,0.018732,0.002626,-0.007305,-0.022455,-0.022342,0.002293,-0.00297,0.012568,0.004259,0.002387,-0.023521,-0.014398,0.008061,-0.00713,-0.004541,-0.006768,0.008254,0.002175,0.013936,-0.011764,-0.005329,-0.003379,-0.024168,-0.009096,0.001449,-0.007915,-0.023049,-0.007399,-0.006107,0.029451,-0.035164,-0.013084,-0.0046,0.002214,-0.004734,0.000585,-0.010677,0.02521,-0.008014,-0.00974,-0.003139,4.9e-05,0.001037,0.016224,0.002176,-0.018329,-0.010924,0.006367,-0.020801,0.01834,-0.006561,-0.018475,0.001385,0.001504,-0.005265,0.021937,0.026431,-0.010178,-0.004649,0.014743,-0.006455,-0.000592,0.010101,0.005465,0.016865,0.023015,-0.018072,-0.002269,-0.00815,0.023424,0.008583,-0.008682,0.007142,0.000363,-0.00172,0.022179,-0.006348,-0.001533,0.013606,-0.020222,0.007255,0.002943,0.012594,0.005401,0.004614,0.00883,-0.00246,-0.001011,0.002633,0.009794,0.003525,0.006961,-0.010522,-0.005311,-0.001347,-0.019297,0.006033,-0.013824,-0.000982,0.003249,0.007947,0.003068,0.000488,0.005373,0.015897,0.00668,0.002806,0.013151,0.016076,0.000884,0.006555,0.666667,0.166667,3.0,3.833333,"Electronics, Professional Services, Local Serv...",14,False,False,False,False,False,False,False,False,False,False,False,False,True,True,True,True,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,-000aQFeK6tqVLndf7xORg,-0.027738,0.037412,0.007019,0.006407,0.004354,-0.005285,-0.010273,0.004754,0.006071,0.018834,0.012821,-0.021983,-0.002804,0.002488,-0.010496,0.004216,-0.032971,-0.017167,0.01614,0.002577,0.005492,-0.018144,0.031077,-0.024672,0.017904,-0.018859,0.003628,0.025943,0.012707,-0.002589,-0.002832,-0.007103,-0.005866,0.02241,0.013158,0.013854,0.003774,-1e-06,0.015016,0.041297,0.032408,0.003349,-0.026224,0.037172,0.005429,0.007529,-0.021226,-0.007134,0.033146,-0.00593,0.026232,0.000369,-0.011271,-0.015598,-0.010524,0.011385,0.013815,0.011965,0.040931,0.008883,0.000582,-0.003109,-0.004049,-0.022657,0.010396,-0.012263,-0.014782,-0.005251,-0.016635,-0.00386,0.004131,-0.024414,0.003984,-0.003817,-0.023049,-0.012739,-0.019521,0.009405,0.01038,-0.004879,-0.03375,-0.000219,0.00301,-0.017039,0.005289,-0.01204,0.001985,0.003886,0.009272,0.005672,-0.011414,0.020128,-0.004136,0.003024,0.035359,-0.025288,-0.012003,-0.004979,0.01354,-0.00376,0.016228,-0.002336,-0.008641,-0.008916,0.023729,-0.013717,-0.010763,-0.001542,-0.004317,-0.004265,-0.006154,-0.000357,0.007941,0.004022,-0.01004,-0.002143,0.004216,-0.008887,-0.003396,-0.015967,0.004146,-0.005276,0.022121,0.009692,-0.006947,-0.015748,0.003391,-0.006775,0.00755,0.009769,0.028079,-0.035267,-0.004585,0.008808,0.028737,0.018575,0.028816,-0.008854,-0.008475,0.000487,0.004927,0.02862,0.003009,0.00621,-0.004447,0.001013,0.030245,-0.015197,-0.010834,-0.002448,0.013291,0.012359,-0.00825,-0.021378,-0.013719,0.020623,0.009189,-0.017598,-0.009157,-0.004822,0.005702,-0.001931,0.000602,-0.005581,-0.008448,-0.026672,0.001478,-0.002303,0.000887,-0.008978,-0.007218,0.038184,-0.01691,0.017915,-0.011773,0.014865,-0.031074,-0.00053,0.042592,-0.015951,-0.023318,-0.027595,0.034925,-0.015474,0.011361,0.018718,-0.018819,-0.002008,0.041432,0.024755,0.001769,-0.020968,-0.012661,-0.023198,0.010154,-0.02107,0.012445,-0.012889,-0.000791,-0.023533,-0.017947,-0.001886,0.003658,0.000924,-0.017163,0.002154,-0.00072,0.018059,-0.012959,0.011919,-0.00214,-0.035065,-0.016817,0.018926,-0.018268,-0.023407,-0.018184,0.002846,0.036747,-0.052445,-0.015186,0.001086,-0.003678,0.006969,-0.004229,-0.020942,0.022191,-0.01646,-0.013144,-0.000691,0.007086,-0.001749,0.014215,0.005489,-0.016498,-0.015637,0.00889,-0.014859,0.019654,-0.021839,-0.021766,0.000612,-0.000924,-0.004216,0.023697,0.025991,-0.002733,-0.008449,0.012186,-0.004689,0.019142,0.005275,-0.004139,0.00995,0.020918,-0.021781,-0.012863,3.1e-05,0.028667,0.010145,-0.006074,0.007731,-9.3e-05,-0.006233,0.024673,-0.009111,-0.008371,0.019192,-0.004005,0.011155,0.005353,-0.002033,-0.011192,0.008456,0.01359,0.005049,-0.000745,0.009627,0.013333,0.009758,-0.006261,-0.018125,0.000727,-1e-05,-0.029143,0.003974,-0.018425,0.00472,-0.005071,0.005684,0.002536,0.010516,0.005451,0.036543,0.008084,-0.00268,0.008032,0.02726,-0.014735,-0.000253,0.666667,0.0,0.0,5.0,"Automotive, Auto Repair",7,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [35]:
# Clean

# Remove rows with NaNs
print('Before: ', len(all_features_business))
all_features_business = all_features_business.dropna(axis=0)
print('After:  ', len(all_features_business))

Before:  13943
After:   13922


In [36]:
# First, shuffle the dataframe 
# and reset the index. (Makes for easier handling of train/test later)
all_features_business = all_features_business.sample(frac=1).reset_index(drop=True)

# Create final y and x 
y_df = all_features_business[all_cats]
x_cols = [ele for ele in all_features_business.columns if ele not in all_cats+['categories', 'business_id']]
# May also want to remove from x_cols: 'cool', 'funny', 'useful', 'stars', 'categories', 'review_count' 
# May also want to drop rows that do not contain more than a threshold number of reviews (20?, 100?)
x_df = all_features_business[x_cols]

# Numpy arrays
x = x_df.values
y = y_df.values

# Classifier wants 1/0, not T/F
y = y.astype(int)

# Split into Train/Test sets
def splitSets(x, y, test_size=0.2):
    test_size_absolute = np.int(test_size * len(x))
    X_test, X_train = x[:test_size_absolute,:], x[test_size_absolute:,:]
    y_test, y_train = y[:test_size_absolute,:], y[test_size_absolute:,:]
    return X_train, X_test, y_train, y_test
    
test_size = 0.2
X_train, X_test, y_train, y_test = splitSets(x, y, test_size=test_size)

In [37]:
y_test

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 1, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

# Category Prediction

In [38]:
# Multilabel Classification
# RandomForestClassifier supports multilabel classification

# Most other classifiers will require use of 
    # sklearn.multioutput.MultiOutputClassifier to run a separate classifier model for each targe
    
from sklearn.ensemble import RandomForestClassifier

In [39]:
rfc = RandomForestClassifier(n_estimators=10, n_jobs=-1)

In [40]:
rfc.fit(X_train,y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=-1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

## Recall (and other classification metrics)

In our case we want a Recall = TPR (True Positive Rate) close to 1 since we want to Recall ALL correct categories. 

The only requirement we have for Precision is that it be less than 1. This is because we want some FPs (False Positives) since these are what WE ARE RECOMMENDING!!

In [41]:
from sklearn.metrics import classification_report

y_predict = rfc.predict(X_test)
print(classification_report(y_test, y_predict, target_names=all_cats))

                                precision    recall  f1-score   support

                     Nightlife       0.83      0.31      0.45       234
                    SportsBars       0.00      0.00      0.00        24
                   Restaurants       0.92      0.88      0.90       983
                          Bars       0.78      0.31      0.44       259
         American(Traditional)       0.00      0.00      0.00         0
                         Pizza       0.97      0.27      0.42       104
                   HairRemoval       0.25      0.03      0.06        59
               NailTechnicians       0.00      0.00      0.00         2
                   Beauty&Spas       0.97      0.60      0.75       250
                    NailSalons       0.94      0.47      0.62        64
                        Waxing       1.00      0.03      0.05        36
                       DaySpas       1.00      0.03      0.06        35
                   Electronics       1.00      0.04      0.07  

  'precision', 'predicted', average, warn_for)
  'recall', 'true', average, warn_for)
  'precision', 'predicted', average, warn_for)


In [42]:
from sklearn.metrics import recall_score 

recall_all_cats = recall_score(y_test, y_predict, average=None)
recall_all_cats

  'recall', 'true', average, warn_for)


array([0.31196581, 0.        , 0.87792472, ..., 0.        , 0.        ,
       0.        ])

## RECOMMENDATIONS

Look at the recommendations. Recommendations are the categories that are predicted, but not present in the data. 

In [43]:
y_proba = rfc.predict_proba(X_test)

In [44]:
print( len(y_proba), ' L')
print( len(y_proba[0]), ' W')
print( len(y_proba[0][0]), " D (0: False prob'y, 1: True prob'y)")

1090  L
2784  W
2  D (0: False prob'y, 1: True prob'y)


In [45]:
y_proba[0][0]

array([1., 0.])

In [46]:
reccs_binary = (y_test == 0) & (y_predict == 1)
reccs_binary

array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [47]:
all_cats_ser = pd.Series(data=all_cats)

In [48]:
all_cats_true = []
all_cats_recc = []
for biz in range(len(y_test)):
    cats_true = ', '.join(list(all_cats_ser[y_test[biz,:]==1]))
    all_cats_true.append(cats_true)
    
    cats_recc = ', '.join(list(all_cats_ser[reccs_binary[biz,:]==True]))
    all_cats_recc.append(cats_recc)

reccs_df = pd.DataFrame(data=all_cats_true, columns=['Labeled'])
reccs_df['Recommended'] = all_cats_recc
reccs_df.tail()

Unnamed: 0,Labeled,Recommended
2779,"Health&Medical, Massage, MassageTherapy, Chiro...",
2780,"HomeServices, LandscapeArchitects, Landscaping...",
2781,"Restaurants, Food, Vegetarian, Vegan, Live/Raw...",
2782,"ActiveLife, Golf",
2783,"ProfessionalServices, HomeServices, InternetSe...",


In [49]:
list(all_features_business.columns)

['business_id',
 'w2v0',
 'w2v1',
 'w2v2',
 'w2v3',
 'w2v4',
 'w2v5',
 'w2v6',
 'w2v7',
 'w2v8',
 'w2v9',
 'w2v10',
 'w2v11',
 'w2v12',
 'w2v13',
 'w2v14',
 'w2v15',
 'w2v16',
 'w2v17',
 'w2v18',
 'w2v19',
 'w2v20',
 'w2v21',
 'w2v22',
 'w2v23',
 'w2v24',
 'w2v25',
 'w2v26',
 'w2v27',
 'w2v28',
 'w2v29',
 'w2v30',
 'w2v31',
 'w2v32',
 'w2v33',
 'w2v34',
 'w2v35',
 'w2v36',
 'w2v37',
 'w2v38',
 'w2v39',
 'w2v40',
 'w2v41',
 'w2v42',
 'w2v43',
 'w2v44',
 'w2v45',
 'w2v46',
 'w2v47',
 'w2v48',
 'w2v49',
 'w2v50',
 'w2v51',
 'w2v52',
 'w2v53',
 'w2v54',
 'w2v55',
 'w2v56',
 'w2v57',
 'w2v58',
 'w2v59',
 'w2v60',
 'w2v61',
 'w2v62',
 'w2v63',
 'w2v64',
 'w2v65',
 'w2v66',
 'w2v67',
 'w2v68',
 'w2v69',
 'w2v70',
 'w2v71',
 'w2v72',
 'w2v73',
 'w2v74',
 'w2v75',
 'w2v76',
 'w2v77',
 'w2v78',
 'w2v79',
 'w2v80',
 'w2v81',
 'w2v82',
 'w2v83',
 'w2v84',
 'w2v85',
 'w2v86',
 'w2v87',
 'w2v88',
 'w2v89',
 'w2v90',
 'w2v91',
 'w2v92',
 'w2v93',
 'w2v94',
 'w2v95',
 'w2v96',
 'w2v97',
 'w2v98',
 'w2

In [50]:
reccs_df['categories'] = all_features_business['categories'].iloc[:len(reccs_df)]
reccs_df['business_id'] = all_features_business['business_id'].iloc[:len(reccs_df)]
reccs_df.tail()

Unnamed: 0,Labeled,Recommended,categories,business_id
2779,"Health&Medical, Massage, MassageTherapy, Chiro...",,"Massage Therapy, Health & Medical, Chiropractors",dHLFyEOj3CMi-pvrQanzxw
2780,"HomeServices, LandscapeArchitects, Landscaping...",,"Landscaping, Landscape Architects, Patio Cover...",9l09oYjDG0IQtZUitbhEAA
2781,"Restaurants, Food, Vegetarian, Vegan, Live/Raw...",,"Restaurants, Vegan, Live/Raw Food, Vegetarian",R1jJQi2yR44D_2ileqr8kA
2782,"ActiveLife, Golf",,"Active Life, Golf",aIqmckwjedtsvMKwPOEhhw
2783,"ProfessionalServices, HomeServices, InternetSe...",,"Television Service Providers, Home Services, P...",9gKE3S1PB67Ls6tYtoGVKQ


In [51]:
# This is where I need to pick up. I need to match the dataframes 
# so that I can match reviews etc and judge how well the recommender is doing

In [52]:
list(all_features_business['categories'].tail())

['Medical Centers, Doctors, Obstetricians & Gynecologists, Health & Medical',
 'Restaurants, Chinese',
 'Brazilian, Restaurants, Steakhouses',
 'Beer Bar, Bars, Sports Bars, Nightlife',
 'Restaurants, Thai']

In [53]:
len(reccs_df[reccs_df['Recommended']!=''])

273

In [54]:
reccs_df[reccs_df['Recommended']!=''].sort_values(by='Recommended')

Unnamed: 0,Labeled,Recommended,categories,business_id
429,"Beauty&Spas, Tanning",ActiveLife,"Tanning, Beauty & Spas",Cn-tvkG09gaSWGif-lWNqw
757,"ReligiousOrganizations, Churches",Arts&Entertainment,"Religious Organizations, Churches",99HjFNBu3p69JvDV2cpQFw
1409,"EventPlanning&Services, Venues&EventSpaces, Pu...",Arts&Entertainment,"Landmarks & Historical Buildings, Venues & Eve...",70ThpWAeQV8PEyJXJNid4Q
360,"Automotive, CarDealers",AutoRepair,"Car Dealers, Automotive",4oPqNwH6oULYu_X_L4A5xw
257,"Automotive, BodyShops, Towing",AutoRepair,"Towing, Body Shops, Automotive",ML2t7AfdnHNf9df6CT27FA
955,"Automotive, Wheel&RimRepair, Tires, AutoParts&...",AutoRepair,"Automotive, Tires, Wheel & Rim Repair, Auto Pa...",nqgeTj6bfIMY0v2J-vZa8A
379,"Automotive, Tires","AutoRepair, OilChangeStations","Tires, Automotive",sDhwuuRKr7Phz_XDYSleAQ
2730,"HomeServices, Flooring, Carpeting, CarpetInsta...",Automotive,"Home Services, Siding, Carpeting, Flooring, Ca...",0ZplgTLfGZvIKkG63ohfog
2536,"Hotels, Hotels&Travel, CarRental",Automotive,"Car Rental, Hotels & Travel",CAzLsMtoCZAuBeLnnOqmxw
827,"EventPlanning&Services, Hotels, Hotels&Travel,...",Automotive,"Tours, Limos, Hotels & Travel, Party Bus Renta...",GIzHC2PUrWC6_50C8R0CUA


In [55]:
dfreviews.columns

Index(['business_id', 'cool', 'date', 'funny', 'review_id', 'stars', 'text',
       'useful', 'user_id'],
      dtype='object')

In [56]:
pd.set_option('display.max_colwidth',2000)
dfreviews[dfreviews['business_id']=='KN0gPRzDvA6uVYims2KA0w']['text']

6680                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 This cute little place is really nearby my house and they have a lot of nice comfort food here   I ve only ever gone here for brunch and their portions are definitely big enough to share   You can add little things  ie  side of

In [57]:
# Given X_test row, find identical row in all_features_business and use that to find 'business_id'
all_features_business[x_cols].values == X_test[0]

array([[ True,  True,  True, ...,  True,  True,  True],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

In [58]:
X_test.shape

(2784, 305)

In [59]:
all_features_business[x_cols].values.shape

(13922, 305)