# Assignment 2
## Lauren Gripenstraw

In this notebook, I create an interactive dashboard using data on Santa Barbara and Goleta restaurants from the Yelp API.

First, I import the necessary packages.

In [None]:
import pandas as pd
import numpy as np
import matplotlib

from ipywidgets import widgets, interactive

The following is my function for reading in the data from the Yelp API. It was modified from the code provided on GitHub by Yelp for this purpose.

In [None]:



from __future__ import print_function

import argparse
import json
import pprint
import requests
import sys
import urllib



try:
    from urllib.error import HTTPError
    from urllib.parse import quote
    from urllib.parse import urlencode
except ImportError:
    from urllib2 import HTTPError
    from urllib import quote
    from urllib import urlencode


API_KEY= '8DTWnf5EuPNKEzu13U1225FpWzLryVmZbbu58HpF52S-kAQUMz1NsLz6GR_M5qxGHqkh2-d9fDIxuXAnNYahODuxnsri_PpGeN5u7OBWr56MrDOJPiQICLrJ_mXfWnYx'





API_HOST = 'https://api.yelp.com/v3'
SEARCH_PATH = '/businesses/search'
BUSINESS_PATH = '/v3/businesses/'  




def get_yelp_data(offset):
    
    
    host = API_HOST
    path = SEARCH_PATH
    api_key = API_KEY
    url = '{0}{1}'.format(host, quote(path.encode('utf8')))
    headers = {
        'Authorization': 'Bearer %s' % api_key
    }

    params = {"term": "restaurant",
              "offset":offset
              }
    
    print(u'Querying {0} ...'.format(url))

    response = requests.request('GET', "https://api.yelp.com/v3/businesses/search?location=Santa Barbara+Goleta&limit=50&sort=0", params = params, headers = headers)
    return response.json()


In [None]:
i = 0

The following is a loop to read in as much data as Yelp would allow me.

In [None]:
while (i <= 150):
    new_yelp_data = get_yelp_data(i)
    if (i == 0):
        yelp_data_all = new_yelp_data
    else:
        yelp_data_all = dict(yelp_data_all, **new_yelp_data)
    i += 50

Here I change the "businesses" component of the gathered data from a dict to a Pandas DataFrame.

In [None]:
yelp_data = pd.DataFrame.from_dict(yelp_data_all["businesses"])

Just checking that everything is correct.

In [None]:
yelp_data.head()

Here I am selecting only the columns I am interested in. I selected the columns that display category, location, name, rating and price.

In [None]:
yelp_data_clean = pd.DataFrame(yelp_data.iloc[:, [1, 8, 9, 11, 12]])

Checking for accuracy again.

In [None]:
yelp_data_clean.head()

Some rows have many entries for "categories." Here I first take the first dict of category alias and title from every row, assuming that to be the most relevant category. Next, from that dict I select the title entry.

In [None]:
yelp_data_clean['categories'] = [d[0] for d in yelp_data_clean.categories]

In [None]:
yelp_data_clean["categories"] = [d["title"] for d in yelp_data_clean.categories]

In [None]:
yelp_data_clean.head()

Here I am selecting just the city from each row's location dict.

In [None]:
yelp_data_clean["location"] = [d["city"] for d in yelp_data_clean.location]

In [None]:
yelp_data_clean.head()

A few restaurants that are not in Santa Barbara or Goleta made it into the data, I believe due to Yelp's default radius criteria. I tried to modify my get_yelp_data function to eliminate these, but was unsuccessful. Here I am simply removing them because they do not fit into the categories I will use in my menus.

In [None]:
yelp_data_clean = yelp_data_clean.loc[yelp_data_clean["location"].isin(["Santa Barbara", "Goleta"])]

In [None]:
yelp_data_clean.head()

Here I just rename all the columns with proper capitalization, to make the selections in the menus appear more attractive as opposed to entirely lowercase.

In [None]:
yelp_data_clean.columns = ['Categories', 'Location', 'Name', 'Price', 'Rating']

I encountered an error when building the plots due to the price symbol '$$' being mistaken for LaTEX, so here I just re-code the price symbols to numbers.

In [None]:
yelp_data_clean['Price'] = yelp_data_clean['Price'].map({'$': '1', '$$': '2', '$$$': '3', '$$$$': '4'})


In [None]:
yelp_data_clean.head()

Here are my widgets, which create the drop-down menu selectors for my plots. The first is a location menu which contains "Santa Barbara", "Goleta", or "All". The second menu is a features menu which selects which feature to display. The choices are "Categories", "Price", "Rating", and "Average Rating per Category."

In [None]:
location = widgets.Dropdown(
    options=['All'] + list(yelp_data_clean['Location'].unique()),
    value='All',
    description='Location:',
)

In [None]:
features = widgets.Dropdown(
    options = list(yelp_data_clean[['Categories', 'Price', 'Rating']]) + ['Average Rating per Category'],
    value = 'Categories',
    description='Feature:',
)

This is a plotting function that takes two parameters indicating the selections from the two drop down menus, with if and else statements to determine the correct plot to draw.

In [None]:
def plotit(location, features):
        yelp2 = yelp_data_clean.copy()
        if location != 'All':
            yelp2 = yelp2[yelp2.Location == location]

        if features == 'Categories':
            yelp2['Categories'].value_counts().plot(kind = 'bar')
        elif features == 'Price':
            yelp2['Price'].value_counts().plot(kind = 'bar')
        elif features == 'Rating':
            yelp2.groupby('Rating').size().plot(kind = 'bar')
        else:
            yelp2.groupby('Categories')['Rating'].mean().plot(kind = 'bar')

Here is where I call the function and the widgets to make an interactive dashboard.

In [None]:
interactive(plotit, location = location, features = features)
