# Building a Simple GIS with Yelp API and Folium - Lab

## Introduction 

So we have learned quite a bit about APIs and how they are now big buzzword in the tech industry. Think of it as a protocol for how to make requests and communicate with another server. We have seen how to mine Twitter for getting text data and apply basic frequency based NLP techniques to get some insight. 

One of the key aspects of being a data scientist is the ability to learn how a new API works, how to go through its specific authentication process (OAuth) and how to process the data structures that get returned as a response to our requests. It is a good practice to spend some time learning the API through the official documentation before sending in requests. 

On these lines, this lab requires you to learn another popular API (YELP Fusion) by following the provided detailed online documentation. We shall build a simple Geographical Information System (GIS) using the data from yelp.

## Objectives
You will be able to: 
* Successfully sign up for Yelp API 
* Create HTTP requests to get data from Yelp API
* Parse HTTP responses and perform data analysis on the data returned
* Create a simple geographical system on to view information about selected businesses, at a given location. 


## The Yelp Fusion API - v3


### Point your browser over to this [yelp page](https://www.yelp.com/developers/v3/manage_app) and create an app in order to obtain  `client_id` and `api_key` tokens. 

**NOTE:** You will be required to sign up using Google or Facebook etc. if you dont already have an account.

<img src="yelp_app.png" width=500>



After registration, you'll be presented with your account information and limits of your access. For Yelp, or any other API for that reason, you need to make sure that you dont surpass your request quota, otherwise, you may end up getting banned in some cases. Yelp shows this information to you as below:

<img src="quota.png" width=500>

### Save your api_key and client_id in the variables below:

In [2]:
# Save your tokens in the following string variables
client_id = 'r9Cm6QJgzur1AjJrivvU0A'
api_key = 'QyP3-1gO4j-23kJ2aCj3ygEO7rwvH2HXk4iOBrxK-aRWyIPJErWDcTFgEwQBzRXryMz1MvxOjl34M5jvb1GCrxVc5Go5czfd7pmkJvXEcLR9l2zbSxm58ra-u3I_XHYx'

## The `yelpapi` 

The yelpapi is a pure Python implementation of the Yelp Fusion API (aka Yelp v3 API). It is simple, fast, and robust to any changes Yelp may make to the API in the future. See tha basic usage of this library on the [official Github repo](https://github.com/gfairchild/yelpapi). You may look out for other APIs to achieve this but for this lesson, we shall use it for sake of simplicity. 

First you must pip install the library

In [2]:
!pip install yelpapi

Collecting yelpapi
  Downloading https://files.pythonhosted.org/packages/bb/07/f01be72829a3ce2da71bfde33d4bfe9ce5d8173a5a0470420fcb4dbacdd9/yelpapi-2.3.0-py2.py3-none-any.whl
Installing collected packages: yelpapi
Successfully installed yelpapi-2.3.0


distributed 1.21.8 requires msgpack, which is not installed.
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


### Import `yelpapi ` into working environment and pass in the api_key as shown in the Github Repo 

In [3]:
# Code here

from yelpapi import YelpAPI

yelp_api = YelpAPI(api_key, timeout_s = 3.0)


### The Api request and response

Great so we can now start making API calls using the format:
```python
response = yelp_api.search_query(term =<search term>, 
                                 location=<search location>, 
                                 sort_by='rating', 
                                 limit=50)
```
We can pass in a lot more arguments to refine our search. [Here is a complete list of options that search API provides us](https://www.yelp.com/developers/documentation/v3/business_search)

* Make an API request using a simple criteria location and term
* save the response as `response` 
* inspect the type and contents of `response`.

In [6]:

## Pass in a spcific term and location to make a call. 
term = 'Chinese food'
location = 'London'
results = yelp_api.search_query(term = term,
                                location = location,
                               sort_by = 'rating',
                                limit = 50)

# For this example, we are looking for chinese food in London.




In [7]:
# Make an API call using chosen term and location


type(results)
print(results)

{'businesses': [{'id': 'h2cCoDNQOPd51HFwvuAGNg', 'alias': 'hakkasan-london-3', 'name': 'Hakkasan', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yzJnweDO4DsfJ94jAw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/hakkasan-london-3?adjust_creative=r9Cm6QJgzur1AjJrivvU0A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=r9Cm6QJgzur1AjJrivvU0A', 'review_count': 219, 'categories': [{'alias': 'cantonese', 'title': 'Cantonese'}], 'rating': 4.0, 'coordinates': {'latitude': 51.5171482803943, 'longitude': -0.13180578932657}, 'transactions': [], 'price': '££££', 'location': {'address1': '8 Hanway Place', 'address2': '', 'address3': '', 'city': 'London', 'zip_code': 'W1T 1HD', 'country': 'GB', 'state': 'XGL', 'display_address': ['8 Hanway Place', 'London W1T 1HD', 'United Kingdom']}, 'phone': '+442079277000', 'display_phone': '+44 20 7927 7000', 'distance': 558.9650039747677}, {'id': 'QE5aa5N-dbfvafLsjn7TVg', 'alias': 'bamboo-flute-fitzrovia', 'name': '

### JSON .. again ! 

We have a nice nifty little return now! As you can see, the contents of the response is formatted as a string but what kind of data structures does this remind you of?  

To start there's the outer curly brackets:  

#### {"businesses":   

Hopefully you're thinking 'hey that's just like a python dictionary!'

Then within that we have what appears to be a list of dictionaries:  

#### {"id": "jeWIYbgBho9vBDhc5S1xvg",

This response is an example of a JSON (JavaScript Object Notation) format that we've seen so many times before. We can simply treat it as a dictionary and process it further. 

### Inspect the values for all the keys in the response

In [8]:
# inspect the key value pairs to understand the strcuture of data 

for key in results:
    print(key.values())

AttributeError: 'str' object has no attribute 'values'

Whoops, what's going on here!? Well, notice from our previous preview of the response that we saw there were a hierarhcy within the response. Let's begin to investigate further to see what the problem is.

First, recall that the overall strucutre of the response was a dictionary. Let's look at what the keys are:

In [9]:
results.keys()

dict_keys(['businesses', 'total', 'region'])

Consult the Yelp API and learn what value is carried in each key. 

#### Continue to preview these keys  further to get a little better acquainted. 

In [11]:
print('BUSINESS:', results['businesses'][0])

print('REGION:', results['region'])

print('TOTAL :',results['total'])

BUSINESS: {'id': 'h2cCoDNQOPd51HFwvuAGNg', 'alias': 'hakkasan-london-3', 'name': 'Hakkasan', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yzJnweDO4DsfJ94jAw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/hakkasan-london-3?adjust_creative=r9Cm6QJgzur1AjJrivvU0A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=r9Cm6QJgzur1AjJrivvU0A', 'review_count': 219, 'categories': [{'alias': 'cantonese', 'title': 'Cantonese'}], 'rating': 4.0, 'coordinates': {'latitude': 51.5171482803943, 'longitude': -0.13180578932657}, 'transactions': [], 'price': '££££', 'location': {'address1': '8 Hanway Place', 'address2': '', 'address3': '', 'city': 'London', 'zip_code': 'W1T 1HD', 'country': 'GB', 'state': 'XGL', 'display_address': ['8 Hanway Place', 'London W1T 1HD', 'United Kingdom']}, 'phone': '+442079277000', 'display_phone': '+44 20 7927 7000', 'distance': 558.9650039747677}
REGION: {'center': {'longitude': -0.135955810546875, 'latitude': 51.51283552118349}}

This makes more sense, so we are mainly interested in the `businesses` for our needs. 

### Print the names of businesses and included ratings 

In [13]:
# Code here 
for business in results['businesses']:
    print('Business Name', business['name'])
    print('Business Rating', business['rating'])
    print('\n')

Business Name Hakkasan
Business Rating 4.0


Business Name Bamboo Flute
Business Rating 4.5


Business Name Yauatcha
Business Rating 4.0


Business Name Lanzhou Noodle Bar
Business Rating 4.0


Business Name Hakkasan Mayfair
Business Rating 4.0


Business Name Silk Road
Business Rating 4.5


Business Name Hunan
Business Rating 4.5


Business Name Barshu
Business Rating 4.0


Business Name Bugis Street Brasserie
Business Rating 4.0


Business Name Gold Mine
Business Rating 4.0


Business Name Xi'an Impressions
Business Rating 4.5


Business Name Royal China
Business Rating 4.0


Business Name Chi Noodle
Business Rating 4.0


Business Name Joy King Lau
Business Rating 4.0


Business Name Sichuan Folk
Business Rating 4.0


Business Name Dragon Castle
Business Rating 4.0


Business Name Canton Element
Business Rating 4.5


Business Name Bun House
Business Rating 4.0


Business Name The Duck and Rice
Business Rating 4.0


Business Name Princess Garden Of Mayfair
Business Rating 4.0


Busine

Great, now are are getting somewhere. It is a good idea at this stage to store this information as a dataframe for processing further. 
### Create a Pandas dataframe for contents of `businesses`
* Check the number of records in the dataframe
* Inspect the columns and head

In [18]:
import pandas as pd

column_names = results['businesses'][0].keys()

df = pd.DataFrame(results['businesses'], columns = column_names)
df.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,h2cCoDNQOPd51HFwvuAGNg,hakkasan-london-3,Hakkasan,https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yz...,False,https://www.yelp.com/biz/hakkasan-london-3?adj...,219,"[{'alias': 'cantonese', 'title': 'Cantonese'}]",4.0,"{'latitude': 51.5171482803943, 'longitude': -0...",[],££££,"{'address1': '8 Hanway Place', 'address2': '',...",442079277000,+44 20 7927 7000,558.965004
1,QE5aa5N-dbfvafLsjn7TVg,bamboo-flute-fitzrovia,Bamboo Flute,https://s3-media3.fl.yelpcdn.com/bphoto/kYQZjH...,False,https://www.yelp.com/biz/bamboo-flute-fitzrovi...,28,"[{'alias': 'chinese', 'title': 'Chinese'}]",4.5,"{'latitude': 51.5228746, 'longitude': -0.1418939}",[],£,"{'address1': '145 Cleveland Street', 'address2...",442073872738,+44 20 7387 2738,1189.508255
2,sYwBQ7mJYhB35nn-_SZstQ,yauatcha-london-7,Yauatcha,https://s3-media1.fl.yelpcdn.com/bphoto/G7Ydt9...,False,https://www.yelp.com/biz/yauatcha-london-7?adj...,459,"[{'alias': 'dimsum', 'title': 'Dim Sum'}, {'al...",4.0,"{'latitude': 51.5137076071076, 'longitude': -0...",[],£££,"{'address1': '15-17 Broadwick Street', 'addres...",442074948888,+44 20 7494 8888,112.59841
3,pdFiFtol9YI__9ROOXUIYA,lanzhou-noodle-bar-london,Lanzhou Noodle Bar,https://s3-media4.fl.yelpcdn.com/bphoto/OMre4T...,False,https://www.yelp.com/biz/lanzhou-noodle-bar-lo...,312,"[{'alias': 'chinese', 'title': 'Chinese'}, {'a...",4.0,"{'latitude': 51.5116034713013, 'longitude': -0...",[],£,"{'address1': '33 Cranbourne Street', 'address2...",442074674546,+44 20 7467 4546,578.474355
4,chEEcQbc8PbidTeXK34H9g,hakkasan-mayfair-london-2,Hakkasan Mayfair,https://s3-media2.fl.yelpcdn.com/bphoto/LX_W20...,False,https://www.yelp.com/biz/hakkasan-mayfair-lond...,116,"[{'alias': 'chinese', 'title': 'Chinese'}]",4.0,"{'latitude': 51.5103202323384, 'longitude': -0...",[],££££,"{'address1': '17 Bruton Street', 'address2': '...",442079071888,+44 20 7907 1888,681.445072


This is fantastic. We have successfully learned a new API , made requests to it, recieved and studied the response and stored the results in a dataframe and can now enjoy all the goodness of Pandas. Thats quite a bit of data engineering. 

### Visualize the location from search query
The `region` key in the response carries the geographical information for the region searched.
* Get the latitude / longitude information from `region`
* Create a folium map with these coordinates. 
* Use a zoom start value = 13

In [20]:
!pip install folium

Collecting folium
  Downloading https://files.pythonhosted.org/packages/55/e2/7e523df8558b7f4b2ab4c62014fd378ccecce3fdc14c9928b272a88ae4cc/folium-0.7.0-py3-none-any.whl (85kB)
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/63/36/1c93318e9653f4e414a2e0c3b98fc898b4970e939afeedeee6075dd3b703/branca-0.3.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.1 folium-0.7.0


distributed 1.21.8 requires msgpack, which is not installed.
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


In [47]:
m_long = results['region']['center']['longitude']
m_lat = results['region']['center']['latitude']

In [52]:
import folium

#Start with an empty map
m = folium.Map(location = [m_lat, m_long], zoom_start =13, width=750, height=500)



Expected Output:
![](london.png)

Nice. We can now extract the coordinate information for each business and plot it on this map.

### Get the business coordinates from dataframe for each business and plot on the map above

In [59]:
m = folium.Map(location = [m_lat, m_long], zoom_start =13, width=750, height=500)

#add marker for every business coordinate
long = []
lat = []
names = []
for coordinate in df['coordinates']:
    long.append(coordinate['longitude'])
    lat.append(coordinate['latitude'])
    
for place in df['name']:
    names.append(place)
    
for i in range(0,len(df)):
    folium.Marker([lat[i], long[i]], popup = names[i], tooltip = 'click for info').add_to(m)
    
m

Expected output:
![](markers.png)

Kool so we have everything in place but the visualization is still not very *Informative* so to speak. You can't tell which marker represents which business and also other information on business like rating, cost, links to user reviews etc. is still not visible. SO its geographical , but not exactly an Information System yet as you cant make any decisions on this information.  Here's as example of what it possible can look like
![](out.png)


For this you need to understand `folium.popup()` which let's you click on a marker to show a pop up window. This window acts more like an HTML page so you can easily format the information you present in the popup using following values:
* The official business logo/image:  `image_url`
* Name of the Business: `name`
* Price (how expensive): `price`
* Links to user reviews on Yelp: `url`

Doing this in HTML is not required , so we recommend that you first try to put in basic information in the popups as just text. As a next stage , you can start changing into HTML code to make it visually more appealing.

### Attempt to recreate the interactive visualization shown above.

Here's a good resource with code examples on [how to create folium popups](https://github.com/python-visualization/folium/blob/master/examples/Popups.ipynb)

In [118]:
from folium import IFrame

In [158]:
# Code here 
m = folium.Map(location = [m_lat, m_long], zoom_start =13, width=750, height=500)

long = []
lat = []
names = []
for coordinate in df['coordinates']:
    long.append(coordinate['longitude'])
    lat.append(coordinate['latitude'])
    
for place in df['name']:
    names.append(place)


for i in range(0,len(df)):
    html = folium.Html('<H2><center><font color="black">' + names[i] + '</H2></center>'
                       '<B><center><font color="black"> Rating: ' + str(df['rating'][i]) + '</B></center>'
                       '<B><center><font color="black"> Price: ' + str(df['price'][i]) + '</B></center>'
                        '<B><center><a href =' + str(df['url'][i]) + ' targer = "_blank"> Read Reviews</B></a></center>', script = True)
    iframe = folium.IFrame(html, width=400, height=150)
    popup = folium.Popup(iframe, max_width = 3000)
    Text = folium.Marker(location = [lat[i], long[i]], popup = popup, icon = folium.Icon(icon = 'info-sign'))
    m.add_child(Text)

m

Wow . An Interactive Geographical information System backed by live data through API calling. 

<img src="star.jpg" width=300>

### More APIs to Checkout

* Google Maps
* Twitter
* AWS
* IBM's Watson
* Yelp

## Summary 

In this lab, we learned how to use the Yelp API with authentication, making calls, understanding the responses and creating interactive geographical visualizations in Folium. We encourage you to re-visit this lab again once you have studied some important machine learning algorithms to make predictions , find similarities, group/cluster businesses or classify them based on user criteria. 