# Building a Simple GIS with Yelp API and Folium - Lab

## Introduction 

So we have learned quite a bit about APIs and how they are now big buzzword in the tech industry. Think of it as a protocol for how to make requests and communicate with another server. We have seen how to mine Twitter for getting text data and apply basic frequency based NLP techniques to get some insight. 

One of the key aspects of being a data scientist is the ability to learn how a new API works, how to go through its specific authentication process (OAuth) and how to process the data structures that get returned as a response to our requests. It is a good practice to spend some time learning the API through the official documentation before sending in requests. 

On these lines, this lab requires you to learn another popular API (YELP Fusion) by following the provided detailed online documentation. We shall build a simple Geographical Information System (GIS) using the data from yelp.

## Objectives
You will be able to: 
* Successfully sign up for Yelp API 
* Create HTTP requests to get data from Yelp API
* Parse HTTP responses and perform data analysis on the data returned
* Create a simple geographical system on to view information about selected businesses, at a given location. 


## The Yelp Fusion API - v3


### Point your browser over to this [yelp page](https://www.yelp.com/developers/v3/manage_app) and create an app in order to obtain  `client_id` and `api_key` tokens. 

**NOTE:** You will be required to sign up using Google or Facebook etc. if you dont already have an account.

<img src="yelp_app.png" width=500>



After registration, you'll be presented with your account information and limits of your access. For Yelp, or any other API for that reason, you need to make sure that you dont surpass your request quota, otherwise, you may end up getting banned in some cases. Yelp shows this information to you as below:

<img src="quota.png" width=500>

### Save your api_key and client_id in the variables below:

In [1]:
# Save your tokens in the following string variables
client_id = 'xI9YHnrKZQ7uuTG3EwYVTw'
api_key = 'mmmRxOAQlnnXPbZjUUnHfQdC9uB5gQ1qSNLiUZKN3SF8aOtVcnxh0Bn6IoLh7MPu4oAKinFMZ8mv5yyISpb6UnZs5twn6jBPKl3lGB3rfxfVss-scq2HnEpIzn9UXHYx'

## The `yelpapi` 

The yelpapi is a pure Python implementation of the Yelp Fusion API (aka Yelp v3 API). It is simple, fast, and robust to any changes Yelp may make to the API in the future. See tha basic usage of this library on the [official Github repo](https://github.com/gfairchild/yelpapi). You may look out for other APIs to achieve this but for this lesson, we shall use it for sake of simplicity. 

First you must pip install the library

In [3]:
# !pip install yelpapi

### Import `yelpapi ` into working environment and pass in the api_key as shown in the Github Repo 

In [2]:
# Code here
import yelpapi

### The Api request and response

Great so we can now start making API calls using the format:
```python
response = yelp_api.search_query(term =<search term>, 
                                 location=<search location>, 
                                 sort_by='rating', 
                                 limit=50)
```
We can pass in a lot more arguments to refine our search. [Here is a complete list of options that search API provides us](https://www.yelp.com/developers/documentation/v3/business_search)

* Make an API request using a simple criteria location and term
* save the response as `response` 
* inspect the type and contents of `response`.

In [7]:
import requests
import pandas as pd

In [4]:
## Pass in a spcific term and location to make a call. 

# For this example, we are looking for chinese food in London.

term = 'Chinese food'
location = 'London'

In [17]:
# Make an API call using chosen term and location

url = 'http://api.yelp.com/v3/businesses/search'



headers = {'Authorization': 'Bearer {}'.format(api_key)}
        
p = {'term': term.replace(' ','+'), 'location': location.replace(' ','+')}
    

response = requests.get(url, headers = headers, params = p)


type(response)
print(response)


<Response [200]>


### JSON .. again ! 

We have a nice nifty little return now! As you can see, the contents of the response is formatted as a string but what kind of data structures does this remind you of?  

To start there's the outer curly brackets:  

#### {"businesses":   

Hopefully you're thinking 'hey that's just like a python dictionary!'

Then within that we have what appears to be a list of dictionaries:  

#### {"id": "jeWIYbgBho9vBDhc5S1xvg",

This response is an example of a JSON (JavaScript Object Notation) format that we've seen so many times before. We can simply treat it as a dictionary and process it further. 

### Inspect the values for all the keys in the response

In [22]:
# inspect the key value pairs to understand the strcuture of data 
response = response.json()

In [26]:
response.keys()

dict_keys(['businesses', 'total', 'region'])

In [28]:
print(len(response['businesses']))
print(response['total'])
print(len(response['region']))

20
1100
1


Whoops, what's going on here!? Well, notice from our previous preview of the response that we saw there were a hierarhcy within the response. Let's begin to investigate further to see what the problem is.

First, recall that the overall strucutre of the response was a dictionary. Let's look at what the keys are:

In [33]:
print(response.keys())
print()
print(type(response['businesses']))
print(type(response['total']))
print(type(response['region']))

dict_keys(['businesses', 'total', 'region'])

<class 'list'>
<class 'int'>
<class 'dict'>


Consult the Yelp API and learn what value is carried in each key. 

#### Continue to preview these keys  further to get a little better acquainted. 

In [34]:
print('BUSINESS:', response['businesses'][0])

print('REGION:', response['region'])

print('TOTAL :',response['total'])

BUSINESS: {'id': 'pdFiFtol9YI__9ROOXUIYA', 'alias': 'lanzhou-noodle-bar-london', 'name': 'Lanzhou Noodle Bar', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/OMre4TNCrkDck3wqpUXbjQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/lanzhou-noodle-bar-london?adjust_creative=xI9YHnrKZQ7uuTG3EwYVTw&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=xI9YHnrKZQ7uuTG3EwYVTw', 'review_count': 316, 'categories': [{'alias': 'chinese', 'title': 'Chinese'}, {'alias': 'noodles', 'title': 'Noodles'}], 'rating': 4.0, 'coordinates': {'latitude': 51.5116034713013, 'longitude': -0.127834377873126}, 'transactions': [], 'price': '£', 'location': {'address1': '33 Cranbourne Street', 'address2': '', 'address3': '', 'city': 'London', 'zip_code': 'WC2H 7AD', 'country': 'GB', 'state': 'XGL', 'display_address': ['33 Cranbourne Street', 'London WC2H 7AD', 'United Kingdom']}, 'phone': '+442074674546', 'display_phone': '+44 20 7467 4546', 'distance': 578.4743547816091}
REGION: 

This makes more sense, so we are mainly interested in the `businesses` for our needs. 

### Print the names of businesses and included ratings 

In [36]:
# Code here 
for business in response['businesses']:
    print('name: {}  rating: {} '.format(business['name'],business['rating']))

name: Lanzhou Noodle Bar  rating: 4.0 
name: Yauatcha  rating: 4.0 
name: Bamboo Flute  rating: 4.5 
name: Royal China  rating: 4.0 
name: Sichuan Chef  rating: 4.5 
name: Rabieng  rating: 4.0 
name: Barshu  rating: 4.0 
name: Ma La  rating: 4.0 
name: Sichuan Folk  rating: 4.0 
name: Hakkasan  rating: 4.0 
name: Taiwan Village  rating: 4.5 
name: Chinese Overseas  rating: 4.0 
name: Sichuan Kitchen  rating: 4.5 
name: Bugis Street Brasserie  rating: 4.0 
name: Joy King Lau  rating: 4.0 
name: Busaba Soho  rating: 4.0 
name: Moon House  rating: 4.0 
name: La Mian & Dim Sum Stall  rating: 5.0 
name: Xi'an Impressions  rating: 4.5 
name: Gold Mine  rating: 4.0 


Great, now are are getting somewhere. It is a good idea at this stage to store this information as a dataframe for processing further. 
### Create a Pandas dataframe for contents of `businesses`
* Check the number of records in the dataframe
* Inspect the columns and head

In [37]:
# Code here 
df = pd.DataFrame(response['businesses'])
df.head()

Unnamed: 0,alias,categories,coordinates,display_phone,distance,id,image_url,is_closed,location,name,phone,price,rating,review_count,transactions,url
0,lanzhou-noodle-bar-london,"[{'alias': 'chinese', 'title': 'Chinese'}, {'a...","{'latitude': 51.5116034713013, 'longitude': -0...",+44 20 7467 4546,578.474355,pdFiFtol9YI__9ROOXUIYA,https://s3-media4.fl.yelpcdn.com/bphoto/OMre4T...,False,"{'address1': '33 Cranbourne Street', 'address2...",Lanzhou Noodle Bar,442074674546,£,4.0,316,[],https://www.yelp.com/biz/lanzhou-noodle-bar-lo...
1,yauatcha-london-7,"[{'alias': 'dimsum', 'title': 'Dim Sum'}, {'al...","{'latitude': 51.5137076071076, 'longitude': -0...",+44 20 7494 8888,112.59841,sYwBQ7mJYhB35nn-_SZstQ,https://s3-media1.fl.yelpcdn.com/bphoto/G7Ydt9...,False,"{'address1': '15-17 Broadwick Street', 'addres...",Yauatcha,442074948888,£££,4.0,462,[],https://www.yelp.com/biz/yauatcha-london-7?adj...
2,bamboo-flute-fitzrovia,"[{'alias': 'chinese', 'title': 'Chinese'}]","{'latitude': 51.5228746, 'longitude': -0.1418939}",+44 20 7387 2738,1189.508255,QE5aa5N-dbfvafLsjn7TVg,https://s3-media3.fl.yelpcdn.com/bphoto/kYQZjH...,False,"{'address1': '145 Cleveland Street', 'address2...",Bamboo Flute,442073872738,£,4.5,28,[],https://www.yelp.com/biz/bamboo-flute-fitzrovi...
3,royal-china-london-2,"[{'alias': 'dimsum', 'title': 'Dim Sum'}]","{'latitude': 51.5105005571439, 'longitude': -0...",+44 20 7221 2535,3558.785104,2NdnxONuSAsDkQPhU919tQ,https://s3-media3.fl.yelpcdn.com/bphoto/Mpn3L7...,False,"{'address1': '13 Queensway', 'address2': '', '...",Royal China,442072212535,££,4.0,96,[],https://www.yelp.com/biz/royal-china-london-2?...
4,sichuan-chef-london,"[{'alias': 'chinese', 'title': 'Chinese'}]","{'latitude': 51.4931335449219, 'longitude': -0...",+44 20 7244 7888,4533.461179,LIA_VK2aBaiNFIXSgwnXvA,https://s3-media3.fl.yelpcdn.com/bphoto/CX65BM...,False,"{'address1': '15 Kenway Rd', 'address2': '', '...",Sichuan Chef,442072447888,,4.5,9,[],https://www.yelp.com/biz/sichuan-chef-london?a...


This is fantastic. We have successfully learned a new API , made requests to it, recieved and studied the response and stored the results in a dataframe and can now enjoy all the goodness of Pandas. Thats quite a bit of data engineering. 

### Visualize the location from search query
The `region` key in the response carries the geographical information for the region searched.
* Get the latitude / longitude information from `region`
* Create a folium map with these coordinates. 
* Use a zoom start value = 13

In [44]:
import folium

# Code here

lon = response['region']['center']['longitude']
lat = response['region']['center']['latitude']

yelp = folium.Map([lat, lon], zoom_start= 13)

yelp


Expected Output:
![](london.png)

Nice. We can now extract the coordinate information for each business and plot it on this map.

### Get the business coordinates from dataframe for each business and plot on the map above

In [63]:
length = len(df)
length = list(range(0,length))
for i in length:
    business = df.loc[i]
    b_lat = business['coordinates']['latitude']
    b_lon = business['coordinates']['longitude']

51.5116034713013
51.5137076071076
51.5228746
51.5105005571439
51.4931335449219
51.539829
51.512835
51.4967879034045
51.520178
51.5171482803943
51.4855449
51.5395926
51.5111313
51.4936621211568
51.511104
51.5137892
51.5747524
51.5227498599062
51.5539006581936
51.5133018493652


Kool so we have everything in place but the visualization is still not very *Informative* so to speak. You can't tell which marker represents which business and also other information on business like rating, cost, links to user reviews etc. is still not visible. SO its geographical , but not exactly an Information System yet as you cant make any decisions on this information.  Here's as example of what it possible can look like
![](out.png)


For this you need to understand `folium.popup()` which let's you click on a marker to show a pop up window. This window acts more like an HTML page so you can easily format the information you present in the popup using following values:
* The official business logo/image:  `image_url`
* Name of the Business: `name`
* Price (how expensive): `price`
* Links to user reviews on Yelp: `url`

Doing this in HTML is not required , so we recommend that you first try to put in basic information in the popups as just text. As a next stage , you can start changing into HTML code to make it visually more appealing.

### Attempt to recreate the interactive visualization shown above.

Here's a good resource with code examples on [how to create folium popups](https://github.com/python-visualization/folium/blob/master/examples/Popups.ipynb)

In [68]:
# Code here 

# Code here 
length = len(df)
length = list(range(0,length))
for i in length:
    business = df.loc[i]
    b_lat = business['coordinates']['latitude']
    b_lon = business['coordinates']['longitude']
    information = 'name: {}, price: {}, rating: {}'.format(str(business['name']),str(business['price']), str(business['rating']))
    marker = folium.Marker([b_lat, b_lon], popup = information)
    marker.add_to(yelp)

yelp

Wow . An Interactive Geographical information System backed by live data through API calling. 

<img src="star.jpg" width=300>

### More APIs to Checkout

* Google Maps
* Twitter
* AWS
* IBM's Watson
* Yelp

## Summary 

In this lab, we learned how to use the Yelp API with authentication, making calls, understanding the responses and creating interactive geographical visualizations in Folium. We encourage you to re-visit this lab again once you have studied some important machine learning algorithms to make predictions , find similarities, group/cluster businesses or classify them based on user criteria. 