# Building a Simple GIS with Yelp API and Folium - Lab

## Introduction 

So we have learned quite a bit about APIs and how they are now big buzzword in the tech industry. Think of it as a protocol for how to make requests and communicate with another server. We have seen how to mine Twitter for getting text data and apply basic frequency based NLP techniques to get some insight. 

One of the key aspects of being a data scientist is the ability to learn how a new API works, how to go through its specific authentication process (OAuth) and how to process the data structures that get returned as a response to our requests. It is a good practice to spend some time learning the API through the official documentation before sending in requests. 

On these lines, this lab requires you to learn another popular API (YELP Fusion) by following the provided detailed online documentation. We shall build a simple Geographical Information System (GIS) using the data from yelp.

## Objectives
You will be able to: 
* Successfully sign up for Yelp API 
* Create HTTP requests to get data from Yelp API
* Parse HTTP responses and perform data analysis on the data returned
* Create a simple geographical system on to view information about selected businesses, at a given location. 


## The Yelp Fusion API - v3


### Point your browser over to this [yelp page](https://www.yelp.com/developers/v3/manage_app) and create an app in order to obtain  `client_id` and `api_key` tokens. 

**NOTE:** You will be required to sign up using Google or Facebook etc. if you dont already have an account.

<img src="yelp_app.png" width=500>



After registration, you'll be presented with your account information and limits of your access. For Yelp, or any other API for that reason, you need to make sure that you dont surpass your request quota, otherwise, you may end up getting banned in some cases. Yelp shows this information to you as below:

<img src="quota.png" width=500>

### Save your api_key and client_id in the variables below:

In [1]:
# Save your tokens in the following string variables
client_id = 'daXmVQtyzQNoOXYDAOOHmQ'
api_key = 'LG5IBlecRe_ov7wtZtD_fvYZb621Tljkcox4_N3BOtF-ONNLmXHkd5BhOXMPvS04GHUEYs6t1jhvGIDjWVwshSCOJurZJSJpjIvTxyzQttKOGF6a_FwLFHzpl9MmXHYx'

## The `yelpapi` 

The yelpapi is a pure Python implementation of the Yelp Fusion API (aka Yelp v3 API). It is simple, fast, and robust to any changes Yelp may make to the API in the future. See tha basic usage of this library on the [official Github repo](https://github.com/gfairchild/yelpapi). You may look out for other APIs to achieve this but for this lesson, we shall use it for sake of simplicity. 

First you must pip install the library

In [2]:
!pip install yelpapi



### Import `yelpapi ` into working environment and pass in the api_key as shown in the Github Repo 

In [3]:
from yelpapi import YelpAPI
yelp_api = YelpAPI(api_key)

### The Api request and response

Great so we can now start making API calls using the format:
```python
response = yelp_api.search_query(term =<search term>, 
                                 location=<search location>, 
                                 sort_by='rating', 
                                 limit=50)
```
We can pass in a lot more arguments to refine our search. [Here is a complete list of options that search API provides us](https://www.yelp.com/developers/documentation/v3/business_search)

* Make an API request using a simple criteria location and term
* save the response as `response` 
* inspect the type and contents of `response`.

In [4]:
## Pass in a spcific term and location to make a call. 

response = yelp_api.search_query(term ='Ramen',
                                 location='New York', 
                                 sort_by='rating', 
                                 limit=50)

# For this example, we are looking for chinese food in London.
term = 'Chinese food'
location = 'London'
response

{'businesses': [{'id': 'VaDcSFv8Ir6lZely4BhPOw',
   'alias': 'shinshi-ramen-new-york',
   'name': 'Shinshi Ramen',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/igCloMb68SW2cphCWBPPAg/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/shinshi-ramen-new-york?adjust_creative=daXmVQtyzQNoOXYDAOOHmQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=daXmVQtyzQNoOXYDAOOHmQ',
   'review_count': 171,
   'categories': [{'alias': 'ramen', 'title': 'Ramen'}],
   'rating': 4.5,
   'coordinates': {'latitude': 40.75732, 'longitude': -73.96791},
   'transactions': [],
   'price': '$$',
   'location': {'address1': '235 E 53rd St',
    'address2': '',
    'address3': None,
    'city': 'New York',
    'zip_code': '10022',
    'country': 'US',
    'state': 'NY',
    'display_address': ['235 E 53rd St', 'New York, NY 10022']},
   'phone': '+16466697812',
   'display_phone': '(646) 669-7812',
   'distance': 6179.107838130476},
  {'id': 'FlZ1zdVEKWv7dwqm8Uw8-w'

In [5]:
# Make an API call using chosen term and location

response = yelp_api.search_query(term =term,
                                 location=location, 
                                 sort_by='rating', 
                                 limit=50)
type(response)
print(response)


{'businesses': [{'id': 'h2cCoDNQOPd51HFwvuAGNg', 'alias': 'hakkasan-london-3', 'name': 'Hakkasan', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yzJnweDO4DsfJ94jAw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/hakkasan-london-3?adjust_creative=daXmVQtyzQNoOXYDAOOHmQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=daXmVQtyzQNoOXYDAOOHmQ', 'review_count': 218, 'categories': [{'alias': 'cantonese', 'title': 'Cantonese'}], 'rating': 4.0, 'coordinates': {'latitude': 51.5171482803943, 'longitude': -0.13180578932657}, 'transactions': [], 'price': '££££', 'location': {'address1': '8 Hanway Place', 'address2': '', 'address3': '', 'city': 'London', 'zip_code': 'W1T 1HD', 'country': 'GB', 'state': 'XGL', 'display_address': ['8 Hanway Place', 'London W1T 1HD', 'United Kingdom']}, 'phone': '+442079277000', 'display_phone': '+44 20 7927 7000', 'distance': 558.9650039747677}, {'id': 'QE5aa5N-dbfvafLsjn7TVg', 'alias': 'bamboo-flute-fitzrovia', 'name': '

### JSON .. again ! 

We have a nice nifty little return now! As you can see, the contents of the response is formatted as a string but what kind of data structures does this remind you of?  

To start there's the outer curly brackets:  

#### {"businesses":   

Hopefully you're thinking 'hey that's just like a python dictionary!'

Then within that we have what appears to be a list of dictionaries:  

#### {"id": "jeWIYbgBho9vBDhc5S1xvg",

This response is an example of a JSON (JavaScript Object Notation) format that we've seen so many times before. We can simply treat it as a dictionary and process it further. 

### Inspect the values for all the keys in the response

In [6]:
# inspect the key value pairs to understand the strcuture of data 
response

{'businesses': [{'id': 'h2cCoDNQOPd51HFwvuAGNg',
   'alias': 'hakkasan-london-3',
   'name': 'Hakkasan',
   'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yzJnweDO4DsfJ94jAw/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/hakkasan-london-3?adjust_creative=daXmVQtyzQNoOXYDAOOHmQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=daXmVQtyzQNoOXYDAOOHmQ',
   'review_count': 218,
   'categories': [{'alias': 'cantonese', 'title': 'Cantonese'}],
   'rating': 4.0,
   'coordinates': {'latitude': 51.5171482803943,
    'longitude': -0.13180578932657},
   'transactions': [],
   'price': '££££',
   'location': {'address1': '8 Hanway Place',
    'address2': '',
    'address3': '',
    'city': 'London',
    'zip_code': 'W1T 1HD',
    'country': 'GB',
    'state': 'XGL',
    'display_address': ['8 Hanway Place', 'London W1T 1HD', 'United Kingdom']},
   'phone': '+442079277000',
   'display_phone': '+44 20 7927 7000',
   'distance': 558.9650039747677},


Whoops, what's going on here!? Well, notice from our previous preview of the response that we saw there were a hierarhcy within the response. Let's begin to investigate further to see what the problem is.

First, recall that the overall strucutre of the response was a dictionary. Let's look at what the keys are:

In [7]:
response.keys()

dict_keys(['businesses', 'total', 'region'])

Consult the Yelp API and learn what value is carried in each key. 

#### Continue to preview these keys  further to get a little better acquainted. 

In [8]:
print('BUSINESS:', response['businesses'][0])
print()
print('REGION:', response['region'])
print()
print('TOTAL :',response['total'])

BUSINESS: {'id': 'h2cCoDNQOPd51HFwvuAGNg', 'alias': 'hakkasan-london-3', 'name': 'Hakkasan', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yzJnweDO4DsfJ94jAw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/hakkasan-london-3?adjust_creative=daXmVQtyzQNoOXYDAOOHmQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=daXmVQtyzQNoOXYDAOOHmQ', 'review_count': 218, 'categories': [{'alias': 'cantonese', 'title': 'Cantonese'}], 'rating': 4.0, 'coordinates': {'latitude': 51.5171482803943, 'longitude': -0.13180578932657}, 'transactions': [], 'price': '££££', 'location': {'address1': '8 Hanway Place', 'address2': '', 'address3': '', 'city': 'London', 'zip_code': 'W1T 1HD', 'country': 'GB', 'state': 'XGL', 'display_address': ['8 Hanway Place', 'London W1T 1HD', 'United Kingdom']}, 'phone': '+442079277000', 'display_phone': '+44 20 7927 7000', 'distance': 558.9650039747677}

REGION: {'center': {'longitude': -0.135955810546875, 'latitude': 51.51283552118349}

This makes more sense, so we are mainly interested in the `businesses` for our needs. 

### Print the names of businesses and included ratings 

In [9]:
for business in response['businesses']:
    print("Name:",business['name'],",",'Rating:',business['rating'])

Name: Hakkasan , Rating: 4.0
Name: Bamboo Flute , Rating: 4.5
Name: Yauatcha , Rating: 4.0
Name: Lanzhou Noodle Bar , Rating: 4.0
Name: Hakkasan Mayfair , Rating: 4.0
Name: Silk Road , Rating: 4.5
Name: Hunan , Rating: 4.5
Name: Barshu , Rating: 4.0
Name: Bugis Street Brasserie , Rating: 4.0
Name: Gold Mine , Rating: 4.0
Name: Royal China , Rating: 4.0
Name: Chi Noodle , Rating: 4.0
Name: Joy King Lau , Rating: 4.0
Name: Sichuan Folk , Rating: 4.0
Name: Xi'an Impressions , Rating: 4.5
Name: Dragon Castle , Rating: 4.0
Name: Canton Element , Rating: 4.5
Name: Bun House , Rating: 4.0
Name: The Duck and Rice , Rating: 4.0
Name: Princess Garden Of Mayfair , Rating: 4.0
Name: Lotus , Rating: 4.0
Name: Yipin China Restaurant , Rating: 4.0
Name: Good Earth , Rating: 4.0
Name: Ping Pong , Rating: 4.0
Name: A. Wong , Rating: 4.0
Name: Kai Of Mayfair , Rating: 4.0
Name: Pearl Liang , Rating: 3.5
Name: Firecracker , Rating: 4.0
Name: Four Seasons , Rating: 3.5
Name: Yming , Rating: 4.0
Name: Ma L

Great, now are are getting somewhere. It is a good idea at this stage to store this information as a dataframe for processing further. 
### Create a Pandas dataframe for contents of `businesses`
* Check the number of records in the dataframe
* Inspect the columns and head

In [10]:
import pandas as pd
df = pd.DataFrame(response['businesses'])
df.head()

Unnamed: 0,alias,categories,coordinates,display_phone,distance,id,image_url,is_closed,location,name,phone,price,rating,review_count,transactions,url
0,hakkasan-london-3,"[{'alias': 'cantonese', 'title': 'Cantonese'}]","{'latitude': 51.5171482803943, 'longitude': -0...",+44 20 7927 7000,558.965004,h2cCoDNQOPd51HFwvuAGNg,https://s3-media2.fl.yelpcdn.com/bphoto/IcS9yz...,False,"{'address1': '8 Hanway Place', 'address2': '',...",Hakkasan,442079277000,££££,4.0,218,[],https://www.yelp.com/biz/hakkasan-london-3?adj...
1,bamboo-flute-fitzrovia,"[{'alias': 'chinese', 'title': 'Chinese'}]","{'latitude': 51.5228746, 'longitude': -0.1418939}",+44 20 7387 2738,1189.508255,QE5aa5N-dbfvafLsjn7TVg,https://s3-media3.fl.yelpcdn.com/bphoto/kYQZjH...,False,"{'address1': '145 Cleveland Street', 'address2...",Bamboo Flute,442073872738,£,4.5,28,[],https://www.yelp.com/biz/bamboo-flute-fitzrovi...
2,yauatcha-london-7,"[{'alias': 'dimsum', 'title': 'Dim Sum'}, {'al...","{'latitude': 51.5137076071076, 'longitude': -0...",+44 20 7494 8888,112.59841,sYwBQ7mJYhB35nn-_SZstQ,https://s3-media1.fl.yelpcdn.com/bphoto/G7Ydt9...,False,"{'address1': '15-17 Broadwick Street', 'addres...",Yauatcha,442074948888,£££,4.0,456,[],https://www.yelp.com/biz/yauatcha-london-7?adj...
3,lanzhou-noodle-bar-london,"[{'alias': 'chinese', 'title': 'Chinese'}, {'a...","{'latitude': 51.5116034713013, 'longitude': -0...",+44 20 7467 4546,578.474355,pdFiFtol9YI__9ROOXUIYA,https://s3-media4.fl.yelpcdn.com/bphoto/IKy91e...,False,"{'address1': '33 Cranbourne Street', 'address2...",Lanzhou Noodle Bar,442074674546,£,4.0,310,[],https://www.yelp.com/biz/lanzhou-noodle-bar-lo...
4,hakkasan-mayfair-london-2,"[{'alias': 'chinese', 'title': 'Chinese'}]","{'latitude': 51.5103202323384, 'longitude': -0...",+44 20 7907 1888,681.445072,chEEcQbc8PbidTeXK34H9g,https://s3-media2.fl.yelpcdn.com/bphoto/LX_W20...,False,"{'address1': '17 Bruton Street', 'address2': '...",Hakkasan Mayfair,442079071888,££££,4.0,117,[],https://www.yelp.com/biz/hakkasan-mayfair-lond...


This is fantastic. We have successfully learned a new API , made requests to it, recieved and studied the response and stored the results in a dataframe and can now enjoy all the goodness of Pandas. Thats quite a bit of data engineering. 

### Visualize the location from search query
The `region` key in the response carries the geographical information for the region searched.
* Get the latitude / longitude information from `region`
* Create a folium map with these coordinates. 
* Use a zoom start value = 13

In [44]:
loc = response['region']['center']

-0.13180578932657

In [29]:
!pip install folium
import folium
m =folium.Map([loc['latitude'],loc['longitude']],zoom_start=13)
m



Expected Output:
![](london.png)

Nice. We can now extract the coordinate information for each business and plot it on this map.

### Get the business coordinates from dataframe for each business and plot on the map above

In [45]:
for location in df['coordinates']:

    long = location['longitude']
    lat = location['latitude']
    
    marker = folium.Marker(location=[lat,long])
    marker.add_to(m)
m

Expected output:
![](markers.png)

Kool so we have everything in place but the visualization is still not very *Informative* so to speak. You can't tell which marker represents which business and also other information on business like rating, cost, links to user reviews etc. is still not visible. SO its geographical , but not exactly an Information System yet as you cant make any decisions on this information.  Here's as example of what it possible can look like
![](out.png)


For this you need to understand `folium.popup()` which let's you click on a marker to show a pop up window. This window acts more like an HTML page so you can easily format the information you present in the popup using following values:
* The official business logo/image:  `image_url`
* Name of the Business: `name`
* Price (how expensive): `price`
* Links to user reviews on Yelp: `url`

Doing this in HTML is not required , so we recommend that you first try to put in basic information in the popups as just text. As a next stage , you can start changing into HTML code to make it visually more appealing.

### Attempt to recreate the interactive visualization shown above.

Here's a good resource with code examples on [how to create folium popups](https://github.com/python-visualization/folium/blob/master/examples/Popups.ipynb)

In [46]:
from folium import IFrame

Wow . An Interactive Geographical information System backed by live data through API calling. 

<img src="star.jpg" width=300>

### More APIs to Checkout

* Google Maps
* Twitter
* AWS
* IBM's Watson
* Yelp

## Summary 

In this lab, we learned how to use the Yelp API with authentication, making calls, understanding the responses and creating interactive geographical visualizations in Folium. We encourage you to re-visit this lab again once you have studied some important machine learning algorithms to make predictions , find similarities, group/cluster businesses or classify them based on user criteria. 