# JSON and APIs

_November 3, 2020_

Agenda today:
- Introduction to API and Remote Server Model 
- Getting data through an API: Case study with YELP API

In [25]:
import pandas as pd
import numpy as np
import requests
import json
#from yelp.client import Client
import matplotlib.pyplot as plt
plt.style.use('seaborn')

## Part I. APIs and Remote Server Model
API stands for Application Programming Interface. At some point or the other, large companies would build API for their products for their clients or internal use. It allows the company's application to communicate with another application. But what _exactly_ is an API?

#### Remote server 
When we think about the world of Web, we can think of it as a collection of _servers_. And servers are nothing but huge computers that store a huge amount of data from users and are optimized to process requests. For example, when you type in www.facebook.com, your browser sends a _request_ to the Facebook server, and gets a response from the server, thus interpreting the code and displaying your homepage. 

In this case, your browser is the _client_, and Facebook’s server is an API. To put it broadly, whenever you visit a website, you are interacting with its API. However, an API isn’t the same as the remote server — rather it is the part of the server that receives __requests__ and sends __responses__.

<img src='status-code.png' width = 500>

## Part II. Getting Data Through APIs

The `get()` method send a request to YELP's API, and stored information in a variable called `request`. Next, let's see if it's successful. 

#### YELP API
Sometimes you need _authentication_ to get data from a service in additional to just sending a `GET()` request. Yelp API is a perfect example. 

You will need to go to the YELP's developer's [website](https://www.yelp.com/developers/v3/manage_app) and request for a client ID and API key, which function like a key into a house of data. 

<img src='yelp.png' width = 500>

In [26]:
# lets try to get some data from yelp!
url = 'https://api.yelp.com/v3/businesses/search'
response = requests.get(url)

In [27]:
# check the status code
response.status_code

# what happened here?

400

In [4]:
# now we are ready to get our data 

# usually, services would limit you to a certain amount of API calls. This varies from service
# to service, so you have to watch out to it 

MY_API_KEY = "0GyN2A4C0TsdKRN_MM0KoSf-mzy5GyJTHVDI0sGlhvuGHRHGlj6tp70l3F-qfyzyM3eKFG9s2fnW5a3Zf9Lxjo4wYL0NaahGkquKWanmMNaxdPCUr-_eob5z65-hX3Yx"


term = 'Axe Throwing'
location = 'Brooklyn'
SEARCH_LIMIT = 30

url = 'https://api.yelp.com/v3/businesses/search'

#opional parameter in get reqest that allows us to use our api key
headers = {
        'Authorization': 'Bearer {}'.format(MY_API_KEY),
    }

url_params = {
                'term': term.replace(' ', '+'),
                'location': location.replace(' ', '+'),
                'limit': SEARCH_LIMIT
            }
response = requests.get(url, headers=headers, params=url_params)

In [28]:
MY_API_KEY = "0GyN2A4C0TsdKRN_MM0KoSf-mzy5GyJTHVDI0sGlhvuGHRHGlj6tp70l3F-qfyzyM3eKFG9s2fnW5a3Zf9Lxjo4wYL0NaahGkquKWanmMNaxdPCUr-_eob5z65-hX3Yx"


term = 'Bakeries'
location = 'Bronx'
SEARCH_LIMIT = 30

url = 'https://api.yelp.com/v3/businesses/search'

#opional parameter in get reqest that allows us to use our api key
headers = {
        'Authorization': 'Bearer {}'.format(MY_API_KEY),
    }

url_params = {
                'term': term.replace(' ', '+'),
                'location': location.replace(' ', '+'),
                'limit': SEARCH_LIMIT
            }
response = requests.get(url, headers=headers, params=url_params)

In [29]:
response.status_code

200

In [35]:
# examine the response object

print(response)


<Response [200]>


In [36]:
# how are we going to parse the response.text object?

response.text

'{"businesses": [{"id": "b9RW2YCriwuiC5x2Bs-E_Q", "alias": "madonia-brothers-bakery-bronx", "name": "Madonia Brothers Bakery", "image_url": "https://s3-media3.fl.yelpcdn.com/bphoto/xZ74D2ReukvMdIPMQyth7A/o.jpg", "is_closed": false, "url": "https://www.yelp.com/biz/madonia-brothers-bakery-bronx?adjust_creative=pn1anYS2nL_8xD8KiFJOiA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=pn1anYS2nL_8xD8KiFJOiA", "review_count": 266, "categories": [{"alias": "bakeries", "title": "Bakeries"}], "rating": 4.5, "coordinates": {"latitude": 40.85438, "longitude": -73.8884099}, "transactions": [], "price": "$", "location": {"address1": "2348 Arthur Ave", "address2": "", "address3": "", "city": "Bronx", "zip_code": "10458", "country": "US", "state": "NY", "display_address": ["2348 Arthur Ave", "Bronx, NY 10458"]}, "phone": "+17182955573", "display_phone": "(718) 295-5573", "distance": 1437.4485410002578}, {"id": "IKkot7qjdVw2tlSJi-LLrQ", "alias": "contis-pastry-shoppe-bronx", "name

In [37]:
# json.load() takes in a 

# json.loads() takes in a long string and spits out a python dictionary
json.loads(response.text)

{'businesses': [{'id': 'b9RW2YCriwuiC5x2Bs-E_Q',
   'alias': 'madonia-brothers-bakery-bronx',
   'name': 'Madonia Brothers Bakery',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/xZ74D2ReukvMdIPMQyth7A/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/madonia-brothers-bakery-bronx?adjust_creative=pn1anYS2nL_8xD8KiFJOiA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=pn1anYS2nL_8xD8KiFJOiA',
   'review_count': 266,
   'categories': [{'alias': 'bakeries', 'title': 'Bakeries'}],
   'rating': 4.5,
   'coordinates': {'latitude': 40.85438, 'longitude': -73.8884099},
   'transactions': [],
   'price': '$',
   'location': {'address1': '2348 Arthur Ave',
    'address2': '',
    'address3': '',
    'city': 'Bronx',
    'zip_code': '10458',
    'country': 'US',
    'state': 'NY',
    'display_address': ['2348 Arthur Ave', 'Bronx, NY 10458']},
   'phone': '+17182955573',
   'display_phone': '(718) 295-5573',
   'distance': 1437.4485410002578},
  {'i

In [44]:
# working with JSON

bakeries = response.text
bakeries = json.loads(bakeries)

In [45]:
type(bakeries)

dict

In [46]:
# cleaning and exploring the data
for key in bakeries.keys():
    print(key)

businesses
total
region


In [49]:
bakeries['businesses'][0]

{'id': 'b9RW2YCriwuiC5x2Bs-E_Q',
 'alias': 'madonia-brothers-bakery-bronx',
 'name': 'Madonia Brothers Bakery',
 'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/xZ74D2ReukvMdIPMQyth7A/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/madonia-brothers-bakery-bronx?adjust_creative=pn1anYS2nL_8xD8KiFJOiA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=pn1anYS2nL_8xD8KiFJOiA',
 'review_count': 266,
 'categories': [{'alias': 'bakeries', 'title': 'Bakeries'}],
 'rating': 4.5,
 'coordinates': {'latitude': 40.85438, 'longitude': -73.8884099},
 'transactions': [],
 'price': '$',
 'location': {'address1': '2348 Arthur Ave',
  'address2': '',
  'address3': '',
  'city': 'Bronx',
  'zip_code': '10458',
  'country': 'US',
  'state': 'NY',
  'display_address': ['2348 Arthur Ave', 'Bronx, NY 10458']},
 'phone': '+17182955573',
 'display_phone': '(718) 295-5573',
 'distance': 1437.4485410002578}

In [52]:
bakeries['region']

{'center': {'longitude': -73.87138366699219, 'latitude': 40.85220853481013}}

In [53]:
# explore which objects in the dataframe we need
import pandas as pd
bakeries_df = pd.DataFrame(bakeries['businesses'])
# pd.DataFrame.from_dict(axe_throwing['businesses'])
bakeries_df.head()
# axethrowing_df.to_csv('axethrowing_info')

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,b9RW2YCriwuiC5x2Bs-E_Q,madonia-brothers-bakery-bronx,Madonia Brothers Bakery,https://s3-media3.fl.yelpcdn.com/bphoto/xZ74D2...,False,https://www.yelp.com/biz/madonia-brothers-bake...,266,"[{'alias': 'bakeries', 'title': 'Bakeries'}]",4.5,"{'latitude': 40.85438, 'longitude': -73.8884099}",[],$,"{'address1': '2348 Arthur Ave', 'address2': ''...",17182955573,(718) 295-5573,1437.448541
1,IKkot7qjdVw2tlSJi-LLrQ,contis-pastry-shoppe-bronx,Conti's Pastry Shoppe,https://s3-media3.fl.yelpcdn.com/bphoto/6WmIdx...,False,https://www.yelp.com/biz/contis-pastry-shoppe-...,241,"[{'alias': 'bakeries', 'title': 'Bakeries'}]",4.0,"{'latitude': 40.845635, 'longitude': -73.862791}",[],$$,"{'address1': '786 Morris Park Ave', 'address2'...",17182399339,(718) 239-9339,1025.648619
2,vGFCPgKmDqsldeA7qadZbw,terranova-bakery-bronx,Terranova Bakery,https://s3-media1.fl.yelpcdn.com/bphoto/atuNwg...,False,https://www.yelp.com/biz/terranova-bakery-bron...,44,"[{'alias': 'bakeries', 'title': 'Bakeries'}]",4.5,"{'latitude': 40.8544, 'longitude': -73.88481}",[],$$,"{'address1': '691 E 187th St', 'address2': '',...",17183676985,(718) 367-6985,1154.745342
3,VJp6W2UZR9HKRbPrY2ZCZw,zeppieri-and-sons-bakery-bronx,Zeppieri & Sons Bakery,https://s3-media3.fl.yelpcdn.com/bphoto/cbGPAD...,False,https://www.yelp.com/biz/zeppieri-and-sons-bak...,88,"[{'alias': 'bakeries', 'title': 'Bakeries'}]",4.0,"{'latitude': 40.84692, 'longitude': -73.83177}",[],$,"{'address1': '3004 Buhre Ave', 'address2': '',...",17188299111,(718) 829-9111,3375.348046
4,rqr8waPF6HoIm5MurDziLA,sal-and-doms-pastry-shop-bronx,Sal & Dom's Pastry Shop,https://s3-media1.fl.yelpcdn.com/bphoto/5H0is3...,False,https://www.yelp.com/biz/sal-and-doms-pastry-s...,94,"[{'alias': 'bakeries', 'title': 'Bakeries'}]",4.5,"{'latitude': 40.865162, 'longitude': -73.855157}",[],$$,"{'address1': '1108 Allerton Ave', 'address2': ...",17185153344,(718) 515-3344,1985.172565


In [17]:
# look into .gitignore - to specify which files you want to ignore. You can put your API key in your git ignore
#look into geopandas

In [18]:
# let's turn the things we need into a pandas dataframe

In [24]:
# you can do some analysis and visualization from here on! 

plt.hist(axethrowing_df['review_count']
# "plt.hist(axethrowing_df['review_count']);\n",

plt.title('review count of axe throwing in brooklyn')
# visualize the review count - what's the appropriate plot?
plt.show()

SyntaxError: invalid syntax (<ipython-input-24-5d3380ea9f5a>, line 6)

In [None]:
# query the name of the axe throwing place with the highest review


In [None]:
# migrate the cleaned data into a sql db

In [None]:
# can you do some other queries using sql/pandas?

#### Resources
- [Getting Data from Reddit API](https://www.storybench.org/how-to-scrape-reddit-with-python/)
- [Twitch API](https://dev.twitch.tv/docs)