In [1]:
import pandas as pd
import requests

### Javascript Object Notation (JSON)

1.Common web data format

2.`Not tabular`
  (a).Records don't have to all have the same set of attributes

3.Data organized into `collections` of objects

4.Objects are collections of `attribute-value` pairs

5.`Nested JSON`: objects within objects

### Reading JSON Data

`read_json()` :

1.Takes a `string path` to JSON _or_ JSON data as a string

2.Specify data types with `dtype` keyword argument

3.`orient` keyword argument to `flag` uncommon JSON data layouts

### Data Orientation

JSON data isn't tabular, so there are ways to store data--

1.`Record Orientation` : Most common JSON arrangement

2.`Column Orientation` : More space-efficient than record-oriented JSON

### Load JSON data

Many open data portals make available JSONs datasets that are particularly easy to parse. They can be accessed directly via `URL`. Each `object` is a `record`, all objects have the same set of attributes, and none of the values are nested objects that themselves need to be parsed.

JSON isn't a tabular format, so pandas makes assumptions about its orientation when loading data. Most JSON data you encounter will be in orientations that pandas can automatically transform into a data frame.

Sometimes, like in this `modified` version of the Department of Homeless Services Daily Report, data is `oriented differently`. To reduce the file size, it has been `split` formatted. You'll see what happens when you try to load it normally versus with the orient keyword argument. The `try/except` block will alert you if there are errors loading the data.

In [None]:
### Try loading dhs_report_reformatted.json without any keyword arguments.

try:
    # Load the JSON without keyword arguments
    df = pd.read_json("dhs_report_reformatted.json")
    
    # Plot total population in shelters over time
    df["date_of_census"] = pd.to_datetime(df["date_of_census"])
    df.plot(x="date_of_census", 
            y="total_individuals_in_shelter")
    plt.show()
    
except ValueError:
    print("pandas could not parse the JSON.")

In [None]:
### Load dhs_report_reformatted.json to a data frame with orient specified.

try:
    # Load the JSON with orient specified
    df = pd.read_json("dhs_report_reformatted.json",
                      orient = "split")
    
    # Plot total population in shelters over time
    df["date_of_census"] = pd.to_datetime(df["date_of_census"])
    df.plot(x="date_of_census", 
            y="total_individuals_in_shelter")
    plt.show()
    
except ValueError:
    print("pandas could not parse the JSON.")

### Application Programming Interfaces(APIs)

1.Defines how an application `communicates` with other programs.

2.Way to get `data` from an `application` without knowing database details.

### Requests

1.Send and get data from websites

2.Not tied to a particular API

3.`requests.get()` to get data from a `URL`

### requests.get()

1.`requests.get(url_string)` to get data from a URL

2.Keyword arguments:
  
  a.`params keyword`: takes a `dictionary` of `parameters and values` to customize API request
  
  b.`headers keyword`: takes a `dictionary`, can be used to provide user `authentication` to API

3.Result: a `response` object, containing `data and metadata`
  
  a.`response.json()` will return just the `JSON` data
  
### response.json() and pandas

1.`response.json()` returns a `dictionary`

2.`read_json()` expects `strings`, not `dictionaries`. Load the response JSON to a `data frame` with `pd.DataFrame()`. `read_json()` will give an `error!`



### Ex 1: Get data from an API

In this exercise, you'll use `requests.get()` to query the `Yelp Business Search API` for `cafes` in `New York City`. requests.get() needs a URL to get data from. 

`Formatting` parameters to get the data you need is an integral part of working with APIs. These parameters can be passed to the `get()` function's`params` keyword argument as a `dictionary`.

Many `APIs` require users provide an `API key`, obtained by `registering` for the service. `Keys` typically are passed in the request `header`, rather than as parameters.

The Yelp API also needs search parameters and authorization headers passed to the params and headers keyword arguments, respectively.

The Yelp API requires the `location` parameter be set. It also lets users supply a term to search for. You'll use these parameters to get data about `cafes` in NYC, then process the result to create a data frame.

The Yelp API documentation says "To authenticate API calls with the API Key, set the Authorization HTTP `header` value as Bearer API_KEY." The key is `api_key`.

You'll need to extract the data from the response with its `json()` method, and pass it to pandas's `DataFrame()` function to make a data frame. Note that the necessary data is under the dictionary key `"businesses"`.

In this exercise--

1.Copy the `URL` of Yelp Business Search API and store it in a variable `api_url`

2.Create a `dictionary`, `parameters`, with the `term` and `location` parameters set to search for `"cafe"`s in `"NYC"`.

3.Create a `dictionary`,`headers`, that passes the formatted key string to the `"Authorization"` header value.

4.`Query` the Yelp API `api_url` with `requests`'s `get()` function and the headers and params keyword arguments set. Save the result as `response`. 

5.`Extract` the JSON data from response with the appropriate method. Save the result as `data`.

6.Load the cafe listings to the data frame `cafes` with pandas's DataFrame() function. The listings are under the `businesses"` key in data.

7.Print the data frame's `dtypes` to see what information you're getting.

8.Load the `"businesses"` values in data to the data frame cafes and print the head.

9.From the data frame cafes and print the `names` column.

In [28]:
# Go to the URL of Yelp Business Search API -- https://www.yelp.com/developers/documentation/v3/business_search

#  Get the API URL from the Request --- https://api.yelp.com/v3/businesses/search

api_url = "https://api.yelp.com/v3/businesses/search"

In [30]:
api_key = 'bRKybkEWy8-IR8mixB_OuWQBKUsKR-lHyjjCq19dSr3XXwcoOIGWmHvwbZnj8W4c7NiuciEevmynbL8eiLZkOy87GhUjTOw-rlmU9Ldm1A_89h8jRn9y2sO7709-YXYx'

In [31]:
## Create a dictionary, parameters, with the `term` and `location` parameters set to search for "cafe"s in "NYC"

parameters = {"term" : "cafe" , "location" : 'NYC'}

In [32]:
# Create a dictionary, headers, that passes the formatted key string to the "Authorization" header value. The key is api_key

headers = {"Authorization": "Bearer {}".format(api_key)}

In [33]:
# Query the Yelp API api_url with requests's get() function and the headers and params keyword arguments set. 
# Save the result as response.

response = requests.get(api_url, params = parameters, headers = headers)

In [35]:
## Extract the JSON data from response with the appropriate method. Save the result as data.

data = response.json()
data.keys()

dict_keys(['businesses', 'total', 'region'])

In [52]:
## Load the cafe listings to the data frame cafes with pandas's DataFrame() function. 
## The listings are under the businesses" key in data.

cafes = pd.DataFrame(data["businesses"])


# Load the "businesses" values in data to the data frame cafes and print the head.
cafes.head()

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,coordinates,transactions,price,location,phone,display_phone,distance
0,vijwGDNrPBJHEG7_DsjZNw,usagi-ny-dumbo-7,Usagi NY,https://s3-media1.fl.yelpcdn.com/bphoto/6yNqnk...,False,https://www.yelp.com/biz/usagi-ny-dumbo-7?adju...,69,"[{'alias': 'bookstores', 'title': 'Bookstores'...",4.5,"{'latitude': 40.70383, 'longitude': -73.98691}","[delivery, pickup]",$$,"{'address1': '163 Plymouth St', 'address2': ''...",17188018037.0,(718) 801-8037,635.781863
1,pimuUR-TEHIjUla3S3jemQ,coffee-project-new-york-east-village-new-york,Coffee Project New York | East Village,https://s3-media4.fl.yelpcdn.com/bphoto/oTalWA...,False,https://www.yelp.com/biz/coffee-project-new-yo...,695,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,"{'latitude': 40.72699, 'longitude': -73.98922}",[delivery],$,"{'address1': '239 E 5th St', 'address2': None,...",12122287888.0,(212) 228-7888,2438.032688
2,KWC_KYP336tSFhAihd8plg,bluestone-lane-dumbo-café-brooklyn,Bluestone Lane DUMBO Café,https://s3-media1.fl.yelpcdn.com/bphoto/AjXvAC...,False,https://www.yelp.com/biz/bluestone-lane-dumbo-...,201,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.0,"{'latitude': 40.70077577888819, 'longitude': -...","[delivery, pickup]",$,"{'address1': '55 Prospect St', 'address2': '',...",17183746858.0,(718) 374-6858,720.691039
3,kpxXi23lUQkeJQH-2BtzDw,qahwah-house-brooklyn,Qahwah House,https://s3-media3.fl.yelpcdn.com/bphoto/hGHi0N...,False,https://www.yelp.com/biz/qahwah-house-brooklyn...,128,"[{'alias': 'cafes', 'title': 'Cafes'}, {'alias...",5.0,"{'latitude': 40.7185628184474, 'longitude': -7...","[delivery, pickup]",$,"{'address1': '162 Bedford Ave', 'address2': ''...",,,3455.274392
4,QgJE4Jfzk7nzmRF_W_6ASA,espresso-bar-brooklyn,Espresso Bar,https://s3-media3.fl.yelpcdn.com/bphoto/LeLZYK...,False,https://www.yelp.com/biz/espresso-bar-brooklyn...,2,"[{'alias': 'cafes', 'title': 'Cafes'}]",5.0,"{'latitude': 40.653054, 'longitude': -73.975948}",[],,"{'address1': '1233 Prospect Ave', 'address2': ...",,,6021.998285


In [26]:
## Print the data frame's dtypes to see what information you're getting.

cafes.dtypes

id                object
alias             object
name              object
image_url         object
is_closed           bool
url               object
review_count       int64
categories        object
rating           float64
coordinates       object
transactions      object
price             object
location          object
phone             object
display_phone     object
distance         float64
dtype: object

In [27]:
## From the data frame cafes and print the name column.

cafes["name"]

0                                   Usagi NY
1     Coffee Project New York | East Village
2                  Bluestone Lane DUMBO Café
3                               Qahwah House
4                               Espresso Bar
5                                  % Arabica
6                                     Butler
7                           Good Thanks Cafe
8                                     Banter
9                             Urban Backyard
10                                  Devocion
11                      Remi Flower & Coffee
12                       Now or Never Coffee
13                              Sweet Moment
14                              Bibble & Sip
15                                      ACRE
16                            AMERICANO CAFE
17                          Hole in the Wall
18                                     Maman
19                          FEED Shop & Cafe
Name: name, dtype: object

### Working with nested JSONs

JSONs contain `objects` with `attribute-value` pair. A JSON is nested when the `value` itself is an `object`.

### Flatten nested JSONs

A feature of JSON data is that it can be `nested`: that means in this case, an `attribute`'s `value` can consist of `attribute-value` pairs. This nested data is more useful `unpacked, or flattened`, into its own `data frame` columns. The `pandas.io.json` submodule has a function, `json_normalize()`, that does exactly this.

`pandas.io.json`: 

1.`pandas.io.json` submodule has tools for `reading and writing` JSON. It Needs its own `import` statement.

2.`json_normalize()`:
    
  a.Takes a `dictionary/list` of dictionaries (like pd.DataFrame() does)
    
  b.Returns a `flattened` dataframe
    
  c.Default flattened column name pattern: `attribute.nestedattribute`
    
  d.Choose a different separator with the `sep` argument
  

### Ex 2: 

The Yelp API `response` data is nested. Your job is to `flatten` out the next level of data in the `coordinates` and `location` columns.

1.Load the `json_normalize()` function from `pandas.io.json` submodule.

2.Isolate the JSON data from `response` and assign it to `data`

3.Use `json_normalize()` to flatten and load the `businesses` data to a data frame, `cafes`. Set the `sep` argument to use underscores `(_)`, rather than periods.

In [41]:
# Load the json_normalize() function from pandas.io.json submodule.
from pandas import json_normalize

In [42]:
# .Isolate the JSON data from response and assign it to data

data = response.json()

In [46]:
# Use json_normalize() to flatten and load the businesses data to a data frame, cafes. 
# Set the sep argument to use underscores (_), rather than periods.

cafes = json_normalize(data["businesses"], sep = "_")
#cafes

{'id': 'vijwGDNrPBJHEG7_DsjZNw',
 'alias': 'usagi-ny-dumbo-7',
 'name': 'Usagi NY',
 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/6yNqnkPDJAZXpMemwCYN3w/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/usagi-ny-dumbo-7?adjust_creative=kCAx88XpjKpjxe9YMRbcwg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=kCAx88XpjKpjxe9YMRbcwg',
 'review_count': 69,
 'categories': [{'alias': 'bookstores', 'title': 'Bookstores'},
  {'alias': 'cafes', 'title': 'Cafes'},
  {'alias': 'homedecor', 'title': 'Home Decor'}],
 'rating': 4.5,
 'coordinates': {'latitude': 40.70383, 'longitude': -73.98691},
 'transactions': ['delivery', 'pickup'],
 'price': '$$',
 'location': {'address1': '163 Plymouth St',
  'address2': '',
  'address3': '',
  'city': 'Dumbo',
  'zip_code': '11201',
  'country': 'US',
  'state': 'NY',
  'display_address': ['163 Plymouth St', 'Dumbo, NY 11201']},
 'phone': '+17188018037',
 'display_phone': '(718) 801-8037',
 'distance': 635.781863152762

### Deeply Nested Data

`json_normalize()`: 

 a.`record_path`: string/list of string attributes to nested data
 
 b.`meta`: list of other attributes to load to data frame
 
 c.`meta_prefix`: string to prefix to meta column names
 
Last exercise, you flattened data nested down one level. Here, you'll unpack more deeply nested data. The `categories` attribute in the `Yelp API response` contains `lists` of objects. To flatten this data, you'll employ `json_normalize()` arguments to specify the `path` to `categories` and pick other `attributes` to `include` in the data frame. You should also change the `separator` to facilitate column selection and `prefix` the other attributes to `prevent` column name `collisions`. 

### Ex 3:

Use `json_normalize()`:

1.to flatten records under the `businesses` key in data, setting `underscores (_)` as separators, 

2.Specify the `record_path` to the `categories` data, 

3.Set the `meta` keyword argument to get business `name`, `alias`, `rating`, and the `attributes` nested under `coordinates: latitude and longitude`, 

4.Add `"biz_"` as a `meta_prefix` to prevent `duplicate` column names.


In [66]:
data["businesses"][0]

{'id': 'vijwGDNrPBJHEG7_DsjZNw',
 'alias': 'usagi-ny-dumbo-7',
 'name': 'Usagi NY',
 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/6yNqnkPDJAZXpMemwCYN3w/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/usagi-ny-dumbo-7?adjust_creative=kCAx88XpjKpjxe9YMRbcwg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=kCAx88XpjKpjxe9YMRbcwg',
 'review_count': 69,
 'categories': [{'alias': 'bookstores', 'title': 'Bookstores'},
  {'alias': 'cafes', 'title': 'Cafes'},
  {'alias': 'homedecor', 'title': 'Home Decor'}],
 'rating': 4.5,
 'coordinates': {'latitude': 40.70383, 'longitude': -73.98691},
 'transactions': ['delivery', 'pickup'],
 'price': '$$',
 'location': {'address1': '163 Plymouth St',
  'address2': '',
  'address3': '',
  'city': 'Dumbo',
  'zip_code': '11201',
  'country': 'US',
  'state': 'NY',
  'display_address': ['163 Plymouth St', 'Dumbo, NY 11201']},
 'phone': '+17188018037',
 'display_phone': '(718) 801-8037',
 'distance': 635.781863152762

In [70]:
flat_cafes = json_normalize(data["businesses"], sep="_", record_path='categories',
                            
                            meta=["name","alias", "rating", ['coordinates', "latitude"],['coordinates', "longitude"]],
                            
                            meta_prefix = "cafe_")                                                                       

In [71]:
flat_cafes.head()

Unnamed: 0,alias,title,cafe_name,cafe_alias,cafe_rating,cafe_coordinates_latitude,cafe_coordinates_longitude
0,bookstores,Bookstores,Usagi NY,usagi-ny-dumbo-7,4.5,40.70383,-73.98691
1,cafes,Cafes,Usagi NY,usagi-ny-dumbo-7,4.5,40.70383,-73.98691
2,homedecor,Home Decor,Usagi NY,usagi-ny-dumbo-7,4.5,40.70383,-73.98691
3,coffee,Coffee & Tea,Coffee Project New York | East Village,coffee-project-new-york-east-village-new-york,4.5,40.72699,-73.98922
4,sandwiches,Sandwiches,Coffee Project New York | East Village,coffee-project-new-york-east-village-new-york,4.5,40.72699,-73.98922


### Appending two dataframes

In this exercise, you’ll practice appending records by creating a dataset of the 100 highest-rated cafes in New York City according to Yelp.

APIs often limit the amount of data returned, since sending large datasets can be time- and resource-intensive. The Yelp Business Search API limits the results returned in a call to 50 records. However, the offset parameter lets a user retrieve results starting after a specified number. By modifying the offset, we can get results 1-50 in one call and 51-100 in another. Then, we can append the data frames.

Append the results of the API call to top_50_cafes, setting ignore_index so rows will be renumbered.

In [75]:
params = {"term": "cafe", 
          "location": "NYC",
          "sort_by": "rating", 
          "limit": 50,
          }

In [76]:
result = requests.get(api_url, headers=headers, params=params)
first_50_cafes = json_normalize(result.json()["businesses"])

In [83]:
first_50_cafes

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,transactions,...,coordinates.latitude,coordinates.longitude,location.address1,location.address2,location.address3,location.city,location.zip_code,location.country,location.state,location.display_address
0,vijwGDNrPBJHEG7_DsjZNw,usagi-ny-dumbo-7,Usagi NY,https://s3-media1.fl.yelpcdn.com/bphoto/6yNqnk...,False,https://www.yelp.com/biz/usagi-ny-dumbo-7?adju...,69,"[{'alias': 'bookstores', 'title': 'Bookstores'...",4.5,"[delivery, pickup]",...,40.70383,-73.98691,163 Plymouth St,,,Dumbo,11201,US,NY,"[163 Plymouth St, Dumbo, NY 11201]"
1,kpxXi23lUQkeJQH-2BtzDw,qahwah-house-brooklyn,Qahwah House,https://s3-media3.fl.yelpcdn.com/bphoto/hGHi0N...,False,https://www.yelp.com/biz/qahwah-house-brooklyn...,128,"[{'alias': 'cafes', 'title': 'Cafes'}, {'alias...",5.0,"[delivery, pickup]",...,40.718563,-73.95713,162 Bedford Ave,,,Brooklyn,11249,US,NY,"[162 Bedford Ave, Brooklyn, NY 11249]"
2,pimuUR-TEHIjUla3S3jemQ,coffee-project-new-york-east-village-new-york,Coffee Project New York | East Village,https://s3-media4.fl.yelpcdn.com/bphoto/oTalWA...,False,https://www.yelp.com/biz/coffee-project-new-yo...,695,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,[delivery],...,40.72699,-73.98922,239 E 5th St,,,New York,10003,US,NY,"[239 E 5th St, New York, NY 10003]"
3,KWC_KYP336tSFhAihd8plg,bluestone-lane-dumbo-café-brooklyn,Bluestone Lane DUMBO Café,https://s3-media1.fl.yelpcdn.com/bphoto/AjXvAC...,False,https://www.yelp.com/biz/bluestone-lane-dumbo-...,201,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.0,"[delivery, pickup]",...,40.700776,-73.988364,55 Prospect St,,,Brooklyn,11201,US,NY,"[55 Prospect St, Brooklyn, NY 11201]"
4,QgJE4Jfzk7nzmRF_W_6ASA,espresso-bar-brooklyn,Espresso Bar,https://s3-media3.fl.yelpcdn.com/bphoto/LeLZYK...,False,https://www.yelp.com/biz/espresso-bar-brooklyn...,2,"[{'alias': 'cafes', 'title': 'Cafes'}]",5.0,[],...,40.653054,-73.975948,1233 Prospect Ave,,,Brooklyn,11218,US,NY,"[1233 Prospect Ave, Brooklyn, NY 11218]"
5,ED7A7vDdg8yLNKJTSVHHmg,arabica-brooklyn,% Arabica,https://s3-media4.fl.yelpcdn.com/bphoto/_3VSEF...,False,https://www.yelp.com/biz/arabica-brooklyn?adju...,72,"[{'alias': 'coffee', 'title': 'Coffee & Tea'}]",4.0,[],...,40.702471,-73.99426,20 Old Fulton St,,,Brooklyn,11201,US,NY,"[20 Old Fulton St, Brooklyn, NY 11201]"
6,-2UtjTxrt1Xzd-HPsLJ7mA,butler-brooklyn-2,Butler,https://s3-media1.fl.yelpcdn.com/bphoto/CU_zfW...,False,https://www.yelp.com/biz/butler-brooklyn-2?adj...,132,"[{'alias': 'bakeries', 'title': 'Bakeries'}, {...",4.5,"[delivery, pickup]",...,40.703267,-73.992242,40 Water St,,,Brooklyn,11201,US,NY,"[40 Water St, Brooklyn, NY 11201]"
7,vZevXSC1w27dEvXc2EHCEg,good-thanks-cafe-new-york,Good Thanks Cafe,https://s3-media4.fl.yelpcdn.com/bphoto/bwiCKL...,False,https://www.yelp.com/biz/good-thanks-cafe-new-...,255,"[{'alias': 'cafes', 'title': 'Cafes'}, {'alias...",4.5,"[delivery, pickup]",...,40.71972,-73.98952,131A Orchard St,,,New York,10002,US,NY,"[131A Orchard St, New York, NY 10002]"
8,Zc7Jbuwe3XO-EaVXIh_TYQ,banter-new-york,Banter,https://s3-media2.fl.yelpcdn.com/bphoto/XCjCFr...,False,https://www.yelp.com/biz/banter-new-york?adjus...,510,"[{'alias': 'breakfast_brunch', 'title': 'Break...",4.5,"[delivery, pickup]",...,40.72787,-74.0011,169 Sullivan St,,,New York,10012,US,NY,"[169 Sullivan St, New York, NY 10012]"
9,HUlbrPvAr6sXuBfp5z1MWA,urban-backyard-new-york,Urban Backyard,https://s3-media2.fl.yelpcdn.com/bphoto/bB8vr7...,False,https://www.yelp.com/biz/urban-backyard-new-yo...,214,"[{'alias': 'coffee', 'title': 'Coffee & Tea'}]",4.5,[delivery],...,40.72077,-73.99646,180 Mulberry St,,,New York,10012,US,NY,"[180 Mulberry St, New York, NY 10012]"


In [94]:
params2 = {"term": "cafe", 
          "location": "NYC",
          "sort_by": "rating", 
          "limit": 50,
          "offset": 50}

In [95]:
result = requests.get(api_url, headers=headers, params=params2)
next_50_cafes = json_normalize(result.json()["businesses"])

In [96]:
next_50_cafes

Unnamed: 0,id,alias,name,image_url,is_closed,url,review_count,categories,rating,transactions,...,coordinates.longitude,location.address1,location.address2,location.address3,location.city,location.zip_code,location.country,location.state,location.display_address,price
0,VLC2DROxvGX-ka_aBmJN9w,social-house-cafe-brooklyn,Social House Cafe,https://s3-media1.fl.yelpcdn.com/bphoto/XHkG8R...,False,https://www.yelp.com/biz/social-house-cafe-bro...,9,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",5.0,"[delivery, pickup]",...,-73.966014,60 Broadway,,,Brooklyn,11249,US,NY,"[60 Broadway, Brooklyn, NY 11249]",
1,XEdGYktN7VVuyjkN5ukr4A,bright-side-brooklyn-3,Bright Side,https://s3-media2.fl.yelpcdn.com/bphoto/cSx5Lj...,False,https://www.yelp.com/biz/bright-side-brooklyn-...,27,"[{'alias': 'cafes', 'title': 'Cafes'}]",4.5,"[delivery, pickup]",...,-73.964017,184 Kent Ave,,,Brooklyn,11249,US,NY,"[184 Kent Ave, Brooklyn, NY 11249]",
2,jniZaiTDALOVQfNkg_S9gg,787-coffee-new-york-5,787 Coffee,https://s3-media3.fl.yelpcdn.com/bphoto/RA6g4b...,False,https://www.yelp.com/biz/787-coffee-new-york-5...,44,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",5.0,"[delivery, pickup]",...,-74.01083,66 Pearl St,,,New York,10004,US,NY,"[66 Pearl St, New York, NY 10004]",
3,blX8Y6O0J4p6fXVcTRmANg,the-elk-new-york,The Elk,https://s3-media3.fl.yelpcdn.com/bphoto/Dv6TT4...,False,https://www.yelp.com/biz/the-elk-new-york?adju...,165,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.0,[delivery],...,-74.007479,128 Charles St,,,New York,10014,US,NY,"[128 Charles St, New York, NY 10014]",$$
4,NJay-xUEFsvWzdeohs9EEA,the-coppola-cafe-new-york,The Coppola Cafe,https://s3-media2.fl.yelpcdn.com/bphoto/gX4aHY...,False,https://www.yelp.com/biz/the-coppola-cafe-new-...,46,"[{'alias': 'cafes', 'title': 'Cafes'}]",4.5,"[delivery, pickup]",...,-74.001612,171 West 4th St,,,New York,10014,US,NY,"[171 West 4th St, New York, NY 10014]",$
5,nI9a2XEe0kM6sEBZr-gxXg,three-owls-market-new-york,Three Owls Market,https://s3-media3.fl.yelpcdn.com/bphoto/lkgnYE...,False,https://www.yelp.com/biz/three-owls-market-new...,35,"[{'alias': 'cafes', 'title': 'Cafes'}, {'alias...",4.5,"[delivery, pickup]",...,-74.0082,800 Washington St,,,New York,10014,US,NY,"[800 Washington St, New York, NY 10014]",$$
6,u0bYCPlbJ85_lDjnZQ_e3w,kaigo-coffee-room-new-york-4,Kaigo Coffee Room,https://s3-media2.fl.yelpcdn.com/bphoto/fpRbgK...,False,https://www.yelp.com/biz/kaigo-coffee-room-new...,159,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",4.5,[delivery],...,-74.000473,120C Lafayette St,,,New York,10013,US,NY,"[120C Lafayette St, New York, NY 10013]",$$
7,1fDtQij01Mk92ZTGn6WHjg,the-little-sweet-cafe-boerum-hill-2,The Little Sweet Cafe,https://s3-media2.fl.yelpcdn.com/bphoto/Pxtbjo...,False,https://www.yelp.com/biz/the-little-sweet-cafe...,155,"[{'alias': 'coffee', 'title': 'Coffee & Tea'}]",4.5,[delivery],...,-73.986732,77B Hoyt St,,,Boerum Hill,11201,US,NY,"[77B Hoyt St, Boerum Hill, NY 11201]",$$
8,xTnivXEdEtXrnhp-z4JGvg,la-parisienne-new-york-5,La Parisienne,https://s3-media2.fl.yelpcdn.com/bphoto/ywbWsB...,False,https://www.yelp.com/biz/la-parisienne-new-yor...,618,"[{'alias': 'cafes', 'title': 'Cafes'}, {'alias...",4.5,"[delivery, pickup]",...,-74.00938,9 Maiden Ln,,,New York,10038,US,NY,"[9 Maiden Ln, New York, NY 10038]",$$
9,1Ik8VQxUa0aDbc1KzrxLRg,caffe-aronne-new-york-3,Caffe Aronne,https://s3-media1.fl.yelpcdn.com/bphoto/06P9Le...,False,https://www.yelp.com/biz/caffe-aronne-new-york...,14,"[{'alias': 'coffee', 'title': 'Coffee & Tea'},...",5.0,"[delivery, pickup]",...,-74.002048,112 Greenwich Ave,,,New York,10011,US,NY,"[112 Greenwich Ave, New York, NY 10011]",


In [98]:
# Append the results, setting ignore_index to renumber rows
cafes = first_50_cafes.append(next_50_cafes, ignore_index = True)

In [99]:
# Print shape of cafes
cafes.shape

(100, 24)

### Merge:

To do this, you'll merge two datasets with the DataFrame merge() method. The first,crosswalk, is a crosswalk between ZIP codes and Public Use Micro Data Sample Areas (PUMAs), which are aggregates of census tracts and correspond roughly to NYC neighborhoods. Then, you'll merge in pop_data, which contains 2016 population estimates for each PUMA.

1.Use the DataFrame method to merge cafes and crosswalk on location_zip_code and zipcode, respectively. Assign the result to cafes_with_pumas.

2.Merge pop_data into cafes_with_pumas on their puma fields. Save the result as cafes_with_pop.

In [None]:
# Merge crosswalk into cafes on their zip code fields
cafes_with_pumas = cafes.merge(crosswalk,
                left_on="location_zip_code", 
                right_on="zipcode")



# Merge pop_data into cafes_with_pumas on puma field
cafes_with_pop = cafes_with_pumas.merge(pop_data, on = "puma")

# View the data
print(cafes_with_pop.head())