# Running your own API queries

This notebook is designed to give you a template where you can start to execute your own queries to the Yelp API. As an example, you will request all high-end (`$$$` and `$$$$`) restaurants in Newark, NJ.

Before starting, two important remarks:
- The Yelp API allows a maximum of 5000 free requests per day, per user.
- If you're using the `/businesses/search` endpoint, you'll have a limit of 50 entries per response. If you use other endpoints, please check the [Yelp API documentation](https://www.yelp.com/developers/documentation/v3) to find out.

In this notebook you'll have a chance to edit your request parameters to fit your needs. You'll also learn to run a sequence of requests and accumulate all of their response data together. 

Ready?

### Import packages

In [None]:
import pandas as pd
import requests

### Yelp API Key setup

In [None]:
API_KEY = "PASTE-YOUR-API-KEY-INSIDE-THESE-QUOTATION-MARKS"

### Request setup
Start setting up the base URL and the headers of your request

In [None]:
# Define the base URL for the request
base_url = "https://api.yelp.com/v3/businesses/search"

# Set up the request headers -- API key is used here
headers = {"Authorization": "Bearer " + API_KEY}

Now define the request parameters

In [None]:
# Define the request parameters

# 50 is the max number of entries a single Yelp request on the businesses/search endpoint 
# can return. Change to 50 once you're confident that your query is correct!

REQUEST_LENGTH = 5  

params = {
    "location": "Newark, NJ",
    "term": "restaurant",
    "price": "3, 4",  # 3 corresponds to $$$, and 4 to $$$$
    "limit": REQUEST_LENGTH,
    "offset": 0
} 

Now execute the request, saving the response in a variable called `response`.

In [None]:
# Execute request
response = requests.get(
    base_url,
    headers=headers,
    params=params
)
    
# Extract data as a pandas dataframe
data_dict = response.json()
data_df = pd.DataFrame(data_dict["businesses"])

In [None]:
# Inspect the data

print("The total number of entries for this search is ", data_dict["total"])

In [None]:
# Show the downloaded data in tabular format
data_df

### Make a sequence of requests

It is very possible that the amount of data in one single request is not enough for your project. But now that you've fine-tuned your request parameters above to obtain the information you want, check that the above data look alright. Are you getting results that make sense? 

If so, here's what you're going do do next:
- you're going to execute a sequence of API requests
- with each response, you'll extract the data and append it to the previous chunk
- once you've exhausted your API daily limit, or the total number of entries in the search, the code will stop
This setup will give you a chance to run queries returning more entries than the request limit.

_Note: the logic in the next cell is a bit more complex that what we've seen so far. Don't worry if you don't understand it fully, focus on the results and revisit the code and the comments later!_

In [None]:
# Reset the request length to the maximum allowed by Yelp
REQUEST_LENGTH = 50
params["limit"] = REQUEST_LENGTH

# Initialize loop variables
offset = 0

while True:
    
    # Request data
    response = requests.get(
        base_url,
        headers=headers,
        params=params
    )
    
    
    # Extract data in JSON format
    data_dict = response.json()
    
    # Print out total number or entries to retrieve
    if offset == 0:
        print("Running a query on {num_entries} entries...".format(num_entries=data_dict["total"]))
    
    # Check if this is the first request in the sequence
    if offset == 0:
        # First request, so create dataframe
        data_df = pd.DataFrame(data_dict["businesses"])
    else:
        # Second or later request, so append data to existing dataframe
        data_df = data_df.append(pd.DataFrame(data_dict["businesses"]))

    # If not entries have yet been retrieved
    if data_dict["total"] > offset + REQUEST_LENGTH:
        # Update the offset parameter
        offset = offset + REQUEST_LENGTH
        params["offset"] = offset
        
    else:
        # All entries retrieved, exit loop
        print("Query completed!")
        break


# Reset the dataframe index so it runs sequentially from 0 to the number of entries
data_df = data_df.reset_index(drop=True)

In [None]:
# Display the queried data
data_df

Extract the coordinates as longitude and latitude columns.

In [None]:
# Extract latitude and longitude into new columns
data_df["latitude"] = data_df["coordinates"].apply(lambda x: x["latitude"])
data_df["longitude"] = data_df["coordinates"].apply(lambda x: x["longitude"])

# # An alternative, more generic way of achieving the same thing
# data_df = pd.concat([data_df, data_df["coordinates"].apply(pd.Series)], axis=1)

# Inspect transformed dataframe
data_df

Save this dataframe as a CSV

In [None]:
# Save requested data as a CSV
data_df.to_csv("./my_requested_data.csv", index=False)  # index=False is used to avoid writing the row index in the file

Verify the CSV contents

In [None]:
# Print created CSV to screen
!cat ./my_requested_data.csv

Now that you've seen how this operates, go up to the cell where the `base_url` and `params` are defined and make any changes you need for your questions or projects. Good luck!