# Querying Yelp using GraphQL
This is a beta program, but contrary to the other Yelp APIs, allows us to completely customize our querys, only obtaining the data we'll need for our analysis.

Below is an example of how to utilize the `GraphQL` Yelp API to query 10 coffee shops in Brooklyn 11222 postal code.

For detailed instructions please reference [Getting Started with Yelp GraphQL](https://docs.developer.yelp.com/docs/graphql-intro)

In [2]:
# import packages
import requests
import pandas as pd
import sys
sys.path.append('src') # add src folder to path

# import api key
from config import YELP_API

First we need to set our `headers` and `url`.<br>
The API key needs to be approved for beta use, so prior to making any queries, goto the `Manage Account` section of your Yelp profile and approve for beta use.

In [3]:
# set up headers and access token
headers = {
    "Authorization": f"Bearer {YELP_API}",
    "Content-Type": "application/json"
}

# set url
url = "https://api.yelp.com/v3/graphql"

Next we perform the business query.<br>
We don't want to query the reviews yet, because we have a **one-to-many relationship**, meaning we will have multiple reviews for each business (limiting to 5 in this demo).

In [4]:
# basic query
businesses_query = """
{
  search(location: "Brooklyn 11222", categories: "coffee", limit: 10) {
    business {
      id
      name
      rating
      review_count
    }
  }
}
"""


We make the API call with the `requests` package passing the query and the headers for authentication.<br>
After this we convert the response to json, this will allow us to smoothly parse into lists and eventually dataframes.

In [5]:
# api call to fetch businesses
response = requests.post(url, json={"query": businesses_query}, headers=headers)
data = response.json()


In [6]:
# extract info
businesses = data["data"]["search"]["business"]

# data storage lists
business_data = []
review_data = []

# iterate over the businesses and fetch their reviews
for business in businesses:
    # fetch reviews for the current business
    reviews_query = f"""
    {{
      business(id: "{business['id']}") {{
        reviews(limit: 5) {{
          user {{
            name
            id
          }}
          rating
          text
        }}
      }}
    }}
    """
    # api call to fetch reviews
    reviews_response = requests.post(url, json={"query": reviews_query}, headers=headers)
    reviews_data = reviews_response.json()
    
    # extract info
    reviews = reviews_data["data"]["business"]["reviews"]

    # store business data
    business_data.append({
        "business_id": business["id"],
        "business_name": business["name"],
        "rating": business["rating"],
        "review_count": business["review_count"]
    })

    # store review data
    for review in reviews:
        review_data.append({
            "business_id": business["id"],
            "review_user_id":review["user"]["id"],
            "review_user": review["user"]["name"],
            "review_rating": review["rating"],
            "review_text": review["text"]
        })


Create two dataframes:
- one for businesses
- one for reviews

Having separate dataframes will allow us to explore the **one-to-many** relationship between business and reviews.

In [7]:
# business df
business_df = pd.DataFrame(business_data)

# reviews df
review_df = pd.DataFrame(review_data)


Look at contents

In [8]:
business_df

Unnamed: 0,business_id,business_name,rating,review_count
0,1Q3oaJahyGRogDWgpo7PIw,Five Leaves,4.0,1726
1,kpxXi23lUQkeJQH-2BtzDw,Qahwah House,4.5,355
2,s1pJHjoce-IbHQiCe4mA3w,Martha's Country Bakery,4.5,993
3,UpPXAjKc-CyuCg72chwd3A,Lella Alimentari,4.5,237
4,YpGxtJy9ErnjfGG6DXy5uA,Coffee Shop,4.5,376
5,VdAVjghEq_Zl-DDte0mjrw,ACRE,4.5,109
6,AfZPx4piTmg9dqZpYgYTgg,Moe's Doughs Donut Shop,4.5,374
7,QaDOWy4-11982JWIxpImNQ,Little Choc Apothecary,4.5,546
8,qXuUBEaib4caLC6WCPhBjA,Patisserie Tomoko,4.5,413
9,k17DEW9TqRZf6EFSc49OsA,Bakeri,4.0,407


In [9]:
review_df

Unnamed: 0,business_id,review_user_id,review_user,review_rating,review_text
0,1Q3oaJahyGRogDWgpo7PIw,FvHYaxYF6mAw67dv_z35Rg,Ava B.,4,I loved the coffee to-go stand here. So glad I...
1,1Q3oaJahyGRogDWgpo7PIw,Sk9smcO5GeU1rY91DrFVew,Marykate M.,1,Clearly the owners/ management of this establi...
2,1Q3oaJahyGRogDWgpo7PIw,2M2GdA9HzcXI5cqECZDqgg,Brandi I.,4,Oh my goodness this place was delicious! The r...
3,kpxXi23lUQkeJQH-2BtzDw,lGZ1juew09lxLRDYAmdtjg,Sophia C.,5,Perfect chill spot to catch up with a friend! ...
4,kpxXi23lUQkeJQH-2BtzDw,zeTbYRLUK70R_3thy_L26A,Theo W.,5,This place lives up to the hype. \n\nI tried a...
5,kpxXi23lUQkeJQH-2BtzDw,ZdVhjBEiZvuuGcHHjIt6bA,Nichakorn C.,5,Adeni Chai (5) - very interesting and flavorfu...
6,s1pJHjoce-IbHQiCe4mA3w,SYXe1HJXcRKN8lgtBXPhfA,Sofyan S.,5,Food (5/5): We ordered the caramel cake and th...
7,s1pJHjoce-IbHQiCe4mA3w,-CQO2sipbY360zwPU_eJsA,Abiba S.,4,This bakery is situated in the heart of Willia...
8,s1pJHjoce-IbHQiCe4mA3w,4xxctThQAQYwb8Ntv17JCw,Michele B.,4,Vegan cheesecake is non-dairy \nDelicious with...
9,UpPXAjKc-CyuCg72chwd3A,-yEhhXT6URxh_yxHko5Gzg,Alan S.,5,I think this is one of the most underrated bre...


Pickle dataframes for future analysis.

In [11]:
# pickle df
business_df.to_pickle("./data/business_df.pkl")
review_df.to_pickle("./data/reviews_df.pkl")