# **Yelp API**

## **Imports**

In [None]:
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Additional Imports
import os, json, math, time
from yelpapi import YelpAPI
from tqdm.notebook import tqdm_notebook

In [None]:
# Load API Credentials
with open('/Users/davyd/.secret/yelp_api.json','r') as f:
    login = json.load(f)

In [None]:
login.keys()

In [None]:
dict_keys(['client-id', 'api-key'])

In [None]:
# Instantiate YelpAPI Variable
yelp = YelpAPI(login['api-key'],timeout_s=5.0)

Using the Yelp API involves several steps, including creating a Yelp Developer account, obtaining API keys, and making requests to the API. Here's a step-by-step guide:

Create a Yelp Developer Account:

Go to the Yelp Developer website: Yelp for Developers.
Sign up for a developer account if you don't have one. If you do, log in.
Create a New App:

Once logged in, go to the Create App page.
Fill out the required information for your new app (name, description, etc.).
Agree to the terms of service and click on "Create App."
Get API Key:

After creating the app, you'll be redirected to the app's dashboard.
Locate the "API Keys" section to find your API key. You'll need this key to authenticate your requests.
Understand the API Endpoints:

Review the Yelp Fusion API documentation to understand the available endpoints and how to structure your requests.
Common endpoints include:
Business Search: Retrieve information about businesses using parameters such as location, category, or name.
Business Details: Get detailed information about a specific business using its unique ID.
Reviews: Retrieve reviews for a specific business.
Autocomplete: Get autocomplete suggestions for search terms.
Make API Requests:

Use your preferred programming language to make HTTP requests to Yelp's API endpoints.

Include your API key in the request headers for authentication.

Example using cURL:

bash
Copy code
curl -X GET -H "Authorization: Bearer YOUR_API_KEY" "https://api.yelp.com/v3/businesses/search?location=San+Francisco"
Handle API Responses:

Yelp API responses are typically in JSON format. Parse the JSON response to extract the information you need.

Handling a dataset with multiple rows for restaurants that had multiple visits can be challenging, but it can also provide valuable insights if done correctly. Here's a step-by-step approach to help you answer your questions and make predictions about future visits:

Data Exploration:

Begin by thoroughly understanding your dataset. Examine the columns and their meanings.
Identify any missing values, outliers, or inconsistencies in the data.
Plot histograms, scatter plots, or other visualizations to explore data distributions and relationships.
Data Preprocessing:

Handle missing data appropriately, either by imputing values or removing rows with missing information.
Standardize or normalize numerical features if necessary.
Convert categorical variables into numerical representations using techniques like one-hot encoding or label encoding.
Feature Engineering:

Create new features if they might provide valuable information for your analysis. For example, you can calculate the average score for each restaurant from its multiple visits.
Consider using time-related features to capture trends over time.
Data Splitting:

Decide whether to split the data into multiple datasets (one for each visit) or keep it together.
Splitting the data by visit may be useful if you want to model each visit separately and analyze trends between visits. However, this can increase complexity.
Keeping the data together and adding a visit identifier can make it easier to build predictive models.
Exploratory Data Analysis (EDA):

Conduct EDA to identify trends or patterns within the data.
Analyze the distribution of scores, types of violations, cuisine categories, and geographical locations.
Feature Selection:

Determine which features are most relevant for predicting subsequent visits. You can use techniques like feature importance or correlation analysis.
Consider creating new features that capture the history of violations, scores, or other relevant information across visits.
Model Building:

Choose appropriate machine learning algorithms for your prediction task. You might consider classification models since you want to predict whether a restaurant will have a second or third visit.
Split your data into training and testing sets for model evaluation.
Train and fine-tune your models on the training data, using appropriate evaluation metrics (e.g., accuracy, precision, recall, F1-score).
Evaluation and Validation:

Evaluate your model's performance on the testing data to ensure it generalizes well to new data.
Consider using techniques like cross-validation to assess model stability and reduce overfitting.
Interpretability:

Interpret your model's results to understand which features are driving predictions. This can provide insights into why certain restaurants are more likely to receive subsequent visits.
Deployment:

If your model performs well, consider deploying it as a tool to help prioritize restaurant inspections or allocate resources more effectively.
Continuous Monitoring and Updating:

Periodically retrain your model with new data to keep it up to date and ensure its performance remains satisfactory.
Whether you choose to split the data by visit or keep it together depends on the specific goals of your analysis and the complexity you're comfortable with. Keeping the data together with appropriate feature engineering and feature selection can often lead to more interpretable models and simpler workflows. However, splitting the data by visit can provide more granular insights into the factors influencing repeat inspections for each visit. It's essential to weigh the trade-offs and choose an approach that aligns with your project's objectives.

# **SQL query for API**

In [None]:
#Notes:
#Only some inspection types result in gradable inspections
#While scores are provided on initial inspection, grades are not unless the score is less than or equal to 13.
#If a restaurant receives a score of > 13 on initial inspection,the current grade will still reflect the grade
#received during the last inspection cycle until the next re-inspection is complete. 
#Download DOHMH NYC Restaurant Inspection Results data set and save as CSV file: NYC_Insp_Results.csv 


with RecentInspDate as (
       select 'CAMIS'
              , max([INSPECTION DATE]) as MostRecentInspDate
              from NYC_Insp_Results
       where ([INSPECTION TYPE] in (                                            
                'Cycle Inspection / Re-inspection'
              , 'Pre-permit (Operational) / Re-inspection')
              OR ([INSPECTION TYPE] in (                                                    
                'Cycle Inspection / Initial Inspection'                                  
              , 'Pre-permit (Operational) / Initial Inspection') 
              AND SCORE <= 13) or ([INSPECTION TYPE] in (                                                    
                'Pre-permit (Operational) / Reopening Inspection'
                           ,'Cycle Inspection / Reopening Inspection') 
)) and GRADE in ('A', 'B', 'C', 'P', 'Z')  #values where a grade card or grade pending card is issued
       group by CAMIS)

In [None]:
#Select restaurant inspection data based on the most recent inspection date
select distinct r.CAMIS
       , DBA as Name
       , r.MostRecentInspDate
       , GRADE
       , [INSPECTION TYPE]
       , SCORE
       from NYC_Insp_Results i
       join RecentInspDate r
              on r.CAMIS = i.CAMIS
              and r.MostRecentInspDate = i.[INSPECTION DATE]
       where [INSPECTION TYPE] in (                                            
'Cycle Inspection / Re-inspection'
, 'Pre-permit (Operational) / Re-inspection' #re-inspections
              , 'Pre-permit (Operational) / Reopening Inspection'
,'Cycle Inspection / Reopening Inspection') #re-opening inspections where grade pending is issued
              OR ([INSPECTION TYPE] in (                                                    
                'Cycle Inspection / Initial Inspection'                                  
              , 'Pre-permit (Operational) / Initial Inspection') 
              AND SCORE <= 13) #initial inspections where A grade is issued  