# STA 141B Data & Web Technologies for Data Analysis

### Lecture 8, 10/26/23, APIs


### Announcements

 - HW 2 due tomorrow

### Today's topics

- Undocumented APIs

### Ressources
 - [Yolo County Health Inspections](https://yoloeco.envisionconnect.com/)

### Recap: HTTP

A response to an HTTP request always includes a status code that summarizes whether the request was successful. Wikipedia has a full [list of HTTP status codes](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes). Generally,

* 200-299: Your request succeeded.
* 300-399: You need to take further action to complete the request.
* 400-499: Your request wasn't valid (you made a mistake). You've probably seen 404 before!
* 500-599: Your request failed (the server made a mistake).

### API Keys

Many APIs use a _key_ or _token_ to identify the user. For instance, The Guardian, a British newspaper, provides a [web API](https://open-platform.theguardian.com/) to access their news articles. You need an API key to use their web APIs. You can get one for free [here](https://bonobo.capi.gutools.co.uk/register/developer).

#### Storing API Keys

Your API key is private and your responsibility. Treat it like a password. Keep it secret! 

In order to keep your API key separate from your code:
1. Save the API key in a text file.
2. Use Python to load the API key into a variable.

Python's built-in `open()` function opens a file, and the `.readline()` method reads a line from a file. Often you'll see these used with `with`, which automatically closes the file at the end of the block:

In [None]:
def read_key(keyfile):
    with open(keyfile) as f:
        return f.readline().strip("\n")

In [None]:
key = read_key("../keys/guardian.txt") # Don't print out your actual API key

In [None]:
type(key)

Now you can use the `key` variable anywhere you need the actual API key.

#### Querying The Guardian

We've got our key, so let's use The Guardian API. 

We want to answer the question whether Biden or Trump get more newspaper coverage in the days leading up to the 2020 U.S. presidential election. Let's start by trying to get all of the articles about one of the candidates.

In [None]:
response = requests.get("https://content.guardianapis.com/search", params = {
        "api-key": key,
        "q": "Biden",
        "from-date": "2020-11-01",
        "to-date": "2020-11-10",
        "page-size": 50,
        "order_by": "newest",
        "page": 1
    }) # try page 12

In [None]:
response.raise_for_status

In [None]:
response.json()

In [None]:
import time
def get_articles(q, page = 1, from_date = "2020-11-01"):
    time.sleep(0.05) 
    response = requests.get("https://content.guardianapis.com/search", params = {
        "api-key": key,
        "q": q,
        "from-date": from_date,
        "to-date": "2020-11-10",
        "page-size": 50,
        "order_by": "newest", 
        "page": page
    })
    response.raise_for_status()
    return response.json()["response"]

In [None]:
biden = get_articles("Biden")

In [None]:
biden

In [None]:
pages = biden["pages"]
pages

In [None]:
pageSize = biden["pageSize"]
pageSize

In [None]:
currentPage = biden["currentPage"]
currentPage

In [None]:
results = biden["results"]
for p in range(2, pages + 1):
    results += get_articles("biden", p)["results"]

In [None]:
results

In [None]:
type(results)

In [None]:
df = pd.DataFrame(results)

In [None]:
df.shape

In [None]:
df.tail()

In [None]:
df["webPublicationDate"] = pd.to_datetime(df["webPublicationDate"])

In [None]:
type(df["webPublicationDate"][0])

In [None]:
df.head()

In [None]:
date = df["webPublicationDate"].dt
date

In [None]:
date.day_name()

In [None]:
dates = pd.DataFrame({"day": date.day, "day_name": date.day_name()})

In [None]:
dates

In [None]:
dates.groupby(["day", "day_name"]).size()

Write it as a function

In [None]:
def get_articles(q, page = 1):
    response = requests.get("https://content.guardianapis.com/search", params = {
        "api-key": key,
        "q": q,
        "from-date": "2020-11-01",
        "to-date": "2020-11-10",
        "page-size": 50,
        "page": page
    })
    response.raise_for_status()
    return response.json()["response"]

In [None]:
def get_all_articles(q, time_sleep = 0.05):
    # Get the first page, and find out how many pages there are.
    candidate = get_articles(q)
    pages = candidate["pages"]

    # Loop over remaining pages.
    results = candidate["results"]
    for p in range(2, pages + 1):
        results += get_articles(q, p)["results"]
        time.sleep(time_sleep)

    # Convert the articles to data frame, and the date column to a date.
    df = pd.DataFrame(results)
    df["webPublicationDate"] = pd.to_datetime(df["webPublicationDate"])
    
    # Get the day and day name, then count them.
    date = df["webPublicationDate"].dt
    dates = pd.DataFrame({"day": date.day, "day_name": date.day_name()})
    return dates.groupby(["day", "day_name"]).size()

In [None]:
biden=get_all_articles("Biden")
biden

In [None]:
biden.head(10)

In [None]:
trump=get_all_articles("Trump")
trump

In [None]:
df = pd.DataFrame([biden,trump]).T
df = df.rename(columns={0: 'Biden', 1: 'Trump'})
df = df.reset_index()
df

In [None]:
df = df.melt(id_vars = ['day', 'day_name'])
df

In [None]:
import plotnine as p9
(
    p9.ggplot(df, p9.aes(x='day',y='value',color='variable')) + 
        p9.geom_line() + 
    p9.labs(color='',x='Day',y='Number of articles')
)

What are some ways this analysis could be improved?

* Check that articles about "Trump" and "Biden" are actually about the two candidates. Some may be about other things -- the English word "trump", "Hunter Biden", etc...
* Check whether the API searches article text or just article titles.
* Use more sources, and use American newspapers (unless the goal was to analyze international news).
* Make visualizations.
* Use a larger time window.
* Use other kinds of data (e.g., poll results) to look for relationships.

Collecting and cleaning data takes a lot of very technical work, but it's only the first step in the analysis. When you finish data collection and cleaning, it can feel like you're finally done. Take a moment to congratulate yourself and step away from the data, so that when you come back you'll be ready to do a careful statistical analysis.

### OAuth

[OAuth](https://en.wikipedia.org/wiki/OAuth) is a way to give an application access to data on a website or web API.

You might run into OAuth if you use a web API where the data is private. For instance, Twitter provides a [web API](https://developer.twitter.com/en/docs.html) for managing your personal Twitter account. If you want to access the API from a Python script, first you have to use OAuth to tell Twitter that the script has permission to use your data.

OAuth can operate in several different ways. As always, check the documentation for the web API you want to use in order to find out what you need to do.

The simplest case of OAuth requires scripts to have a key or token from the web API provider. This is very similar to using an API key.

For more complicated cases, the `requests-ouathlib` package ([docs](https://requests-oauthlib.readthedocs.io/en/latest/)) may help.

### Undocumented Web APIs

Many websites use undocumented web APIs to get data. For example:

 - [University of California Compensation](https://ucannualwage.ucop.edu/wage/)
 - [Yolo County Health Inspections](https://yoloeco.envisionconnect.com/)

You can identify these websites by looking at requests in your browser's developer tools. For Firefox and Chrome these can be accessed (Windows: <kbd>Ctrl</kbd> + <kbd>i</kbd>; MacOS: <kbd>&#8984;</kbd> + <kbd>&#8997;</kbd> + <kbd>i</kbd>).

Requests to web APIs almost always return JSON or XML data. By examining the browser requests, you can work out the endpoints and parameters, allowing you to use the API.

**CAUTION:** Web APIs that are undocumented are often undocumented for a reason. Using an undocumented API may make someone angry or get you into legal trouble! Government and quasi-government websites (like the examples above) are probably okay, as long as you cache and rate-limit your requests. For everything else, find for an alternative or get permission first.

Let's reverse engineer the Yolo County Health Inspections web API so that we can get data about local restaurants.

In [10]:
import numpy as np
import pandas as pd
import requests
import requests_cache
requests_cache.install_cache("mycache")

In [11]:
url = 'https://yoloeco.envisionconnect.com/api/pressAgentClient/searchFacilities'

In [12]:
result = requests.post(url, params = {
    'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4'
}, 
              data = {
    'FacilityName': "Ali Baba"
})
result.raise_for_status()

Check the [docs](https://requests.readthedocs.io/en/latest/api/?highlight=post#requests.post) for `requests`!

In [13]:
result.url

'https://yoloeco.envisionconnect.com/api/pressAgentClient/searchFacilities?PressAgentOid=c08cb189-894c-4c8c-b595-a5ef010226b4'

In [14]:
result.json()

[{'FacilityId': 'FA0001973',
  'FacilityName': 'ALI BABA RESTAURANT',
  'Address': '220 3RD ST ',
  'CityStateZip': 'DAVIS CA 95616 ',
  'LastScore': 100.0,
  'attachmentId': '1ed0446f-d8a7-4a72-8557-afb800bb6fac'}]

Lets investigate this further. The second request uses the `FacilityID` as parameter. 

In [15]:
url = 'https://yoloeco.envisionconnect.com/api/pressAgentClient/programs'
result = requests.get(url, params = {
    'FacilityId': 'FA0001973', 
    'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4'
})
result.raise_for_status()
result.json()

[{'ProgramId': 'PR0000674',
  'ProgramIdentifier': None,
  'ProgramCategory': 'FOOD INSPECTION',
  'LastScore': 100.0,
  'attachmentId': '1ed0446f-d8a7-4a72-8557-afb800bb6fac'}]

We are interested in the inspections text, for which we have to provide the `ProgramID` parameter. 

In [16]:
url = 'https://yoloeco.envisionconnect.com/api/pressAgentClient/inspections'

In [17]:
result = requests.get(url, params = {
    'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
    'ProgramId': 'PR0000674'
})
result.raise_for_status()

In [18]:
results = result.json()
results

[{'activity_date': '2023-02-28T00:00:00',
  'service': 'ROUTINE-INSPECTION',
  'Oid': '722bd292-ad66-46ad-a16b-afb7011a322e',
  'FacilityOid': 'a4602905-d0f7-4fbe-8388-7de1163c420e',
  'score': 100.0,
  'attachmentId': '5323c79d-03c4-418f-bb06-afb800b42912',
  'color': 'Green',
  'violations': [{'violation_description': 'K045 - Floors, walls and ceilings: built, maintained, and clean',
    'v_memo': 'California Retail Food Code §114271. Observed the following surfaces needing to be cleaned: \r\n- walls around the hood and the hood system \r\n- ceiling in food prep area and at the ventilation system. Cardboard and tape must be removed from ventilation system. \r\n- floors under the three compartment sink must be cleaned\r\nAll walls, floors, and ceiling surfaces within a facility shall be clean and in good repair, as well as durable, smooth, and non-absorbent at all times. Correct within 1 week.\r\n',
    'violation_text': 'The walls / ceilings shall have durable, smooth, nonabsorbent, 

In [19]:
results_df = pd.DataFrame(results)
results_df

Unnamed: 0,activity_date,service,Oid,FacilityOid,score,attachmentId,color,violations
0,2023-02-28T00:00:00,ROUTINE-INSPECTION,722bd292-ad66-46ad-a16b-afb7011a322e,a4602905-d0f7-4fbe-8388-7de1163c420e,100.0,5323c79d-03c4-418f-bb06-afb800b42912,Green,"[{'violation_description': 'K045 - Floors, wal..."
1,2022-05-09T00:00:00,ROUTINE-INSPECTION,2ff09bcd-9a36-4b8d-ada1-ae90011ebb4d,a4602905-d0f7-4fbe-8388-7de1163c420e,100.0,3fccdc38-5271-4669-bccc-ae9100aa99ee,Green,[{'violation_description': 'K006 - Adequate ha...
2,2021-03-12T00:00:00,ROUTINE-INSPECTION,dd11952d-7c82-4410-b030-ace901237a04,a4602905-d0f7-4fbe-8388-7de1163c420e,100.0,43e9d9ea-f28d-45e7-963e-ace901277735,Green,[{'violation_description': 'K005 - Hands clean...


In [20]:
results_df['violations'][0]

[{'violation_description': 'K045 - Floors, walls and ceilings: built, maintained, and clean',
  'v_memo': 'California Retail Food Code §114271. Observed the following surfaces needing to be cleaned: \r\n- walls around the hood and the hood system \r\n- ceiling in food prep area and at the ventilation system. Cardboard and tape must be removed from ventilation system. \r\n- floors under the three compartment sink must be cleaned\r\nAll walls, floors, and ceiling surfaces within a facility shall be clean and in good repair, as well as durable, smooth, and non-absorbent at all times. Correct within 1 week.\r\n',
  'violation_text': 'The walls / ceilings shall have durable, smooth, nonabsorbent, light-colored, and washable surfaces.  All floor surfaces, other than the customer service areas, shall be approved, smooth, durable and made of nonabsorbent material that is easily cleanable. Approved base coving shall be provided in all areas, except customer service areas and where food is store

In [21]:
results_df['violations'][0][0]['v_memo']

'California Retail Food Code §114271. Observed the following surfaces needing to be cleaned: \r\n- walls around the hood and the hood system \r\n- ceiling in food prep area and at the ventilation system. Cardboard and tape must be removed from ventilation system. \r\n- floors under the three compartment sink must be cleaned\r\nAll walls, floors, and ceiling surfaces within a facility shall be clean and in good repair, as well as durable, smooth, and non-absorbent at all times. Correct within 1 week.\r\n'

In [22]:
len(results_df['violations'][0])

1

In [23]:
violations = [
    results_df['violations'][0][i]['violation_description'] for i in range(len(results_df['violations'][0]))
]
violations

['K045 - Floors, walls and ceilings: built, maintained, and clean']

In [24]:
{'Ali Baba': violations}

{'Ali Baba': ['K045 - Floors, walls and ceilings: built, maintained, and clean']}

How can we generalize this procedure? 

In [25]:
url = 'https://yoloeco.envisionconnect.com/api/pressAgentClient/searchFacilities'

In [26]:
result=requests.post(url, params  = {
    "PressAgentOid": "c08cb189-894c-4c8c-b595-a5ef010226b4"}, 
                     data = {
    "FacilityName": "Ali Baba", 
})
result.raise_for_status()

In [27]:
result.json()

[{'FacilityId': 'FA0001973',
  'FacilityName': 'ALI BABA RESTAURANT',
  'Address': '220 3RD ST ',
  'CityStateZip': 'DAVIS CA 95616 ',
  'LastScore': 100.0,
  'attachmentId': '1ed0446f-d8a7-4a72-8557-afb800bb6fac'}]

In [28]:
result=requests.post(url, params  = {
    "PressAgentOid": "c08cb189-894c-4c8c-b595-a5ef010226b4"}, 
              data = {
    "FacilityName": "a", 
})
result.json()

[{'FacilityId': 'FA0001345',
  'FacilityName': 'A&B LIQUOR',
  'Address': '2328 W CAPITOL AVE ',
  'CityStateZip': 'WEST SACRAMENTO CA 95691 ',
  'LastScore': 100.0,
  'attachmentId': '778c1323-b5bd-4910-bd4a-af6d00f13489'},
 {'FacilityId': 'FA0022329',
  'FacilityName': 'ACE SUSHI @SAVE MART 604',
  'Address': '1900 ANDERSON RD ',
  'CityStateZip': 'DAVIS CA 95616 ',
  'LastScore': 100.0,
  'attachmentId': '12f56aac-cff0-4266-804e-b01c00fb05c4'},
 {'FacilityId': 'FA0019474',
  'FacilityName': 'ACOUSTIC EVENTS',
  'Address': '4467 D ST ',
  'CityStateZip': 'SACRAMENTO CA 95819 ',
  'LastScore': 100.0,
  'attachmentId': '6ad8b394-103b-4f88-8e1c-ae2b01015500'},
 {'FacilityId': 'FA0014014',
  'FacilityName': 'AFC SUSHI / HOT WOK @ BEL AIR #526',
  'Address': '1885 E GIBSON RD ',
  'CityStateZip': 'WOODLAND CA 95776 ',
  'LastScore': 100.0,
  'attachmentId': 'b935b8cb-815a-4c30-8a00-b05701044f13'},
 {'FacilityId': 'FA0014013',
  'FacilityName': "AFC SUSHI / HOT WOK @ RALEY'S #206",
  'Addr

In [29]:
pd.DataFrame(result.json())

Unnamed: 0,FacilityId,FacilityName,Address,CityStateZip,LastScore,attachmentId
0,FA0001345,A&B LIQUOR,2328 W CAPITOL AVE,WEST SACRAMENTO CA 95691,100.0,778c1323-b5bd-4910-bd4a-af6d00f13489
1,FA0022329,ACE SUSHI @SAVE MART 604,1900 ANDERSON RD,DAVIS CA 95616,100.0,12f56aac-cff0-4266-804e-b01c00fb05c4
2,FA0019474,ACOUSTIC EVENTS,4467 D ST,SACRAMENTO CA 95819,100.0,6ad8b394-103b-4f88-8e1c-ae2b01015500
3,FA0014014,AFC SUSHI / HOT WOK @ BEL AIR #526,1885 E GIBSON RD,WOODLAND CA 95776,100.0,b935b8cb-815a-4c30-8a00-b05701044f13
4,FA0014013,AFC SUSHI / HOT WOK @ RALEY'S #206,367 W MAIN ST,WOODLAND CA 95695,100.0,389c36bc-45e3-47ef-a36e-b03e010a4c95
5,FA0014015,AFC SUSHI / HOT WOK @ RALEY'S #448,1601 W CAPITOL AVE,WEST SACRAMENTO CA 95691,100.0,54ab2356-9d16-40fa-a278-b0900101971d
6,FA0010191,AFC SUSHI @ SAFEWAY #1205,1451 W COVELL BLVD,DAVIS CA 95616,100.0,5258bb8c-f65d-4f7f-8e93-b03e008d93a8
7,FA0010190,AFC SUSHI @ SAFEWAY #1561,2121 COWELL BLVD,DAVIS CA 95616,100.0,9c5a9061-4b83-4ccb-a3d8-b04c0105c1e2
8,FA0018619,AFC TIK TOK WOK @ SAFEWAY #1205,1451 W COVELL BLVD,DAVIS CA 95616,100.0,ce5f0c41-1c7a-46a4-aad1-b03e008bcad3
9,FA0012949,AFRIDI FOOD COMPANY,1250 CHURCHILL DOWNS AVE D STE,WOODLAND CA 95776,100.0,b55fd798-09fb-4bbb-9c4d-aff1010b75ae


Lets write a pipeline. 

In [30]:
def fetch_violations(ProgramId):
    result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/inspections', 
                          params = {
        'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
        'ProgramId': ProgramId
    })
    result.raise_for_status()
    results = result.json()
    results_df = pd.DataFrame(results)
    violations = [
        results_df['violations'][0][i]['violation_description'] for i in range(len(results_df['violations'][0]))
    ]
    return(violations)

In [31]:
fetch_violations('PR0000674') # for ali baba 

['K045 - Floors, walls and ceilings: built, maintained, and clean']

In [32]:
def fetch_ProgramId(FacilityID):
    result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/programs', 
                          params = {
        'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
        'FacilityID': FacilityID
    })
    result.raise_for_status()
    ProgramId = result.json()[0]['ProgramId']
    return(ProgramId)

In [33]:
fetch_ProgramId('FA0001973')

'PR0000674'

In [34]:
def fetch_FacilityID(letter):
    result = requests.post('https://yoloeco.envisionconnect.com/api/pressAgentClient/searchFacilities?', 
                           params  = {
    "PressAgentOid": "c08cb189-894c-4c8c-b595-a5ef010226b4"}, 
                           data = {
    "FacilityName": letter, 
    })
    facility_table = pd.DataFrame(result.json())[['FacilityId', 'FacilityName']]
    return(facility_table)

In [35]:
fetch_FacilityID('a')

Unnamed: 0,FacilityId,FacilityName
0,FA0001345,A&B LIQUOR
1,FA0022329,ACE SUSHI @SAVE MART 604
2,FA0019474,ACOUSTIC EVENTS
3,FA0014014,AFC SUSHI / HOT WOK @ BEL AIR #526
4,FA0014013,AFC SUSHI / HOT WOK @ RALEY'S #206
5,FA0014015,AFC SUSHI / HOT WOK @ RALEY'S #448
6,FA0010191,AFC SUSHI @ SAFEWAY #1205
7,FA0010190,AFC SUSHI @ SAFEWAY #1561
8,FA0018619,AFC TIK TOK WOK @ SAFEWAY #1205
9,FA0012949,AFRIDI FOOD COMPANY


In [36]:
import time

In [37]:
def get_violations(): 
    violations = {}
    for letter in map(chr, range(97, 99)): # map(chr, range(97, 123)) takes too long
        time.sleep(0.05) # sleep until making a request for each letter
        facility_table = fetch_FacilityID(letter)
        for index in range(facility_table.shape[0]): # for all facilities returned for this letter
            FacilityId, FacilityName = facility_table.iloc[index]
            time.sleep(0.1) # sleep again for each individual request
            ProgramId = fetch_ProgramId(FacilityId)
            print(FacilityName)
            violations[FacilityName] = fetch_violations(ProgramId)
    return(violations)

In [38]:
violations = get_violations()

A&B LIQUOR


KeyError: 'violations'

In [39]:
ProgramId = fetch_ProgramId('FA0001345')            
ProgramId

'PR0000623'

In [40]:
fetch_violations('PR0000623')

KeyError: 'violations'

In [41]:
result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/inspections', params = {
        'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
        'ProgramId': 'PR0000623'
})
result.raise_for_status()

In [42]:
results = result.json()
results

[]

Lets check this in the browser! 

In [43]:
result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/programs', params = {
        'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
        'FacilityID': 'FA0001345'
    }).json()
[result[i]['ProgramId'] for i in range(len(result))]

['PR0000623', 'PR0069422']

In [44]:
def fetch_ProgramId(FacilityID):
    result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/programs', params = {
        'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
        'FacilityID': FacilityID
    }).json()
    ProgramId = [result[i]['ProgramId'] for i in range(len(result))]
    return(ProgramId)

In [45]:
def fetch_violations(ProgramId_list):
    violations = []
    for ProgramId in ProgramId_list: 
        result = requests.get('https://yoloeco.envisionconnect.com/api/pressAgentClient/inspections', params = {
            'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4', 
            'ProgramId': ProgramId
        }).json()
        results_df = pd.DataFrame(result)
        if not results_df.empty: # only append violations if there are any
            violations.extend(
                [results_df['violations'][0][i]['violation_description'] for i in range(len(results_df['violations'][0]))]
            )
    return(violations)

In [46]:
fetch_violations(['PR0000623', 'PR0069422'])

['K021 - Hot and cold water available',
 'K039 - Thermometers provided and accurate']

In [47]:
violations = get_violations()

A&B LIQUOR
ACE SUSHI @SAVE MART 604
ACOUSTIC EVENTS
AFC SUSHI / HOT WOK @ BEL AIR #526
AFC SUSHI / HOT WOK @ RALEY'S #206
AFC SUSHI / HOT WOK @ RALEY'S #448
AFC SUSHI @ SAFEWAY #1205
AFC SUSHI @ SAFEWAY #1561
AFC TIK TOK WOK @ SAFEWAY #1205
AFRIDI FOOD COMPANY
AFTER HOURS BOBA & TEA
AGGIE LIQUOR
AGTECH INNOVATION ALLIANCE
AISLE 1 #2576
AJ HUNDAL MART
AKIRA COFFEE & TEA
ALI BABA RESTAURANT
ALL SEASONS ALL REASONS CATERING
ALOHA POKEE & RAMEN
ALYCE NORMAN SCHOOL
AM/PM MINI MARKET #5731- FOOD
AMY LOVES MUSTARD
ANAR PERSIAN KITCHEN
ANDERSON FAMILY CATERING BBQ
ANDERSON GAS & MINI MART - FOOD
ANDERSON ROAD SHELL - FOOD
ANDY'S ARCO - FOOD
ANTOJITOS JAIMITO #4TB1720
APNA BAZAAR
APPLEBEE'S - WOODLAND
ARCO AMPM GAS STATION
ARIANA FOOD MARKET
ARMADILLO MUSIC INC.
ASOCIACION LOS CAPORALES - ARENA CONCESSION
ATLAS CRAFT COFFEE
AUNTIE YASY'S GLUTEN-FREE GOODIES & MEAL DELIVERY LLC
AUTHENTIC INDIA
AVOCADO TOAST
AY! JALISCO TAQUERIA #1
BABES.BUBBLES.BOARDS
BABY O'S DONUTS
BAKLAVA AND COFFEE
BALLAST P

In [48]:
violations

{'A&B LIQUOR': ['K021 - Hot and cold water available',
  'K039 - Thermometers provided and accurate'],
 'ACE SUSHI @SAVE MART 604': [],
 'ACOUSTIC EVENTS': [],
 'AFC SUSHI / HOT WOK @ BEL AIR #526': ['K022 - Sewage and wastewater properly disposed'],
 "AFC SUSHI / HOT WOK @ RALEY'S #206": [],
 "AFC SUSHI / HOT WOK @ RALEY'S #448": ['K045 - Floors, walls and ceilings: built, maintained, and clean'],
 'AFC SUSHI @ SAFEWAY #1205': [],
 'AFC SUSHI @ SAFEWAY #1561': ['K035 - Equipment/Utensils - approved installed clean good repair, capacity'],
 'AFC TIK TOK WOK @ SAFEWAY #1205': ['K027 - Food separated and protected'],
 'AFRIDI FOOD COMPANY': ['K009 - Proper cooling methods',
  'K014 - Food contact surfaces: clean and sanitized'],
 'AFTER HOURS BOBA & TEA': ['K006 - Adequate handwashing facilities supplied & accessible',
  'K033 - Nonfood-contact surfaces clean'],
 'AGGIE LIQUOR': ['K045 - Floors, walls and ceilings: built, maintained, and clean'],
 'AGTECH INNOVATION ALLIANCE': [],
 'AISL

#### Safeway

Check the [docs](https://requests.readthedocs.io/en/latest/api/?requests.get)!

In [49]:
url = 'https://www.safeway.com/abs/pub/xapi/search/products'
params = {
    'request-id': 8261674581248326671,
    'q': 'eggs',
    'rows': 30,
    'start': 0,
    'search-type': 'keyword',
    'storeid': 3132,
    'featured': 'true',
    'url': 'https://www.safeway.com',
    'pageurl': 'https://www.safeway.com', 
    'search-uid': 'uid%3D3640904575678%3Av%3D12.0%3Ats%3D1674581210532%3Ahc%3D3', 
    'pagename': 'search',
    'dvid': 'web-4.1search',
}
header = {
    'ocp-apim-subscription-key': 'e914eec9448c4d5eb672debf5011cf8f', 
}

In [50]:
results = requests.get(url, params=params)
results.json()

{'statusCode': 401,
 'message': 'Access denied due to missing subscription key. Make sure to include subscription key when making requests to an API.'}

In [51]:
results = requests.get(url, params=params, headers=header)
results.json()

{'timestamp': '2023-10-23T06:48:25.987Z',
 'status': 'BAD_REQUEST',
 'message': 'Search encountered a problem. Please try again OSSR0033-R',
 'errors': ['Search encountered a problem. Please try again OSSR0033-R']}

In [52]:
results.raise_for_status

<bound method Response.raise_for_status of <Response [400]>>

### Summary 

- Check the query type, header and params using the developer tools 
- Often, multiple API queries are made to display one result 