# Practical Pandas and API's

**OBJECTIVE**

- Read csv files using url's and local file paths
- Access data using API's and the `requests` library
- Parse `json` data returned from API's

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas.plotting as pdp

### API's

An API is another resource for loading data.  Let's consider a basic API containing jokes of a wide variety.

- [Documentation](https://v2.jokeapi.dev/)
- Example response [here](https://v2.jokeapi.dev/joke/Any?type=twopart)

In [2]:
import requests

In [3]:
#make a request of the api (through a url)
r = requests.get('https://v2.jokeapi.dev/joke/Any?type=twopart')

In [4]:
r

<Response [200]>

In [5]:
print(r.text)

{
    "error": false,
    "category": "Dark",
    "type": "twopart",
    "setup": "What's the difference between an apple and a black guy?",
    "delivery": "The apple will eventually fall from the tree that it's hanging from!",
    "flags": {
        "nsfw": false,
        "religious": false,
        "political": false,
        "racist": true,
        "sexist": false,
        "explicit": true
    },
    "id": 101,
    "safe": false,
    "lang": "en"
}


In [6]:
data_dict = r.json()

In [7]:
data_dict['setup']

"What's the difference between an apple and a black guy?"

### Example II: NYC Open Data

Some API's have supporting libraries that stand in between the raw data and you; for example to access the NYC Citywide Payroll Data [here](https://dev.socrata.com/foundry/data.cityofnewyork.us/k397-673e).


In [8]:
# make sure to install these packages before running:
!pip install sodapy

import pandas as pd
from sodapy import Socrata

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.cityofnewyork.us", None)

# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("k397-673e", limit=2000)

# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

Collecting sodapy


  Downloading sodapy-2.2.0-py2.py3-none-any.whl.metadata (15 kB)


Downloading sodapy-2.2.0-py2.py3-none-any.whl (15 kB)


Installing collected packages: sodapy


Successfully installed sodapy-2.2.0



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m




In [9]:
results_df.head()

Unnamed: 0,fiscal_year,payroll_number,agency_name,last_name,first_name,mid_init,agency_start_date,work_location_borough,title_description,leave_status_as_of_june_30,base_salary,pay_basis,regular_hours,regular_gross_paid,ot_hours,total_ot_paid,total_other_pay
0,2025,67,ADMIN FOR CHILDREN'S SVCS,NARVAEZ,JOSE,I,1996-06-23T00:00:00.000,MANHATTAN,CITY LABORER,ACTIVE,331.92,per Day,2080.0,75400.0,886.5,49064.52,0.0
1,2025,67,ADMIN FOR CHILDREN'S SVCS,HOWARD,CONNIE,T,2011-07-04T00:00:00.000,BRONX,ASSOCIATE YOUTH DEVELOPMENT SPECIALIST,ACTIVE,81562.0,per Annum,1820.0,73742.01,407.52,23908.27,18706.99
2,2025,67,ADMIN FOR CHILDREN'S SVCS,CALABRESE,MICHAEL,,1996-06-23T00:00:00.000,MANHATTAN,PROGRAM EVALUATOR,ACTIVE,108739.0,per Annum,1820.0,105356.95,0.0,0.0,1497.01
3,2025,67,ADMIN FOR CHILDREN'S SVCS,GILL,KAREN,E,1996-06-23T00:00:00.000,MANHATTAN,CLERICAL ASSOCIATE MOST MAYORAL AG,ACTIVE,76460.0,per Annum,1820.0,74075.15,0.0,0.0,5696.52
4,2025,67,ADMIN FOR CHILDREN'S SVCS,JACKSON,SHELDENE,A,1998-07-13T00:00:00.000,QUEENS,CHILD PROTECTIVE SPECIALIST,ACTIVE,75138.0,per Annum,1820.0,72794.28,56.0,2781.22,9274.01


### Finding Data

Identify two areas of interest to your group.  Can you find a resource for good data on these subjects using either local data files, url's, or an API?