# 2-SRM 641 Week 3 Pandas


## Making Pandas DataFrames from API Requests

Many websites have public APIs providing data feeds via JSON or some other format. There are a number of ways to access these APIs from Python; one method recommended is the requests package, which can be installed with pip or conda: `conda install requests`

In this example, we will use the U.S. Geological Survey's API to grab a JSON object of earthquake data and convert it to a `pandas.DataFrame`.

USGS API: https://earthquake.usgs.gov/fdsnws/event/1/

**Run the following codes and save the data in your folder for further analysis**

In [1]:
# Get Data from API

import datetime as dt
import pandas as pd
import requests

yesterday = dt.date.today() - dt.timedelta(days=1)
api = 'https://earthquake.usgs.gov/fdsnws/event/1/query'
payload = {
    'format': 'geojson',
    'starttime': yesterday - dt.timedelta(days=30),
    'endtime': yesterday
}
response = requests.get(api, params=payload)

# let's make sure the request was OK
response.status_code

200

Response of 200 means OK, so we can pull the data out of the result. Since we asked the API for a JSON payload, we can extract it from the response with the json() method.

## Isolate the Data from the JSON Response

We need to check the structures of the response data to know where our data is.

In [2]:
earthquake_json = response.json()
earthquake_json.keys()

dict_keys(['type', 'metadata', 'features', 'bbox'])

The USGS API provides information about our request in the metadata key. Note that your result will be different, regardless of the date range you chose, because the API includes a timestamp for when the data was pulled:

In [3]:
earthquake_json['metadata']

{'generated': 1699835425000,
 'url': 'https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2023-10-12&endtime=2023-11-11',
 'title': 'USGS Earthquakes',
 'status': 200,
 'api': '1.14.0',
 'count': 10087}

Each element in the JSON array features is a row of data for our dataframe.

In [4]:
type(earthquake_json['features'])

list

Since the results retrieved are based on real-time data, your data will be different depending on the date you run this.

In [5]:
earthquake_json['features'][0]

{'type': 'Feature',
 'properties': {'mag': 4.6,
  'place': '5 km SSE of Reykjanesbær, Iceland',
  'time': 1699659700951,
  'updated': 1699662479040,
  'tz': None,
  'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/us7000la96',
  'detail': 'https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=us7000la96&format=geojson',
  'felt': None,
  'cdi': None,
  'mmi': None,
  'alert': None,
  'status': 'reviewed',
  'tsunami': 0,
  'sig': 326,
  'net': 'us',
  'code': '7000la96',
  'ids': ',us7000la96,',
  'sources': ',us,',
  'types': ',origin,phase-data,',
  'nst': 47,
  'dmin': 0.943,
  'rms': 0.37,
  'gap': 111,
  'magType': 'mb',
  'type': 'earthquake',
  'title': 'M 4.6 - 5 km SSE of Reykjanesbær, Iceland'},
 'geometry': {'type': 'Point', 'coordinates': [-22.5005, 63.958, 10]},
 'id': 'us7000la96'}

## Convert to DataFrame

We need to grab the properties section out of every entry in the features JSON array to create our dataframe.

In [6]:
earthquake_properties_data = [
    quake['properties'] for quake in earthquake_json['features']
]
df = pd.DataFrame(earthquake_properties_data)
df.head() # only selected the first 5

Unnamed: 0,mag,place,time,updated,tz,url,detail,felt,cdi,mmi,...,ids,sources,types,nst,dmin,rms,gap,magType,type,title
0,4.6,"5 km SSE of Reykjanesbær, Iceland",1699659700951,1699662479040,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",us7000la96,",",us,",",origin,phase-data,",47.0,0.943,0.37,111.0,mb,earthquake,"M 4.6 - 5 km SSE of Reykjanesbær, Iceland"
1,-0.17,Alaska Peninsula,1699659556140,1699668903770,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",av91121313,",",av,",",origin,phase-data,",6.0,,0.05,147.0,ml,earthquake,M -0.2 - Alaska Peninsula
2,4.5,Banda Sea,1699658997006,1699660514040,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",us7000la90,",",us,",",origin,phase-data,",29.0,1.57,0.82,54.0,mb,earthquake,M 4.5 - Banda Sea
3,3.15,"4 km WSW of Mayagüez, Puerto Rico",1699657949680,1699659717050,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",pr71431213,",",pr,",",origin,phase-data,",10.0,0.04183,0.19,157.0,md,earthquake,"M 3.2 - 4 km WSW of Mayagüez, Puerto Rico"
4,1.1,"31 km SSW of Goldfield, Nevada",1699656768354,1699659341711,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nn00868762,",",nn,",",origin,phase-data,",10.0,0.184,0.1333,122.75,ml,earthquake,"M 1.1 - 31 km SSW of Goldfield, Nevada"


In [7]:
# View the whole dataframe, has 10087 rows, 26 columns

df 

Unnamed: 0,mag,place,time,updated,tz,url,detail,felt,cdi,mmi,...,ids,sources,types,nst,dmin,rms,gap,magType,type,title
0,4.60,"5 km SSE of Reykjanesbær, Iceland",1699659700951,1699662479040,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",us7000la96,",",us,",",origin,phase-data,",47.0,0.94300,0.3700,111.00,mb,earthquake,"M 4.6 - 5 km SSE of Reykjanesbær, Iceland"
1,-0.17,Alaska Peninsula,1699659556140,1699668903770,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",av91121313,",",av,",",origin,phase-data,",6.0,,0.0500,147.00,ml,earthquake,M -0.2 - Alaska Peninsula
2,4.50,Banda Sea,1699658997006,1699660514040,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",us7000la90,",",us,",",origin,phase-data,",29.0,1.57000,0.8200,54.00,mb,earthquake,M 4.5 - Banda Sea
3,3.15,"4 km WSW of Mayagüez, Puerto Rico",1699657949680,1699659717050,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",pr71431213,",",pr,",",origin,phase-data,",10.0,0.04183,0.1900,157.00,md,earthquake,"M 3.2 - 4 km WSW of Mayagüez, Puerto Rico"
4,1.10,"31 km SSW of Goldfield, Nevada",1699656768354,1699659341711,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nn00868762,",",nn,",",origin,phase-data,",10.0,0.18400,0.1333,122.75,ml,earthquake,"M 1.1 - 31 km SSW of Goldfield, Nevada"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10082,0.95,"3 km ENE of The Geysers, CA",1697069851600,1697077996330,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,1.0,0.0,,...,",nc73946215,",",nc,",",dyfi,nearby-cities,origin,phase-data,scitech-...",8.0,0.01036,0.0200,92.00,md,earthquake,"M 1.0 - 3 km ENE of The Geysers, CA"
10083,1.12,"3 km ENE of The Geysers, CA",1697069809590,1697076372183,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nc73946210,",",nc,",",nearby-cities,origin,phase-data,scitech-link,",9.0,0.01018,0.0100,92.00,md,earthquake,"M 1.1 - 3 km ENE of The Geysers, CA"
10084,1.40,"115 km NW of Yakutat, Alaska",1697069804213,1698690566413,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",ak023d3c9fpb,",",ak,",",origin,phase-data,",,,0.4200,,ml,ice quake,"M 1.4 Ice Quake - 115 km NW of Yakutat, Alaska"
10085,1.03,"3 km SSW of Cobb, CA",1697069373120,1697072892830,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nc73946205,",",nc,",",nearby-cities,origin,phase-data,scitech-link,",16.0,0.01684,0.0300,107.00,md,earthquake,"M 1.0 - 3 km SSW of Cobb, CA"


In [8]:
# View the last 5 observations

df.tail()

Unnamed: 0,mag,place,time,updated,tz,url,detail,felt,cdi,mmi,...,ids,sources,types,nst,dmin,rms,gap,magType,type,title
10082,0.95,"3 km ENE of The Geysers, CA",1697069851600,1697077996330,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,1.0,0.0,,...,",nc73946215,",",nc,",",dyfi,nearby-cities,origin,phase-data,scitech-...",8.0,0.01036,0.02,92.0,md,earthquake,"M 1.0 - 3 km ENE of The Geysers, CA"
10083,1.12,"3 km ENE of The Geysers, CA",1697069809590,1697076372183,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nc73946210,",",nc,",",nearby-cities,origin,phase-data,scitech-link,",9.0,0.01018,0.01,92.0,md,earthquake,"M 1.1 - 3 km ENE of The Geysers, CA"
10084,1.4,"115 km NW of Yakutat, Alaska",1697069804213,1698690566413,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",ak023d3c9fpb,",",ak,",",origin,phase-data,",,,0.42,,ml,ice quake,"M 1.4 Ice Quake - 115 km NW of Yakutat, Alaska"
10085,1.03,"3 km SSW of Cobb, CA",1697069373120,1697072892830,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",nc73946205,",",nc,",",nearby-cities,origin,phase-data,scitech-link,",16.0,0.01684,0.03,107.0,md,earthquake,"M 1.0 - 3 km SSW of Cobb, CA"
10086,0.9,"24 km ESE of Julian, CA",1697068830650,1697131563986,,https://earthquake.usgs.gov/earthquakes/eventp...,https://earthquake.usgs.gov/fdsnws/event/1/que...,,,,...,",ci40581440,",",ci,",",nearby-cities,origin,phase-data,scitech-link,",35.0,0.07059,0.19,71.0,ml,earthquake,"M 0.9 - 24 km ESE of Julian, CA"


Using DataFrame’s `to_csv` method, we can write the data out to a comma-separated file:

In [9]:
# Write to csv - save in your folder

df.to_csv('earthquakes.csv', index=False)

Resources: 

- Stefanie Molin. Hands on Data Analysis with Pandas