# Week 9 Lecture - Data Formats

Topics
- Getting data from the web
- Application Programming Interfaces (APIs)
- Hypertext Transfer Protocol (HTTP) 
- HTTP in Python with Requests

## Getting Data from the Web

So far we have been getting data into python via 2 ways:
- Typing it literally into our code as ints, floats, strings
- Readings files from disk using `open()` and 


- project gutenberg example
- WPRDC Example with CSV

## Application Programming Interfaces 


- What is an APIs
- API documentation
    - National Weather Service - https://www.weather.gov/documentation/services-web-api
- Example APIs
    - Library of Congress https://labs.loc.gov/lc-for-robots/
    - Open Weather https://openweathermap.org/api

### What is an API

APIs are *designed* by humans for humans (well software developers)

### Using APIs: Read the documentation

### Example APIs



```
Information Sciences Building, 
135 North Bellefield Avenue, 
Pittsburgh, PA 15213

Latitude: 40.447475 | Longitude: -79.952396
```
Get your GPS coordinates via https://www.gps-coordinates.net 

## Hypertext Transfer Protocol - The Lingua Franca of the Web

HTTP is the protocol or languate of web browsers and web servers, but also of many other services and applications.

https://developer.mozilla.org/en-US/docs/Web/HTTP

https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview

Not just for HTML, you can do data too. 


WHen you know HTTP and you can read the documentation, then you have the basic knowledge and skills to access all kinds of APIs.

### HTTP Request & Response


### HTTP Verbs

### HTTP Flow

## HTTP in Python with Requests

Python includes an [HTTP module in the standard library](https://docs.python.org/3/library/http.html). However it is "low level" which means the API design can leave some things to be desired. 
    - Think of building a house from scratch vs. [buying one from Ikea](https://www.boklok.co.uk)
    
    
Because communicating with APIs via HTTP is so common, a software developer named [Ken Reitz](https://kenreitz.org) created a third-party Python library called [Requests](https://requests.readthedocs.io/en/master/).

"**Requests** is an elegant and simple HTTP library for Python, built for human beings."



In [2]:
import requests

In [3]:
r = requests.get('https://api.github.com/user', auth=('user', 'pass'))

In [4]:
r.status_code

401

In [5]:
r.headers

{'Server': 'GitHub.com', 'Date': 'Mon, 15 Mar 2021 13:57:10 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Content-Length': '131', 'X-GitHub-Media-Type': 'github.v3; format=json', 'X-RateLimit-Limit': '60', 'X-RateLimit-Remaining': '59', 'X-RateLimit-Reset': '1615820230', 'X-RateLimit-Used': '1', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Vary': 'Accept-Encoding, Accept, X-Requested-With', 'X-GitHub-Request-Id': 'C67D:10F

In [6]:
r.text

'{"message":"Requires authentication","documentation_url":"https://docs.github.com/rest/reference/users#get-the-authenticated-user"}'

In [7]:
r.json()

{'message': 'Requires authentication',
 'documentation_url': 'https://docs.github.com/rest/reference/users#get-the-authenticated-user'}

Dog License
https://data.wprdc.org/dataset/allegheny-county-dog-licenses/resource/e16d4ab3-842a-4f39-9ad7-ce5921002280

Query for dogs named bud
https://data.wprdc.org/api/3/action/datastore_search?resource_id=e16d4ab3-842a-4f39-9ad7-ce5921002280&q=bud

In [8]:
# set up our request
query = "bud"
resource_id = "e16d4ab3-842a-4f39-9ad7-ce5921002280"
endpoint_url = "https://data.wprdc.org/api/3/action/datastore_search"

parameters = {"resource_id": resource_id,
              "q": query}

In [9]:
# make the request
response = requests.get(endpoint_url, params=parameters)

In [11]:
response.status_code

200

In [12]:
response.headers

{'Server': 'nginx/1.10.3 (Ubuntu)', 'Date': 'Mon, 15 Mar 2021 14:05:26 GMT', 'Content-Type': 'application/json;charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Pragma': 'no-cache', 'Cache-Control': 'no-cache', 'X-Cache-Status': 'MISS', 'Content-Encoding': 'gzip'}

In [13]:
response.json()

{'help': 'https://data.wprdc.org/api/3/action/help_show?name=datastore_search',
 'success': True,
 'result': {'include_total': True,
  'resource_id': 'e16d4ab3-842a-4f39-9ad7-ce5921002280',
  'fields': [{'type': 'int', 'id': '_id'},
   {'type': 'text', 'id': 'LicenseType'},
   {'type': 'text', 'id': 'Breed'},
   {'type': 'text', 'id': 'Color'},
   {'type': 'text', 'id': 'DogName'},
   {'type': 'text', 'id': 'OwnerZip'},
   {'type': 'int4', 'id': 'ExpYear'},
   {'type': 'timestamp', 'id': 'ValidDate'},
   {'type': 'float', 'id': 'rank'}],
  'records_format': 'objects',
  'q': 'bud',
  'records': [{'LicenseType': 'Dog Senior Citizen or Disability Neutered Male',
    'ValidDate': '2020-12-18T14:44:46',
    'Color': 'BLACK',
    'Breed': 'CHOW CHOW MIX',
    'rank': 0.0573088,
    'OwnerZip': '15132',
    'ExpYear': 2021,
    '_id': 2180430,
    'DogName': 'CURLEY BUD'},
   {'LicenseType': 'Dog Senior Citizen or Disability Spayed Female',
    'ValidDate': '2020-12-22T10:13:57',
    'Color'

In [15]:
dogs = response.json()['result']['records']
len(dogs)

19

In [20]:
for dog in dogs:
    dog_info = dog['DogName'].capitalize() + " is a " + dog['Color'].capitalize() + " " + dog['Breed'].capitalize()
    print(dog_info)

Curley bud is a Black Chow chow mix
Rose bud is a Black/silver Poodle mix
Bud is a Black/tan Yorkshire terrier
Bud is a Black/brown Doberman mix
Bud is a White/black Shih tzu mix
Bud rahaw is a Black Labrador retriever
Bud weiser is a Black/tan Ger shepherd
Bud is a White Mixed
Bud is a Black/brown Ger shepherd mix
Bud is a Red/brown Dachshund
Bud is a Brown Labrador retriever
Bud light is a Brown Chihuahua mix
Bud is a Black with white Bord collie mix
Lil bud is a Brown Chihuahua
Bud is a Brown/tan/brindle Bord collie mix
Rose bud is a Spotted Mixed
Bud is a White/black/brown Beagle mix
Bud is a Black/brown Mixed
Bud is a White/black/brown Boxer mix


In [21]:
def pgh_dog_search(query):
    resource_id = "e16d4ab3-842a-4f39-9ad7-ce5921002280"
    endpoint_url = "https://data.wprdc.org/api/3/action/datastore_search"

    parameters = {"resource_id": resource_id,
                  "q": query}
    response = requests.get(endpoint_url, params=parameters)
    return response.json()['result']['records']


In [23]:
lilos = pgh_dog_search("Lilo")
len(lilos)

16

In [24]:
lilos

[{'LicenseType': 'Dog Individual Spayed Female',
  'ValidDate': '2020-12-11T15:16:35',
  'Color': 'BLONDE',
  'Breed': 'DOGUE DE BORDEAUX',
  'rank': 0.0573088,
  'OwnerZip': '15237',
  'ExpYear': 2021,
  '_id': 2183590,
  'DogName': 'LILO'},
 {'LicenseType': 'Dog Senior Citizen or Disability Spayed Female',
  'ValidDate': '2020-12-28T09:06:46',
  'Color': 'WHITE/TAN',
  'Breed': 'MIXED',
  'rank': 0.0573088,
  'OwnerZip': '15239',
  'ExpYear': 2021,
  '_id': 2185592,
  'DogName': 'LILO'},
 {'LicenseType': 'Dog Individual Spayed Female',
  'ValidDate': '2021-01-13T08:36:57',
  'Color': 'TAN',
  'Breed': 'PUG',
  'rank': 0.0573088,
  'OwnerZip': '15135',
  'ExpYear': 2021,
  '_id': 2185665,
  'DogName': 'LILO'},
 {'LicenseType': 'Dog Individual Spayed Female',
  'ValidDate': '2020-12-14T12:04:06',
  'Color': 'SPOTTED',
  'Breed': 'MIXED',
  'rank': 0.0573088,
  'OwnerZip': '15238',
  'ExpYear': 2021,
  '_id': 2188883,
  'DogName': 'LILO'},
 {'LicenseType': 'Dog Individual Spayed Female'

### Getting the Weather


Documentation for [current weather data](https://openweathermap.org/current) from OpenWeather
```
api.openweathermap.org/data/2.5/weather?q={city name}&appid={API key}
```

To use the OpenWeather api you need to [create an account](https://home.openweathermap.org/users/sign_up)

Once you sign up they will send you an API key via email. This is your special secret key for accessing the API. It needs to be kept a secret so I store mine in a file called `api-key.txt`. You will need to create a text file like this with the key if you want to run the code below.


In [34]:
city = "Pittsburgh"
ow_endpoint = "http://api.openweathermap.org/data/2.5/weather"

with open("api-key.txt","r") as fh:
    api_key = fh.read().strip()
api_key

'7e0f64d8a7aabb393a45becbf9e0b19f'

In [30]:
parameters = {
    "q":city,
    "appid":api_key
}

response = requests.get(ow_endpoint, params=parameters)
response.status_code

200

In [31]:
response.json()

{'coord': {'lon': -79.9959, 'lat': 40.4406},
 'weather': [{'id': 800,
   'main': 'Clear',
   'description': 'clear sky',
   'icon': '01d'}],
 'base': 'stations',
 'main': {'temp': 270.56,
  'feels_like': 265.16,
  'temp_min': 269.82,
  'temp_max': 271.48,
  'pressure': 1030,
  'humidity': 46},
 'visibility': 10000,
 'wind': {'speed': 3.09, 'deg': 0},
 'clouds': {'all': 1},
 'dt': 1615818005,
 'sys': {'type': 1,
  'id': 3247,
  'country': 'US',
  'sunrise': 1615807873,
  'sunset': 1615850791},
 'timezone': -14400,
 'id': 5206379,
 'name': 'Pittsburgh',
 'cod': 200}

In [32]:
parameters = {
    "q":city,
    "appid":api_key,
    "units": "imperial" #MERCA
}

response = requests.get(ow_endpoint, params=parameters)
response.status_code

200

In [33]:
response.json()

{'coord': {'lon': -79.9959, 'lat': 40.4406},
 'weather': [{'id': 800,
   'main': 'Clear',
   'description': 'clear sky',
   'icon': '01d'}],
 'base': 'stations',
 'main': {'temp': 27.66,
  'feels_like': 17.96,
  'temp_min': 26.01,
  'temp_max': 30,
  'pressure': 1030,
  'humidity': 46},
 'visibility': 10000,
 'wind': {'speed': 6.91, 'deg': 0},
 'clouds': {'all': 1},
 'dt': 1615818154,
 'sys': {'type': 1,
  'id': 3247,
  'country': 'US',
  'sunrise': 1615807873,
  'sunset': 1615850791},
 'timezone': -14400,
 'id': 5206379,
 'name': 'Pittsburgh',
 'cod': 200}

That is better

In [36]:
def get_temp(city):
    ow_endpoint = "http://api.openweathermap.org/data/2.5/weather"
    
    with open("api-key.txt","r") as fh:
        api_key = fh.read().strip()
    
    
    
    parameters = {
        "q":city,
        "appid":api_key,
        "units": "imperial" #MERCA
    }

    response = requests.get(ow_endpoint, params=parameters)
    return response.json()['main']['temp']

In [41]:
get_temp("Pittsburgh")

27.66

In [43]:
get_temp("Paris")

50.54

In [42]:
get_temp("Los Angeles")

49.73

In [45]:
get_temp("McMurdo Station")

-1.01