### Demonstration of how to use the Python requests module 
* pulling down a plain vanilla web page (html, javascript, etc.)
  * checking the return code
  * getting the headers
  * getting the web page content (html, javascript, etc.)
* web api calls
  * passing parameters in json format
  * receiving responses in json format
* downloading a text file or binary file of any type from the internet
* downloading a zip file from the internet and unzipping it

In [1]:
import requests
import os
import zipfile

#### Plain Vanilla Web Page
We will start with a plain vanilla web page using the google.com landing page.  
The get() method corresponds to the http GET.

In [2]:
r = requests.get("https://google.com")

Check the status code, 200 means ok, errors would be like 401, 404, etc.

In [3]:
r.status_code

200

We can easily retrieve the http headers in dictionary form in Python

In [4]:
r.headers

{'P3P': 'CP="This is not a P3P policy! See g.co/p3phelp for more info."', 'Set-Cookie': '1P_JAR=2020-03-11-01; expires=Fri, 10-Apr-2020 01:51:24 GMT; path=/; domain=.google.com; Secure, NID=199=uluGu95nj5q28mFp8shz_PqjlXBdZ8-_G482xW2lLekcFNqsWVU3rI8imHqAstP8iac-d8JFRwbVBaRflHoJW7G0tAIFRCJgvvDV61LEXOXBsRskmPNUyoH6kUA1W9YjRmCvtrbE_JAb2ExUeM579iXgSD0RFBnm69gqQfgDDdg; expires=Thu, 10-Sep-2020 01:51:24 GMT; path=/; domain=.google.com; HttpOnly', 'Transfer-Encoding': 'chunked', 'Content-Type': 'text/html; charset=ISO-8859-1', 'X-Frame-Options': 'SAMEORIGIN', 'X-XSS-Protection': '0', 'Content-Encoding': 'gzip', 'Expires': '-1', 'Server': 'gws', 'Date': 'Wed, 11 Mar 2020 01:51:24 GMT', 'Cache-Control': 'private, max-age=0'}

We can easily retrive the text of the webpage (html, http, etc.)  With google.com, as you can see the html is really long, so we will only look at the first 500 characters

In [5]:
len(r.text)

11885

In [6]:
# it's long so let's just see the first 500 characters of the html returned
r.text[0:500]

'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world\'s information, including webpages, images, videos and more. Google has many special features to help you find exactly what you\'re looking for." name="description"><meta content="noodp" name="robots"><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title>'

#### Web API Call
We will now make a web API call. We will send json as the "payload" and receive json back.
The post() method corresponds to an http POST. The data keywork argument is a python dictionary with the payload.
The json() method takes the response text and formats it into a python dictionary.

In [7]:
r = requests.post('http://httpbin.org/post', data = {'key':'value'})

In [8]:
r.status_code

200

In [9]:
r.text

'{\n  "args": {}, \n  "data": "", \n  "files": {}, \n  "form": {\n    "key": "value"\n  }, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Content-Length": "9", \n    "Content-Type": "application/x-www-form-urlencoded", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.22.0", \n    "X-Amzn-Trace-Id": "Root=1-5e6845a2-7da774faec9c44d84785eac6"\n  }, \n  "json": null, \n  "origin": "35.230.18.223", \n  "url": "http://httpbin.org/post"\n}\n'

In [10]:
r.json()

{'args': {},
 'data': '',
 'files': {},
 'form': {'key': 'value'},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Content-Length': '9',
  'Content-Type': 'application/x-www-form-urlencoded',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.22.0',
  'X-Amzn-Trace-Id': 'Root=1-5e6845a2-7da774faec9c44d84785eac6'},
 'json': None,
 'origin': '35.230.18.223',
 'url': 'http://httpbin.org/post'}

#### Another Web API Call Example
Here is an example of another web API call to a website that gives position information on the internal space station and returns it in json format.  It does not need parameters, so we don't need to pass a payload, so we will use the get() method.

You may want to check out their website for more API calls to try:

https://wheretheiss.at/w/developer

In [11]:
r = requests.get('https://api.wheretheiss.at/v1/satellites/25544')

In [12]:
r.status_code

200

In [13]:
r.text

'{"name":"iss","id":25544,"latitude":-50.965379254754,"longitude":-161.30051755269,"altitude":436.61434025728,"velocity":27538.367196964,"visibility":"daylight","footprint":4591.0419450701,"timestamp":1583891975,"daynum":2458919.583044,"solar_lat":-3.5805427884919,"solar_lon":152.60569267739,"units":"kilometers"}'

In [14]:
r.json()

{'altitude': 436.61434025728,
 'daynum': 2458919.583044,
 'footprint': 4591.0419450701,
 'id': 25544,
 'latitude': -50.965379254754,
 'longitude': -161.30051755269,
 'name': 'iss',
 'solar_lat': -3.5805427884919,
 'solar_lon': 152.60569267739,
 'timestamp': 1583891975,
 'units': 'kilometers',
 'velocity': 27538.367196964,
 'visibility': 'daylight'}

#### Download an Excel File from the internet 
We will now download an excel file from the internet.  This same method works with any file you want to download, a text file, or any other binary file, such as Word, pdf, images such as jpeg, videos such as mp4, a zip file, a tarball file, etc.  Basically any file.

We simply call the get() method and the r.content attribute will hold the data in binary format with encoding.  We can them open a file in wb mode (write binary), write the contents, and close the file.

In [15]:
r = requests.get('http://kevincrook.com/ucb/data/coffee_chain.xlsx')

In [16]:
r.status_code

200

In [17]:
r.content[0:100]

b'PK\x03\x04\x14\x00\x06\x00\x08\x00\x00\x00!\x00\x9c\xd7\xfc\xa8^\x01\x00\x00<\x04\x00\x00\x13\x00\xcf\x01[Content_Types].xml \xa2\xcb\x01(\xa0\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

In [18]:
f = open("coffee_chain.xlsx", "wb")

In [19]:
f.write(r.content)

423875

In [20]:
f.close()

#### Download a Zip File from the Internet and Unzip it
We will download a zip file from the internet using the previous method of downloading a binary file.

Using the os module, we will make a temporary directory to hold the zip file.  We will also use this same directory to unzip the contents.

We will use the zipfile module to unzip the file.

In [21]:
r = requests.get('http://kevincrook.com/ucb/data/Hospital_Compare_Data_OpenRefine_Breakout.zip')

In [22]:
r.status_code

200

In [23]:
os.getcwd()

'/home/jupyter/w205/python-requests'

In [24]:
if not os.path.exists("temp_zip"):
    os.mkdir("temp_zip")

In [25]:
dir_file = os.path.join("temp_zip", "Hospital_Compare_Data_OpenRefine_Breakout.zip" )
dir_file

'temp_zip/Hospital_Compare_Data_OpenRefine_Breakout.zip'

In [26]:
f = open(dir_file, "wb")

In [27]:
f.write(r.content)

3149304

In [28]:
f.close()

In [29]:
z = zipfile.ZipFile(dir_file, "r")

In [30]:
z.extractall("temp_zip")

In [31]:
z.close()

In [32]:
r = requests.get('http://localhost:5000/')

In [33]:
r = requests.get('http://localhost:5000/purchase_a_sword')

In [34]:
r.status_code

200

35.230.18.223
r = requests.get('http://35.230.18.223:5000/')
r = requests.get('http://35.230.18.223:5000/purchase_a_sword')