# Introduction to Requests: Types, Responses, Use Cases


It is very often useful to be able to communicate between computers and services in our code

To do this, we use requests!

HTTP requests are used by one piece of code to get or send along pieces of information from another service or computer.


## HTTP

Early on in the semester, we covered some of the basics of HTTP.

It is a request/response system that allows a client to send requests to a server and get a response using a TCP handshake for verification.

One example here is a web browser. Every time that you visit a website, you send a request to the server of that site and recieve an HTML page as a response, which your browser renders

## Python Requests Library


Today, we'll be covering the python requests library, which can be installed with 'pip install requests' in the command line and then the following import statement in your code:

In [2]:
import requests

The requests library will allow us to send HTTP requests to other servers or services and yielding a python object that contains a response code, a header, and a body

## Request Methods

- GET:
    - Used to grab data from a website or service
    - This is what browsers use to access a website
    - You can also pass some key-value pairs for authentication and other arguments (google searches use a get request)
    - The following is an example of a get request to google.com (we get a resposne with status code 200, meaning that we successfully connected to the server)
    - If you were to print r.text here, you would get a large string of HTML

In [4]:
r = requests.get('https://www.google.com')

<Response [200]>

- POST
    - Used to send a collection of data to a site or service
    - We pass in a dictionary to the function call to send this data with our request
    - POST requests are often used by websites to send form data that the user inputs to the server to be processed.
    - The following is an example of a POST request to a demo URL passing in a dictionary

In [14]:
url = 'https://www.w3schools.com/python/demopage.php'
myobj = {'somekey': 'somevalue'}

x = requests.post(url, json = myobj)
x

200

- Other request methods supported by the Requests module (less common methods):
    - DELETE:
        - Used to send a request that indicates deletion of something on the server
    - PATCH:
        - A set of instructions to modify the server
    - HEAD:
        - Requests just the headers of a response to a get request from the server
    - PUT:
        - Replace a target resource with payload

- Request methods not supported by the requests module
    - CONNECT:
        - Establish a two-way connection stream between server and client
    - OPTIONS:
        - Requests the allowed options for connection to a server
    - TRACE:
        - Traces the forwarding involved in a request

## Response Codes


- 100 -199
    - These status codes provide information about the request. Often, this means that the server is still loading or doing some sort of internal work
- 200 - 299
    - A 200 status code means that the request was successful (the server exists and the provided inputs are sufficient)
    - The other codes in this range mean that the request was successfully sent, but they often indicate other important pieces of information about the response
- 300 - 399
    - These are various redirection codes. They indicate that the user is being directed away from the server initially requested
- 400 - 499
    - These are unsuccessful requests, the most common of which are:
        - 400: Bad request, meaning the request format is not correct
        - 401: Unauthorized, client is not authorized to make this request
        - 404: Not Found, the requested resource cannot be found
- 500 - 599
    - 500 indicates an internal server error
    - The rest of these mean that something is going wrong in the server

If you are looking for an exhaustive list of the status codes which I have not listed here, you can find it on https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

You can see the status code of a request response in python by accessing the status_code property of the response as follows:

In [15]:
url = 'https://www.w3schools.com/python/demopage.php'
myobj = {'somekey': 'somevalue'}

x = requests.post(url, json = myobj)
x.status_code

200

## When might we use requests?

There are various reasons that we might want to send requests in our python code.

Programmatically accessing resources on the internet can be very valuable!

#### Webscraping

Suppose that there is a website, or list of websites, which has a dataset or datasets that we are interested in downloading

We could certainly manually visit each site and copy the values into CSV files, but the requests library would provide us with a much quicker solution which we could run periodically to automatically update our dataset

In the following code, we grab the data file from https://dasl.datadescription.com/download/data/3176 and save it to outfile.csv

In [27]:
r = requests.get('https://dasl.datadescription.com/download/data/3176')


with open('outfile.csv', 'w', encoding="utf-8") as f:
    f.write(r.text)

In some cases, the data we want might not be available for download from the website, so we can use tools like BeautifulSoup4 to grab the HTML and then parse through the HTML to isolate the data that we are interested in. Tools like these will use complex regex or lxml to provide us with tools that make it easy to parse html and grab specific peices of data.

Suppose that we want to pull the table from https://www.worldometers.info/coronavirus/ to get the coronavirus cases per country and save it in a csv file locally. We can use the beautifulShoup library alongside the requests library to isolate the table, then pass it into a pandas dataframe and save it!

In [43]:
from bs4 import BeautifulSoup
import pandas as pd


# Send the request to get the raw HTML data
url = 'https://www.worldometers.info/coronavirus/'
res = requests.get(url)

# parser-lxml = Change HTML to Python friendly format
# Obtain page's information
soup = BeautifulSoup(res.text, 'lxml')


By inspecting the table on the page, we can find the id of the table that we are interested in is 'main_table_countries_today'. Using this id, we can use the BeautifulSoup find method to isolate the content of the table.

In [44]:
# isolate the table data
table = soup.find('table', id='main_table_countries_today')
table

<table class="table table-bordered table-hover main_table_countries" id="main_table_countries_today" style="width:100%;margin-top: 0px !important;display:none;">
<thead>
<tr>
<th width="1%">#</th>
<th width="100">Country,<br/>Other</th>
<th width="20">Total<br/>Cases</th>
<th width="30">New<br/>Cases</th>
<th width="30">Total<br/>Deaths</th>
<th width="30">New<br/>Deaths</th>
<th width="30">Total<br/>Recovered</th>
<th width="30">New<br/>Recovered</th>
<th width="30">Active<br/>Cases</th>
<th width="30">Serious,<br/>Critical</th>
<th width="30">Tot Cases/<br/>1M pop</th>
<th width="30">Deaths/<br/>1M pop</th>
<th width="30">Total<br/>Tests</th>
<th width="30">Tests/<br/>
<nobr>1M pop</nobr>
</th>
<th width="30">Population</th>
<th style="display:none" width="30">Continent</th>
<th width="30">1 Case<br/>every X ppl</th><th width="30">1 Death<br/>every X ppl</th><th width="30">1 Test<br/>every X ppl</th>
<th width="30">New Cases/1M pop</th>
<th width="30">New Deaths/1M pop</th>
<th width

Clearly, we have now found the table by itself. Now we can get the column names by finding all of the 'th' elements in the table. The find_all method will get all of the elements in the table with the provided tag.

In [48]:
# Obtain every title of columns with tag <th>
headers = []
for i in table.find_all('th'):
    title = i.text.replace('\n', '').replace('\xa0', 'al') # Some replcements of characters we do not want in our data
    headers.append(title)
headers

['#',
 'Country,Other',
 'TotalCases',
 'NewCases',
 'TotalDeaths',
 'NewDeaths',
 'TotalRecovered',
 'NewRecovered',
 'ActiveCases',
 'Serious,Critical',
 'TotalCases/1M pop',
 'Deaths/1M pop',
 'TotalTests',
 'Tests/1M pop',
 'Population',
 'Continent',
 '1 Caseevery X ppl',
 '1 Deathevery X ppl',
 '1 Testevery X ppl',
 'New Cases/1M pop',
 'New Deaths/1M pop',
 'Active Cases/1M pop']

Now let's create a pandas dataframe to store all of our data once we start getting the rows of the table!

In [54]:
# Create a dataframe
df = pd.DataFrame(columns = headers)

Finally, it's time to get each of the rows of the table and populate our dataframe. First we find all of the "tr" tags (correspond to the rows), then each element in the row is a "td". We get all of these and add them row by row to the DataFrame.

In [56]:
length = len(df)
# Create a for loop to fill mydata
for j in table.find_all('tr')[1:]:
    row_data = j.find_all('td')
    row = [i.text.replace('\n', '') for i in row_data]
    df.loc[length] = row
    length += 1

print(df)

    #   Country,Other   TotalCases  NewCases TotalDeaths NewDeaths  \
0       North America  119,185,579   +47,497   1,564,392      +206   
1                Asia  199,231,459  +250,199   1,498,223      +413   
2              Europe  238,536,220   +91,733   1,961,163      +217   
3       South America   65,136,665   +41,562   1,335,850      +141   
4             Oceania   12,998,617    +4,705      22,239        +5   
..  ..            ...          ...       ...         ...       ...   
487            Total:   65,136,665   +41,562   1,335,850      +141   
488            Total:   12,998,617    +4,705      22,239        +5   
489            Total:   12,709,355    +1,218     258,080        +1   
490            Total:          721                    15             
491            Total:  647,798,616  +436,914   6,639,962      +983   

    TotalRecovered NewRecovered ActiveCases Serious,Critical  ... TotalTests  \
0      114,616,529      +34,343   3,004,658            8,365  ...              

Now we have all of our data in a pandas dataframe, and we know how to work with that! If you want to read more about BeautifulSoup and its usecases, you can check out the documentation here: https://beautiful-soup-4.readthedocs.io/en/latest/

Exercise if you would like to try it: Get the GDP of each country from the table on this Wikipedia page https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)_per_capita
Save it into a pandas dataframe using the method above!

In [57]:
# Solution!

### Requesting data from APIs on the internet


There are many resources on the internet which are designed to be accessed programmatically. The use cases of these services varies from providing small amounts of data to running complex functions that have been implemented by the service on your inputs.

Let's explore a very simple use case. The https://catfact.ninja/fact website will provide the user with a fact about cats on each call. We can use the python requests module to get an example of one as follows:

In [37]:
r = requests.get('https://catfact.ninja/fact')

print(r.text)

{'fact': 'Cats can predict earthquakes. We humans are not 100% sure how they do it. There are several different theories.', 'length': 111}


This clearly looks just like a dictionary, so we can use the response.json() method to cast the text into a dictionary, then access the fact itself.

In [39]:
r = requests.get('https://catfact.ninja/fact')

print(r.json())

print(r.json()['fact'])

{'fact': 'Many Egyptians worshipped the goddess Bast, who had a woman’s body and a cat’s head.', 'length': 84}
Many Egyptians worshipped the goddess Bast, who had a woman’s body and a cat’s head.


As an exercise, use a for loop to print 10 cat facts from the API:

In [None]:
### Write code here 




Let's check out another example. This api will predict your age based on the name that you provide it. We will have to provide a key-value pair along with this get request so that it knows what name to use

In [64]:
r = requests.get('https://api.agify.io', params = {'name': 'Alex'})
r.json()

{'age': 45, 'count': 411442, 'name': 'Alex'}

We can also take a look at the Wikipedia API for something a little bit larger. Here, we are going to request the stock trades of members of the US Congress from this free AWS API https://house-stock-watcher-data.s3-us-west-2.amazonaws.com/data/all_transactions.json

We can then save this data into a pandas dataframe and export it to a csv file pretty easily!

In [65]:
res = requests.get('https://house-stock-watcher-data.s3-us-west-2.amazonaws.com/data/all_transactions.json')

d = res.json()

df = pd.DataFrame(d)

df.head()

Unnamed: 0,disclosure_year,disclosure_date,transaction_date,owner,ticker,asset_description,type,amount,representative,district,ptr_link,cap_gains_over_200_usd
0,2021,10/04/2021,2021-09-27,joint,BP,BP plc,purchase,"$1,001 - $15,000",Hon. Virginia Foxx,NC05,https://disclosures-clerk.house.gov/public_dis...,False
1,2021,10/04/2021,2021-09-13,joint,XOM,Exxon Mobil Corporation,purchase,"$1,001 - $15,000",Hon. Virginia Foxx,NC05,https://disclosures-clerk.house.gov/public_dis...,False
2,2021,10/04/2021,2021-09-10,joint,ILPT,Industrial Logistics Properties Trust - Common...,purchase,"$15,001 - $50,000",Hon. Virginia Foxx,NC05,https://disclosures-clerk.house.gov/public_dis...,False
3,2021,10/04/2021,2021-09-28,joint,PM,Phillip Morris International Inc,purchase,"$15,001 - $50,000",Hon. Virginia Foxx,NC05,https://disclosures-clerk.house.gov/public_dis...,False
4,2021,10/04/2021,2021-09-17,self,BLK,BlackRock Inc,sale_partial,"$1,001 - $15,000",Hon. Alan S. Lowenthal,CA47,https://disclosures-clerk.house.gov/public_dis...,False


### Attacking a website

I'm not saying that you should do this, but using some multithreading and an infinite loop with the requests module, you could try running a Denial of Service attack on smaller websites by sending a significant number of concurrent reqeusts to a site to the point that the site is so busy processing these requests that it cannot handle any other user traffic.

Most big websites will have protection against these kinds of things that make it very difficult to actually take down their site, but the idea here is that sending requests with python can be very powerful

### Exercises!


1. Use the condesk API at https://api.coindesk.com/v1/bpi/currentprice.json to get the current price of bitcoin!

In [41]:
### Code for exercise 1

2. Get the HTML from https://www.oberlin.edu/ and store it in a file locally

In [None]:
### Code for exercise 2

3. Use the api at https://dog.ceo/api/breeds/image/random to get the url for an image of a dog, then send a get request to the url and save the image locally

In [None]:
### Code for exercise 3