# Introduction to Web Requests with python

Based on https://www.pluralsight.com/guides/web-scraping-with-request-python

## Introduction

On the Internet websites often offer a RESTful API endpoints (URLs, URIs) to share data via HTTP requests. 

HTTP requests are composed of methods like GET, POST, PUT, DELETE, etc. to manipulate and access resources or data. 

Often, websites require a registration process to access RESTful APIs or offer no API at all. 

So, to simplify the process, we can also download the data as raw text and format it. 

For instance, downloading content from a personal blog or profile information of a GitHub user without any registration. 

This guide will explain the process of making web requests in python using `requests` package and its various features. 

### Making a Get Request

In order to make a REST call, the first step is to import the python requests module in the current environment. 

In [None]:
import requests                                         # To use request package in current program 

### JSON

JSON is an open standard file format, and data interchange format, that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and array data types.
An example is:
```json
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    }
  ],
  "children": [],
  "spouse": null
}
```

JSON can be easily transformed into native Python objects and vice versa.


### First request with JSON

To make the first request, we will be using JSONPlaceholder API which provides JSON response for specific item like posts, todos, and albums. So, the `/todos/1` API will respond with the details of a TODO item. 




In [None]:
url = 'https://jsonplaceholder.typicode.com/todos/1' 
response = requests.get(url)        # To execute get request 
print(response.status_code)     # To print http response code  
print(response.text)

### Previous results

The status code `200` means a successful execution of request and `response.content` will return the actual JSON response of a TODO item.

### POST Request

In POST requests a dictionary object can be used to send the data, as a key-value pair, as a second parameter to the post method. 

In [None]:
data = {'title':'Python Requests','body':'Requests are awesome','userId':1} 
response = requests.post('https://jsonplaceholder.typicode.com/posts', data) 
print(response.status_code) 
print(response.text) 

###  POST request advantages

POST requests have no restriction on data length, so they’re more suitable for files and images. Whereas GET requests have a limit of 2 kilobytes (some servers can handle 64 KB data) and GET only allows ASCII values.

Just like `post`, requests also support other methods like `put`, `delete`, etc. Any request can be sent without any data and can define empty placeholder names to enhance code clarity. 

In this case where `data` is set as `None`, this can be skipped because it happened automatically due to default values. 

In [None]:
response = requests.post('https://jsonplaceholder.typicode.com/posts', data = None, json = {}) 
print(response.json())      # output: {'id': 101} 

### Response Types

The response object can be parsed as string, bytes, JSON, or raw as: 


In [None]:
data = {'title':'Python Requests','body':'Requests are awesome','userId':1} 
response = requests.post('https://jsonplaceholder.typicode.com/posts', data) 

print(response.content)           # To print response bytes 
print(response.text)              # To print unicode response string 
jsonRes = response.json()         # To get response dictionary as JSON 
print(jsonRes['title'] , jsonRes['body'], sep = ' : ')  # output: Python Requests : Requests are awesome 

### Headers

A header contains information about the client (type of browser), server, accepted response type, IP address, etc. Headers can be customized for the source browser (user-agent) and content-type. They can be viewed using the `headers` property as:

In [None]:
headers = {'user-agent': 'customize header string', 'Content-Type': 'application/json; charset=utf-8'}  
response = requests.get(url, headers=headers)   # modify request headers 

In [None]:
print(response.headers)                         # print response headers 

In [None]:
print(response.headers['Content-Type'])         # output: application/json; charset=utf-8 

### Timeout

* Timeout: Allows `requests` to terminate any request, if there is no response within the set timeout duration. This will avoid any indefinite waiting state, in case there's no response from server. 

In [None]:
requests.get('https://github.com/', timeout=2.00) # 2 sec timeout, should work without problems if connected to the internet

Now with a shorter timeout of 50 ms.

An error is expected here unless github.com can respond in under 50 ms.

In [None]:
requests.get('https://github.com/', timeout=0.050) 