<h3 style="text-align:center;color:cadetblue;">Python <code>requests</code></h3>

1. HTTP Server and Client.

2. HTTP request methods.

3. Get data from web:
    - Getting Started With `requests`
    - The GET Request
    - The Response
    - Status Codes
    - Content
    - Headers
    - Query String Parameters
    - Other HTTP Methods
    - The Message Body

<p style="text-align:center;color:blue;"><b>1. HTTP Server and Client.</b></p>

An **HTTP client** is a program that sends requests to an **HTTP server**, while an HTTP server is software that responds to those requests. HTTP stands for **Hypertext Transfer Protocol**, and it's a set of rules that govern how data is transmitted over the internet.

<div style="display:flex;align-items:center;background:white;">
    <img src="images/client-server-model.png" style="width:400px;object-fit:cover;" />
    <img src="images/client-server.png" style="width:400px;object-fit:cover;margin-left:40px;" />
<div>

Here are examples of HTTP clients:

**1. Web Browsers**
- **Google Chrome**, **Mozilla Firefox**, **Safari**, **Microsoft Edge**, etc.: Browsers serve as HTTP clients for accessing web pages.

**2. Command-Line Tools**
- **cURL**: A widely used command-line tool for transferring data using various protocols, including HTTP and HTTPS.
  ```bash
  curl -X GET https://api.example.com/resource
  ```
- **HTTPie**: A user-friendly command-line HTTP client with a modern interface.
  ```bash
  http GET https://api.example.com/resource
  ```


**3. Programming Libraries**
- **Python Requests**: A simple and popular HTTP library.
    ```python
    import requests
    response = requests.get("https://api.example.com/resource")
    print(response.text)
    ```

- **JavaScript Fetch API**: Built into modern browsers for making HTTP requests.
    ```javascript
    fetch("https://api.example.com/resource")
      .then(response => response.json())
      .then(data => console.log(data));
    ```


**4. GUI-Based Clients**
- **Postman**: A graphical tool for testing and developing APIs.
- **Insomnia**: A lightweight, user-friendly REST client.

**5. Specialized Clients**
- **Wget**: A command-line tool for downloading files over HTTP, HTTPS, and FTP.
  ```bash
  wget https://example.com/resource
  ```
- **Paw** (macOS): An advanced API testing tool.

---

Here are examples of HTTP servers, categorized based on their use cases and implementations:

**1. General-Purpose Web Servers**
These servers are used for hosting websites, APIs, and other web services.
- **Apache HTTP Server (Apache)**: A widely used, open-source HTTP server.
- **Nginx**: Known for high performance, scalability, and as a reverse proxy.
- **Microsoft IIS (Internet Information Services)**: A web server for Windows environments.


**2. Application Servers**
These servers are designed to run web applications and APIs.
- **Node.js**: A runtime that allows you to build HTTP servers with JavaScript.
  ```javascript
  const http = require('http');
  const server = http.createServer((req, res) => {
    res.writeHead(200, { 'Content-Type': 'text/plain' });
    res.end('Hello, World!\n');
  });
  server.listen(3000);
  ```
- **Django (Python)**: Comes with a built-in development HTTP server.
  ```bash
  python manage.py runserver
  ```
- **Flask (Python)**: Lightweight framework with a built-in server for development.
  ```bash
  flask run
  ```
- **Spring Boot (Java)**: Provides an embedded HTTP server using Tomcat, Jetty, or Undertow.

<p style="text-align:center;color:blue;"><b>2. HTTP request methods.</b></p>

An **HTTP request** is a message sent by a client (usually a web browser or an HTTP client like Postman) to a server, asking it to perform an action, such as retrieving a web page, submitting data, or fetching resources.

**Components of an HTTP Request**
An HTTP request consists of several parts:

1. **Request Line**  
   - Specifies the HTTP method, the URL, and the HTTP version.
     Example:  
     ```
     GET /index.html HTTP/1.1
     ```

2. **Headers**  
   - Key-value pairs providing additional information about the request.  
     Example:  
     ```
     Host: www.example.com
     User-Agent: Mozilla/5.0
     Accept: text/html
     ```

3. **Body**  
   - Optional part that contains data sent to the server (e.g., form data or JSON payload). Used in methods like `POST` or `PUT`.  
     Example:  
     ```
     {
       "username": "johndoe",
       "password": "securepassword123"
     }
     ```

**HTTP Request Methods**
HTTP defines several methods, each indicating the type of action the client wants the server to perform:

1. **GET**  
   - Requests a resource (e.g., web page or data).  
     Example:  
     ```
     GET /api/users HTTP/1.1
     ```

2. **POST**  
   - Submits data to the server, often resulting in a new resource being created.  
     Example:  
     ```
     POST /api/users HTTP/1.1
     Content-Type: application/json

     { "name": "John", "age": 30 }
     ```

3. **PUT**  
   - Updates an existing resource or creates it if it doesn't exist.  
     Example:  
     ```
     PUT /api/users/1 HTTP/1.1
     Content-Type: application/json

     { "name": "John", "age": 31 }
     ```

4. **DELETE**  
   - Deletes a specified resource.  
     Example:  
     ```
     DELETE /api/users/1 HTTP/1.1
     ```


**How It Works**
1. **Client Initiates the Request:**  
   A web browser or an HTTP client sends the request to the server.

2. **Server Processes the Request:**  
   The server examines the request line, headers, and (if present) body, and takes the appropriate action.

3. **Response:**  
   The server sends back an **HTTP response** containing the requested resource, a status code, and possibly additional headers.

> Note: URL and Endpoint are basically the same thing.

<p style="text-align:center;color:blue;"><b>3. Get data from web</b></p>

**Getting Started With `requests`**

Let’s begin by installing the requests library. To do so, run the following command:
```cmd
pip install requests
```

In [36]:
%pip install requests




[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [37]:
import requests

In [38]:
requests.get('https://example.com')

<Response [200]>

In [46]:
res = requests.get('https://example.com')
type(res)

requests.models.Response

In [43]:
res

<Response [404]>

**Status Code**

The first bit of information that you can gather from Response is the **status code**. A status code informs you of the status of the request.

For example, a **200 OK** status means that your request was successful, whereas a **404 NOT FOUND** status means that the resource you were looking for was not found. There are many other possible status codes as well to give you specific insights into what happened with your request.

In [47]:
res.status_code

200

Sometimes, you might want to use this information to make decisions in your code:

In [48]:
if res.status_code == 200:
    print('Success!')
elif res.status_code == 404:
    print('Not Found.')

Success!


In [None]:
requests.models.Response

In [51]:
type(res)

requests.models.Response

In [50]:
bool(res)

True

With this logic, if the server returns a 200 status code, your program will print Success!. If the result is a 404, your program will print Not Found.

requests goes one step further in simplifying this process for you. If you use a Response instance in a conditional expression, it will evaluate to True if the status code was between 200 and 400, and False otherwise.

Therefore, you can simplify the last example by rewriting the if statement:

In [52]:
if res:
    print('Success!')
else:
    print('An error has occurred.')

Success!


In [54]:
res = requests.get('https://example.com/a')
type(res)

requests.models.Response

In [55]:
res.raise_for_status()

HTTPError: 404 Client Error: Not Found for url: https://example.com/a

Let’s say you don’t want to check the response’s status code in an if statement. Instead, you want to raise an exception if the request was unsuccessful. You can do this using `.raise_for_status()`:

In [56]:
import requests
from requests.exceptions import HTTPError

for url in ['https://api.github.com', 'https://api.github.com/invalid']:
    try:
        response = requests.get(url)

        # If the response was successful, no Exception will be raised
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')
    except Exception as err:
        print(f'Other error occurred: {err}')
    else:
        print('Success!')

Success!
HTTP error occurred: 404 Client Error: Not Found for url: https://api.github.com/invalid


In [57]:
res.ok

False

**Content**

The response of a GET request often has some valuable information, known as a payload, in the message body. Using the attributes and methods of Response, you can view the payload in a variety of different formats.

To see the response’s content in bytes, you use `.content`:

In [58]:
response = requests.get('https://api.github.com')
response.content

b'{\n  "current_user_url": "https://api.github.com/user",\n  "current_user_authorizations_html_url": "https://github.com/settings/connections/applications{/client_id}",\n  "authorizations_url": "https://api.github.com/authorizations",\n  "code_search_url": "https://api.github.com/search/code?q={query}{&page,per_page,sort,order}",\n  "commit_search_url": "https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}",\n  "emails_url": "https://api.github.com/user/emails",\n  "emojis_url": "https://api.github.com/emojis",\n  "events_url": "https://api.github.com/events",\n  "feeds_url": "https://api.github.com/feeds",\n  "followers_url": "https://api.github.com/user/followers",\n  "following_url": "https://api.github.com/user/following{/target}",\n  "gists_url": "https://api.github.com/gists{/gist_id}",\n  "hub_url": "https://api.github.com/hub",\n  "issue_search_url": "https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}",\n  "issues_url": "https://api.

In [61]:
res = requests.get('https://requests.readthedocs.io/en/latest/')
content = res.content

In [63]:
type(content.decode('utf8'))

str

While `.content` gives you access to the raw bytes of the response payload, you will often want to convert them into a string using a character encoding such as UTF-8. response will do that for you when you access `.text`:

In [69]:
text = response.text

In [65]:
response.content.decode('utf8') == response.text

True

In [70]:
import json
info = json.loads(text)

In [72]:
info['current_user_url']

'https://api.github.com/user'

If you take a look at the response, you’ll see that it is actually serialized JSON content. To get a dictionary, you could take the str you retrieved from `.text` and deserialize it using `json.loads()`. However, a simpler way to accomplish this task is to use `.json()`:

In [73]:
response.json()

{'current_user_url': 'https://api.github.com/user',
 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}',
 'authorizations_url': 'https://api.github.com/authorizations',
 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}',
 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}',
 'emails_url': 'https://api.github.com/user/emails',
 'emojis_url': 'https://api.github.com/emojis',
 'events_url': 'https://api.github.com/events',
 'feeds_url': 'https://api.github.com/feeds',
 'followers_url': 'https://api.github.com/user/followers',
 'following_url': 'https://api.github.com/user/following{/target}',
 'gists_url': 'https://api.github.com/gists{/gist_id}',
 'hub_url': 'https://api.github.com/hub',
 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}',
 'issues_url': 'https://api.github.com/issues',
 'keys_url': '

**Headers**

The response headers can give you useful information, such as the content type of the response payload and a time limit on how long to cache the response. To view these headers, access `.headers`:

In [74]:
response.headers

{'Date': 'Fri, 10 Jan 2025 15:39:31 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept,Accept-Encoding, Accept, X-Requested-With', 'ETag': 'W/"4f825cc84e1c733059d46e76e6df9db557ae5254f9625dfe8e1b09499c449438"', 'X-GitHub-Media-Type': 'github.v3; format=json', 'x-github-api-version-selected': '2022-11-28', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '0', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security

In [76]:
res = requests.get('https://requests.readthedocs.io/en/latest/')
res.headers

{'Date': 'Fri, 10 Jan 2025 15:46:36 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'CF-Ray': '8ffdcd02ec90ecc2-WAW', 'CF-Cache-Status': 'HIT', 'Access-Control-Allow-Origin': '*', 'Age': '3508', 'Cache-Control': 'max-age=1200', 'Content-Encoding': 'gzip', 'ETag': 'W/"715dee4405bf186850fd9935935e0edd"', 'Last-Modified': 'Sat, 02 Nov 2024 20:14:04 GMT', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Vary': 'Accept-Encoding', 'access-control-allow-methods': 'HEAD, OPTIONS, GET', 'cdn-cache-control': 'public', 'referrer-policy': 'no-referrer-when-downgrade', 'x-amz-id-2': 'tPN+azBYRzK+T/++mkN5MxO9t3XsDCr9Tv/cnzFt7KcPWPn2PnA1jzm46KrOVautxcUi2hFXlle6XhukuXgI/cwA8kjKIRNl', 'x-amz-meta-mtime': '1730578427.924560948', 'x-amz-request-id': 'RCP300JCXSBWC9ET', 'x-amz-server-side-encryption': 'AES256', 'x-backend': 'web-ext-theme-i-0e280d6ae9104d6a1', 'x-content-type-options': 'nosniff', 'x-rtd-domain': 're

In [77]:
type(response.headers)

requests.structures.CaseInsensitiveDict

In [78]:
response.headers['Content-Type']

'application/json; charset=utf-8'

In [79]:
response.headers['content-type']

'application/json; charset=utf-8'

**Query String Parameters**

One common way to customize a GET request is to pass values through query string parameters in the URL. To do this using `get()`, you pass data to params. For example, you can use GitHub’s Search API to look for the requests library:

In [80]:
res = requests.get('https://api.github.com/search/repositories?q=requests+language:python')
res.headers['content-type']

'application/json; charset=utf-8'

In [82]:
len(res.json()['items'])

30

In [83]:
import requests

# Search GitHub's repositories for requests
response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
)

json_response = response.json()
repository = json_response['items'][0]
print(f'Repository name: {repository["name"]}')
print(f'Repository description: {repository["description"]}')

Repository name: secrules-language-evaluation
Repository description: Set of Python scripts to perform SecRules language evaluation on a given http request.


**The Message Body**

According to the HTTP specification, POST, PUT, and the less common PATCH requests pass their data through the message body rather than through parameters in the query string. Using requests, you’ll pass the payload to the corresponding function’s data parameter.

In [84]:
requests.post('https://httpbin.org/post', data={'key':'value'})

<Response [200]>

In [87]:
res = requests.get('https://httpbin.org/post', data={'key':'value'})
print(res.text)

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>



You can also send that same data as a list of tuples:

In [89]:
res = requests.post('https://httpbin.org/post', data=[('key', 'value')])

In [90]:
res.json()

{'args': {},
 'data': '',
 'files': {},
 'form': {'key': 'value'},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Content-Length': '9',
  'Content-Type': 'application/x-www-form-urlencoded',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.32.3',
  'X-Amzn-Trace-Id': 'Root=1-67814394-6f8258af4dd5ec4f7f575104'},
 'json': None,
 'origin': '84.54.70.101',
 'url': 'https://httpbin.org/post'}