
# Python's Requests Library


## Introduction

- **Requests** is the de facto standard for making HTTP requests in Python.
- Abstracts complex `urllib3` implementations into a human-friendly API

## Key Features
- Intuitive HTTP methods: `GET`, `POST`, `PUT`, `DELETE`, `PATCH`, `HEAD`, `OPTIONS`
- Rich response objects with built-in JSON parsing: `response.json()`
- Allows customization with headers, query parameters, and payloads.
- Handles cookies, sessions, and authentication seamlessly.

## Common Use Cases
- RESTful API interactions
- Web scraping
- File downloads/uploads
- Authentication flows (Basic, OAuth, Custom)



## Installation
To install Requests, use the following command:

```bash
pip install requests
```

or

```bash
conda install requests
```

## Reference Documentation

- [Requests Documentation](https://docs.python-requests.org/en/master/)

## Asynchronous Compatability {.smaller-80}

The `requests` library is synchronous-only and will block async applications.  For async HTTP requests, use alternatives like:

  - `aiohttp`: Full-featured async HTTP client/server framework
  - `httpx`: Modern async/sync HTTP client with Requests-like API
  
:::{.callout-note}
Asynchronous requests are requests that are executed in parallel, without waiting for the response of the previous request.
:::


## HTTP Requests
- **GET**: Retrieve data from a specified resource.
- **POST**: Send data to create a resource.
- **PUT**: Update a resource with new data.
- **DELETE**: Remove a resource.

```python
import requests

response = requests.get("https://api.github.com")
print(response.status_code)  # Example of GET request
```


## Request {.smaller-80}

An HTTP **request** is a message sent from a client (like a web browser or API client) to a server. Each request contains several key components:

- **Method**: The HTTP method used to make the request (e.g., GET, POST).
- **URL**: The URL of the resource being requested.
- **Headers**: Additional metadata sent with the request (e.g., content type).
- **Body**: Data sent with the request (e.g., form data, JSON payload).

## Response 

Every HTTP request returns a **response** object with the following attributes:

- **Status Code**: HTTP status code
- **Headers**: Response headers
- **Body**: Response content (e.g., HTML, JSON, binary data)

## Response Status Codes {.smaller-80}

HTTP status codes indicate the status of the server’s response. Some common status codes are:

- `100 Continue`: Request received, please continue.
- `200 OK`: Request was successful.
- `404 Not Found`: Resource not found.
- `500 Internal Server Error`: Server error.

:::{.callout-note}
Please refer to the [HTTP Status Codes](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) documentation for a complete list of status codes.
:::

## The `GET` Method

The `GET` method indicates that you’re trying to get data from a resource.

In [None]:
import requests

requests.get('https://api.github.com')

<Response [200]>

### Response Object

The `response` object contains the server’s response to the HTTP request.  If we store the response in a variable, we can use the `response` object to access the data.

In [3]:
import requests

response = requests.get('https://api.github.com')

### Response Status Code

By accessing `.status_code`, you can see the status code that the server returned:

```python
if response.status_code == 200:
    print("Success!")
elif response.status_code == 404:
    print("Not Found.")
```

### Truthiness of Response Object

The `response` object has a boolean value, which is `True` if the status code is between 200 and 400, and `False` otherwise.

```python
if response:
    print("Success!")
else:
    print("An error has occurred.")
```

:::{.callout-note}
This truth value test is possible because the response object's `__bool__()` method is overloaded.
:::

### Exception Handling {.smaller-70}

The `requests` library considers any status code in the 400 or 500 range as unsuccessful.  To simplify error handling, use the `raise_for_status()` method to automatically raise an exception for unsuccessful requests. This allows you to avoid writing multiple `if` statements to check for various HTTP status codes.

```python
import requests
from requests.exceptions import HTTPError

URLS = ["https://api.github.com", "https://api.github.com/invalid"]

for url in URLS:
    try:
        response = requests.get(url)
        response.raise_for_status()
    except HTTPError as http_err:
        print(f"HTTP error occurred: {http_err}")
    except Exception as err:
        print(f"Other error occurred: {err}")
    else:
        print("Success!")
```

### Response Content {.smaller-70}

The message body of a `GET` request response contains the retrieved information, commonly referred to as the **payload**.  The Response object provides several attributes and methods that allow you to view and interact with this payload in various formats.

- **.content**: Raw bytes of the response.
```python
print(response.content)  # Access raw bytes
```

- **.text**: Decoded text of the response.
```python
print(response.text)     # Access decoded text
```

- **.json()**: Convert response to JSON.

In [6]:
print(response.json())   # Access JSON data

{'current_user_url': 'https://api.github.com/user', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}', 'authorizations_url': 'https://api.github.com/authorizations', 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}', 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}', 'emails_url': 'https://api.github.com/user/emails', 'emojis_url': 'https://api.github.com/emojis', 'events_url': 'https://api.github.com/events', 'feeds_url': 'https://api.github.com/feeds', 'followers_url': 'https://api.github.com/user/followers', 'following_url': 'https://api.github.com/user/following{/target}', 'gists_url': 'https://api.github.com/gists{/gist_id}', 'hub_url': 'https://api.github.com/hub', 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}', 'issues_url': 'https://api.github.com/issues', 'keys_url': 'https://api.git

### Response Content (continued) {.smaller-80}

The `response.json()` method attempts to parse the response content as JSON and, if successful, returns a Python dictionary (or list, if the JSON data is an array). This allows you to interact with the data as you would with any other Python dictionary or list.

In [4]:
import requests

response = requests.get('https://api.github.com')
response_dict = response.json()
response_dict["emojis_url"]

'https://api.github.com/emojis'

:::{.callout-note}
If the response does not contain valid JSON, calling response.json() will raise a `JSONDecodeError`
:::


### Response Headers {.hide}

HTTP **response headers** are metadata associated with the request and response, such as the content type, date, and server information. The `.headers` attribute is a dictionary of the response headers.

In [10]:
import requests
response = requests.get("https://api.github.com")
print(response.headers["Content-Type"])

application/json; charset=utf-8


:::{.callout-note}
The headers object is case-insensitive, so you can access headers using any case.
:::

### Request Parameters {.hide .smaller-80}

**Request parameters** allow you to filter, sort, and customize data when making a request. You can include them in the `requests.get()` method by passing a dictionary to the `params` argument.

```python
import requests

response = requests.get(
    "https://api.github.com/search/repositories", 
    params={"q": "language:python", "sort": "stars", "order": "desc"})

dict_response = response.json()
repos = dict_response["items"]
for repo in repos[:5]:
    print(f"Name: {repo['name']}")
    print(f"Stars: {repo['stargazers_count']}")
    print(f"Description: {repo['description']}")
    print("\n")
```

:::{.callout-note}
See the [GitHub API documentation](https://docs.github.com/en/rest/search/search#search-repositories) for more information on query parameters.
:::

### Request Parameters (continued) {.hide .smaller-80}

The `params` argument in the `requests.get()` method accepts:

- A dictionary
- A list of tuples
- A string

```python
response = requests.get(
    "https://api.github.com/search/repositories", 
    params=[("q", "language:python"), ("sort", "stars"), ("order", "desc")])

response = requests.get(
    "https://api.github.com/search/repositories",
    params=b"q=language:python&sort=stars&order=desc")
```

:::{.callout-note}
The b prefix in b\"...\" creates a bytes literal in Python instead of a regular string. This is important when working with data that needs to be in binary format, which is common when making HTTP requests.
:::

:::{.notes}
Regular strings ("...") contain Unicode characters
Bytes literals (b"...") contain sequences of bytes (values from 0-255)
:::

### Request Headers {.hide .smaller-70}

The `requests` library allows you to customize the **request headers** by passing a dictionary to the `headers` argument in the `requests.get()` method.

In this example, the **Accept** header tells the server what content types your application can handle, `application/vnd.github.text-match+json`, which is a
GitHub specific **Accept** header where the content is a special JSON format (for match highlighting).

```python
import requests

response = requests.get(
    "https://api.github.com/search/repositories",
    params={"q": "cincinnati hackathon"},
    headers={"Accept": "application/vnd.github.text-match+json"},
)

dict_response = response.json()
first_repository = json_response["items"][0]
print(first_repository["text_matches"][0]["matches"])
```

:::{.callout-note}
See the [GitHub API documentation](https://docs.github.com/en/rest/reference/search#search-repositories) for more information on the **Accept** header.
:::

## Other HTTP Methods

We focused on the `GET` method, but the `requests` library supports other HTTP methods as well:

- **POST**: Create a new resource.
- **PUT**: Update an existing resource.
- **DELETE**: Remove a resource.
- **PATCH**: Update a resource partially.
- **HEAD**: Retrieve headers only.


## API Authentication {.smaller-80}

API authentication is a way to identify and authorize users to access an API. There are several ways to authenticate with an API, including:

- **Basic Authentication**: Username and password.
- **Token-based Authentication**: Access tokens.
- **API Keys**: Unique keys for an application or user.
- **OAuth**: Authorization framework for third-party applications.

:::{.callout-note}
Whenever you interact with an API, you should always check the API documentation to understand the authentication requirements, and make sure you follow **security** best practices.
:::


## Authentication
- **Basic Authentication**: Pass a tuple with username and password.

```python
from requests.auth import HTTPBasicAuth
response = requests.get("https://api.github.com/user", auth=HTTPBasicAuth('user', 'pass'))
```

- **Bearer Token Authentication**: Use custom headers.

```python
headers = {"Authorization": "Bearer <token>"}
response = requests.get("https://api.github.com/user", headers=headers)
```


## GitHub API Authentication {.smaller-80}

The GitHub API allows you to authenticate using an **API token**. To authenticate with the GitHub API, you need to pass your token in the `Authorization` header.

First, create a personal access token on GitHub:

1. Go to your GitHub account settings.
2. Click on Developer settings.
3. Click on Personal access tokens.
4. Click on Generate new token.
5. Give your token a name and select the scopes you need.
6. Click on Generate token.

## GitHub API Authentication (continued) {.hide .smaller-80}

After creating your token, you can use it to authenticate with the GitHub API:

```python
import requests

# Replace with your token from GitHub
token = "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"
headers = {
    "Authorization": f"Bearer {token}",
    "Accept": "application/vnd.github+json"
}

# Make authenticated request
response = requests.get(
    "https://api.github.com/user",
    headers=headers
)

if response.status_code == 200:
    user_data = response.json()
    print(f"Authenticated as: {user_data['login']}")
else:
    print(f"Authentication failed: {response.status_code}")
```

## How Does the API Know it's You ? {.hide}

**How does it know it's you?**

When you create a Personal Access Token (PAT), GitHub generates and stores a cryptographic hash of your token along with your user ID and permissions. Later, when you make an API request with your token, GitHub hashes the provided token and looks up the result to verify who it belongs to and the granted access rights.

## Is My Approach Secure?

**NO**

You should never hardcode your API tokens in your code. Instead, you should store them in a secure location, such as environment variables or a configuration file. This prevents accidental exposure of your tokens in your codebase.

:::{.callout-note}
Security is often a compromise between **convinience** and **security** level.
:::

### Using `dotenv` 

One approach is to use a library like `python-dotenv` to load environment variables from a `.env` file:

```python
from dotenv import load_dotenv
load_dotenv()

token = os.getenv("GITHUB_TOKEN")
```

**Is this enough to be secure?**


## SSL Verification 

Requests verifies SSL certificates by default for secure connections.  **Do not disable SSL verification** unless you have a good reason to do so -- don't follow someone's advices blindly without knowing the risks.

```python
response = requests.get("https://api.github.com", verify=False)
```

:::{.callout-note}
I am calling this out because you will see this advice in many online forums, but it is not recommended to disable SSL verification.
:::


## Performance Optimization {.hide .smaller-70}

The `requests` library provides several options to optimize performance and improve the efficiency of your HTTP requests.  You need to consider these options based on your specific use case, specifically the server you are interacting with.

- **Timeout**: Control wait time for a response.

```python
response = requests.get("https://api.github.com", timeout=3)
```

- **Session Object**: Use `Session` for persistent connections.

```python
session = requests.Session()
response = session.get("https://api.github.com")
session.close()
```

- **Retries**: Configure retries using adapters.

```python
from requests.adapters import HTTPAdapter
session = requests.Session()
adapter = HTTPAdapter(max_retries=3)
session.mount("https://", adapter)
response = session.get("https://api.github.com")
```



## The End