# Authentication with the requests library

APIs often **require authentication** to control access, enforce rate limits, and support auditing. When you hit an endpoint that requires authentication without valid credentials, you typically get a **401** or **403**.

This lab covers:
- The difference between **401 Unauthorized** and **403 Forbidden**
- Why authentication is needed (public vs protected endpoints)
- **Basic authentication** using the `auth` keyword argument

## 401 vs 403

| Code | Meaning | When to use (best practice) |
|------|--------|------------------------------|
| **401 Unauthorized** | Authentication failed or missing | No credentials sent, or credentials are **wrong** (e.g. wrong password, invalid or expired token). |
| **403 Forbidden** | Authenticated but not allowed | Credentials are **valid**, but the user/token **does not have permission** to access the requested resource or perform the operation. |

Some APIs use 401 and 403 interchangeably. Best practice is to return **401** when authentication itself fails, and **403** when the authenticated identity lacks permission. In practice, both codes commonly indicate "you can't access this resource"—either due to missing/invalid auth or insufficient permissions.

## Why authentication matters

Authentication lets the API:
- **Identify** who is making the request
- **Apply authorization rules** (what that identity is allowed to do)
- **Protect resources** that should not be available to everyone
- **Apply different rate limits** (e.g. stricter limits for anonymous requests than for authenticated ones)

## Public vs protected endpoint: seeing 401 in action

We'll send a simple GET request to two GitHub endpoints:
- **Public** – `/zen` (no auth required)
- **Protected** – `/user` (requires authentication; returns the authenticated user's profile)

Without credentials, the protected endpoint returns **401** and a message that authentication is required.

In [1]:
import requests

GITHUB_ENDPOINT = "https://api.github.com"

urls = {
    "public": f"{GITHUB_ENDPOINT}/zen",
    "protected": f"{GITHUB_ENDPOINT}/user",
}

for description, url in urls.items():
    response = requests.get(url, timeout=5)
    print(f"{description} ({url})")
    print(f"  status: {response.status_code}")
    print(f"  body: {response.text[:200]}")
    print()

public (https://api.github.com/zen)
  status: 200
  body: Mind your words, they are important.

protected (https://api.github.com/user)
  status: 401
  body: {
  "message": "Requires authentication",
  "documentation_url": "https://docs.github.com/rest",
  "status": "401"
}



The **public** endpoint returns **200** and a Zen quote. The **protected** endpoint returns **401** and a message that authentication is required.

## Basic authentication

**Basic auth** sends a **username and password** with each request, encoded in the **Authorization** header. It's less secure than token-based authentication (e.g. Bearer tokens), but you'll still encounter it.

In `requests`, use the **`auth`** keyword with a **tuple** `(username, password)`. The library encodes them and sets the header for you.

We'll use **[httpbin.org](https://httpbin.org)** to simulate Basic auth. The path `/basic-auth/<user>/<password>` defines which username/password combination is accepted.

In [2]:
import requests
import json

HTTPBIN_ENDPOINT = "https://httpbin.org"
url = f"{HTTPBIN_ENDPOINT}/basic-auth/my_user/my_password"

response = requests.get(url, auth=("my_user", "my_password"), timeout=10)

print(f"status code: {response.status_code}")
print(json.dumps(response.json(), indent=2))

status code: 200
{
  "authenticated": true,
  "user": "my_user"
}


With the **correct** username and password (matching the URL), we get **200** and a JSON body like `{"authenticated": true, "user": "my_user"}`.

## Wrong credentials: handling 401

If we use the **wrong** password (e.g. URL expects `my_password` but we send something else), httpbin returns **401**. We use **`raise_for_status()`** and catch **`requests.exceptions.HTTPError`** to handle it.

In [3]:
url = f"{HTTPBIN_ENDPOINT}/basic-auth/my_user/my_password"

try:
    response = requests.get(url, auth=("my_user", "wrong_password"), timeout=10)
    response.raise_for_status()
    print(json.dumps(response.json(), indent=2))
except requests.exceptions.HTTPError as error:
    print(f"HTTP error: {error}")
    print(f"status: {error.response.status_code}")

HTTP error: 401 Client Error: UNAUTHORIZED for url: https://httpbin.org/basic-auth/my_user/my_password
status: 401


We get **401 Unauthorized**. This demonstrates how to use **Basic auth** with the `auth` keyword and how to handle authentication failures with `raise_for_status()` and `HTTPError`.

## Token-based authentication (API keys and Bearer tokens)

With **modern APIs**, you'll often use **API keys** or **tokens** passed in the **Authorization** header instead of Basic auth:

- **GitHub (Personal Access Token)** – use the **`token`** scheme: `Authorization: token <your_token>`. GitHub also accepts **`Bearer`**.
- **OAuth 2.0 / many other APIs** – use the **Bearer** scheme: `Authorization: Bearer <your_token>`.

Some APIs support **both**; check the **API documentation** for the exact header format. Using the wrong format can lead to **401** or **403**.

**Important:** Always load tokens from **environment variables** or **secret managers**. Never hard-code secrets in code.

### Loading the token from the environment

We use **`python-dotenv`** to load variables from a **`.env`** file. `load_dotenv(override=True)` ensures values from `.env` override any existing environment variables (useful when you've set them elsewhere).

Store your GitHub Personal Access Token (PAT) in `.env` as `GITHUB_PAT=...`. If your `.env` is in a different directory (e.g. one level up), pass the path to `load_dotenv(path="...")`. If you didn't set this up, see the setup lecture in this section to create a PAT and add it to a `.env` file in this folder. You may need to install the package: `pip install python-dotenv`.

In [4]:
import requests
import os
from dotenv import load_dotenv

load_dotenv(override=True)

token = os.environ.get("GITHUB_PAT", "")

# Sanity check: GitHub fine-grained PATs start with "github_pat_"
print(f"Token loaded (first 10 chars): {token[:10]}...")

Token loaded (first 10 chars): github_pat...


### Authenticated requests with the Authorization header

We pass the token in the **`Authorization`** header. GitHub accepts either **`token <token>`** or **`Bearer <token>`**. We use **`headers=headers`** in `requests.get()` so the protected `/user` endpoint returns the authenticated user's profile.

The **public** `/zen` endpoint returns **plain text**, not JSON, so calling `response.json()` on it raises a **JSON decode error**. We catch that and fall back to printing `response.text` for non-JSON responses.

In [5]:
GITHUB_ENDPOINT = "https://api.github.com"
urls = {
    "public": f"{GITHUB_ENDPOINT}/zen",
    "protected": f"{GITHUB_ENDPOINT}/user",
}

headers = {
    "Authorization": f"token {token}",  # or f"Bearer {token}" — GitHub supports both
}

for description, url in urls.items():
    print(f"Requesting {description} ({url})")
    try:
        response = requests.get(url, headers=headers, timeout=5)
        response.raise_for_status()
        try:
            data = response.json()
            print(f"  Authenticated user: {data.get('login', data)}")
        except requests.exceptions.JSONDecodeError:
            print("  Invalid JSON in response body, defaulting to text:")
            print(f"  {response.text[:200]}")
    except requests.exceptions.HTTPError as error:
        print(f"  HTTP error: {error}")
    print()

Requesting public (https://api.github.com/zen)
  Invalid JSON in response body, defaulting to text:
  Practicality beats purity.

Requesting protected (https://api.github.com/user)
  Authenticated user: LironeFitoussi



With a valid token, the **protected** `/user` request returns **200** and your username (`login`). The **public** `/zen` request returns 200 with plain text, so we hit the JSON decode branch and print the text instead.

**If you see 401:** ensure `GITHUB_PAT` is set in your `.env` and that the token is valid and has the right scopes.

## Pitfalls with token-based auth

| Pitfall | Why it matters | What to do |
|--------|----------------|------------|
| **Wrong header format** | Using `Bearer` when the API expects `token` (or vice versa) can cause **401** or **403**. | Follow the **API documentation**. If the API supports both (like GitHub), either is fine. |
| **Hard-coding secrets** | Tokens in source code risk **accidental exposure** (commits, screenshots, logs). | Always use **environment variables** or **secret managers**; use a **`.env`** file for local development and never commit it. |