# Web Requests & APIs (30-Minute Micro-Class)

Welcome! In ~30 minutes you'll learn how to:

- Send HTTP requests with `requests`
- Work with query params, headers, and JSON payloads
- Handle timeouts, errors, and simple retries
- Page through results and respect basic rate limits
- Do a mini-project: fetch data from a public API and analyze it

### Agenda (Suggested pacing)
1. **HTTP & requests refresher** (3 min)
2. **GET, params, JSON** (7 min)
3. **POST, headers, timeouts, errors** (8 min)
4. **Pagination + simple retries** (6 min)
5. **Mini-project (SWAPI)** (5–6 min)
6. **Stretch exercises** (time-permitting)

> All examples use free, public endpoints: `jsonplaceholder.typicode.com`, `httpbin.org`, `reqres.in`, and `swapi.dev`.

## 0) Setup
Colab already includes `requests`, `pandas`, and `matplotlib`. Let's import what we need.

In [None]:
import requests, time, math, os
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
import pandas as pd
import matplotlib.pyplot as plt

requests.__version__

## 1) HTTP in 3 Minutes
- **Methods**: `GET` (read), `POST` (create), `PUT/PATCH` (update), `DELETE` (remove)
- **URLs** + optional **query parameters** (e.g. `?q=python&limit=10`)
- **Headers**: metadata (e.g. `Authorization`, `User-Agent`, `Accept`)
- **Body**: often JSON (send/receive)
- **Status codes**: `2xx` success, `4xx` client error, `5xx` server error

## 2) Basic GET and JSON
We'll call a free placeholder API and parse JSON.

In [None]:
url = "https://jsonplaceholder.typicode.com/todos/1"
resp = requests.get(url)
print("Status:", resp.status_code)
data = resp.json()
data

### Query Parameters
Use `params={}` to safely build URLs.

In [None]:
url = "https://jsonplaceholder.typicode.com/posts"
params = {"userId": 1}
resp = requests.get(url, params=params)
posts = resp.json()
print(f"Fetched {len(posts)} posts for userId=1")
posts[:2]  # peek at first two

### Convert JSON to a DataFrame
Pandas makes quick inspection and filtering easy.

In [None]:
df = pd.DataFrame(posts)
df.head()

## 3) POST, Headers, Timeouts, and Errors
We'll use `httpbin.org` to echo back what we send.

In [None]:
url = "https://httpbin.org/post"
payload = {"name": "Solomon", "role": "AI Developer"}
headers = {"User-Agent": "Colab-Demo/1.0", "Accept": "application/json"}
resp = requests.post(url, json=payload, headers=headers, timeout=10)
print("Status:", resp.status_code)
resp.json()["json"]  # server echoes our JSON payload

### Handling Errors & Timeouts
Use `timeout` and `raise_for_status()` inside `try/except`.

In [None]:
def safe_get(url, **kwargs):
    try:
        r = requests.get(url, timeout=kwargs.pop("timeout", 10), **kwargs)
        r.raise_for_status()
        return r
    except requests.exceptions.Timeout:
        print("Request timed out")
    except requests.exceptions.HTTPError as e:
        print("HTTP error:", e)
    except requests.exceptions.RequestException as e:
        print("Other error:", e)

bad = safe_get("https://httpbin.org/status/429")  # 429 Too Many Requests
good = safe_get("https://httpbin.org/get")
good.json()["url"] if good else None

### Simple Retries (with Backoff)
Transient errors (e.g., 429/5xx) can be retried with backoff.

In [None]:
def session_with_retries(total=3, backoff=0.5, status_forcelist=(429, 500, 502, 503, 504)):
    retry = Retry(total=total, backoff_factor=backoff, status_forcelist=status_forcelist, allowed_methods=["GET","POST"]) 
    s = requests.Session()
    s.headers.update({"User-Agent": "Colab-Demo/1.0"})
    adapter = HTTPAdapter(max_retries=retry)
    s.mount("http://", adapter)
    s.mount("https://", adapter)
    return s

s = session_with_retries()
r = s.get("https://httpbin.org/status/503")  # will retry
print("Final status:", r.status_code)

## 4) Pagination Example
The `reqres.in` API demonstrates paged responses.

In [None]:
def fetch_all_users_reqres():
    base = "https://reqres.in/api/users"
    page = 1
    users = []
    while True:
        r = requests.get(base, params={"page": page}, timeout=10)
        r.raise_for_status()
        js = r.json()
        users.extend(js.get("data", []))
        if page >= js.get("total_pages", 1):
            break
        page += 1
        time.sleep(0.25)  # be nice
    return users

users = fetch_all_users_reqres()
pd.DataFrame(users)

## (Note) Authentication Patterns
- **API key** in header: `Authorization: Bearer <TOKEN>` or custom header like `x-api-key`.
- **Query param** (less secure): `?api_key=...`.
- **OAuth2**: obtain access token, then pass as `Bearer` token.

Below is a *pattern* showing how you might read an API key from an environment variable. (This cell won’t run a real key.)

In [None]:
API_KEY = os.getenv("MY_API_KEY", "<put_your_key_here>")
headers = {"Authorization": f"Bearer {API_KEY}"}
# Example (disabled):
# resp = requests.get("https://api.example.com/data", headers=headers, timeout=10)
# resp.json()

## 5) Mini-Project: Star Wars People (SWAPI)
Goal: fetch **all people** from SWAPI, analyze species counts, and plot.

In [None]:
def fetch_all_swapi(resource="people"):
    url = f"https://swapi.dev/api/{resource}/"
    results = []
    while url:
        r = requests.get(url, timeout=15)
        r.raise_for_status()
        js = r.json()
        results.extend(js.get("results", []))
        url = js.get("next")
        time.sleep(0.2)
    return results

people = fetch_all_swapi("people")
len(people), people[0]["name"] if people else None

In [None]:
# Build a species label for each person (fetch species name if available)
from functools import lru_cache

@lru_cache(maxsize=64)
def species_name(url):
    if not url:
        return "Unknown"
    r = requests.get(url, timeout=10)
    r.raise_for_status()
    return r.json().get("name", "Unknown")

rows = []
for p in people:
    sp_list = p.get("species", [])
    sp = species_name(tuple(sp_list)[0]) if sp_list else "Human? (Unknown in API)"
    rows.append({"name": p.get("name"), "species": sp})

swdf = pd.DataFrame(rows)
swdf.head()

In [None]:
counts = swdf["species"].value_counts().sort_values(ascending=False)
plt.figure(figsize=(8,4))
counts.head(8).plot(kind="bar")
plt.title("Top Species (SWAPI People)")
plt.xlabel("Species")
plt.ylabel("Count")
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
plt.show()
counts.head(10)

## 6) Stretch Exercises (Try these!)
1. **Error handling**: Wrap the SWAPI fetches in `try/except` with `raise_for_status()` and a fallback.
2. **Backoff**: Add exponential backoff to the SWAPI pagination loop on non-200 responses.
3. **Headers**: Add a custom `User-Agent` to your requests session.
4. **Params**: Use query params to filter a different API (e.g., `jsonplaceholder` by `userId`).
5. **Join data**: Combine two endpoints (e.g., SWAPI people + homeworld names) and plot homeworld counts.

## Quick Reference
- GET with params: `requests.get(url, params={"q": "python"})`
- POST JSON: `requests.post(url, json={"a":1})`
- Headers: `requests.get(url, headers={"Authorization": "Bearer ..."})`
- Timeout: `requests.get(url, timeout=10)`
- Errors: `resp.raise_for_status()` inside `try/except`
- Retries: `requests.Session()` + `HTTPAdapter(max_retries=Retry(...))`
- Pagination: loop until `next` is `None` or page > total_pages

## Wrap-Up
You learned how to call REST APIs with `requests`, handle common concerns (params, headers, JSON, timeouts, errors, retries), and page through results. Keep these patterns handy and plug in your target APIs.

**Next steps:**
- Add caching (e.g., `requests-cache`) for faster iteration
- Explore auth with a real API key stored in environment variables
- Build a small helper module wrapping these patterns for your projects