# Timeouts and Retries

When issuing HTTP requests and interacting with **external services**, we should have **timeout** and **retry** mechanisms in place. External services can be **slow or unreliable**, which can cause scripts to **hang** or **fail unexpectedly**.

Timeouts and retries help keep automation scripts **responsive** and **resilient** to flaky external behavior.

## Timeouts in `requests`

You can pass **`timeout`** in two ways:

| Form | Meaning |
|------|--------|
| **Single value** (e.g. `timeout=5`) | Same value used for both **connect** and **read** timeouts. |
| **Tuple** (e.g. `timeout=(2, 10)`) | **`(connect_timeout, read_timeout)`** – control each phase separately. |

- **Connect timeout** – raised if the **connection** to the server cannot be established within the given time (e.g. server unreachable, network slow).
- **Read timeout** – raised if **data stops arriving** within the given time, or if the **request takes longer than the read timeout** to complete (e.g. server is slow to respond).

So the problem can be either "couldn't connect" (connect timeout) or "connected but response took too long" (read timeout).

## Simulating a read timeout with httpbin

**[httpbin.org](https://httpbin.org)** has a **`/delay/<seconds>`** endpoint that waits for the given number of seconds before responding. We'll request **`/delay/5`** (5 second delay) with a **2 second** timeout. The connection is established quickly, but the **response** takes longer than 2 seconds, so we get a **read timeout**.

In [1]:
import requests
import time

HTTPBIN_ENDPOINT = "https://httpbin.org"
delay_url = f"{HTTPBIN_ENDPOINT}/delay/5"

start = time.perf_counter()
try:
    response = requests.get(delay_url, timeout=2)
    elapsed = time.perf_counter() - start
    print(f"Completed in {elapsed:.2f} seconds")
    print(f"Status: {response.status_code}")
except (requests.exceptions.ConnectTimeout, requests.exceptions.ReadTimeout) as timeout_error:
    elapsed = time.perf_counter() - start
    print(f"Timeout after {elapsed:.2f} seconds")
    print(timeout_error)

Timeout after 2.73 seconds
HTTPSConnectionPool(host='httpbin.org', port=443): Read timed out. (read timeout=2)


We get a **ReadTimeout**. The issue is with the **operation** (waiting for the response), not with establishing the connection. The connection was made, but the server waits 5 seconds before replying, and our **read timeout** is 2 seconds, so the request times out.

**Try it yourself:** change `timeout=2` to `timeout=6` (or more) and run again—the request should complete and print the status code.

## Retries

**Transient issues** (network glitches, server overload, slow responses) can cause requests to **fail** or **timeout**. A **retry mechanism** helps:

- **Retry** on: **server errors (5xx)** and **network exceptions** (e.g. timeouts). These are often temporary.
- **Do not retry** (break out): on **success**, or on **client errors (4xx)**. Client errors usually mean something is wrong with *our* request (bad URL, invalid payload, etc.); retrying the same request typically won't help.

You can use a **fixed delay** between retries for simplicity, or **exponential backoff** (with optional jitter) for a more robust approach. Below we use a fixed delay.

**Important:** Avoid retrying **non-idempotent** operations. An **idempotent** operation can be repeated multiple times with the **same end result** (e.g. GET, or a well-designed PUT). A **non-idempotent** operation can change state each time (e.g. POST that creates a new record). If you retry a failed POST, you might create duplicates. For such cases, handle the error differently instead of blindly retrying.

### Simple retry loop with fixed delay

We use httpbin's **`/status/<code>`** endpoint. To simulate flakiness (about **2/3 server error, 1/3 success**), we pick at random one of three URLs: two return **500**, one returns **200**. We retry up to **max_retries** times with a **fixed delay** between attempts. We **break** on success or on client errors (4xx); we **retry** on server errors (5xx) and optionally on timeouts.

In [None]:
import requests
import time
import random

HTTPBIN_ENDPOINT = "https://httpbin.org"
# Simulate flakiness: 2/3 chance 500, 1/3 chance 200 (picked at random per request)
flaky_urls = [
    f"{HTTPBIN_ENDPOINT}/status/500",
    f"{HTTPBIN_ENDPOINT}/status/500",
    f"{HTTPBIN_ENDPOINT}/status/200",
]

max_retries = 3
delay = 2  # seconds between retries

for attempt in range(1, max_retries + 1):
    print(f"Attempt {attempt}/{max_retries}")
    try:
        url = random.choice(flaky_urls)
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        print(f"Succeeded with status {response.status_code}")
        break
    except requests.exceptions.HTTPError as error:
        status_code = error.response.status_code
        if status_code < 500:
            print(f"Failed with client error {status_code}, skipping retry")
            break
        print(f"Failed with server error {status_code}")
        if attempt < max_retries:
            print(f"Waiting {delay} seconds before retry...")
            time.sleep(delay)
    except (requests.exceptions.ConnectTimeout, requests.exceptions.ReadTimeout) as timeout_error:
        print(f"Timeout: {timeout_error}")
        if attempt < max_retries:
            print(f"Waiting {delay} seconds before retry...")
            time.sleep(delay)
else:
    print(f"All {max_retries} attempts failed")

The **for-else** clause runs only if the loop **completed all iterations without breaking**—i.e. we never succeeded and never bailed out on a client error. So we print "All 3 attempts failed".

Run the cell multiple times: sometimes you'll succeed on the first attempt (200), sometimes after one or two retries, and occasionally all three attempts may hit 500. The randomness demonstrates how retries help with transient server errors.