# Level 1 â€” Week 4 Practice (Starter Notebook)

Starter LLM client skeleton: timeouts, retries, caching, and logs.

## References (docs)
- `requests` timeouts: https://requests.readthedocs.io/en/latest/user/quickstart/#timeouts
- Tenacity: https://tenacity.readthedocs.io/
- Python `logging`: https://docs.python.org/3/library/logging.html
- HTTP 429 Too Many Requests (MDN): https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
- Twelve-Factor App: https://12factor.net/


## Setup

This notebook demonstrates patterns without requiring a real API key.
Replace `fake_provider_call(...)` with a real provider call later.


In [None]:
import hashlib
import json
import logging
import time
from dataclasses import dataclass
from typing import Any, Dict

import requests
from tenacity import retry, stop_after_attempt, wait_exponential


In [None]:
logging.basicConfig(level=logging.INFO, format='%(levelname)s %(message)s')
logger = logging.getLogger('llm_client')

CACHE: Dict[str, Any] = {}


## Stable cache keys

Cache keys should include everything that changes output: model, prompt, temperature, etc.


In [None]:
def make_cache_key(payload: dict) -> str:
    raw = json.dumps(payload, sort_keys=True, ensure_ascii=False).encode('utf-8')
    return hashlib.sha256(raw).hexdigest()


## Provider call stub

Simulate transient failures so you can test retries/timeouts.


In [None]:
def fake_provider_call(payload: dict, timeout_s: float) -> dict:
    if payload.get('force_error'):
        raise requests.Timeout('Simulated timeout')
    time.sleep(0.05)
    return {
        'text': 'echo: ' + str(payload.get('prompt', '')) ,
        'model': payload.get('model', 'fake'),
    }


## LLM client skeleton

Implements timeout + retry/backoff + caching + logs.


In [None]:
@dataclass
class LLMConfig:
    model: str = 'fake-model'
    timeout_s: float = 10.0
    max_retries: int = 3

cfg = LLMConfig()
cfg


In [None]:
def llm_call(prompt: str, *, config: LLMConfig, force_error: bool = False) -> dict:
    payload = {
        'model': config.model,
        'prompt': prompt,
        'force_error': force_error,
    }
    cache_key = make_cache_key(payload)
    if cache_key in CACHE:
        logger.info('cache_hit')
        return CACHE[cache_key]

    @retry(stop=stop_after_attempt(config.max_retries), wait=wait_exponential(multiplier=0.5, min=0.5, max=4.0))
    def _call_once():
        t0 = time.time()
        try:
            return fake_provider_call(payload, timeout_s=config.timeout_s)
        finally:
            logger.info('latency_ms=%s' % int((time.time()-t0)*1000))

    resp = _call_once()
    CACHE[cache_key] = resp
    return resp

llm_call('hello', config=cfg)


## TODO

Replace stub with real provider call, then add parsing/validation (Week 3).
