tafreeman · tafreeman · Apr 12, 2026 · Apr 11, 2026 · Apr 11, 2026 · Apr 11, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -37,10 +37,11 @@ ExecutionKit is a minimal library for LLM reasoning patterns — it fills the ga
 
 | Module | Role |
 |--------|------|
-| `provider.py` | `LLMProvider` protocol, `Provider` HTTP client, `LLMResponse`, 9-class error hierarchy |
+| `errors.py` | 9-class exception hierarchy (`ExecutionKitError` → `LLMError`, `PatternError` subtrees); extracted from `provider.py` (F-06) |
+| `provider.py` | `LLMProvider` protocol, `Provider` HTTP client, `LLMResponse`; re-exports error classes from `errors.py` for backwards compatibility; `_classify_http_error()` is the single HTTP status→exception mapping point shared by both backends (F-02) |
 | `types.py` | Frozen value types: `TokenUsage`, `PatternResult[T]`, `Tool`, `VotingStrategy`, `Evaluator` |
 | `cost.py` | `CostTracker` — mutable accumulator with two-phase accounting (`reserve_call` + `record_without_call`) |
-| `patterns/base.py` | `checked_complete()` — shared budget guard + retry entry point for all patterns |
+| `patterns/base.py` | `checked_complete()` — shared budget guard + retry entry point; `_check_budget()` helper uses `getattr()` field loop replacing per-field if-chains (F-05/F-08); `_TrackedProvider.supports_tools` delegates to wrapped provider via `getattr` instead of hardcoding `Literal[True]` (F-04) |
 | `patterns/consensus.py` | Parallel sampling, majority/unanimous voting, agreement metadata |
 | `patterns/refine_loop.py` | Iterative improvement with `ConvergenceDetector`; default evaluator uses XML sandboxing |
 | `patterns/react_loop.py` | Think-act-observe loop; validates tool args against JSON Schema; caps context via `max_history_messages` |
@@ -55,7 +56,9 @@ ExecutionKit is a minimal library for LLM reasoning patterns — it fills the ga
 
 **Two-phase cost accounting** — `reserve_call()` pre-increments the call counter before `await` (TOCTOU-safe for concurrent patterns); `record_without_call(response)` adds token counts after success.
 
-**Budget guards** — `checked_complete()` in `patterns/base.py` checks token/call budget before every LLM call and raises `BudgetExhaustedError` (with accumulated cost snapshot) if exceeded.
+**Budget guards** — `checked_complete()` in `patterns/base.py` checks token/call budget before every LLM call and raises `BudgetExhaustedError` (with accumulated cost snapshot) if exceeded. The internal `_check_budget()` helper iterates over field names using `getattr()` rather than repeating an if-block per field (F-05/F-08).
+
+**Centralised HTTP error mapping** — `_classify_http_error()` in `provider.py` is the single function that converts HTTP status codes to the appropriate error subclass. Both the `_post_httpx` and `_post_urllib` backends call it, eliminating the duplicated mapping logic that previously existed in each (F-02).
 
 **Structural typing** — `LLMProvider` and `ToolCallingProvider` are `@runtime_checkable` protocols, not base classes. Any object matching the interface works.
 

diff --git a/docs/api-reference.md b/docs/api-reference.md
@@ -1089,10 +1089,53 @@ Validate that an evaluator score is in [0.0, 1.0] and not NaN.
 
 ---
 
+### `_check_budget()` (internal)
+
+```python
+def _check_budget(
+    budget: TokenUsage,
+    current: TokenUsage,
+    fields: tuple[str, ...],
+    *,
+    sentinel_suffix: str,
+    exceeded_suffix: str,
+) -> None
+```
+
+Internal helper used by `checked_complete()` (F-05/F-08). Iterates over the named `TokenUsage` fields using `getattr()` and raises `BudgetExhaustedError` on the first field that is either sentinel-exhausted (value `-1`, set by `pipe()` propagation) or over its limit. This replaces the previous per-field if-block repetition and follows the same pattern as CPython's `dataclasses.asdict()`.
+
+**Location:** `executionkit/patterns/base.py`
+
+**Raises:** `BudgetExhaustedError` on the first exhausted field.
+
+---
+
+### `_classify_http_error()` (internal)
+
+```python
+def _classify_http_error(
+    status: int,
+    raw: dict[str, Any],
+    retry_after: float,
+    *,
+    cause: BaseException,
+) -> NoReturn
+```
+
+Internal helper in `provider.py` (F-02). Centralises the HTTP status code → exception mapping that is shared by both the `_post_httpx` and `_post_urllib` backends. Raises the correct typed exception — `RateLimitError` for HTTP 429, `PermanentError` for 401/403/404, `ProviderError` for all other non-2xx codes — and chains `cause` as the original exception. Both HTTP backends call this single function rather than duplicating the mapping logic.
+
+**Location:** `executionkit/provider.py`
+
+**Raises:** `RateLimitError`, `PermanentError`, or `ProviderError` (always raises; return type is `NoReturn`).
+
+---
+
 ## Error Hierarchy
 
 All exceptions carry `cost: TokenUsage` and `metadata: dict[str, Any]` attributes set at raise time.
 
+> **Module location (F-06):** The full 9-class hierarchy is defined in `executionkit/errors.py`. `provider.py` re-exports every class under the same name so that `from executionkit.provider import XError` imports remain valid.
+
 ```
 ExecutionKitError
 ├── LLMError                    — provider communication errors

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -37,15 +37,21 @@ shape every design decision:
 executionkit/
 ├── __init__.py          — public API surface; sync wrappers
 ├── types.py             — frozen value types: PatternResult, TokenUsage, Tool, VotingStrategy, Evaluator
+├── errors.py            — 9-class exception hierarchy (F-06: extracted from provider.py)
 ├── provider.py          — LLMProvider protocol, ToolCallingProvider protocol,
-│                          Provider concrete class, LLMResponse, ToolCall,
-│                          and the 9-class error hierarchy
+│                          Provider concrete class, LLMResponse, ToolCall;
+│                          re-exports error classes from errors.py for backwards compatibility;
+│                          _classify_http_error() is the single HTTP status→exception mapping
+│                          point for both urllib and httpx backends (F-02)
 ├── cost.py              — CostTracker mutable accumulator
 ├── compose.py           — pipe() composition helper, PatternStep protocol
 ├── kit.py               — Kit session facade (provider + cumulative usage)
 ├── _mock.py             — MockProvider test double (satisfies both protocols)
 ├── patterns/
-│   ├── base.py          — checked_complete(), validate_score(), _TrackedProvider
+│   ├── base.py          — checked_complete(), validate_score(), _TrackedProvider;
+│   │                      _check_budget() uses getattr() field loop replacing per-field
+│   │                      if-chains (F-05/F-08); _TrackedProvider.supports_tools delegates
+│   │                      to wrapped provider via getattr (F-04)
 │   ├── consensus.py     — parallel majority/unanimous voting
 │   ├── refine_loop.py   — iterative score-guided refinement
 │   └── react_loop.py    — tool-calling think-act-observe loop
@@ -66,7 +72,8 @@ patterns/base    ──► cost, engine/retry, provider, types
 patterns/consensus  ──► cost, engine/parallel, engine/retry, patterns/base, provider, types
 patterns/refine_loop ──► cost, engine/convergence, engine/retry, patterns/base, provider, types
 patterns/react_loop  ──► cost, engine/retry, patterns/base, provider, types
-provider  ──► types
+provider  ──► types, errors  (re-exports all 9 error classes from errors.py)
+errors    ──► types
 cost      ──► types
 engine/*  ──► provider (retry only)
 ```
@@ -172,8 +179,13 @@ directly. Its snapshot is emitted as an immutable `TokenUsage` via `to_usage()`.
 
 ## Error Handling Architecture
 
+The full 9-class exception hierarchy lives in `executionkit/errors.py` (F-06).
+`provider.py` re-exports all nine classes under the same names so that existing
+`from executionkit.provider import XError` imports continue to work without
+modification (PEP 387 backwards compatibility).
+
 ```
-ExecutionKitError
+ExecutionKitError              ← executionkit/errors.py
 ├── LLMError                  ← provider communication failures
 │   ├── RateLimitError        ← HTTP 429; carries retry_after float
 │   ├── PermanentError        ← HTTP 401/403/404; do not retry
@@ -188,6 +200,12 @@ All errors carry `cost: TokenUsage` so callers can see what was spent before
 the failure. `pipe()` augments errors with the cumulative cross-step cost before
 re-raising.
 
+**HTTP error classification:** `_classify_http_error()` in `provider.py` is the
+single function responsible for mapping HTTP status codes to the correct error
+subclass. Both the `_post_httpx` and `_post_urllib` backends call it, eliminating
+duplicated mapping logic (F-02). This mirrors the pattern used by the Anthropic
+SDK's `_make_status_error()`.
+
 **Retry boundary:** `with_retry()` in `engine/retry.py` only retries
 `RateLimitError` and `ProviderError`. `PermanentError` propagates immediately.
 `asyncio.CancelledError` is always re-raised without retry.

diff --git a/docs/c4/c4-code-src-executionkit-patterns.md b/docs/c4/c4-code-src-executionkit-patterns.md
@@ -22,9 +22,9 @@
 - **Raises**: `ValueError` if score is NaN or outside [0.0, 1.0] range
 
 #### `checked_complete(provider: LLMProvider, messages: Sequence[dict[str, Any]], tracker: CostTracker, budget: TokenUsage | None, retry: RetryConfig | None, **kwargs: Any) -> LLMResponse`
-- **Description**: Makes a budget-aware LLM API call with retry logic. Checks token and LLM call budgets before dispatching and records usage in the cost tracker.
+- **Description**: Makes a budget-aware LLM API call with retry logic. Checks token and LLM call budgets before dispatching (via `_check_budget`) and records usage in the cost tracker.
 - **Location**: `base.py:24-55`
-- **Dependencies**: `LLMProvider`, `CostTracker`, `BudgetExhaustedError`, `with_retry`, `DEFAULT_RETRY`, `TokenUsage`, `RetryConfig`, `LLMResponse`
+- **Dependencies**: `LLMProvider`, `CostTracker`, `BudgetExhaustedError`, `with_retry`, `DEFAULT_RETRY`, `TokenUsage`, `RetryConfig`, `LLMResponse`, `_check_budget`, `_BUDGET_FIELD_LABELS`
 - **Parameters**:
   - `provider: LLMProvider` - The LLM provider to use
   - `messages: Sequence[dict[str, Any]]` - Messages to send to the LLM
@@ -35,6 +35,26 @@
 - **Return Type**: `LLMResponse` - Response from the LLM provider
 - **Raises**: `BudgetExhaustedError` if any budget constraint would be exceeded
 
+#### `_check_budget(budget: TokenUsage, current: TokenUsage, fields: tuple[str, ...], *, sentinel_suffix: str, exceeded_suffix: str) -> None`
+- **Description**: Validates selected `TokenUsage` fields by comparing the configured `budget` against the current accumulated `TokenUsage`. Iterates over the supplied `fields` and raises `BudgetExhaustedError` with a descriptive message if a field has reached a sentinel condition or would exceed its allowed limit.
+- **Location**: `base.py`
+- **Dependencies**: `TokenUsage`, `BudgetExhaustedError`, `_BUDGET_FIELD_LABELS`
+- **Parameters**:
+  - `budget: TokenUsage` - Maximum allowed token/call counts
+  - `current: TokenUsage` - Current accumulated token/call usage to validate against the budget
+  - `fields: tuple[str, ...]` - Names of the `TokenUsage` fields to check
+  - `sentinel_suffix: str` - Message suffix used when a budget field is already at its sentinel/exhausted value
+  - `exceeded_suffix: str` - Message suffix used when the current usage would exceed the configured budget
+- **Return Type**: `None`
+- **Raises**: `BudgetExhaustedError` naming the field that hit a sentinel condition or exceeded its budget (e.g., "input_tokens", "llm_calls")
+
+#### `_BUDGET_FIELD_LABELS`
+- **Description**: Module-level dict mapping `TokenUsage` field names to human-readable label strings used in `BudgetExhaustedError` messages. Drives the field-loop in `_check_budget`, making it easy to add new budget dimensions without modifying control flow.
+- **Location**: `base.py`
+- **Type**: `dict[str, str]`
+- **Example entries**: `{"input_tokens": "input tokens", "output_tokens": "output tokens", "llm_calls": "LLM calls"}`
+- **Dependencies**: None
+
 #### `_note_truncation(response: LLMResponse, metadata: dict[str, Any], context: str) -> None`
 - **Description**: Logs a warning and increments truncation counter in metadata if the LLM response was truncated (finish_reason indicates truncation).
 - **Location**: `base.py:58-66`
@@ -185,7 +205,8 @@
   - `_budget: TokenUsage | None` - Optional token budget constraints
   - `_retry: RetryConfig | None` - Retry configuration
   - `_context: str` - Context string for error messages
-  - `supports_tools: bool = True` - Class attribute indicating tool support capability
+- **Properties**:
+  - `supports_tools: bool` - Delegates to the wrapped provider's `supports_tools` attribute rather than hardcoding `Literal[True]`; this allows `_TrackedProvider` to accurately reflect the capabilities of the underlying provider at runtime
 - **Methods**:
   - `__init__(provider: LLMProvider, tracker: CostTracker, metadata: dict[str, Any], *, budget: TokenUsage | None, retry: RetryConfig | None, context: str) -> None` - Initializes the wrapper with dependencies
   - `complete(messages: Sequence[dict[str, Any]], *, temperature: float | None = None, max_tokens: int | None = None, tools: Sequence[dict[str, Any]] | None = None, **kwargs: Any) -> LLMResponse` - Wraps provider.complete() with budget and truncation checks
@@ -228,6 +249,7 @@ None - executionkit has zero external runtime dependencies as specified in `pypr
 ### Standard Library Dependencies
 
 - `asyncio` - For async/await support and task management (react_loop)
+- `logging` - Module-level import in `react_loop.py` for structured diagnostic logging
 - `collections.Counter` - For vote counting in consensus
 - `json` - For serializing tool arguments (react_loop)
 - `math` - For NaN checking in score validation
@@ -270,7 +292,7 @@ classDiagram
 
         class TrackedProvider {
             <<class>>
-            +supports_tools: bool
+            +supports_tools: bool (property, delegates to _provider)
             -_provider: LLMProvider
             -_tracker: CostTracker
             -_metadata: dict
@@ -285,6 +307,8 @@ classDiagram
             <<module>>
             +validate_score(score) float
             +checked_complete(provider, messages, ...) LLMResponse
+            -_check_budget(tracker, budget) None
+            -_BUDGET_FIELD_LABELS dict
             -_note_truncation(response, metadata, context) void
             -_TrackedProvider TrackedProvider
         }