# 01 â€” Problem Formulation (Detailed)

This notebook provides a formal, mathematically grounded statement of the tasks the system must solve, the operational constraints, and the evaluation metrics. The goal is to make tradeoffs explicit and to derive metrics that drive model selection and policy thresholds.

Problem statement

Given a stream of inbound events $xim D$ (messages, metadata, signals), produce for each $x$ an action $an A$ and an artifact $z$ containing the score, explanation, and provenance. The system should maximize expected operational utility under safety and audit constraints.

Mathematical objective

Define utility $U(a,x)$ as expected reward minus cost. The production policy $i$ chooses $a$ to approximately maximize expected utility:

$$i(x) pprox rgax_{an A} athbb{E}[R(a,X)id X=x] - C(a),$$

where $R$ is a stochastic reward (e.g., conversion revenue) and $C$ is the operational cost of pursuing action $a$.

Evaluation metrics

- Precision@k: $athrm{P@k} = rac{1}{k}um_{i=1}^k athbf{1}	ext{item}_i	ext{is true opportunity}$ for ranked lists.
- Recall: fraction of opportunities recovered at a threshold.
- Expected cost per positive: $E[Cid 	ext{pursue}]$.
- Offline policy evaluation: use importance sampling or inverse propensity weighting to estimate counterfactual performance when comparing policies.

Safety and audit constraints

- Every automated `pursue` decision must include an artifact with the scored features and the exact arithmetic that generated the score.
- Restrict automated actions to those with confidence above conservative thresholds; lower confidence actions should be flagged for human review.

Experimental design notes

- Use randomized rollouts and stratified sampling to measure causal impact on conversion.
- Monitor calibration drift over time; recalibrate scores (via isotonic regression or Platt scaling) when necessary.

## Decision thresholds and calibration

Calibration: if a model outputs an unbounded score $s(x)$, convert to probability with a link function $p=igma(s)$ or calibrate empirically: learn a monotone mapping $g$ such that $g(s)pprox P(	ext{positive}id s)$.

Threshold selection: choose threshold $	au$ to maximize expected utility given estimated conversion probability $p(x)$ and cost $c$: pursue when $p(x)dot V - c > 0$, where $V$ is expected value of conversion. This directly ties business economics to the decision rule.