# Vector-stabilized CDS ↔ Transition Tilt inside a Vector-like Layer

This notebook demonstrates a practical credit-risk pattern:

1. Link **CDS information** to a **rating transition model** by fitting a scalar tilt parameter $\alpha$ that scales a baseline generator.
2. Stabilize that fit using **vectors/embeddings** (peer prior + shrinkage).
3. Wrap the above inside a **Vector-like automation layer** that routes inbound text, retrieves precedents, enforces exception controls, and reuses last good calibration when market data is missing.

The goal is to show, end-to-end, how “vector” methods help a bank when CDS observations are noisy/sparse and when workflow automation is required.

---

## Per-message processing pipeline

For each inbound message (email/ticket/chat):

1. **Route** the message to a workflow using text embeddings (trade capture vs calibration vs tilt update vs exception).
2. **Extract** structured fields (issuer, tenor, spread, currency).
3. **Retrieve** similar historical cases (precedents) using embeddings.
4. **Control**: if routed to an exception workflow, open a case and **do not update** risk parameters.
5. Otherwise:
   - if a spread is present, compute a CDS-implied default proxy and calibrate $\alpha$,
   - if a spread is missing, reuse the last good issuer calibration; if none exists, fall back to rating mean (naive) or peer prior (vector).

---

## Transition model: CTMC generator and default probability

We represent credit migration with a continuous-time Markov chain (CTMC) over rating states. Let $Q$ be the generator matrix. In this demo we use four states:

- $0 =$ IG  
- $1 =$ BBB-ish  
- $2 =$ HY  
- $3 =$ Default (absorbing)

Given a generator $Q$, the transition probability matrix over horizon $T$ is:
$
P(T) = \exp(QT)
$.
If the issuer starts in rating state $r$, the transition-model default probability by time $T$ is:
$
PD_{\mathrm{TM}}(T \mid r) = P(T)_{r, D}
$.
The code computes $P(T)$ via an eigen-decomposition-based matrix exponential (sufficient for a small $4\times 4$ demo).

---

## Linking CDS risk to transitions via a scalar tilt $\alpha$

To link market-implied default risk to the transition model, we scale a baseline generator $Q_{\mathrm{base}}$:
$
Q_{\mathrm{issuer}} = \alpha \, Q_{\mathrm{base}}
$.
Intuition:
- Larger $\alpha$ increases migration intensities proportionally (including default intensity).
- Smaller $\alpha$ slows migration and reduces default probability.

For any horizon $T$ and start rating $r$:
$
PD_{\mathrm{TM}}(T \mid r,\alpha) = \left(\exp(\alpha Q_{\mathrm{base}} T)\right)_{r,D}
$.
Calibration chooses $\alpha$ such that $PD_{\mathrm{TM}}(T)$ matches a market-implied target derived from CDS.

---

## CDS proxy used in this notebook (simplified)

A full CDS calibration bootstraps a term structure of hazard rates $\lambda(t)$ using multiple maturities and discounting assumptions.

To keep the demo self-contained, we use a simplified approximation:

- Convert spread from bps to decimal: $s = \frac{\text{bp}}{10000}$.
- Assume constant recovery $R$.
- Use the heuristic relationship:
$
s \approx (1-R)\lambda
\quad\Rightarrow\quad
\lambda \approx \frac{s}{1-R}
$.
- Assuming constant hazard $\lambda$, compute a single-horizon default proxy:
$
PD_{\mathrm{CDS}}(T) \approx 1 - e^{-\lambda T}
$.
This $PD_{\mathrm{CDS}}(T)$ is the target used for fitting $\alpha$.

---

## Naive calibration of $\alpha$ (pure market fit)

Given $PD_{\mathrm{CDS}}(T)$ and start rating $r$, the naive approach fits $\alpha$ by minimizing squared error:
$
\alpha_{\mathrm{naive}} = \arg\min_{\alpha}
\left(PD_{\mathrm{TM}}(T \mid r,\alpha) - PD_{\mathrm{CDS}}(T)\right)^2
$.
In the code, this is solved via a grid search over $\alpha \in [0.01, 5]$.

Strength: market-consistent when quotes are reliable.  
Weakness: unstable when quotes are noisy/illiquid or missing.

---

## “Vector” for risk stabilization: issuer embeddings and peer prior

Here, “vector” means a numerical representation of issuer state that supports similarity-based inference.

Each issuer $i$ has a structured feature vector $x_i$ (e.g., sector, leverage, equity vol, liquidity proxy, rating). After normalization we obtain an embedding $z_i$.

Similarity is measured using cosine similarity:
$$
\mathrm{sim}(i,j) = \frac{z_i^\top z_j}{\|z_i\|\|z_j\|}
$$
We compute an issuer-specific peer prior for $\alpha$ using the $k$ nearest neighbors:
$$
\alpha_{\mathrm{prior}}(i) = \sum_{j \in \mathcal{N}_k(i)} w_{ij}\,\alpha_j
$$
with weights defined by a similarity softmax:
$$
w_{ij} =
\frac{\exp(\mathrm{sim}(i,j)/\tau)}
{\sum_{\ell \in \mathcal{N}_k(i)} \exp(\mathrm{sim}(i,\ell)/\tau)}
$$
This prior is most valuable when market quotes are missing or low quality.

---

## Vector-stabilized calibration (MAP shrinkage)

When a market observation exists, we combine:
- fit-to-market, and
- shrinkage toward the peer prior.

We solve:
$$
\alpha_{\mathrm{vector}} = \arg\min_{\alpha}
\left(PD_{\mathrm{TM}}(T \mid r,\alpha) - PD_{\mathrm{CDS}}(T)\right)^2
+ \lambda\left(\alpha - \alpha_{\mathrm{prior}}(i)\right)^2
$$
Interpretation:
- The first term enforces market consistency.
- The second term prevents extreme/unstable parameters by pulling toward peers.

To ensure “market-respecting” behavior, the implementation can constrain $\alpha$ to stay within a band around the pure market fit $\alpha_{\mathrm{mkt}}$, e.g.
$$
\alpha \in \left[\alpha_{\mathrm{mkt}}(1-\delta),\, \alpha_{\mathrm{mkt}}(1+\delta)\right]
$$

---

## Text embeddings: routing and precedent retrieval (Vector-like layer)

Separately from issuer vectors, we embed message text for:

1. **Workflow routing**: pick the workflow whose description embedding is most similar to the inbound message.
2. **Precedent retrieval**: retrieve similar historical cases for context/governance.

With normalized embeddings, inner product corresponds to cosine similarity. If $e$ is the embedding of the inbound message and $v_k$ is the embedding of workflow $k$:
$$
\mathrm{score}(k) = e^\top v_k
$$
and the selected workflow is:
$$
k^* = \arg\max_k \mathrm{score}(k)
$$

---

## Controls and governance logic

Two production-relevant controls are implemented:

### 1) Exception gating
If the message routes to the exception workflow (e.g., suspected bad tick), we open a case and **do not update** $\alpha$.

This prevents contaminating calibration/risk parameters with anomalous inputs.

### 2) Issuer-level memory for missing quotes
If the spread is missing, we reuse the **last calibrated** $\alpha$ for that issuer (if available). This avoids discontinuities from falling back to rating means or peers when the system simply lacks a fresh quote.

Operationally, this mirrors a common desk approach: “carry forward last good curve/tilt unless we have validated new market information.”

---

In [1]:
import os
import re
import warnings
from typing import Optional, Dict, List, Tuple, Set

import numpy as np
import faiss
from sentence_transformers import SentenceTransformer

# Reduce TensorFlow verbosity if it gets imported transitively by transformers/sentence-transformers.
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

# Silence tqdm autonotebook warning in some notebook-like environments.
warnings.filterwarnings("ignore", message="Using `tqdm.autonotebook.tqdm`.*")

# Rating state indices
DEFAULT_STATE = 3  # 0=IG, 1=BBB-ish, 2=HY, 3=Default (absorbing)

  from tqdm.autonotebook import tqdm, trange
2025-12-19 15:29:40.006589: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.



## Section 0A — Ratings transition model and continuous-time generator

### Objective
Construct a baseline *continuous-time* credit migration model (with default as an absorbing state) that can generate multi-horizon transition probabilities and default probabilities, later “tilted” by a scalar factor.

### Mathematical framework

#### Discrete-time transition matrix
Let $P \in \mathbb{R}^{K\times K}$ be a one-period transition matrix across rating states, with rows summing to 1. Default is absorbing:
- States: $0=\text{IG}$, $1=\text{BBB-ish}$, $2=\text{HY}$, $3=\text{Default}$
- Absorbing default: $P_{3,3}=1$ and $P_{3,j}=0$ for $j\neq 3$

#### Cohort simulation and empirical estimation
A cohort simulation generates rating paths $\{X_{n,t}\}$ for names $n=1,\dots,N$ and time steps $t=0,\dots,T-1$.

Transition counts are computed as:
$$
n_{ij} = \sum_{n=1}^{N}\sum_{\tau=1}^{T-1}\mathbf{1}\{X_{n,\tau-1}=i,\;X_{n,\tau}=j\}.
$$

The cohort transition matrix estimator is:
$$
\hat P_{ij} = \frac{n_{ij}}{\sum_{k=1}^{K} n_{ik}},
$$
with absorbing default enforced explicitly.

#### Continuous-time generator
For a continuous-time Markov chain with generator matrix $Q$, the transition matrix over horizon $t$ is:
$$
P(t) = e^{Qt}.
$$

The script approximates $Q \approx \log(P)$ via a truncated series expansion:
$$
\log(P) = (P-I) - \frac{(P-I)^2}{2} + \frac{(P-I)^3}{3} - \cdots
$$

Because truncation and estimation noise can produce invalid generators, the script applies a “repair”:
- Set negative off-diagonals to zero: $Q_{ij}\leftarrow \max(Q_{ij},0)$ for $i\neq j$
- Enforce row-sum-zero property:
$$
Q_{ii} = -\sum_{j\neq i} Q_{ij}.
$$

#### Transition-model-implied default probability
Given base generator $Q_{\text{base}}$, scalar tilt $\alpha>0$, horizon $T$, and current rating state $r$:
$$
P_{\alpha}(T) = e^{(\alpha Q_{\text{base}})T},
\qquad
\mathrm{PD}_{\mathrm{TM}}(T \mid r,\alpha) = [P_{\alpha}(T)]_{r,\text{Default}}.
$$

### What the code does / produces
- Produces a baseline generator $Q_{\text{base}}$ from either:
  - a simulated-and-estimated transition matrix $\hat P$, or
  - the invented matrix $P_{\text{true}}$
- Provides a function $\mathrm{PD}_{\mathrm{TM}}(T \mid r,\alpha)$ to map a tilt $\alpha$ into multi-horizon default probabilities.

### Why it is designed this way
- The transition matrix provides a *structural* baseline for credit migration.
- The generator representation enables coherent multi-horizon behavior via matrix exponentiation rather than ad hoc scaling.
- The scalar $\alpha$ provides a simple one-parameter “tilt” knob to align the structural model to market-implied default probabilities.


In [2]:
# ============================================================
# SECTION 0A — Transition matrices + Generator (CTMC)
# ============================================================

def Invented_Transition_Matrix_4() -> np.ndarray:
    """
    Toy 4-state one-period transition matrix with Default absorbing.
    Rows sum to 1, and last row is [0,0,0,1].
    """
    return np.array(
        [
            [0.9000, 0.0800, 0.0199, 0.0001],
            [0.0500, 0.8500, 0.0900, 0.0100],
            [0.0100, 0.0900, 0.8000, 0.1000],
            [0.0000, 0.0000, 0.0000, 1.0000],
        ],
        dtype=float,
    )


def Initialize_Counterparties(num_names: int, start_weights: np.ndarray) -> np.ndarray:
    """
    Sample initial rating states for num_names using categorical weights start_weights.

    start_weights is assumed to be length-4 in this demo (IG, BBB-ish, HY, Default).
    """
    start_state = np.zeros(num_names, dtype=int)
    cum_weights = np.cumsum(start_weights)
    uniforms = np.random.uniform(0.0, 1.0, num_names)

    for n in range(num_names):
        if 0.0 < uniforms[n] <= cum_weights[0]:
            start_state[n] = 0
        elif cum_weights[0] < uniforms[n] <= cum_weights[1]:
            start_state[n] = 1
        elif cum_weights[1] < uniforms[n] <= cum_weights[2]:
            start_state[n] = 2
        else:
            start_state[n] = 3
    return start_state


def Transition_Step(current_state: int, transition_matrix: np.ndarray) -> int:
    """
    One-step draw from the categorical distribution in transition_matrix[current_state, :].
    """
    cum_probs = np.cumsum(transition_matrix[current_state, :])
    u = np.random.uniform(0.0, 1.0)
    for j in range(transition_matrix.shape[1]):
        if u <= cum_probs[j]:
            return j
    return transition_matrix.shape[1] - 1


def Simulate_Rating_Data(
    num_names: int,
    num_periods: int,
    transition_matrix: np.ndarray,
    start_weights: np.ndarray,
) -> np.ndarray:
    """
    Simulate discrete-time rating paths:
      data[n, t] = rating state for name n at time t.
    """
    start_state = Initialize_Counterparties(num_names, start_weights)
    data = np.zeros((num_names, num_periods), dtype=int)

    for n in range(num_names):
        data[n, 0] = start_state[n]
        for t in range(1, num_periods):
            data[n, t] = Transition_Step(data[n, t - 1], transition_matrix)
    return data


def Get_Transition_Count(num_states: int, num_periods: int, num_names: int, data: np.ndarray) -> np.ndarray:
    """
    Compute cohort transition counts:
      count[i, j] = number of i->j transitions observed across all names and time steps.
    """
    count_ij = np.zeros((num_states, num_states), dtype=float)
    for n in range(num_names):
        for tau in range(1, num_periods):
            i = int(data[n, tau - 1])
            j = int(data[n, tau])
            count_ij[i, j] += 1.0
    return count_ij


def Estimate_Cohort_Transition_Matrix(num_states: int, count_ij: np.ndarray, period_power: int = 1) -> np.ndarray:
    """
    Estimate a one-period transition matrix by row-normalizing counts, then enforce absorbing default.

    If a row has no observations, fall back to identity on that row.
    period_power allows compounding to multi-period transitions via matrix power.
    """
    est_matrix = np.zeros((num_states, num_states), dtype=float)

    for i in range(num_states):
        denom = np.sum(count_ij[i, :])
        if denom > 0:
            est_matrix[i, :] = count_ij[i, :] / denom
        else:
            est_matrix[i, i] = 1.0

    # Enforce absorbing default explicitly (model governance / sanity constraint)
    est_matrix[-1, :] = 0.0
    est_matrix[-1, -1] = 1.0

    return np.linalg.matrix_power(est_matrix, period_power)


def Find_Generator_Matrix_Series(transition_matrix: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
    """
    Approximate generator Q ≈ log(P) using a truncated series expansion around I:
      log(P) = (P-I) - (P-I)^2/2 + (P-I)^3/3 - ...

    IMPORTANT:
    - This truncation does NOT guarantee a valid generator.
    - Empirical P may also violate embeddability (there may not exist a valid Q with exp(Q)=P).

    The code therefore "repairs" Q:
    - Clip negative off-diagonal entries to 0
    - Recompute diagonals to enforce row-sum approximately 0

    Returns:
      tilde_q: raw series approximation
      q_repaired: repaired generator suitable for CTMC usage in this demo
    """
    num_states = transition_matrix.shape[0]
    delta = transition_matrix - np.eye(num_states)

    tilde_q = delta.copy()
    term = delta.copy()

    # Truncate at k = num_states + 1 terms (toy choice; adequate for demo scale, not production-grade)
    for k in range(2, num_states + 1):
        term = -np.dot(term, delta) / float(k)
        tilde_q += term

    q_repaired = tilde_q.copy()

    # Repair step: enforce Q_{ij} >= 0 for i != j and row sums ~ 0
    for i in range(num_states):
        for j in range(num_states):
            if i != j and q_repaired[i, j] < 0.0:
                q_repaired[i, j] = 0.0

        # Set diagonal so row sums to 0: Q_{ii} = -sum_{j!=i} Q_{ij}
        off_diag_sum = np.sum(q_repaired[i, :]) - q_repaired[i, i]
        q_repaired[i, i] = -off_diag_sum

    return tilde_q, q_repaired


def Expm_Via_Eig(matrix_a: np.ndarray) -> np.ndarray:
    """
    Matrix exponential via eigen-decomposition.
    Appropriate for small matrices (4x4 demo). For production, prefer scipy.linalg.expm or scaling-squaring.

    exp(A) = V exp(diag(w)) V^{-1}
    """
    eigen_vals, eigen_vecs = np.linalg.eig(matrix_a)
    inv_eigen_vecs = np.linalg.inv(eigen_vecs)
    return (eigen_vecs @ np.diag(np.exp(eigen_vals)) @ inv_eigen_vecs).real


def Pd_Tm(generator_base: np.ndarray, alpha: float, horizon_years: float, rating_state: int) -> float:
    """
    Transition-model default probability over horizon_years given:
      Q_issuer = alpha * generator_base
      P(T) = exp(Q_issuer * T)

    Returns P(T)[rating_state, DEFAULT_STATE].
    """
    trans_matrix = Expm_Via_Eig((alpha * generator_base) * horizon_years)
    return float(trans_matrix[rating_state, DEFAULT_STATE])


def Build_Q_Base_From_Transition_Data(
    seed: int = 7,
    use_estimated: bool = True,
    num_names: int = 1000,
    num_periods_years: int = 10,
    start_weights: np.ndarray = np.array([0.45, 0.35, 0.20, 0.00]),
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Either:
      - simulate cohort paths from a 'true' P, estimate P_hat, then compute Q_base from P_hat, OR
      - compute Q_base directly from the true P (for deterministic demo).

    Returns:
      p_used: transition matrix used (estimated or true)
      q_base: repaired CTMC generator used downstream
    """
    np.random.seed(seed)
    p_true = Invented_Transition_Matrix_4()

    if use_estimated:
        data = Simulate_Rating_Data(num_names=num_names, num_periods=num_periods_years, transition_matrix=p_true, start_weights=start_weights)
        count_ij = Get_Transition_Count(num_states=4, num_periods=num_periods_years, num_names=num_names, data=data)
        p_hat = Estimate_Cohort_Transition_Matrix(num_states=4, count_ij=count_ij, period_power=1)
        _, q_base = Find_Generator_Matrix_Series(p_hat)
        return p_hat, q_base

    _, q_base = Find_Generator_Matrix_Series(p_true)
    return p_true, q_base


## Section 0B — CDS par spread pricing and piecewise-constant hazard calibration

### Objective
Infer a *hazard rate term structure* from observed CDS spreads, then convert that hazard curve into market-implied default probabilities at given horizons.

### Mathematical framework

#### Discounting
A flat annual discount curve is used:
$$
DF(0,t) = (1+r)^{-t}.
$$

#### Survival probability under piecewise-constant hazard
Let tenors be $0<T_1<\cdots<T_m$. Define piecewise-constant hazards $\gamma_k$ on $(T_{k-1},T_k]$ (with $T_0=0$). The cumulative hazard up to time $t$ is:
$$
\Lambda(t) = \int_0^t \lambda(u)\,du
          = \sum_{k=1}^{m}\gamma_k\cdot\bigl(\min(t,T_k)-T_{k-1}\bigr)
          + \gamma_m\cdot\max(0,t-T_m).
$$

Survival probability:
$$
S(t) = e^{-\Lambda(t)}.
$$

#### CDS par spread (discrete quarterly approximation)
With recovery $R$, the script approximates:
- Protection leg (default payment) using survival differences:
$$
\mathrm{ProtLeg} \approx \sum_{j} DF(0,t_j)\,[S(t_{j-1})-S(t_j)]
$$
- Premium leg (spread payments) using expected survival:
$$
\mathrm{PremLeg} \approx \sum_{j} DF(0,t_j)\,S(t_j)\,\Delta t,
\qquad \Delta t = 0.25.
$$

Par spread is:
$$
s^* = (1-R)\frac{\mathrm{ProtLeg}}{\mathrm{PremLeg}},
\qquad \text{and reported in bp as } 10{,}000\,s^*.
$$

#### Bootstrapping piecewise hazards
Given market par spreads $\{s(T_k)\}$ at maturities $\{T_k\}$, the script bootstraps sequentially:
- For each $k$, solve for $\gamma_k$ such that:
$$
\mathrm{ParSpread}\bigl(\gamma_1,\dots,\gamma_k; T_k\bigr) = s(T_k),
$$
using bisection root-finding.

#### CDS-implied default probability
Once hazards are calibrated, CDS-implied default probability at horizon $T$ is:
$$
\mathrm{PD}_{\mathrm{CDS}}(T) = 1 - S(T) = 1 - e^{-\Lambda(T)}.
$$

### What the code does / produces
- Parses free text to extract spreads by tenor: $\{T_k \mapsto s(T_k)\}$.
- Bootstraps hazards $\{\gamma_k\}$ consistent with those spreads.
- Produces CDS-implied PD targets $\mathrm{PD}_{\mathrm{CDS}}(T_k)$ at quoted tenors.
- If only one tenor is quoted, it also prints a “flat-hazard extrapolation” across reporting horizons.

### Why it is designed this way
- CDS spreads are a market-consistent source of default risk.
- Piecewise-constant hazards are a standard tractable approximation for calibrating term structures.
- Bootstrapping ensures each quoted tenor is fit exactly (within numerical tolerance).


In [3]:
# ============================================================
# SECTION 0B — CDS pricing + piecewise-constant hazard bootstrap
# ============================================================

class Discount_Curve_Base(object):
    def Df(self, t1: float, t2: float) -> float:
        raise NotImplementedError


class Flat_Discount_Curve(Discount_Curve_Base):
    def __init__(self, rate_r: float):
        self.rate_r = float(rate_r)

    def Df(self, t1: float, t2: float) -> float:
        # Simple annual compounding: DF(0,t)=(1+r)^(-t)
        return (1.0 + self.rate_r) ** -(t2 - t1)


class Credit_Default_Swap:
    def __init__(self, maturity_years: float, discount_curve: Discount_Curve_Base, recovery_r: float = 0.40):
        self.maturity_years = float(maturity_years)
        self.discount_curve = discount_curve
        self.recovery_r = float(recovery_r)

    def Payment_Dates(self) -> np.ndarray:
        # Quarterly schedule: 0, 0.25, ..., maturity-0.25
        return np.arange(0.0, self.maturity_years, 0.25)

    def Survival_Probability(self, parameters, t: float) -> float:
        raise NotImplementedError

    def Par_Spread_Bp(self, parameters) -> float:
        """
        Discrete approximation of CDS par spread:
          spread = (1-R) * ProtLeg / PremLeg
        Returned in basis points.
        """
        dates = self.Payment_Dates()
        prot_leg = 0.0
        prem_leg = 0.0

        for date in dates:
            t_start = date
            t_end = date + 0.25

            df = self.discount_curve.Df(0.0, t_end)
            surv_start = self.Survival_Probability(parameters, t_start)
            surv_end = self.Survival_Probability(parameters, t_end)

            # Protection leg approximated by survival drop over accrual: S(t_start)-S(t_end)
            prot_leg += df * (surv_start - surv_end)

            # Premium leg approximated by survival at pay date * accrual
            prem_leg += df * surv_end * 0.25

        spread_decimal = (1.0 - self.recovery_r) * prot_leg / max(prem_leg, 1e-16)
        return float(spread_decimal * 10000.0)


class Ihp_Credit_Default_Swap(Credit_Default_Swap):
    def __init__(self, tenors_years: List[float], **kwargs):
        super().__init__(**kwargs)
        self.tenors_years = list(map(float, tenors_years))

    def _Hazard_Integral(self, gammas: List[float], t: float) -> float:
        """
        Piecewise-constant hazard:
          lambda(t) = gamma_k for t in (T_{k-1}, T_k]
        Then cumulative hazard:
          Λ(t)=∑ gamma_k * length_of_overlap
        """
        total = 0.0
        prev = 0.0
        for idx, tenor in enumerate(self.tenors_years):
            gamma = float(gammas[idx])
            if t >= tenor:
                total += gamma * (tenor - prev)
            else:
                total += gamma * (t - prev)
                return total
            prev = tenor

        # If t extends beyond last tenor, extend with last gamma.
        if t > self.tenors_years[-1]:
            total += float(gammas[-1]) * (t - self.tenors_years[-1])

        return total

    def Survival_Probability(self, gammas: List[float], t: float) -> float:
        return float(np.exp(-self._Hazard_Integral(gammas, float(t))))


def _Bisect_Root(function_f, lo: float, hi: float, max_iter: int = 80, tol: float = 1e-10) -> float:
    """
    Bisection root finder. If f(lo) and f(hi) have same sign, expands hi up to find a bracket.
    """
    f_lo = function_f(lo)
    f_hi = function_f(hi)

    if np.sign(f_lo) == np.sign(f_hi):
        # Try to find a bracket by expanding hi.
        for _ in range(30):
            hi *= 2.0
            f_hi = function_f(hi)
            if np.sign(f_lo) != np.sign(f_hi):
                break
        else:
            # Could not bracket; return lo as safe fallback (demo behavior).
            return lo

    for _ in range(max_iter):
        mid = 0.5 * (lo + hi)
        f_mid = function_f(mid)

        if abs(f_mid) < tol:
            return mid

        if np.sign(f_mid) == np.sign(f_lo):
            lo, f_lo = mid, f_mid
        else:
            hi, f_hi = mid, f_mid

    return 0.5 * (lo + hi)


def Parse_Spreads_By_Tenor(message_text: str) -> Dict[float, float]:
    """
    Extract {tenor_years: spread_bp} from free text.
    Supports:
      - "1Y=80bp 3Y=105bp 5Y=120bp"
      - "5Y:120bp"
      - "5Y CDS @ 120bp" / "5Y @ 120bp"
    """
    spreads_by_tenor: Dict[float, float] = {}

    # Pattern A: explicit bindings (1Y=80bp, 3Y:105bp, etc.)
    for match in re.finditer(r"(\d+)\s*Y\s*[:=]\s*(\d+(?:\.\d+)?)\s*bp", message_text, flags=re.IGNORECASE):
        spreads_by_tenor[float(match.group(1))] = float(match.group(2))

    # Pattern B: tenor ... @ spread
    for match in re.finditer(r"(\d+)\s*Y\b[^0-9]{0,30}@\s*(\d+(?:\.\d+)?)\s*bp", message_text, flags=re.IGNORECASE):
        spreads_by_tenor[float(match.group(1))] = float(match.group(2))

    return spreads_by_tenor


def Calibrate_Piecewise_Hazards(
    spreads_by_tenor: Dict[float, float],
    discount_curve: Discount_Curve_Base,
    recovery_r: float = 0.40,
) -> Tuple[List[float], List[float]]:
    """
    Sequential bootstrap:
      For tenor T_k, solve for gamma_k so that model par spread equals market spread at T_k.

    Output:
      tenors_sorted, gammas
    """
    tenors_sorted = sorted(spreads_by_tenor.keys())
    gammas: List[float] = []

    for k, tenor_k in enumerate(tenors_sorted):
        target_bp = float(spreads_by_tenor[tenor_k])
        local_tenors = tenors_sorted[: k + 1]

        def objective(gamma_k: float) -> float:
            gamma_all = gammas + [float(gamma_k)]
            cds = Ihp_Credit_Default_Swap(
                tenors_years=local_tenors,
                maturity_years=tenor_k,
                discount_curve=discount_curve,
                recovery_r=recovery_r,
            )
            return cds.Par_Spread_Bp(gamma_all) - target_bp

        gamma_k = _Bisect_Root(objective, lo=1e-8, hi=1.0)
        gammas.append(float(gamma_k))

    return tenors_sorted, gammas


def Cds_Pd_From_Hazards(tenors_years: List[float], gammas: List[float], horizon_years: float) -> float:
    """
    Convert hazards to CDS-implied PD at horizon:
      PD(T) = 1 - S(T)
    Uses dummy discounting because survival is independent of discounting in this structure.
    """
    dummy_curve = Flat_Discount_Curve(rate_r=0.0)
    cds = Ihp_Credit_Default_Swap(
        tenors_years=tenors_years,
        maturity_years=max(max(tenors_years), horizon_years),
        discount_curve=dummy_curve,
        recovery_r=0.40,
    )
    surv = cds.Survival_Probability(gammas, horizon_years)
    return float(1.0 - surv)

## Section 1 — Structured issuer vectors and multi-horizon $\alpha$ estimation (naive vs vector prior)

### Objective
Estimate a scalar tilt $\alpha$ such that transition-model default probabilities match CDS-implied PDs, while stabilizing estimates for sparse/low-quality data via a peer-conditioned vector prior.

### Mathematical framework

#### Structured issuer feature vectors
Each issuer $i$ has a feature vector $x_i$ including:
- sector one-hot encoding,
- leverage, equity volatility, liquidity proxy,
- rating state (as a numeric feature)

The script standardizes features (z-score):
$$
z_{i,d} = \frac{x_{i,d}-\mu_d}{\sigma_d+\epsilon},
$$
where $(\mu_d,\sigma_d)$ are cross-sectional mean/std.

#### Similarity and KNN prior on $\alpha$
Cosine similarity between issuer $i$ and observed issuers:
$$
\mathrm{sim}(i,j) = \frac{z_i^\top z_j}{\|z_i\|\,\|z_j\|}.
$$

Top-$k$ neighbors define weights via a softmax-like rule:
$$
w_j \propto \exp\Bigl(\frac{\mathrm{sim}(i,j)}{\tau}\Bigr),
\qquad \sum_j w_j = 1,
$$
and prior:
$$
\alpha_{\text{prior}}(i)=\sum_{j\in\mathcal{N}_k(i)} w_j\,\alpha_{\text{anchor}}(j).
$$

Here $\alpha_{\text{anchor}}(j)$ is an “anchor” tilt inferred from a single-horizon observation (5Y PD) for issuers with observed data.

#### Market-fit $\alpha$ via grid search (multi-horizon)
Given PD targets at horizons $\mathcal{T}$, rating state $r$, and model $\mathrm{PD}_{\mathrm{TM}}(T\mid r,\alpha)$, the naive estimate solves:
$$
\alpha_{\text{mkt}} = \arg\min_{\alpha\in\mathcal{G}}
\sum_{T\in\mathcal{T}} \omega_T \left(\mathrm{PD}_{\mathrm{TM}}(T\mid r,\alpha) - \mathrm{PD}_{\mathrm{CDS}}(T)\right)^2.
$$

#### MAP estimate with prior regularization
The vector-regularized estimate adds a quadratic penalty:
$$
\alpha_{\text{MAP}} = \arg\min_{\alpha\in\mathcal{G}}
\sum_{T\in\mathcal{T}} \omega_T \left(\mathrm{PD}_{\mathrm{TM}}(T\mid r,\alpha) - \mathrm{PD}_{\mathrm{CDS}}(T)\right)^2
+ \lambda(\alpha-\alpha_{\text{prior}})^2.
$$

The script makes $\lambda$ larger when confidence is lower (stronger pull to the prior).

### What the code does / produces
- Constructs a synthetic issuer universe with structured features and partially missing observed PDs.
- Derives “anchor” $\alpha$ values for issuers with observed PD(5Y).
- Produces two estimators:
  - **Naive**: market-fit only, no peer stabilization.
  - **Vector**: market-fit plus KNN prior regularization, scaled by a confidence score.
- Outputs $\alpha_{\text{naive}}$, $\alpha_{\text{vector}}$ and corresponding $\mathrm{PD}_{\mathrm{TM}}(1Y,3Y,5Y)$.

### Why it is designed this way
- A single scalar $\alpha$ is an interpretable “regime tilt” on migration intensity.
- Multi-horizon fitting prevents the model from matching one tenor while distorting others.
- The vector prior stabilizes estimates when quotes are missing, sparse, or low confidence.

In [4]:
# ============================================================
# SECTION 1 — Structured issuer vectors + alpha estimators
# ============================================================

def Normalize_Rows(matrix_x: np.ndarray) -> np.ndarray:
    """
    Z-score each column across rows.
    NOTE: Despite the name, this normalizes features per column (not row norms).
    """
    mean_vec = matrix_x.mean(axis=0, keepdims=True)
    std_vec = matrix_x.std(axis=0, keepdims=True) + 1e-12
    return (matrix_x - mean_vec) / std_vec


def Cosine_Sim(vector_a: np.ndarray, matrix_b: np.ndarray) -> np.ndarray:
    """
    Compute cosine similarity between vector_a and each row of matrix_b.
    """
    a_norm = np.linalg.norm(vector_a) + 1e-12
    b_norm = np.linalg.norm(matrix_b, axis=1) + 1e-12
    return (matrix_b @ vector_a) / (b_norm * a_norm)


def Knn_Prior_Alpha(
    issuer_embed: np.ndarray,
    embeds_obs: np.ndarray,
    alpha_obs: np.ndarray,
    k: int = 20,
    tau: float = 0.12,
) -> float:
    """
    KNN prior:
      - find top-k cosine similarities
      - compute softmax weights exp(sim/tau)
      - return weighted average of observed alpha anchors
    """
    sims = Cosine_Sim(issuer_embed, embeds_obs)
    idx = np.argsort(-sims)[:k]
    top_sims = sims[idx]

    weights = np.exp(top_sims / max(tau, 1e-6))
    weights = weights / (weights.sum() + 1e-12)

    return float((weights * alpha_obs[idx]).sum())


def Build_Structured_Issuer_Universe(generator_base: np.ndarray, seed: int = 7, num_issuers: int = 400) -> Dict[str, np.ndarray]:
    """
    Create a synthetic issuer universe with:
      - categorical sector
      - numeric leverage, equity vol, liquidity proxy
      - discrete rating state
      - partially missing observed PD(5Y) to simulate sparse/illiquid markets
    """
    rng = np.random.default_rng(seed)

    issuer_names = ["ACME", "BETA", "GAMMA", "DELTA", "OMEGA"]
    issuer_names += [f"ISS{i:03d}" for i in range(num_issuers - len(issuer_names))]

    sector = rng.integers(0, 3, size=num_issuers)
    sector_one_hot = np.eye(3)[sector]
    rating_state = rng.choice([0, 1, 2], size=num_issuers, p=[0.45, 0.35, 0.20])

    leverage = rng.normal(loc=2.5, scale=0.7, size=num_issuers).clip(0.5, 6.0)
    equity_vol = rng.normal(loc=0.30, scale=0.10, size=num_issuers).clip(0.05, 0.80)
    liquidity = rng.normal(loc=1.0, scale=0.4, size=num_issuers).clip(0.1, 3.0)

    sector_effect = np.array([0.00, 0.10, 0.25])[sector]
    latent = rng.normal(0.0, 1.0, size=num_issuers)

    # Synthetic "true" alphas; exponential ensures alpha>0
    log_alpha_true = -0.1 + 0.35 * (leverage - 2.5) + 0.9 * (equity_vol - 0.30) + 0.25 * latent + sector_effect
    alpha_true = np.exp(log_alpha_true).clip(0.01, 5.0)

    pd_true_5y = np.array([Pd_Tm(generator_base, alpha_true[i], 5.0, int(rating_state[i])) for i in range(num_issuers)])

    # Observation noise increases as liquidity falls (toy)
    noise_sd = 0.003 + 0.015 * (1.0 / liquidity)
    pd_obs_5y = (pd_true_5y + rng.normal(0.0, noise_sd, size=num_issuers)).clip(1e-5, 0.95)

    # Missingness probability increases as liquidity falls
    p_missing = (0.10 + 0.55 * (1.0 / (liquidity + 0.2))).clip(0.10, 0.80)
    missing_mask = rng.uniform(0.0, 1.0, size=num_issuers) < p_missing
    pd_obs_5y[missing_mask] = np.nan

    feature_matrix = np.column_stack([sector_one_hot, leverage, equity_vol, liquidity, rating_state.astype(float)])
    embeds = Normalize_Rows(feature_matrix)

    return {
        "issuer_names": np.array(issuer_names),
        "rating_state": rating_state,
        "pd_obs_5y": pd_obs_5y,
        "embeds": embeds,
    }


def Fit_Alpha_From_Pds_Multi(
    generator_base: np.ndarray,
    pd_targets: Dict[float, float],
    rating_state: int,
    alpha_grid: np.ndarray,
    weights: Optional[Dict[float, float]] = None,
) -> float:
    """
    Grid-search alpha to minimize multi-horizon SSE:
      sum_T w_T (PD_TM(T|alpha) - PD_target(T))^2
    """
    horizons = sorted(pd_targets.keys())
    wts = weights or {h: 1.0 for h in horizons}

    sse_list: List[float] = []
    for alpha in alpha_grid:
        sse = 0.0
        for horizon in horizons:
            pd_obs = float(np.clip(pd_targets[horizon], 1e-8, 0.999999))
            pd_model = Pd_Tm(generator_base, float(alpha), float(horizon), int(rating_state))
            sse += float(wts.get(horizon, 1.0)) * (pd_model - pd_obs) ** 2
        sse_list.append(float(sse))

    sse_arr = np.array(sse_list, dtype=float)
    return float(alpha_grid[int(np.argmin(sse_arr))])


def Fit_Alpha_Map_Multi(
    generator_base: np.ndarray,
    pd_targets: Dict[float, float],
    rating_state: int,
    alpha_prior: float,
    alpha_grid: np.ndarray,
    lam: float,
    weights: Optional[Dict[float, float]] = None,
) -> float:
    """
    MAP estimate via grid search:
      sum_T w_T (PD_TM - PD_target)^2 + lam (alpha - alpha_prior)^2
    """
    horizons = sorted(pd_targets.keys())
    wts = weights or {h: 1.0 for h in horizons}

    obj_list: List[float] = []
    for alpha in alpha_grid:
        sse = 0.0
        for horizon in horizons:
            pd_obs = float(np.clip(pd_targets[horizon], 1e-8, 0.999999))
            pd_model = Pd_Tm(generator_base, float(alpha), float(horizon), int(rating_state))
            sse += float(wts.get(horizon, 1.0)) * (pd_model - pd_obs) ** 2

        # Quadratic prior penalty
        sse += float(lam) * (float(alpha) - float(alpha_prior)) ** 2
        obj_list.append(float(sse))

    obj_arr = np.array(obj_list, dtype=float)
    return float(alpha_grid[int(np.argmin(obj_arr))])


def Build_Alpha_Estimators(generator_base: np.ndarray, universe: Dict[str, np.ndarray]):
    """
    Returns:
      Estimate_Alpha_Naive(issuer, pd_targets) -> alpha
      Estimate_Alpha_Vector(issuer, pd_targets, confidence) -> alpha
      Get_Rating_State(issuer) -> int
      known_issuers -> set[str]
    """
    issuer_names = universe["issuer_names"]
    rating_state_arr = universe["rating_state"]
    pd_obs_5y = universe["pd_obs_5y"]
    embeds = universe["embeds"]

    alpha_grid = np.linspace(0.01, 5.0, 800)

    # Anchor alphas inferred from PD(5Y) for issuers with observed PD(5Y)
    alpha_anchor = np.full(len(issuer_names), np.nan)
    for i in range(len(issuer_names)):
        if np.isfinite(pd_obs_5y[i]):
            alpha_anchor[i] = Fit_Alpha_From_Pds_Multi(
                generator_base=generator_base,
                pd_targets={5.0: float(pd_obs_5y[i])},
                rating_state=int(rating_state_arr[i]),
                alpha_grid=alpha_grid,
            )

    # Naive fallback: rating-mean alpha when no PD targets are available
    alpha_rating_mean: Dict[int, float] = {}
    for r in [0, 1, 2]:
        mask = np.isfinite(alpha_anchor) & (rating_state_arr == r)
        alpha_rating_mean[r] = float(np.nanmean(alpha_anchor[mask])) if mask.any() else 1.0

    obs_idx = np.where(np.isfinite(alpha_anchor))[0]
    embeds_obs = embeds[obs_idx]
    alpha_obs_anchor = alpha_anchor[obs_idx]

    # Hyperparameters (demo)
    base_weights = {1.0: 0.7, 3.0: 0.9, 5.0: 1.0}
    lam0 = 0.002
    band0 = 0.02
    knn_k = 20
    softmax_tau = 0.12

    issuer_to_idx = {str(n): i for i, n in enumerate(issuer_names)}

    def Estimate_Alpha_Naive(issuer: str, pd_targets: Optional[Dict[float, float]]) -> float:
        idx = issuer_to_idx.get(issuer, None)
        rating_state = int(rating_state_arr[idx]) if idx is not None else 1

        if pd_targets:
            return Fit_Alpha_From_Pds_Multi(generator_base, pd_targets, rating_state, alpha_grid, weights=base_weights)

        return float(alpha_rating_mean[rating_state])

    def Estimate_Alpha_Vector(
        issuer: str,
        pd_targets: Optional[Dict[float, float]],
        confidence: float = 1.0,
        debug: bool = False,
    ) -> float:
        """
        Vector-stabilized alpha:
          - Compute peer prior alpha_prior via KNN in embedding space
          - If PD targets exist: compute market-fit alpha_mkt
          - Solve MAP objective with confidence-scaled weights and confidence-scaled regularization strength
        """
        confidence = float(np.clip(confidence, 1e-3, 1.0))

        idx = issuer_to_idx.get(issuer, None)
        if idx is None:
            return float(np.nanmean(alpha_obs_anchor))

        rating_state = int(rating_state_arr[idx])
        alpha_prior = Knn_Prior_Alpha(embeds[idx], embeds_obs, alpha_obs_anchor, k=knn_k, tau=softmax_tau)

        if not pd_targets:
            if debug:
                print(f"DEBUG alpha_prior={alpha_prior:.4f} (no PD targets)")
            return float(alpha_prior)

        alpha_mkt = Fit_Alpha_From_Pds_Multi(generator_base, pd_targets, rating_state, alpha_grid, weights=base_weights)

        # Down-weight market term when confidence is low, so the prior can matter.
        weights_eff = {T: base_weights.get(T, 1.0) * confidence for T in pd_targets.keys()}

        # Stronger shrinkage when confidence is low (pull harder to peer prior)
        lam = (lam0 / max(len(pd_targets), 1)) * (1.0 / confidence)

        # Constrain search window: avoid extreme alpha jumps; widen window as confidence falls
        band_eff = min(0.25, max(band0, band0 / confidence))
        lo = min(alpha_mkt, alpha_prior) * (1.0 - band_eff)
        hi = max(alpha_mkt, alpha_prior) * (1.0 + band_eff)

        local_grid = alpha_grid[(alpha_grid >= lo) & (alpha_grid <= hi)]
        if local_grid.size < 10:
            local_grid = alpha_grid

        alpha_map = Fit_Alpha_Map_Multi(
            generator_base=generator_base,
            pd_targets=pd_targets,
            rating_state=rating_state,
            alpha_prior=float(alpha_prior),
            alpha_grid=local_grid,
            lam=float(lam),
            weights=weights_eff,
        )

        if debug:
            print(
                f"DEBUG alpha_prior={alpha_prior:.4f} alpha_mkt={alpha_mkt:.4f} "
                f"conf={confidence:.2f} band={band_eff:.3f} lam={lam:.6f} alpha_map={alpha_map:.4f}"
            )

        return float(alpha_map)

    def Get_Rating_State(issuer: str) -> int:
        idx = issuer_to_idx.get(issuer, None)
        return int(rating_state_arr[idx]) if idx is not None else 1

    known_issuers = set(map(str, issuer_names))
    return Estimate_Alpha_Naive, Estimate_Alpha_Vector, Get_Rating_State, known_issuers


## Section 2 — “Vector platform” layer: routing and precedent retrieval

### Objective
Given an inbound unstructured message, determine:
1) which workflow to run (routing), and  
2) which prior cases are most similar (precedent retrieval).

### Mathematical framework

#### Text embeddings
Each message $m$ is embedded into a vector $e(m)\in\mathbb{R}^d$ using a sentence transformer.

Embeddings are normalized such that similarity is inner product:
$$
\mathrm{sim}(a,b) = e(a)^\top e(b).
$$

#### Nearest-neighbor retrieval (FAISS)
Given workflow descriptions $\{w_j\}$ and case texts $\{c_k\}$, retrieval selects top matches:
$$
\hat j = \arg\max_j e(m)^\top e(w_j),
\qquad
\hat k_1,\hat k_2 = \text{top-2 over } e(m)^\top e(c_k).
$$

### What the code does / produces
- Builds FAISS indices for:
  - workflow routing labels/descriptions,
  - historical cases.
- For each inbound message:
  - prints top workflow route and a runner-up,
  - prints two most similar cases and similarity scores.

### Why it is designed this way
- It emulates a “routing layer” that decides what downstream processing is appropriate.
- Precedent retrieval provides explainability and governance context (“similar cases were handled this way”).


In [5]:
# ============================================================
# SECTION 2 — Vector-like platform layer: routing + retrieval
# ============================================================

WORKFLOWS = [
    {"name": "CDS_TRADE_CAPTURE", "desc": "Capture CDS trade, extract terms, validate, generate booking payload."},
    {"name": "CDS_CURVE_CALIBRATION", "desc": "Calibrate/update CDS curve; handle missing tenors; assess liquidity; store snapshot."},
    {"name": "TRANSITION_TILT_UPDATE", "desc": "Link CDS-implied PD to transition model; fit/regularize alpha tilt; log evidence."},
    {"name": "CREDIT_RISK_EXCEPTION", "desc": "Handle anomalies: outlier quotes, stale markets, inconsistent curve, escalation and case mgmt."},
]

HISTORICAL_CASES = [
    {"case_id": "C001", "text": "ACME 5Y widened post-earnings; illiquid tape; used peer-regularized tilt; documented override."},
    {"case_id": "C002", "text": "BETA missing 1Y/3Y; inferred short end from similar issuers; curve smoothing applied; governance note filed."},
    {"case_id": "C003", "text": "GAMMA HY; regime stress; transition tilt increased consistent with indices; no exception raised."},
    {"case_id": "C004", "text": "DELTA quote spike vs peers; suspect bad tick; opened exception; requested broker validation."},
]


def Embed_Texts(model, texts: List[str]) -> np.ndarray:
    """
    Return normalized float32 embeddings suitable for inner-product (cosine) search.
    """
    return model.encode(texts, convert_to_numpy=True, normalize_embeddings=True).astype("float32")


def Build_Faiss_Ip_Index(embeddings: np.ndarray) -> faiss.IndexFlatIP:
    """
    Build a FAISS inner-product index:
      similarity(x, y) = x^T y (when embeddings are normalized, this is cosine similarity).
    """
    dim = embeddings.shape[1]
    index = faiss.IndexFlatIP(dim)
    index.add(embeddings)
    return index


def Extract_Message_Fields(message_text: str, known_issuers: Set[str]) -> Dict[str, object]:
    """
    Normalize message fields and include parsed quotes when available.

    Behavior:
      - If quotes exist: tenor_years = max tenor; spread_bp = spread at max tenor; quotes included
      - Else: fallback regex tenor/spread if present
    """
    stop_tokens = {
        "NOTE", "MATRIX", "CASE", "CALIBRATION", "UPDATE", "SPREAD", "CURVE", "PEERS",
        "CDS", "PD", "USD", "EUR", "GBP", "JPY", "PLEASE", "BUY", "SELL"
    }

    tokens = re.findall(r"\b[A-Z][A-Z0-9_\-]{2,}\b", message_text.upper())
    issuer = next((tok for tok in tokens if tok not in stop_tokens and tok in known_issuers), "UNKNOWN")

    ccy_match = re.search(r"\b(USD|EUR|GBP|JPY)\b", message_text.upper())
    ccy = ccy_match.group(1) if ccy_match else "USD"

    spreads_num = Parse_Spreads_By_Tenor(message_text)
    quotes = {f"{int(k)}Y": float(v) for k, v in sorted(spreads_num.items())} if spreads_num else None

    if spreads_num:
        max_tenor = float(max(spreads_num.keys()))
        tenor_years = int(max_tenor)
        spread_bp = float(spreads_num[max_tenor])
    else:
        tenor_match = re.search(r"(\d+)\s*Y", message_text, re.IGNORECASE)
        spread_match = re.search(r"(\d+(?:\.\d+)?)\s*bp", message_text, re.IGNORECASE)
        tenor_years = int(tenor_match.group(1)) if tenor_match else None
        spread_bp = float(spread_match.group(1)) if spread_match else None

    out: Dict[str, object] = {
        "issuer": issuer,
        "tenor_years": tenor_years,
        "spread_bp": spread_bp,
        "ccy": ccy,
        "raw_text": message_text,
    }
    if quotes is not None:
        out["quotes"] = quotes

    return out

## Section 3 — End-to-end orchestration: ingestion → calibration → tilt → governance → reporting

### Objective
Run a unified loop over inbound messages that:
1) routes and retrieves precedents,  
2) extracts structured fields and CDS quotes,  
3) calibrates hazards and PD targets (or reuses memory),  
4) estimates tilts (naive and vector) with governance gating,  
5) reports final multi-horizon transition-model PDs.

### Mathematical framework

#### Field extraction
From message text, extract:
- issuer identifier,
- currency,
- quotes $\{T \mapsto s(T)\}$, and summary fields:
  - $\text{tenor\_years}=\max(T)$ if quotes exist,
  - $\text{spread\_bp}=s(\max(T))$ if quotes exist.

#### Confidence scoring (heuristic)
A confidence scalar $c\in(0,1]$ is computed based on:
- whether quotes are fresh vs reused,
- whether terms indicate illiquidity/staleness,
- number of tenors observed.

This confidence drives:
- whether parameters are stored (governance),
- prior strength in $\alpha$ estimation,
- clipping band around market-fit $\alpha$.

#### Governance gating / “memory”
The script maintains issuer-level memory:
- last hazard curve $(\{T_k\},\{\gamma_k\})$,
- last stored $\alpha_{\text{naive}}$ and $\alpha_{\text{vector}}$.

Policy:
1) If **exception**: do nothing.
2) If **no new quotes** but curve and stored alphas exist: reuse (no drift).
3) If **quotes exist**: compute alphas for reporting; store only if $c \ge c_{\min}$.
4) If **no quotes and no curve memory**: fall back to rating mean / vector prior.

#### Guardrail around market-fit alpha
When quotes exist, vector alpha is clipped around the market-fit estimate:
$$
\alpha_{\text{vector}} \leftarrow \min\Bigl(\max(\alpha_{\text{vector}},\;\alpha_{\text{mkt}}(1-b)),\;\alpha_{\text{mkt}}(1+b)\Bigr),
$$
with band $b=b(c)$ increasing as confidence decreases.

### What the code does / produces
For each inbound message, it prints a complete trace including:
- routing choice + similar precedents,
- extracted fields (including quotes if present),
- CDS calibration (hazards) and PD targets (or reuse),
- confidence,
- $\alpha_{\text{naive}}$ and $\alpha_{\text{vector}}$ with governance note,
- final $\mathrm{PD}_{\mathrm{TM}}(1Y)$, $\mathrm{PD}_{\mathrm{TM}}(3Y)$, $\mathrm{PD}_{\mathrm{TM}}(5Y)$ for both tilts.

### Why it is designed this way
- It demonstrates an integrated “platform-like” workflow: *interpret → route → calibrate → link models → govern → report*.
- Memory reuse reduces churn and prevents parameter drift in illiquid markets.
- Confidence gating enforces a basic model risk management control: low-quality quotes influence reporting but do not overwrite stored parameters.


In [6]:
# ============================================================
# SECTION 3 — Unified end-to-end run (multi-horizon linkage)
# ============================================================

def Main():
    # ------------------------------------------------------------
    # 0) Transition model base (annual TM -> generator_base)
    # ------------------------------------------------------------
    _, generator_base = Build_Q_Base_From_Transition_Data(use_estimated=True)

    # ------------------------------------------------------------
    # 1) Universe + alpha estimators
    # ------------------------------------------------------------
    universe = Build_Structured_Issuer_Universe(generator_base=generator_base, seed=7, num_issuers=400)
    estimate_alpha_naive, estimate_alpha_vector, get_rating_state, known_issuers = Build_Alpha_Estimators(generator_base, universe)

    # ------------------------------------------------------------
    # 2) Routing + precedent retrieval indices
    # ------------------------------------------------------------
    model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

    workflow_texts = [f"{w['name']}: {w['desc']}" for w in WORKFLOWS]
    workflow_index = Build_Faiss_Ip_Index(Embed_Texts(model, workflow_texts))

    case_texts = [c["text"] for c in HISTORICAL_CASES]
    case_index = Build_Faiss_Ip_Index(Embed_Texts(model, case_texts))

    # ------------------------------------------------------------
    # 3) CDS inputs
    # ------------------------------------------------------------
    discount_curve = Flat_Discount_Curve(rate_r=0.03)
    recovery_r = 0.40

    inbound_messages = [
        "Buy 50mm ACME 5Y CDS @ 120bp USD. Please book and confirm.",
        "Calibration note: ACME curve missing 1Y; 5Y=120bp; liquidity poor.",
        "We need to update transition matrix tilt for ACME using latest 5Y spread; align PD to market.",
        "Spread looks wrong: DELTA quoted 1200bp but peers ~140bp; suspect bad tick. Raise exception case.",
        "Update tilt for BETA but no reliable CDS quote today; use peers/regime-consistent prior and log evidence.",
        "ACME curve: 1Y=80bp 3Y=105bp 5Y=120bp USD; please update curve + tilt.",
    ]

    # ------------------------------------------------------------
    # 4) Memory: last good curve + last approved alphas
    # ------------------------------------------------------------
    last_hazards: Dict[str, Tuple[List[float], List[float]]] = {}
    last_alpha_naive: Dict[str, float] = {}
    last_alpha_vector: Dict[str, float] = {}

    report_horizons = [1.0, 3.0, 5.0]

    # Governance thresholds
    conf_store_min = 0.50
    conf_reused_curve = 0.60

    # Guardrail around market-fit alpha when quote exists
    band_min = 0.03
    band_max = 0.20

    def Band_From_Confidence(confidence: float) -> float:
        return float(band_min + (1.0 - confidence) * (band_max - band_min))

    def Confidence_From_Message(
        msg: str,
        fresh_market: bool,
        reused_curve: bool,
        num_tenors: int,
    ) -> float:
        """
        Heuristic confidence score:
          - reused curve => fixed medium confidence
          - if market quote exists, start at baseline and penalize illiquidity/missing keywords
          - reward more tenors (more constraints => more stable hazard curve)
        """
        if reused_curve:
            return float(conf_reused_curve)
        if not fresh_market:
            return 0.60

        conf = 0.60

        if re.search(r"\billiquid\b|\bliquidity\s+poor\b|\bstale\b|\bthin\b", msg, flags=re.IGNORECASE):
            conf *= 0.25
        if re.search(r"\bmissing\b", msg, flags=re.IGNORECASE):
            conf *= 0.70

        if num_tenors >= 2:
            conf = min(1.0, conf + 0.20)
        if num_tenors >= 3:
            conf = min(1.0, conf + 0.20)

        return float(np.clip(conf, 0.05, 1.00))

    print("\n" + "#" * 98)
    print("# Unified Demo: multi-horizon CDS↔transition tilt, stabilized by vector prior, embedded in routing layer")
    print("#" * 98)

    for msg in inbound_messages:
        # ----------------------------
        # A) Route
        # ----------------------------
        query_emb = Embed_Texts(model, [msg])
        wf_scores, wf_ids = workflow_index.search(query_emb, k=2)

        chosen_workflow = WORKFLOWS[int(wf_ids[0][0])]
        alt_workflow = WORKFLOWS[int(wf_ids[0][1])]

        # ----------------------------
        # B) Extract structured fields
        # ----------------------------
        fields = Extract_Message_Fields(msg, known_issuers)
        issuer = str(fields["issuer"])
        rating_state = int(get_rating_state(issuer))

        # ----------------------------
        # C) Retrieve similar cases
        # ----------------------------
        case_scores, case_ids = case_index.search(query_emb, k=2)

        print("\n" + "=" * 98)
        print("INBOUND:", msg)
        print(f"ROUTE: {chosen_workflow['name']} (score={wf_scores[0][0]:.3f}) | next: {alt_workflow['name']} (score={wf_scores[0][1]:.3f})")

        printable_fields = {k: fields[k] for k in ["issuer", "tenor_years", "spread_bp", "ccy"] if k in fields}
        if "quotes" in fields:
            printable_fields["quotes"] = fields["quotes"]
        print("FIELDS:", printable_fields)

        print("SIMILAR CASES:")
        for rank in range(2):
            case = HISTORICAL_CASES[int(case_ids[0][rank])]
            print(f"  - {case['case_id']} (sim={case_scores[0][rank]:.3f}): {case['text']}")

        # ----------------------------
        # D) Exception gating
        # ----------------------------
        if chosen_workflow["name"] == "CREDIT_RISK_EXCEPTION":
            print("ACTION: exception workflow — do not update curve/alpha; open case and validate the quote.")
            continue

        # ----------------------------
        # E) CDS curve calibration / PD targets
        # ----------------------------
        spreads = Parse_Spreads_By_Tenor(msg)

        # If no explicit quotes were parsed, fall back to extracted tenor/spread (if present).
        if not spreads and fields.get("spread_bp") is not None and fields.get("tenor_years") is not None:
            spreads = {float(fields["tenor_years"]): float(fields["spread_bp"])}

        fresh_market = False
        reused_curve = False
        pd_targets: Optional[Dict[float, float]] = None

        if spreads:
            fresh_market = True
            tenors, gammas = Calibrate_Piecewise_Hazards(spreads, discount_curve=discount_curve, recovery_r=recovery_r)
            last_hazards[issuer] = (tenors, gammas)

            # PD targets only at quoted tenors; (the reporting horizons are handled below when needed)
            pd_targets = {float(T): Cds_Pd_From_Hazards(tenors, gammas, float(T)) for T in sorted(spreads.keys())}

            curve_str = ", ".join([f"{int(t)}Y:{spreads[t]:.0f}bp" for t in sorted(spreads)])
            print(f"CDS calibration: spreads=({curve_str}) => hazards(gamma)={np.round(gammas, 4).tolist()}")

            if len(spreads) == 1:
                only_T = float(next(iter(spreads.keys())))
                print(f"CDS-implied PD targets: PD({int(only_T)}Y)≈{pd_targets[only_T]:.2%}")

                # Flat-hazard extrapolation in this demo is implicit: last gamma extends beyond last tenor.
                pd_extrap = {T: Cds_Pd_From_Hazards(tenors, gammas, float(T)) for T in report_horizons}
                print("CDS-implied PD (flat-hazard extrapolation): " + ", ".join([f"PD({int(T)}Y)≈{pd_extrap[T]:.2%}" for T in report_horizons]))
            else:
                print("CDS-implied PD targets: " + ", ".join([f"PD({int(T)}Y)≈{pd_targets[T]:.2%}" for T in sorted(pd_targets.keys())]))

        else:
            if issuer in last_hazards:
                reused_curve = True
                tenors, gammas = last_hazards[issuer]

                # When reusing curve, build PD targets for reporting horizons (1/3/5)
                pd_targets = {float(T): Cds_Pd_From_Hazards(tenors, gammas, float(T)) for T in report_horizons}

                print("CDS calibration: MISSING spreads => reused last hazard curve")
                print("CDS-implied PD targets: " + ", ".join([f"PD({int(T)}Y)≈{pd_targets[T]:.2%}" for T in report_horizons]))
            else:
                print("CDS calibration: MISSING spreads => PD targets unavailable (curve memory empty).")

        # ----------------------------
        # F) Confidence score
        # ----------------------------
        conf = Confidence_From_Message(
            msg,
            fresh_market=fresh_market,
            reused_curve=reused_curve,
            num_tenors=len(spreads) if spreads else 0,
        )
        conf_note = " (reused curve)" if reused_curve else ""
        print(f"DATA QUALITY: confidence≈{conf:.2f}{conf_note}")

        # ----------------------------
        # G) Alpha estimation policy (governance + memory)
        # ----------------------------
        if reused_curve and (issuer in last_alpha_naive) and (issuer in last_alpha_vector):
            alpha_naive = float(last_alpha_naive[issuer])
            alpha_vector = float(last_alpha_vector[issuer])
            alpha_note = "Missing quotes => reused last hazard curve AND last stored alphas (no parameter drift)."

        elif pd_targets:
            # Naive is market-fit; used as "mkt" reference for banding
            alpha_naive = float(estimate_alpha_naive(issuer, pd_targets))
            alpha_mkt = float(alpha_naive)

            alpha_vector_raw = float(estimate_alpha_vector(issuer, pd_targets, confidence=conf))

            # Clip to band around market-fit alpha to remain market-respecting.
            band = Band_From_Confidence(conf)
            lo, hi = alpha_mkt * (1.0 - band), alpha_mkt * (1.0 + band)
            alpha_vector = float(np.clip(alpha_vector_raw, lo, hi))

            # Store only if confidence passes governance threshold.
            if conf >= conf_store_min:
                last_alpha_naive[issuer] = alpha_naive
                last_alpha_vector[issuer] = alpha_vector
                alpha_note = "Fresh market quotes => calibrated and stored alphas (confidence-gated)."
            else:
                alpha_note = "Fresh market quotes (low confidence) => calibrated for reporting, but NOT stored (governance gating)."

        else:
            # No PD targets => fall back (naive: rating mean; vector: peer prior)
            alpha_naive = float(estimate_alpha_naive(issuer, None))
            alpha_vector = float(estimate_alpha_vector(issuer, None, confidence=conf))
            alpha_note = "No quotes and no curve memory => naive rating-mean; vector peer-conditioned prior."

        # ----------------------------
        # H) Report PD_TM under both alphas
        # ----------------------------
        pd_tm_naive = {T: Pd_Tm(generator_base, alpha_naive, float(T), rating_state) for T in report_horizons}
        pd_tm_vector = {T: Pd_Tm(generator_base, alpha_vector, float(T), rating_state) for T in report_horizons}

        print("LINK (Transition Tilt) RESULTS:")
        print(f"  rating_state={rating_state} (0=IG,1=BBB-ish,2=HY)")
        print(f"  alpha_naive={alpha_naive:.3f} | " + ", ".join([f"PD_TM({int(T)}Y)={pd_tm_naive[T]:.2%}" for T in report_horizons]))
        print(f"  alpha_vector={alpha_vector:.3f} | " + ", ".join([f"PD_TM({int(T)}Y)={pd_tm_vector[T]:.2%}" for T in report_horizons]))
        print(f"  Note: {alpha_note}")


if __name__ == "__main__":
    Main()


##################################################################################################
# Unified Demo: multi-horizon CDS↔transition tilt, stabilized by vector prior, embedded in routing layer
##################################################################################################

INBOUND: Buy 50mm ACME 5Y CDS @ 120bp USD. Please book and confirm.
ROUTE: CDS_CURVE_CALIBRATION (score=0.386) | next: CDS_TRADE_CAPTURE (score=0.361)
FIELDS: {'issuer': 'ACME', 'tenor_years': 5, 'spread_bp': 120.0, 'ccy': 'USD', 'quotes': {'5Y': 120.0}}
SIMILAR CASES:
  - C001 (sim=0.430): ACME 5Y widened post-earnings; illiquid tape; used peer-regularized tilt; documented override.
  - C002 (sim=0.186): BETA missing 1Y/3Y; inferred short end from similar issuers; curve smoothing applied; governance note filed.
CDS calibration: spreads=(5Y:120bp) => hazards(gamma)=[0.02]
CDS-implied PD targets: PD(5Y)≈9.49%
CDS-implied PD (flat-hazard extrapolation): PD(1Y)≈1.98%, PD(3Y)≈5.81%, PD(5Y)≈

## How to read the printed output

For each inbound message, the notebook prints:

- **ROUTE**: which workflow the text embedding selected and the similarity score.
- **FIELDS**: extracted issuer/tenor/spread/currency.
- **SIMILAR CASES**: most similar precedents by text embedding similarity.
- **CDS→PD proxy**: computed $\lambda$ and $PD_{\mathrm{CDS}}(5Y)$ when spread exists.
- **LINK (Transition Tilt) RESULTS**:
  - `alpha_naive` and resulting $PD_{\mathrm{TM}}(1Y)$, $PD_{\mathrm{TM}}(5Y)$,
  - `alpha_vector` and resulting $PD_{\mathrm{TM}}(1Y)$, $PD_{\mathrm{TM}}(5Y)$,
  - notes indicating whether the system used market calibration, memory reuse, or peer prior.

The intended takeaway is:
- With reliable quotes, vector calibration stays close to market but reduces instability.
- With missing quotes, vectors enable peer-conditioned priors and better defaults than rating means.
- With exceptions, the workflow layer prevents erroneous updates.
