# RFT-Lab — System Metrics & Transparency

A real-world AI system is not trusted
just because it gives an answer.

It is trusted because it can explain:
- how confident it is
- how deeply it reasoned
- how much internal change happened

This notebook builds the **observability layer**
of the Reasoning-First Transformer (RFT).

No model logic is changed here.
We only READ internal signals and expose them.


## Step 0: Imports

We use basic libraries only.
No heavy visualization frameworks yet.


In [3]:
import torch
import numpy as np

## Step 1: What Do We Measure?

We expose three core system metrics:

1. Reasoning Depth  
   → How many reasoning steps were applied

2. Representation Shift  
   → How much internal state changed during reasoning

3. Confidence Score  
   → How stable the final representation is

These metrics are:
- model-agnostic
- lightweight
- real-time safe


## Step 2: Reasoning Depth

This metric directly comes from the Reasoning Block.


In [5]:
def get_reasoning_depth(steps_used):
    # Number of reasoning iterations applied
    return steps_used
# This tells us how much the model actually thought.

## Step 3: Representation Shift

This measures how much the internal state
changed due to reasoning.


In [7]:
def get_representation_shift(avg_shift):
    # Higher value means more internal transformation
    return avg_shift
# Large shift = deeper internal change

## Step 4: Confidence Estimation

Confidence is estimated from representation stability.

Less variance → higher confidence
More variance → lower confidence


In [9]:
def compute_confidence(reasoned_state):
    # Compute variance across feature dimension
    variance = torch.var(reasoned_state, dim=-1)

    # Average variance across tokens
    avg_variance = torch.mean(variance).item()

    # Convert variance to confidence score (0–1)
    confidence = 1 / (1 + avg_variance)

    return round(confidence, 3)

    # We don’t guess confidence; we derive it from internal stability

## Step 5: Risk & Warning Flags

We attach simple warnings
instead of blocking the system.


In [11]:
def generate_warnings(depth, confidence):
    warnings = []

    if depth <= 1:
        warnings.append("Shallow reasoning used")

    if confidence < 0.4:
        warnings.append("Low confidence output")

    return warnings
# The system admits uncertainty instead of hiding it.

## Step 6: Metrics Aggregator

This function collects all metrics
into a single structured output.


In [12]:
def collect_system_metrics(reasoning_result):
    depth = get_reasoning_depth(reasoning_result["steps_used"])
    shift = get_representation_shift(reasoning_result["avg_representation_shift"])
    confidence = compute_confidence(reasoning_result["reasoned_state"])

    warnings = generate_warnings(depth, confidence)

    return {
        "reasoning_depth": depth,
        "avg_representation_shift": round(shift, 4),
        "confidence_score": confidence,
        "warnings": warnings
    }

## Step 7: End-to-End Test

We simulate output from the Reasoning Block
and compute system metrics.


In [13]:
# Simulated reasoning block output
reasoning_output = {
    "reasoned_state": torch.randn(1, 6, 128),
    "steps_used": 3,
    "avg_representation_shift": 0.42
}

metrics = collect_system_metrics(reasoning_output)
metrics

{'reasoning_depth': 3,
 'avg_representation_shift': 0.42,
 'confidence_score': 0.508,

# **This notebook shows:**
- Real-time observability
- Explainable AI metrics
- Honest confidence reporting
- No black-box behavior

**The system does not just answer —
it explains how much it thought and how sure it is.**