# üîç Cross-Model Verification (CMVK)

> **Detect hallucinations by comparing outputs across models.**

## Learning Objectives

By the end of this notebook, you will:
1. Understand why cross-model verification matters
2. Use CMVK to detect drift between outputs
3. Compare embeddings and distributions
4. Implement multi-model consensus verification
5. Set up automatic hallucination detection

---

## Why Cross-Model Verification?

**Problem:** LLMs hallucinate. A single model can confidently output wrong information.

**Solution:** If multiple models agree, confidence increases. If they disagree, flag for review.

```
Single Model:           Cross-Model Verification:

  GPT-4                    GPT-4     Claude     Gemini
    ‚Üì                        ‚Üì         ‚Üì          ‚Üì
"Paris is the            "Paris"   "Paris"    "Paris"
 capital of France"          ‚Üì         ‚Üì          ‚Üì
    ‚Üì                      ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
  Trust it?                    CONSENSUS: 100%
    ü§∑                         ‚úÖ High confidence
```

---

## Step 1: Install Dependencies

In [None]:
!pip install agent-os[cmvk] --quiet

## Step 2: Basic Drift Detection

Compare two text outputs to measure semantic drift:

In [None]:
from cmvk import verify

# Compare two semantically equivalent texts
text_a = "The capital of France is Paris."
text_b = "Paris is the capital city of France."

score = verify(text_a, text_b)

print("üìä Verification Result")
print("=" * 50)
print(f"Text A: {text_a}")
print(f"Text B: {text_b}")
print(f"\nüéØ Drift Score: {score.drift_score:.3f}")
print(f"   (0.0 = identical, 1.0 = completely different)")
print(f"\nüîí Confidence: {score.confidence:.3f}")
print(f"üìÅ Drift Type: {score.drift_type}")

In [None]:
# Compare semantically different texts
text_a = "The capital of France is Paris."
text_c = "The capital of Germany is Berlin."

score = verify(text_a, text_c)

print("üìä Different Texts")
print("=" * 50)
print(f"Text A: {text_a}")
print(f"Text C: {text_c}")
print(f"\nüéØ Drift Score: {score.drift_score:.3f}")
print(f"üìÅ Drift Type: {score.drift_type}")
print(f"\n‚ö†Ô∏è  High drift detected! Outputs disagree.")

## Step 3: Understanding Drift Types

CMVK classifies drift into categories:

In [None]:
from cmvk import verify, DriftType

examples = [
    # Semantic drift (meaning changed)
    ("The answer is 42", "The answer is 24", "SEMANTIC"),
    
    # Structural drift (format changed)
    ("Name: John, Age: 30", "{\"name\": \"John\", \"age\": 30}", "STRUCTURAL"),
    
    # Numerical drift (numbers changed)
    ("Revenue: $1,000,000", "Revenue: $1,000,001", "NUMERICAL"),
    
    # Lexical drift (wording changed, meaning same)
    ("The quick brown fox", "The fast brown fox", "LEXICAL"),
]

print("üìä Drift Type Examples")
print("=" * 70)

for text_a, text_b, expected_type in examples:
    score = verify(text_a, text_b)
    print(f"\n{expected_type}:")
    print(f"  A: {text_a}")
    print(f"  B: {text_b}")
    print(f"  Detected: {score.drift_type} (drift: {score.drift_score:.3f})")

## Step 4: Embedding Verification

Compare vector embeddings directly:

In [None]:
from cmvk import verify_embeddings
import numpy as np

# Simulate embeddings from two models
embedding_a = np.array([0.8, 0.2, 0.5, 0.3, 0.9])
embedding_b = np.array([0.79, 0.21, 0.48, 0.31, 0.88])  # Slightly different
embedding_c = np.array([0.1, 0.9, 0.2, 0.8, 0.1])       # Very different

# Compare similar embeddings
score_ab = verify_embeddings(embedding_a, embedding_b)
print("üìä Similar Embeddings (A vs B)")
print(f"   Drift: {score_ab.drift_score:.4f}")
print(f"   Method: {score_ab.details.get('method', 'cosine')}")

# Compare different embeddings
score_ac = verify_embeddings(embedding_a, embedding_c)
print(f"\nüìä Different Embeddings (A vs C)")
print(f"   Drift: {score_ac.drift_score:.4f}")
print(f"   ‚ö†Ô∏è  Significant drift detected!")

## Step 5: Distribution Verification

Compare probability distributions (e.g., token probabilities):

In [None]:
from cmvk import verify_distributions
import numpy as np

# Simulate token probability distributions
dist_a = np.array([0.7, 0.2, 0.1])  # High confidence in first token
dist_b = np.array([0.65, 0.25, 0.1])  # Similar distribution
dist_c = np.array([0.1, 0.2, 0.7])  # Very different!

# Compare distributions using KL divergence
score_ab = verify_distributions(dist_a, dist_b, method="kl")
print("üìä Similar Distributions (KL Divergence)")
print(f"   Drift: {score_ab.drift_score:.4f}")

score_ac = verify_distributions(dist_a, dist_c, method="kl")
print(f"\nüìä Different Distributions (KL Divergence)")
print(f"   Drift: {score_ac.drift_score:.4f}")
print(f"   ‚ö†Ô∏è  Models disagree significantly!")

## Step 6: Multi-Model Consensus

Verify agreement across multiple models:

In [None]:
from cmvk import ConsensusVerifier

# Simulate outputs from multiple models
model_outputs = {
    "gpt-4": "The Great Wall of China is approximately 21,196 km long.",
    "claude-3": "The Great Wall of China stretches about 21,196 kilometers.",
    "gemini-pro": "The total length of the Great Wall is roughly 21,196 km.",
}

# Create consensus verifier
verifier = ConsensusVerifier(threshold=0.9)  # 90% agreement required

# Verify consensus
result = verifier.verify(model_outputs)

print("üìä Multi-Model Consensus")
print("=" * 60)
for model, output in model_outputs.items():
    print(f"  {model}: {output[:50]}...")

print(f"\nüéØ Consensus Score: {result.consensus_score:.2%}")
print(f"‚úÖ Consensus Reached: {result.consensus}")
if result.consensus:
    print(f"üìù Agreed Answer: {result.canonical_answer}")

In [None]:
# Example with disagreement
conflicting_outputs = {
    "gpt-4": "The population of Tokyo is 14 million.",
    "claude-3": "Tokyo has a population of 37 million in the metro area.",
    "gemini-pro": "About 13.96 million people live in Tokyo proper.",
}

result = verifier.verify(conflicting_outputs)

print("üìä Conflicting Outputs")
print("=" * 60)
for model, output in conflicting_outputs.items():
    print(f"  {model}: {output}")

print(f"\nüéØ Consensus Score: {result.consensus_score:.2%}")
print(f"‚ùå Consensus Reached: {result.consensus}")
print(f"\n‚ö†Ô∏è  Models disagree! Pairwise drift scores:")
for pair, score in result.pairwise_scores.items():
    print(f"   {pair}: {score:.3f}")

## Step 7: Batch Verification

In [None]:
from cmvk import verify_batch

# Verify multiple pairs at once
pairs = [
    ("2 + 2 = 4", "Two plus two equals four"),
    ("Water boils at 100¬∞C", "Water boils at 212¬∞F"),
    ("Python is a programming language", "Python is a type of snake"),
]

results = verify_batch(pairs)

print("üìä Batch Verification Results")
print("=" * 70)

for (a, b), score in zip(pairs, results):
    status = "‚úÖ" if score.drift_score < 0.5 else "‚ö†Ô∏è"
    print(f"\n{status} Drift: {score.drift_score:.3f}")
    print(f"   A: {a}")
    print(f"   B: {b}")

## Step 8: Integrate with Agent OS

In [None]:
from agent_os import KernelSpace
from cmvk import verify, ConsensusVerifier

kernel = KernelSpace(policy="strict")
verifier = ConsensusVerifier(threshold=0.85)

@kernel.register
async def verified_agent(question: str):
    """
    An agent that verifies answers across multiple models.
    """
    # Simulate calling multiple models
    # In production, these would be real API calls
    outputs = {
        "model_1": f"Answer to '{question}': Response from model 1",
        "model_2": f"Answer to '{question}': Response from model 1",  # Same
        "model_3": f"Answer to '{question}': Response from model 1",  # Same
    }
    
    # Verify consensus
    result = verifier.verify(outputs)
    
    if result.consensus:
        return {
            "answer": result.canonical_answer,
            "confidence": result.consensus_score,
            "verified": True
        }
    else:
        return {
            "answer": None,
            "confidence": result.consensus_score,
            "verified": False,
            "warning": "Models disagree - human review needed"
        }

# Execute
result = await kernel.execute(verified_agent, "What is the speed of light?")

print("üìä Verified Agent Result")
print("=" * 50)
for k, v in result.items():
    print(f"  {k}: {v}")

## Step 9: Automatic Hallucination Detection

In [None]:
from cmvk import HallucinationDetector

# Create detector with thresholds
detector = HallucinationDetector(
    semantic_threshold=0.3,   # Flag if semantic drift > 30%
    numerical_threshold=0.1,  # Flag if numerical drift > 10%
    confidence_threshold=0.8  # Require 80% confidence
)

# Test cases
test_cases = [
    ("The Earth is 4.5 billion years old", "The Earth is 4.5 billion years old"),
    ("The Earth is 4.5 billion years old", "The Earth is 6,000 years old"),
    ("Water is H2O", "Water is composed of hydrogen and oxygen atoms"),
]

print("üîç Hallucination Detection")
print("=" * 70)

for source, generated in test_cases:
    result = detector.check(source_text=source, generated_text=generated)
    
    status = "üö® HALLUCINATION" if result.is_hallucination else "‚úÖ OK"
    print(f"\n{status}")
    print(f"   Source:    {source}")
    print(f"   Generated: {generated}")
    if result.is_hallucination:
        print(f"   Reason: {result.reason}")

---

## Summary

| Feature | What It Does |
|---------|-------------|
| `verify()` | Compare two texts for drift |
| `verify_embeddings()` | Compare vector embeddings |
| `verify_distributions()` | Compare probability distributions |
| `ConsensusVerifier` | Multi-model agreement |
| `verify_batch()` | Batch verification |
| `HallucinationDetector` | Automatic hallucination flagging |

### Quick Reference

```python
from cmvk import verify, ConsensusVerifier, HallucinationDetector

# Basic verification
score = verify(text_a, text_b)
print(score.drift_score, score.drift_type)

# Multi-model consensus
verifier = ConsensusVerifier(threshold=0.9)
result = verifier.verify({"model1": out1, "model2": out2})

# Hallucination detection
detector = HallucinationDetector(semantic_threshold=0.3)
result = detector.check(source, generated)
```

---

## Next Steps

- [05-multi-agent-coordination](05-multi-agent-coordination.ipynb) - Trust between agents
- [06-policy-engine](06-policy-engine.ipynb) - Deep dive into policies
- [CMVK Documentation](https://github.com/imran-siddique/cmvk)