# AgentRedChain Demo - SIMULATED DATA ONLY
## Sparse Evaluation Framework for Red-Teaming Multi-Agent LLM Systems

**IMPORTANT**: This notebook uses **100% SIMULATED DATA** generated with numpy.random.beta().
No real API calls are made. All results are synthetic and for demonstration purposes only.

This notebook demonstrates the conceptual workflow:
1. Setup multi-agent chains (simulated)
2. Run dense baseline evaluation (simulated)
3. Analyze attack propagation patterns (on fake data)
4. Design informed sparse sampling (theoretical)
5. Compare sampling strategies (using random data)
6. Fit vulnerability model (on synthetic values)
7. Visualize results (of simulated experiments)

**DO NOT** interpret any results as empirically validated performance metrics.

## IMPORTANT: Run This First!

**If you're getting import errors, run the package installer cell below FIRST.**

This notebook requires several Python packages. The cell below will automatically:
1. Check which packages are missing
2. Install them in your current environment
3. Clear the module cache for clean imports

After running the installer cell, you can proceed with the rest of the notebook.

In [1]:
# Package Installation and Environment Setup
# This cell ensures all required packages are installed in the notebook environment

import sys
import subprocess
import importlib

def install_package(package):
    """Install a package using pip"""
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", "--quiet", package])
        return True
    except subprocess.CalledProcessError:
        return False

def check_and_install(package_name, import_name=None):
    """Check if package is installed, install if not"""
    if import_name is None:
        import_name = package_name
    
    try:
        importlib.import_module(import_name)
        print(f"[OK] {package_name}")
        return True
    except ImportError:
        print(f"[INSTALLING] {package_name}...")
        if install_package(package_name):
            print(f"[INSTALLED] {package_name}")
            return True
        else:
            print(f"[FAILED] {package_name}")
            return False

print("=" * 60)
print("AGENTREDCHAIN PACKAGE INSTALLER")
print("=" * 60)
print("\nChecking and installing required packages...")
print("-" * 40)

# Core scientific packages
packages_to_install = [
    ("numpy", None),
    ("matplotlib", None),
    ("seaborn", None),
    ("pandas", None),
    ("scipy", None),
    ("scikit-learn", "sklearn"),
    ("python-dotenv", "dotenv"),
    ("tqdm", None),
]

# LLM and NLP packages
llm_packages = [
    ("langchain", None),
    ("langchain-community", "langchain_community"),
    ("langchain-openai", "langchain_openai"),
    ("langchain-anthropic", "langchain_anthropic"),
    ("openai", None),
    ("anthropic", None),
    ("transformers", None),
    ("sentence-transformers", "sentence_transformers"),
]

# Combine all packages
all_packages = packages_to_install + llm_packages

print("\nCore packages:")
success_count = 0
for package, import_name in packages_to_install:
    if check_and_install(package, import_name):
        success_count += 1

print("\nLLM/NLP packages:")
for package, import_name in llm_packages:
    if check_and_install(package, import_name):
        success_count += 1

print("-" * 40)
print(f"\nPackage installation complete: {success_count}/{len(all_packages)} successful")

# Force reload of installed modules
print("\nReloading modules...")
import importlib
import sys

# Clear module cache for fresh imports
modules_to_clear = ['numpy', 'matplotlib', 'seaborn', 'pandas', 'scipy', 
                    'sklearn', 'dotenv', 'tqdm', 'langchain', 'openai', 
                    'anthropic', 'transformers', 'sentence_transformers']

for module in modules_to_clear:
    if module in sys.modules:
        del sys.modules[module]

print("Module cache cleared")
print("\nEnvironment setup complete!")
print("=" * 60)
print("\nProceed to the next cell to import packages...")

AGENTREDCHAIN PACKAGE INSTALLER

Checking and installing required packages...
----------------------------------------

Core packages:
[OK] numpy
[OK] matplotlib
[OK] seaborn
[OK] pandas
[OK] scipy
[OK] scikit-learn
[OK] python-dotenv
[OK] tqdm

LLM/NLP packages:
[OK] langchain
[OK] langchain-community


  from .autonotebook import tqdm as notebook_tqdm


[OK] langchain-openai
[OK] langchain-anthropic
[OK] openai
[OK] anthropic
[OK] transformers
[OK] sentence-transformers
----------------------------------------

Package installation complete: 16/16 successful

Reloading modules...
Module cache cleared

Environment setup complete!

Proceed to the next cell to import packages...


## 1. Setup: Build Multi-Agent Chains

In [2]:
# Note: Set your API keys as environment variables
# os.environ['OPENAI_API_KEY'] = 'your-key-here'
# os.environ['ANTHROPIC_API_KEY'] = 'your-key-here'

# For demo purposes, we'll simulate the chains
# In production, uncomment the API key setup above

# Build research pipeline (linear topology)
linear_chain = AgentChain(topology='linear', model_type='gpt-5')
linear_chain.build_research_pipeline()

print(f"Linear chain built with {len(linear_chain.agents)} agents:")
for i, role in enumerate(linear_chain.agent_roles):
    print(f"  {i}: {role}")

# Save configuration
os.makedirs('../data/agent_chains', exist_ok=True)
linear_chain.save_config('../data/agent_chains/research_pipeline.json')

NameError: name 'AgentChain' is not defined

In [None]:
# Build consensus system (star topology)
star_chain = AgentChain(topology='star', model_type='gpt-5')
star_chain.build_consensus_system()

print(f"Star chain built with {len(star_chain.agents)} agents:")
for i, role in enumerate(star_chain.agent_roles):
    print(f"  {i}: {role}")

star_chain.save_config('../data/agent_chains/consensus_system.json')

In [None]:
# Build hierarchical review (hierarchical topology)
hierarchical_chain = AgentChain(topology='hierarchical', model_type='gpt-5')
hierarchical_chain.build_hierarchical_review()

print(f"Hierarchical chain built with {len(hierarchical_chain.agents)} agents:")
for i, role in enumerate(hierarchical_chain.agent_roles):
    print(f"  {i}: {role}")

hierarchical_chain.save_config('../data/agent_chains/hierarchical_review.json')

## 2. Attack Scenarios

In [None]:
# Initialize injection generator
injector = InjectionGenerator()

# Generate all 10 attacks (2 per category)
all_attacks = injector.generate_all_attacks()

print(f"Generated {len(all_attacks)} attack scenarios:\n")
for i, (category, attack, description) in enumerate(all_attacks):
    print(f"{i+1}. {category}: {description}")
    print(f"   Preview: {attack[:100]}...\n")

# Save attack templates
os.makedirs('../data/attack_scenarios', exist_ok=True)
injector.save_templates('../data/attack_scenarios/attack_templates.json')

In [None]:
# Show attack severity distribution
severity_dist = injector.get_severity_distribution()
print("Attack Severity Distribution:")
for severity, count in severity_dist.items():
    print(f"  {severity}: {count} attacks")

## 3. Dense Baseline Evaluation

**Note**: In a real scenario with API access, this would execute actual LLM calls.
For demo purposes, we'll simulate the TVD-MI scores.

In [None]:
# Initialize TVD-MI scorer
scorer = TVDMIScorer(model_name='all-mpnet-base-v2')

# For demo: Simulate dense evaluation results
# In production, this would run actual chain.execute() calls

def simulate_dense_evaluation(chain, attacks):
    """Simulate dense evaluation for demonstration."""
    np.random.seed(42)
    num_attacks = len(attacks)
    num_agents = len(chain.agents)
    
    # Simulate TVD-MI matrix with realistic patterns
    matrix = np.random.beta(2, 5, (num_attacks, num_agents))
    
    # Add some structure based on topology
    if chain.topology == 'linear':
        # Increasing vulnerability along the chain
        for i in range(num_agents):
            matrix[:, i] *= (1 + i * 0.2)
    elif chain.topology == 'star':
        # Coordinator more vulnerable
        matrix[:, 0] *= 1.5
    elif chain.topology == 'hierarchical':
        # Senior reviewer most vulnerable
        if num_agents >= 3:
            matrix[:, 2] *= 1.3
    
    # Normalize to [0, 1]
    matrix = np.clip(matrix, 0, 1)
    
    return {
        'matrix': matrix,
        'attacks': [attack[0] for attack in attacks],
        'agents': chain.agent_roles,
        'chain_topology': chain.topology,
        'results': {(i, j): matrix[i, j] for i in range(num_attacks) for j in range(num_agents)},
        'critical_paths': [(0, 1), (1, 2)] if chain.topology == 'linear' else []
    }

# Run simulated dense evaluation for linear chain
dense_results_linear = simulate_dense_evaluation(linear_chain, all_attacks[:6])  # Use subset for speed
print(f"Dense evaluation complete for linear chain")
print(f"Matrix shape: {dense_results_linear['matrix'].shape}")
print(f"Mean TVD-MI: {np.mean(dense_results_linear['matrix']):.3f}")

## 4. Attack Propagation Analysis

In [None]:
# Analyze attack propagation
propagation = PropagationAnalyzer(
    dense_results_linear['matrix'],
    chain_topology='linear'
)

# Compute propagation metrics
analysis = propagation.analyze_chain_weaknesses()

print("Propagation Analysis Summary:")
print(f"  Mean propagation depth: {analysis['summary']['mean_propagation_depth']:.2f}")
print(f"  Max propagation depth: {analysis['summary']['max_propagation_depth']}")
print(f"  Mean amplification: {analysis['summary']['mean_amplification']:.3f}")
print(f"  Critical edges found: {analysis['summary']['num_critical_edges']}")
print(f"  Most vulnerable position: Agent {analysis['summary']['most_vulnerable_position']}")

# Save analysis
os.makedirs('../experiments/dense_analysis', exist_ok=True)
propagation.save_analysis('../experiments/dense_analysis/linear_propagation.json')

## 5. Pattern Discovery

In [None]:
# Discover patterns for informed sampling
pattern_analyzer = PatternDiscovery(dense_results_linear)

# Get high-value tests
high_value_tests = pattern_analyzer.identify_high_value_tests(top_percent=0.3)
print(f"Identified {len(high_value_tests)} high-value test positions")

# Get sampling recommendations
recommendations = pattern_analyzer.generate_sampling_recommendations()
print(f"\nSampling Recommendations:")
print(f"  Strategy: {recommendations['sampling_strategy']['type']}")
print(f"  Description: {recommendations['sampling_strategy']['description']}")
print(f"  Critical positions: {recommendations['critical_positions']}")
print(f"  Coverage needed: {recommendations['high_priority_tests']['coverage_percent']:.1f}%")

## 6. Sparse Sampling Design

In [None]:
# Create sparse sampler
sampler = InformedSampler(pattern_analyzer)

# Generate sampling masks for different strategies and coverage levels
K, J = dense_results_linear['matrix'].shape
coverage_levels = [0.1, 0.2, 0.33]
strategies = ['informed', 'nlogn', 'random']

masks = sampler.create_multiple_masks(K, J, coverage_levels, strategies)

# Display mask statistics
print("Sampling Mask Statistics:\n")
for strategy in strategies:
    for coverage in coverage_levels:
        mask = masks[strategy][f"{int(coverage*100)}%"]
        print(f"{strategy} @ {coverage:.0%}:")
        print(f"  Total tests: {np.sum(mask)} / {mask.size}")
        print(f"  Actual coverage: {np.mean(mask):.1%}")
        print(f"  Min samples per attack: {np.min(mask.sum(axis=1))}")
        print(f"  Min samples per agent: {np.min(mask.sum(axis=0))}")
        print()

## 7. Sparse Evaluation Experiments

In [None]:
# Run sparse evaluation experiments
sparse_exp = SparseExperiment(dense_results_linear)

# Compare strategies at 33% coverage
comparison_results = {'strategies': {}}

for strategy in strategies:
    mask = masks[strategy]['33%']
    
    # Simulate sparse evaluation
    sparse_results = {
        'results': {k: v for k, v in dense_results_linear['results'].items() 
                   if mask[k[0], k[1]]},
        'matrix': dense_results_linear['matrix'] * mask,
        'mask': mask,
        'n_tests': np.sum(mask),
        'coverage': np.mean(mask),
        'strategy': strategy,
        'execution_time': np.sum(mask) * 0.1,  # Simulated time
        'critical_paths': dense_results_linear['critical_paths']
    }
    
    # Compare to dense
    comparison = sparse_exp.compare_to_dense(sparse_results)
    
    comparison_results['strategies'][strategy] = {
        '33%': {
            'sparse_results': sparse_results,
            'comparison': comparison,
            'efficiency': {
                'tests_run': sparse_results['n_tests'],
                'coverage': sparse_results['coverage'],
                'time_seconds': sparse_results['execution_time'],
                'tests_per_second': sparse_results['n_tests'] / sparse_results['execution_time']
            }
        }
    }
    
    print(f"{strategy} Strategy Results:")
    print(f"  Spearman ρ: {comparison['ranking_correlation']:.3f}")
    print(f"  RMSE: {comparison['vulnerability_rmse']:.3f}")
    print(f"  Tests: {sparse_results['n_tests']} ({sparse_results['coverage']:.1%} coverage)")
    print()

## 8. Vulnerability Model Fitting

In [None]:
# Fit Rasch vulnerability model on sparse data
informed_mask = masks['informed']['33%']
sparse_matrix = dense_results_linear['matrix'] * informed_mask

# Initialize and fit model
vuln_model = VulnerabilityModel(sparse_matrix, informed_mask)
vuln_model.fit(max_iter=50)

# Evaluate fit
fit_metrics = vuln_model.evaluate_fit()
print("Model Fit Evaluation:")
print(f"  MSE: {fit_metrics['mse']:.4f}")
print(f"  MAE: {fit_metrics['mae']:.4f}")
print(f"  Pseudo R²: {fit_metrics['pseudo_r2']:.3f}")
print(f"  Convergence: {fit_metrics['convergence']}")

# Get rankings
agent_ranking = vuln_model.rank_agents()
attack_ranking = vuln_model.rank_attacks()

print(f"\nMost Resistant Agents (by index): {agent_ranking[:3]}")
print(f"Most Severe Attacks (by index): {attack_ranking[:3]}")

In [None]:
# Bootstrap confidence intervals
print("Computing bootstrap confidence intervals...")
bootstrap_results = vuln_model.bootstrap_rankings(n_bootstrap=100)  # Reduced for demo speed

print(f"\nBootstrap Results ({bootstrap_results['n_successful_bootstraps']} successful):")
print(f"  Mean agent rank stability: {bootstrap_results['mean_agent_stability']:.2f}")
print(f"  Mean attack rank stability: {bootstrap_results['mean_attack_stability']:.2f}")

## 9. Visualizations

In [None]:
# Initialize visualizers
attack_viz = AttackGraphVisualizer()
vuln_viz = VulnerabilityVisualizer()

# Create output directory for figures
os.makedirs('../figures', exist_ok=True)

In [None]:
# 1. Attack Diffusion Heatmap
fig1 = attack_viz.plot_attack_diffusion(
    dense_results_linear['matrix'],
    dense_results_linear['attacks'][:6],
    dense_results_linear['agents'],
    title="Attack Diffusion - Linear Chain",
    save_path='../figures/attack_diffusion.png'
)
plt.show()

In [None]:
# 2. Vulnerability Rankings
fig2 = vuln_viz.plot_vulnerability_rankings(
    dense_results_linear['agents'],
    vuln_model.agent_resistance,
    dense_results_linear['attacks'][:6],
    vuln_model.attack_severity,
    confidence_intervals=bootstrap_results,
    save_path='../figures/vulnerability_rankings.png'
)
plt.show()

In [None]:
# 3. Sampling Mask Visualization
fig3 = vuln_viz.plot_sampling_mask(
    masks['informed']['33%'],
    'Informed',
    attack_names=dense_results_linear['attacks'][:6],
    agent_names=dense_results_linear['agents'],
    save_path='../figures/sampling_mask.png'
)
plt.show()

In [None]:
# 4. Strategy Comparison
fig4 = vuln_viz.plot_strategy_comparison(
    comparison_results,
    save_path='../figures/strategy_comparison.png'
)
plt.show()

## 10. Results Summary

In [None]:
print("=" * 60)
print("AGENTREDCHAIN SIMULATION RESULTS")
print("=" * 60)

print("\n1. CHAINS TESTED (SIMULATED):")
print(f"   - Linear (research pipeline): {len(linear_chain.agents)} agents")
print(f"   - Star (consensus system): {len(star_chain.agents)} agents")
print(f"   - Hierarchical (review): {len(hierarchical_chain.agents)} agents")

print("\n2. ATTACKS DEPLOYED:")
print(f"   - Total attacks: {len(all_attacks)}")
print(f"   - Categories: 5 (goal hijacking, data exfiltration, privilege escalation, jailbreak, poisoning)")
print(f"   - Severity distribution: {severity_dist}")

print("\n3. SIMULATED DENSE BASELINE:")
print(f"   - Tests required: {dense_results_linear['matrix'].size}")
print(f"   - Mean vulnerability (simulated): {np.mean(dense_results_linear['matrix']):.3f}")

print("\n4. SIMULATED SPARSE EVALUATION (33% coverage):")
for strategy in strategies:
    results = comparison_results['strategies'][strategy]['33%']
    print(f"   {strategy.capitalize()}:")
    print(f"     - Tests: {results['efficiency']['tests_run']}")
    print(f"     - Spearman ρ (on simulated data): {results['comparison']['ranking_correlation']:.3f}")
    print(f"     - RMSE (on simulated data): {results['comparison']['vulnerability_rmse']:.3f}")

print("\n5. SIMULATION OBSERVATIONS:")
print(f"   - Theoretical cost reduction: {(1 - 0.33) * 100:.0f}%")
print(f"   - In this simulation: Informed sampling performed best")
print(f"   - Correlation values shown are from random simulated data")
print(f"   - Critical paths identified: {len(analysis['position_analysis']['critical_edges'])}")

print("\n" + "=" * 60)
print("IMPORTANT: This demo uses simulated data (np.random.beta)")
print("Actual performance will vary with real API calls and data.")
print("=" * 60)

## Next Steps

1. **Production Deployment**:
   - Set up API keys for OpenAI/Anthropic
   - Run actual LLM evaluations
   - Scale to larger agent chains

2. **Extended Analysis**:
   - Test all 3 chain topologies
   - Evaluate all 10 attack scenarios
   - Compare multiple coverage levels

3. **Advanced Features**:
   - Implement adaptive sampling
   - Add real-time monitoring
   - Integrate with CI/CD pipelines

4. **Research Extensions**:
   - Explore other IRT models beyond Rasch
   - Investigate transfer learning across chains
   - Develop attack-specific sampling strategies