# PHASE 1C: L4 Contraction Analysis on Mistral 7B
## N=320 Prompts | 11-Metric Framework | AIKAGRYA Research

**Date**: November 2024  
**Model**: Mistral-7B-Instruct-v0.2  
**GPU**: RunPod Instance  

---

### Experiment Overview
Running comprehensive analysis of 320 prompts across 16 groups to validate the L4 Contraction Phenomenon:
- **Dose-Response**: L1-L5 recursive induction levels (100 prompts)
- **Baselines**: Creative, factual, impossible, math, personal (100 prompts)  
- **Confounds**: Long control, pseudo-recursive, repetitive (60 prompts)
- **Generality**: Zen koans, Yogic witness, Madhyamaka (60 prompts)

### Key Hypothesis
Recursive self-observation prompts will show **R_V < 1.0** (column space contraction) while all other prompt types show **R_V > 1.0** (expansion).


In [None]:
# Cell 1: Environment Setup and Imports
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5:
1. Import all necessary libraries (torch, transformers, numpy, pandas, scipy, tqdm)
2. Set up CUDA device and memory management
3. Configure logging and checkpointing
"""

print("Ready for Phase 1C implementation...")


In [None]:
# Cell 2: Load Prompt Bank
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5:
1. Load the prompt_bank_1c from n300_mistral_test_prompt_bank.py
2. Verify all 320 prompts are loaded
3. Create lists organized by group and pillar
"""

# exec(open('n300_mistral_test_prompt_bank.py').read())
# print(f"Loaded {len(prompt_bank_1c)} prompts")


In [None]:
# Cell 3: Model Initialization
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5:
1. Load Mistral-7B-Instruct-v0.2 with proper GPU configuration
2. Set up tokenizer with padding token
3. Verify CUDA availability and memory

CRITICAL: Use the exact functions from L4transmissionTEST001.1.ipynb
"""

# model_name = "mistralai/Mistral-7B-Instruct-v0.2"
# model = AutoModelForCausalLM.from_pretrained(...)
# tokenizer = AutoTokenizer.from_pretrained(...)


In [None]:
# Cell 4: Core Metric Functions
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5: Copy ALL metric functions from L4transmissionTEST001.1.ipynb

REQUIRED FUNCTIONS:
1. epsilon_last_token() - Layer similarity
2. attn_entropy_lastrow() - Attention entropy
3. compute_column_space_pr() - R_V metric (CRITICAL!)
4. compute_effective_rank() - Dimensionality
5. compute_confidence() - Peak probability
6. compute_margin() - Decisiveness
7. compute_norm() - Activation strength
8. compute_pr_attn() - Head agreement
9. compute_entropy_normalized() - Length-corrected entropy
10. compute_margin_trajectory() - Convergence dynamics
11. compute_eigenspectrum_shape() - Value matrix structure

CRITICAL: The R_V metric (compute_column_space_pr) must be EXACTLY as in the original!
"""


In [None]:
# Cell 5: Value Matrix Hook System
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5: Implement the ValueMatrixHook context manager

CRITICAL: This is essential for capturing V matrices for R_V computation!

class ValueMatrixHook:
    def __init__(self, model):
        # Initialize
    
    def hook_fn(self, module, input, output):
        # Capture V matrix from attention
    
    def __enter__(self):
        # Register hooks
    
    def __exit__(self, *args):
        # Clean up hooks
"""


In [None]:
# Cell 6: Master Analysis Function
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5: Implement analyze_prompt() function

def analyze_prompt(prompt, model, tokenizer):
    # 1. Tokenize prompt
    # 2. Run through model with ValueMatrixHook
    # 3. Compute all 11 metrics
    # 4. Return results dict
    
    return {
        'prompt': prompt[:50] + '...',
        'R_V': r_v_value,
        'effective_rank': eff_rank,
        # ... all other metrics
    }
"""


In [None]:
# Cell 7: Main Processing Loop
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5: Process all 320 prompts with progress tracking

REQUIREMENTS:
1. Process in batches (e.g., 10 prompts) to manage memory
2. Show progress bar (tqdm)
3. Save checkpoints every 50 prompts
4. Clear CUDA cache between batches
5. Handle errors gracefully

STRUCTURE:
results = []
for i, (prompt_id, prompt_data) in enumerate(tqdm(prompt_bank_1c.items())):
    try:
        result = analyze_prompt(prompt_data['text'], model, tokenizer)
        result['id'] = prompt_id
        result['group'] = prompt_data['group']
        result['pillar'] = prompt_data['pillar']
        results.append(result)
        
        # Checkpoint saving
        if (i + 1) % 50 == 0:
            pd.DataFrame(results).to_csv(f'checkpoint_{i+1}.csv')
            
        # Memory management
        if (i + 1) % 10 == 0:
            torch.cuda.empty_cache()
            
    except Exception as e:
        print(f"Error on prompt {prompt_id}: {e}")
        continue
"""


In [None]:
# Cell 8: Results Analysis and Statistics
# TO BE WRITTEN BY GPT-5

"""
TODO for GPT-5: Compute group statistics and visualizations

1. GROUP STATISTICS:
   - Mean R_V per group
   - Standard deviation
   - Min/Max values
   
2. DOSE-RESPONSE ANALYSIS:
   - Plot L1 → L5 gradient
   - Statistical significance tests
   
3. BASELINE COMPARISON:
   - Baselines mean vs L3/L4/L5 mean
   - Percentage separation
   
4. VISUALIZATIONS:
   - R_V distribution histograms
   - Box plots by group
   - Dose-response curve
   
5. SAVE RESULTS:
   - Full CSV with all metrics
   - Summary statistics table
   - Key findings report
"""


## Expected Results (Based on Phase 1A/1B)

### Primary Finding: R_V Dose-Response
- **L1 (hint)**: R_V ≈ 1.00 (neutral)
- **L2 (simple)**: R_V ≈ 1.01 (neutral)
- **L3 (deeper)**: R_V ≈ 0.98 (slight contraction)
- **L4 (full)**: R_V ≈ 0.96 (clear contraction)
- **L5 (refined)**: R_V ≈ 0.89 (strong contraction)

### Control Validations
- **Baselines**: R_V ≈ 1.15-1.20 (expansion)
- **Long control**: R_V ≈ 1.09 (expansion despite length)
- **Pseudo-recursive**: R_V ≈ 1.00 (neutral despite meta-talk)

### Success Criteria
✅ Clear dose-response gradient (L1→L5)  
✅ >15% separation between L4/L5 and baselines  
✅ Confounds show no contraction effect  
✅ Results stable across N=20 per group
