# AI-Powered Reaction Curation

This notebook demonstrates how to use AICurationUtils for AI-powered analysis and curation of metabolic reactions.

## Overview

AICurationUtils provides AI-powered tools for:
- Analyzing reaction directionality
- Categorizing reaction stoichiometry
- Evaluating reaction equivalence
- Assessing gene-reaction associations

## Backends

Two AI backends are supported:
- **Argo**: Remote API service (default)
- **Claude Code**: Local CLI execution

## 1. Setup and Configuration

In [None]:
%run util.py

# Check current configuration
print("AI Curation Configuration:")
print("=" * 60)
backend = util.get_config_value("ai_curation.backend", default="argo")
executable = util.get_config_value("ai_curation.claude_code_executable", default="claude-code")
print(f"Backend: {backend}")
print(f"Claude Code executable: {executable}")
print()
print("To change configuration, edit ~/.kbutillib/config.yaml:")
print("  ai_curation:")
print("    backend: 'argo'  # or 'claude-code'")
print("    claude_code_executable: 'claude-code'")

## 2. Initialize with Argo Backend

First, let's test with the Argo backend (remote API).

In [None]:
from kbutillib import AICurationUtils

# Initialize with Argo backend
util_argo = AICurationUtils(backend="argo", proxy_port=None)

print(f"Backend: {util_argo.ai_backend}")
print(f"Model: {util_argo.model}")
print(f"Environment: {util_argo.env}")

## 3. Initialize with Claude Code Backend

Now let's try the Claude Code backend (local execution).

**Note**: This requires Claude Code CLI to be installed.

In [None]:
# Try to initialize with Claude Code backend
try:
    util_claude = AICurationUtils(backend="claude-code", proxy_port=None)
    print(f"Backend: {util_claude.ai_backend}")
    print(f"Executable: {util_claude.claude_code_executable}")
    print("Claude Code backend is available!")
    claude_available = True
except FileNotFoundError as e:
    print("Claude Code not found:", e)
    print("Install Claude Code to use this backend.")
    print("For this notebook, we'll use only the Argo backend.")
    util_claude = None
    claude_available = False

## 4. Load a Test Model

Load a small metabolic model for testing.

In [None]:
import cobra
from cobra.test import create_test_model

# Create a small test model
model = create_test_model("textbook")

print(f"Model: {model.id}")
print(f"Reactions: {len(model.reactions)}")
print(f"Metabolites: {len(model.metabolites)}")
print(f"Genes: {len(model.genes)}")
print()
print("Sample reactions:")
for rxn in model.reactions[:5]:
    print(f"  {rxn.id}: {rxn.reaction}")

## 5. Test Basic Chat Functionality

Test the basic chat interface with both backends.

In [None]:
# Test with Argo backend
print("Testing Argo Backend:")
print("=" * 60)

'''
system_msg = "You are an expert in biochemistry. Respond in JSON format."
prompt = '{"question": "What is ATP?"}'

try:
    response = util_argo.chat(prompt=prompt, system=system_msg)
    print(f"Prompt: {prompt}")
    print(f"Response: {response[:200]}...")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to test Argo chat (requires network access)")
print()

In [None]:
# Test with Claude Code backend (if available)
if claude_available:
    print("Testing Claude Code Backend:")
    print("=" * 60)
    
    '''
    system_msg = "You are an expert in biochemistry. Respond in JSON format."
    prompt = '{"question": "What is NAD+?"}'
    
    try:
        response = util_claude.chat(prompt=prompt, system=system_msg)
        print(f"Prompt: {prompt}")
        print(f"Response: {response[:200]}...")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to test Claude Code chat")
else:
    print("Claude Code backend not available")

## 6. Analyze Reaction Directionality

Evaluate the biological directionality of reactions.

In [None]:
# Select a test reaction
test_rxn = model.reactions.PGI  # Glucose-6-phosphate isomerase

print("Test Reaction:")
print("=" * 60)
print(f"ID: {test_rxn.id}")
print(f"Name: {test_rxn.name}")
print(f"Equation: {test_rxn.reaction}")
print(f"Bounds: [{test_rxn.lower_bound}, {test_rxn.upper_bound}]")
print()

In [None]:
# Analyze directionality with Argo backend
print("Analyzing Directionality (Argo Backend):")
print("=" * 60)

'''
try:
    result_argo = util_argo.analyze_reaction_directionality(test_rxn)
    
    if result_argo:
        print(f"Directionality: {result_argo['directionality']}")
        print(f"Confidence: {result_argo['confidence']}")
        print(f"Errors: {result_argo['errors']}")
        print(f"Comments: {result_argo['other_comments']}")
    else:
        print("Skipped (utility reaction)")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to analyze with Argo (may take 30-60 seconds)")
print("Results are cached for future use")
print()

In [None]:
# Analyze directionality with Claude Code backend
if claude_available:
    print("Analyzing Directionality (Claude Code Backend):")
    print("=" * 60)
    
    '''
    try:
        result_claude = util_claude.analyze_reaction_directionality(test_rxn)
        
        if result_claude:
            print(f"Directionality: {result_claude['directionality']}")
            print(f"Confidence: {result_claude['confidence']}")
            print(f"Errors: {result_claude['errors']}")
            print(f"Comments: {result_claude['other_comments']}")
        else:
            print("Skipped (utility reaction)")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to analyze with Claude Code")
else:
    print("Claude Code backend not available")

## 7. Analyze Reaction Stoichiometry

Categorize reaction stoichiometry into primary, cofactor, and minor components.

In [None]:
# Select a reaction with cofactors
test_rxn2 = model.reactions.GAPD  # Glyceraldehyde-3-phosphate dehydrogenase

print("Test Reaction:")
print("=" * 60)
print(f"ID: {test_rxn2.id}")
print(f"Name: {test_rxn2.name}")
print(f"Equation: {test_rxn2.reaction}")
print()
print("This reaction involves:")
print("- Primary: Glyceraldehyde-3-phosphate <-> 3-phosphoglycerate")
print("- Cofactor: NAD+ -> NADH")
print("- Minor: Phosphate, H+")
print()

In [None]:
# Analyze stoichiometry with Argo backend
print("Analyzing Stoichiometry (Argo Backend):")
print("=" * 60)

'''
try:
    result_argo = util_argo.analyze_reaction_stoichiometry(test_rxn2)
    
    if result_argo:
        print("Primary Stoichiometry:")
        for cpd, coef in result_argo['primary_stoichiometry'].items():
            print(f"  {cpd}: {coef}")
        
        print("\nCofactor Stoichiometry:")
        for cpd, coef in result_argo['cofactor_stoichiometry'].items():
            print(f"  {cpd}: {coef}")
        
        print("\nMinor Stoichiometry:")
        for cpd, coef in result_argo['minor_stoichiometry'].items():
            print(f"  {cpd}: {coef}")
        
        print(f"\nPrimary Chemistry: {result_argo['primary_chemistry']}")
        print(f"Confidence: {result_argo['confidence']}")
    else:
        print("Skipped (utility reaction)")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to analyze stoichiometry with Argo")
print()

In [None]:
# Analyze stoichiometry with Claude Code backend
if claude_available:
    print("Analyzing Stoichiometry (Claude Code Backend):")
    print("=" * 60)
    
    '''
    try:
        result_claude = util_claude.analyze_reaction_stoichiometry(test_rxn2)
        
        if result_claude:
            print("Primary Stoichiometry:")
            for cpd, coef in result_claude['primary_stoichiometry'].items():
                print(f"  {cpd}: {coef}")
            
            print("\nCofactor Stoichiometry:")
            for cpd, coef in result_claude['cofactor_stoichiometry'].items():
                print(f"  {cpd}: {coef}")
            
            print("\nMinor Stoichiometry:")
            for cpd, coef in result_claude['minor_stoichiometry'].items():
                print(f"  {cpd}: {coef}")
            
            print(f"\nPrimary Chemistry: {result_claude['primary_chemistry']}")
            print(f"Confidence: {result_claude['confidence']}")
        else:
            print("Skipped (utility reaction)")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to analyze stoichiometry with Claude Code")
else:
    print("Claude Code backend not available")

## 8. Evaluate Reaction Equivalence

Compare two reactions to determine if they are equivalent.

In [None]:
# Select two reactions to compare
rxn1 = model.reactions.PGI  # Glucose-6-phosphate isomerase
rxn2 = model.reactions.PFK  # Phosphofructokinase

print("Comparing Reactions:")
print("=" * 60)
print(f"Reaction 1: {rxn1.id}")
print(f"  {rxn1.reaction}")
print()
print(f"Reaction 2: {rxn2.id}")
print(f"  {rxn2.reaction}")
print()

# Evidence for comparison
comparison_evidence = {
    "note": "Both are glycolysis reactions but catalyze different steps"
}

print(f"Evidence: {comparison_evidence}")
print()

In [None]:
# Evaluate equivalence with Argo backend
print("Evaluating Equivalence (Argo Backend):")
print("=" * 60)

'''
try:
    result_argo = util_argo.evaluate_reaction_equivalence(rxn1, rxn2, comparison_evidence)
    
    if result_argo:
        print(f"Equivalence: {result_argo['equivalence']}")
        print(f"Explanation: {result_argo['explanation']}")
    else:
        print("Skipped (utility reaction)")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to evaluate equivalence with Argo")
print("Expected result: 'different' (different steps in glycolysis)")
print()

In [None]:
# Evaluate equivalence with Claude Code backend
if claude_available:
    print("Evaluating Equivalence (Claude Code Backend):")
    print("=" * 60)
    
    '''
    try:
        result_claude = util_claude.evaluate_reaction_equivalence(rxn1, rxn2, comparison_evidence)
        
        if result_claude:
            print(f"Equivalence: {result_claude['equivalence']}")
            print(f"Explanation: {result_claude['explanation']}")
        else:
            print("Skipped (utility reaction)")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to evaluate equivalence with Claude Code")
else:
    print("Claude Code backend not available")

## 9. Evaluate Gene-Reaction Association

Assess whether a reaction should be associated with a gene based on functional data.

In [None]:
# Select a reaction and gene
test_rxn3 = model.reactions.GAPD

print("Test Reaction:")
print("=" * 60)
print(f"ID: {test_rxn3.id}")
print(f"Name: {test_rxn3.name}")
print(f"Equation: {test_rxn3.reaction}")
print()

# Mock gene data
gene_data = {
    "ID": "b1779",
    "name": "gapA",
    "function": "glyceraldehyde-3-phosphate dehydrogenase",
    "ec_number": "1.2.1.12",
    "description": "Catalyzes oxidative phosphorylation of G3P to 1,3-bisphosphoglycerate"
}

print("Gene Data:")
for key, value in gene_data.items():
    print(f"  {key}: {value}")
print()

In [None]:
# Evaluate gene association with Argo backend
print("Evaluating Gene Association (Argo Backend):")
print("=" * 60)

'''
try:
    result_argo = util_argo.evaluate_reaction_gene_association(test_rxn3, gene_data)
    
    if result_argo:
        print(f"Association: {result_argo['association']}")
        print(f"Explanation: {result_argo['explanation']}")
    else:
        print("Skipped (utility reaction)")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to evaluate gene association with Argo")
print("Expected result: 'exact' (perfect match)")
print()

In [None]:
# Evaluate gene association with Claude Code backend
if claude_available:
    print("Evaluating Gene Association (Claude Code Backend):")
    print("=" * 60)
    
    '''
    try:
        result_claude = util_claude.evaluate_reaction_gene_association(test_rxn3, gene_data)
        
        if result_claude:
            print(f"Association: {result_claude['association']}")
            print(f"Explanation: {result_claude['explanation']}")
        else:
            print("Skipped (utility reaction)")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to evaluate gene association with Claude Code")
else:
    print("Claude Code backend not available")

## 10. Batch Analysis Example

Analyze multiple reactions in a batch.

In [None]:
# Select reactions for batch analysis
batch_reactions = [
    model.reactions.PGI,   # Isomerase (reversible)
    model.reactions.PFK,   # Kinase (irreversible)
    model.reactions.GAPD,  # Dehydrogenase (reversible)
]

print("Batch Analysis Setup:")
print("=" * 60)
print(f"Analyzing {len(batch_reactions)} reactions")
for rxn in batch_reactions:
    print(f"  {rxn.id}: {rxn.name}")
print()

In [None]:
# Batch analysis with Argo backend
print("Batch Directionality Analysis (Argo Backend):")
print("=" * 60)

'''
results = []
for rxn in batch_reactions:
    try:
        result = util_argo.analyze_reaction_directionality(rxn)
        if result:
            results.append({
                'id': rxn.id,
                'name': rxn.name,
                'directionality': result['directionality'],
                'confidence': result['confidence']
            })
            print(f"{rxn.id}: {result['directionality']} ({result['confidence']} confidence)")
    except Exception as e:
        print(f"{rxn.id}: Error - {e}")

print(f"\nAnalyzed {len(results)} reactions successfully")
'''

print("Uncomment code above to run batch analysis with Argo")
print("Note: Results are cached, so repeat runs are fast")
print()

## 11. Curate Biochemical Compound

Validate, correct, and enrich compound records using AI expertise in biochemical databases.

In [None]:
# Select a test compound from the model
test_compound = model.metabolites.atp_c  # ATP in cytosol

print("Test Compound:")
print("=" * 60)
print(f"ID: {test_compound.id}")
print(f"Name: {test_compound.name}")
print(f"Formula: {test_compound.formula}")
print(f"Charge: {test_compound.charge}")
print(f"Compartment: {test_compound.compartment}")
print()

# Show annotations if available
if hasattr(test_compound, 'annotation') and test_compound.annotation:
    print("Annotations:")
    for key, value in test_compound.annotation.items():
        print(f"  {key}: {value}")
print()

In [None]:
# Curate compound with Argo backend
print("Curating Compound (Argo Backend):")
print("=" * 60)

'''
try:
    result_argo = util_argo.curate_biochemical_compound(test_compound)
    
    if result_argo:
        print("Compound Curation Results:")
        print()
        
        # Show any changes
        if 'changes' in result_argo and result_argo['changes']:
            print("Changes:")
            for change in result_argo['changes']:
                print(f"  Field: {change['field']}")
                print(f"    Old: {change['old_value']}")
                print(f"    New: {change['new_value']}")
                print(f"    Reason: {change['reason']}")
        else:
            print("No changes needed")
        print()
        
        # Show errors
        if 'errors' in result_argo and result_argo['errors']:
            print("Errors:")
            for error in result_argo['errors']:
                print(f"  - {error}")
        else:
            print("No errors found")
        print()
        
        # Show comments
        if 'comments' in result_argo and result_argo['comments']:
            print("Comments:")
            for comment in result_argo['comments']:
                print(f"  - {comment}")
        print()
        
        # Show proposed new data
        if 'newdata' in result_argo and result_argo['newdata']:
            print("Proposed New Data:")
            for data in result_argo['newdata']:
                print(f"  Field: {data['field']}")
                print(f"    Value: {data['value']}")
                print(f"    Source: {data['source']}")
                print(f"    Confidence: {data['confidence']}")
    else:
        print("No results returned")
except Exception as e:
    print(f"Error: {e}")
'''

print("Uncomment code above to curate compound with Argo")
print("The AI will validate identity, structure, thermodynamics, and cross-references")
print()

In [None]:
# Curate compound with Claude Code backend (if available)
if claude_available:
    print("Curating Compound (Claude Code Backend):")
    print("=" * 60)
    
    '''
    try:
        result_claude = util_claude.curate_biochemical_compound(test_compound)
        
        if result_claude:
            print("Compound Curation Results:")
            print()
            
            # Show any changes
            if 'changes' in result_claude and result_claude['changes']:
                print("Changes:")
                for change in result_claude['changes']:
                    print(f"  Field: {change['field']}")
                    print(f"    Old: {change['old_value']}")
                    print(f"    New: {change['new_value']}")
                    print(f"    Reason: {change['reason']}")
            else:
                print("No changes needed")
            print()
            
            # Show errors
            if 'errors' in result_claude and result_claude['errors']:
                print("Errors:")
                for error in result_claude['errors']:
                    print(f"  - {error}")
            else:
                print("No errors found")
            print()
            
            # Show comments
            if 'comments' in result_claude and result_claude['comments']:
                print("Comments:")
                for comment in result_claude['comments']:
                    print(f"  - {comment}")
            print()
            
            # Show proposed new data
            if 'newdata' in result_claude and result_claude['newdata']:
                print("Proposed New Data:")
                for data in result_claude['newdata']:
                    print(f"  Field: {data['field']}")
                    print(f"    Value: {data['value']}")
                    print(f"    Source: {data['source']}")
                    print(f"    Confidence: {data['confidence']}")
        else:
            print("No results returned")
    except Exception as e:
        print(f"Error: {e}")
    '''
    
    print("Uncomment code above to curate compound with Claude Code")
else:
    print("Claude Code backend not available")

## 12. Cache Management

View and manage the curation cache.

In [None]:
# Check cache status
print("Cache Management:")
print("=" * 60)

cache_types = [
    "ReactionDirectionality",
    "ReactionStoichiometry",
    "ReactionEquivalence",
    "GeneAssociation",
    "CompoundCuration"
]

for cache_type in cache_types:
    cache = util_argo._load_cached_curation(cache_type)
    print(f"{cache_type}: {len(cache)} cached entries")

print()
print("Cache files are stored in: ~/.kbutillib/")
print("File pattern: AICurationCache<CacheName>.json")
print()
print("Caches are shared between Argo and Claude Code backends")

## 13. Backend Comparison

Compare results from both backends (if Claude Code is available).

In [None]:
if claude_available:
    print("Backend Comparison:")
    print("=" * 60)
    print("Both backends should produce similar results, but:")
    print("- Argo: Uses remote API, requires network")
    print("- Claude Code: Runs locally, more private")
    print()
    print("Caching works the same way for both backends")
    print("Results are interchangeable")
else:
    print("Claude Code not available for comparison")
    print("Install Claude Code CLI to test both backends")

## Summary

This notebook demonstrated:

1. **Configuration** - Viewing and setting AI backend configuration
2. **Initialization** - Creating utils with Argo and Claude Code backends
3. **Basic Chat** - Testing the chat interface with both backends
4. **Directionality Analysis** - Evaluating reaction directionality
5. **Stoichiometry Analysis** - Categorizing reaction components
6. **Equivalence Evaluation** - Comparing reactions
7. **Gene Association** - Validating gene-reaction associations
8. **Batch Processing** - Analyzing multiple reactions
9. **Compound Curation** - Validating and enriching compound records
10. **Cache Management** - Understanding curation caches
11. **Backend Comparison** - Comparing Argo vs Claude Code

### Key Takeaways

- **Transparent**: All methods work with both backends
- **Configurable**: Choose backend via config or at runtime
- **Cached**: Results are cached for performance
- **Flexible**: Switch backends without code changes

### Next Steps

- Uncomment examples to run actual AI queries
- Try with your own metabolic models
- Experiment with both backends
- Use cached results for analysis workflows