# Self-RAG Interactive Demo

Interactive demonstration of the complete Self-RAG system for legal analysis.

In [10]:
import sys
sys.path.append('..')

from src.self_rag.inference import load_pipeline_from_config
import json

## 1. Load Complete Self-RAG Pipeline

In [11]:
# Load pipeline
print("Loading Self-RAG pipeline...")

pipeline = load_pipeline_from_config(
    retrieval_config_path='../configs/retrieval_config.yaml',
    generator_config_path='../configs/generator_config.yaml',
    retriever_index_dir='../data/embeddings',
    generator_weights_path='../models/generator_lora/final',
    critic_weights_path='../models/critic_lora/final',
)

print("‚úÖ Pipeline loaded!")

Loading Self-RAG pipeline...
Loading Self-RAG Pipeline...

1. Loading retriever...
Loading embedding model: sentence-transformers/all-mpnet-base-v2
Model loaded on mps
Embedding dimension: 768
   Loading index from ../data/embeddings
Using CPU index
Created IndexFlatIP index with dimension 768
Index loaded from ../data/embeddings/faiss_index.faiss
Total documents in index: 10
Documents loaded from ../data/embeddings/documents.pkl
   Index loaded: 10 documents

2. Loading generator...
Loading generator model: Qwen/Qwen2.5-0.5B-Instruct
Loading LoRA weights from ../models/generator_lora/final
Generator model loaded successfully
MPS cache cleared

3. Loading critic model for reflection tokens...
Loading model: Qwen/Qwen2.5-0.5B-Instruct
Loading LoRA weights from ../models/critic_lora/final
Model loaded successfully
MPS cache cleared
   Critic model loaded successfully

Pipeline loaded successfully!
‚úÖ Pipeline loaded!


## 2. Test with Example Questions

In [12]:
def demo_question(question):
    """Demonstrate Self-RAG with a question."""
    print(f"\n{'='*80}")
    print(f"Question: {question}")
    print(f"{'='*80}\n")
    
    # Get answer
    result = pipeline.answer_question(question)
    
    # Display formatted response
    formatted = pipeline.format_response(
        result,
        include_passages=True,
        include_reflection=True
    )
    
    print(formatted)
    return result

In [13]:
# Example 1: Basic negligence question
result1 = demo_question("What are the four elements of negligence?")


Question: What are the four elements of negligence?

Question: What are the four elements of negligence?

Answer: Negligence is a legal concept that requires proof of three elements. These elements include:
- Duty of care
- Breach of duty
- Damages.
The first two elements, "duty of care" and "breach of duty," are required to prove negligence. The third element, "damages," is what the plaintiff seeks in compensation for their injuries or losses.
In order to establish negligence, the plaintiff must show that they were exposed to an unreasonable risk of harm. This means that there was a reasonable person who could have foreseen the risk but failed to take steps to mitigate it. For example, if someone was driving a car and ran a red light, they would be considered negligent because they had no reasonable excuse for not stopping.
It's important to note that negligence does not necessarily mean that the defendant caused the injury or loss. In some cases, negligence may be used as a defense 

In [14]:
# Example 2: Specific doctrine
result2 = demo_question("What is res ipsa loquitur and when does it apply?")


Question: What is res ipsa loquitur and when does it apply?

Question: What is res ipsa loquitur and when does it apply?

Answer: Res ipsa loquitur applies to situations where the plaintiff's damages are caused by an event that was foreseeable but not reasonably anticipated. The doctrine requires proof of foreseeability, which means that the defendant must have known or should have known about the risk involved in the act.
The doctrine has been applied in many jurisdictions, including California, New York, Pennsylvania, and Texas. In some cases, courts may also consider whether the defendant had a duty to warn of the risk involved in the act.
In general, res ipsa loquitur is only applicable if the plaintiff can prove that they were unaware of the risk involved in the act. If the plaintiff can prove that they were aware of the risk, then the defendant cannot be held liable for their damages.
It is important to note that res ipsa loquitur is not always applicable. For example, in certai

In [15]:
# Example 3: Defenses
result3 = demo_question("What is the difference between comparative negligence and assumption of risk?")


Question: What is the difference between comparative negligence and assumption of risk?

Question: What is the difference between comparative negligence and assumption of risk?

Answer: Comparative negligence means that a plaintiff can recover only to the extent that the defendant's fault was more than 50% greater than the plaintiff's. Assumption of risk means that the plaintiff must prove that they were aware of the risks involved in their actions.
The key difference between these two concepts is that comparative negligence applies when one party is found liable for an accident, while assumption of risk applies when one party is not liable for an accident.
In general, it is easier to prove comparative negligence because it requires proving that the defendant's fault was greater than 50%. However, if you are sued for an accident where the other party was at fault, then you may be able to prove assumption of risk by showing that you were aware of the risks involved in your actions.
It 

## 3. Interactive Question & Answer

Enter your own questions below!

In [16]:
# # Interactive mode
# your_question = input("Enter your legal question: ")

# if your_question:
#     result = demo_question(your_question)

## 4. Analyze Reflection Tokens

Examine the self-verification in action.

In [17]:
def analyze_reflection(result):
    """Analyze reflection tokens from a result."""
    reflection = result['reflection']
    
    print("\nReflection Token Analysis:")
    print("=" * 50)
    
    print(f"\nüìç Retrieve: {reflection.get('retrieve', 'N/A')}")
    print("   ‚Üí Did the model decide to retrieve evidence?")
    
    print(f"\nüîç ISREL (Relevance): {reflection.get('isrel', 'N/A')}")
    print("   ‚Üí Is the retrieved passage relevant?")
    
    print(f"\n‚úì ISSUP (Support): {reflection.get('issup', 'N/A')}")
    print("   ‚Üí Is the answer supported by evidence?")
    print("   ‚Üí Hallucination detection!")
    
    print(f"\n‚≠ê ISUSE (Utility): {reflection.get('isuse', 'N/A')}")
    print("   ‚Üí Overall response quality (1-5)")
    
    print(f"\nüìä Overall Score: {result['score']:.2f}")
    
    # Hallucination check - Fixed to handle None values
    support = reflection.get('issup') or ''
    if 'No Support' in support:
        print("\n‚ö†Ô∏è  WARNING: Potential hallucination detected!")
    elif 'Fully Supported' in support:
        print("\n‚úÖ Response is fully supported by evidence")

# Analyze previous results
analyze_reflection(result1)


Reflection Token Analysis:

üìç Retrieve: [Retrieve]
   ‚Üí Did the model decide to retrieve evidence?

üîç ISREL (Relevance): None
   ‚Üí Is the retrieved passage relevant?

‚úì ISSUP (Support): None
   ‚Üí Is the answer supported by evidence?
   ‚Üí Hallucination detection!

‚≠ê ISUSE (Utility): [Utility:5]
   ‚Üí Overall response quality (1-5)

üìä Overall Score: 1.00


## 5. Batch Processing

Process multiple questions at once.

In [18]:
# Batch questions
questions = [
    "What is causation in negligence?",
    "What damages can be recovered?",
    "What is professional malpractice?",
]

# Process all
results = pipeline.answer_batch(questions)

# Summary
print("\nBatch Processing Summary:")
print("=" * 80)
for i, (q, r) in enumerate(zip(questions, results), 1):
    print(f"\n{i}. {q}")
    print(f"   Score: {r['score']:.2f}")
    print(f"   Support: {r['reflection'].get('issup', 'N/A')}")
    print(f"   Answer: {r['answer'][:80]}...")


Batch Processing Summary:

1. What is causation in negligence?
   Score: 1.00
   Support: None
   Answer: The concept of causation is a key element in negligence cases. Negligence occurs...

2. What damages can be recovered?
   Score: 1.00
   Support: None
   Answer: Damages are recoverable in a personal injury case if the plaintiff has suffered ...

3. What is professional malpractice?
   Score: 1.00
   Support: None
   Answer: Professional malpractice occurs when a healthcare provider or other health care ...


## 6. Export Results

In [19]:
# Save results for analysis
output = {
    'questions': questions,
    'results': [
        {
            'question': r['question'],
            'answer': r['answer'],
            'reflection': r['reflection'],
            'score': r['score']
        }
        for r in results
    ]
}

with open('../results/demo_results.json', 'w') as f:
    json.dump(output, f, indent=2)

print("‚úÖ Results saved to ../results/demo_results.json")

‚úÖ Results saved to ../results/demo_results.json


## Summary

Demo complete!
- ‚úÖ Tested Self-RAG on legal questions
- ‚úÖ Analyzed reflection tokens
- ‚úÖ Demonstrated hallucination detection
- ‚úÖ Processed batch questions
- ‚úÖ Exported results

## Key Takeaways

1. **Adaptive Retrieval**: Model decides when to retrieve evidence
2. **Self-Verification**: Reflection tokens provide quality assessment
3. **Hallucination Detection**: ISSUP token identifies unsupported claims
4. **Transparency**: See exactly why the model made each decision