# HybEx-Law: Getting Started Tutorial

Welcome to HybEx-Law, a hybrid NLP framework for verifying eligibility for legal aid services!

This notebook will walk you through:
1. **Understanding the system architecture**
2. **Processing legal queries**
3. **Extracting legal facts**
4. **Applying legal rules with Prolog**
5. **Evaluating system performance**

## 1. System Setup

First, let's set up the environment and import necessary modules:

In [None]:
import sys
from pathlib import Path

# Add src to path
sys.path.append(str(Path.cwd() / "src"))

# Import HybEx-Law components
from src.nlp_pipeline.hybrid_pipeline import HybridLegalNLPPipeline
from src.prolog_engine.legal_engine import LegalAidEngine
from src.evaluation.evaluator import HybExEvaluator
from data.sample_data import SAMPLE_QUERIES

print("✅ HybEx-Law components imported successfully!")
print("📊 Sample data loaded:", len(SAMPLE_QUERIES), "queries")

## 2. Initialize the System

Let's create instances of the main components:

In [None]:
# Initialize the hybrid NLP pipeline
pipeline = HybridLegalNLPPipeline()
print("🔧 NLP Pipeline initialized")

# Initialize the legal reasoning engine
legal_engine = LegalAidEngine()
print("⚖️  Legal Engine initialized")

# Initialize the evaluation framework
evaluator = HybExEvaluator()
print("📈 Evaluator initialized")

print("\n🎉 All systems ready!")

## 3. Process a Legal Query

Let's see how the system processes a real legal aid query:

In [None]:
# Example legal query
query = "I am a woman. My husband beats me and demands dowry. I earn 20000 rupees per month. Can I get legal aid?"

print("🗣️  Legal Query:")
print(f"   '{query}'")
print()

# Extract facts using the NLP pipeline
print("🔍 Stage 1 & 2: Extracting legal facts...")
extracted_facts = pipeline.process_query(query, verbose=True)

print("\n📋 Extracted Facts:")
for i, fact in enumerate(extracted_facts, 1):
    print(f"   {i}. {fact}")

## 4. Apply Legal Rules

Now let's use the Prolog engine to determine eligibility:

In [None]:
# Apply legal rules to determine eligibility
print("⚖️  Applying Legal Rules...")
decision = legal_engine.check_eligibility(extracted_facts)

print("\n🎯 ELIGIBILITY DECISION:")
print(f"   Status: {'✅ ELIGIBLE' if decision['eligible'] else '❌ NOT ELIGIBLE'}")
print(f"   Reason: {decision['explanation']}")

if decision.get('additional_info'):
    print(f"   Additional Info: {decision['additional_info']}")

print(f"\n📊 Facts Used: {len(decision['facts_used'])} legal facts")

## 5. Test Different Scenarios

Let's test the system with various types of legal scenarios:

In [None]:
# Test different scenarios
test_scenarios = [
    {
        "name": "Low Income Case",
        "query": "I lost my job and have no income. My landlord is evicting me."
    },
    {
        "name": "SC/ST Category", 
        "query": "I am from scheduled caste. My neighbor has occupied my land illegally."
    },
    {
        "name": "Excluded Case Type",
        "query": "Someone wrote false things about my business in the newspaper. I want to file defamation case."
    },
    {
        "name": "Minor/Child",
        "query": "I am 16 years old. My father died in accident. Insurance company not paying."
    }
]

print("🧪 Testing Different Scenarios:\n")

for i, scenario in enumerate(test_scenarios, 1):
    print(f"{i}. {scenario['name']}")
    print(f"   Query: {scenario['query']}")
    
    # Process the query
    facts = pipeline.process_query(scenario['query'])
    decision = legal_engine.check_eligibility(facts)
    
    status = "✅ ELIGIBLE" if decision['eligible'] else "❌ NOT ELIGIBLE"
    print(f"   Result: {status}")
    print(f"   Reason: {decision['explanation']}")
    print()

## 6. Component Analysis

Let's examine how individual components work:

In [None]:
# Test individual extractors
from src.extractors.income_extractor import IncomeExtractor
from src.extractors.case_type_classifier import CaseTypeClassifier
from src.extractors.social_category_extractor import SocialCategoryExtractor

# Initialize extractors
income_extractor = IncomeExtractor()
case_classifier = CaseTypeClassifier()
social_extractor = SocialCategoryExtractor()

test_query = "I am a woman from SC community. I earn 15000 rupees monthly. My landlord is harassing me."

print("🔬 Component Analysis:")
print(f"Query: {test_query}\n")

# Test income extraction
income_facts = income_extractor.extract(test_query, test_query)
print(f"💰 Income Extractor: {income_facts}")

# Test case type classification
case_type = case_classifier.classify_case_type(test_query)
print(f"⚖️  Case Type Classifier: {case_type}")

# Test social category extraction
social_facts = social_extractor.extract(test_query, test_query)
print(f"👥 Social Category Extractor: {social_facts}")

## 7. Batch Evaluation

Let's evaluate the system on multiple test cases:

In [None]:
# Evaluate on first 5 sample queries
test_data = SAMPLE_QUERIES[:5]

print(f"📊 Evaluating system on {len(test_data)} queries...\n")

results = evaluator.evaluate_pipeline(test_data)

# Display aggregate results
metrics = results['aggregate_metrics']
print("\n📈 EVALUATION RESULTS:")
print(f"   Task Success Rate: {metrics['task_success_rate']:.1%}")
print(f"   Average Fact F1 Score: {metrics['avg_fact_f1']:.3f}")
print(f"   Average Fact Precision: {metrics['avg_fact_precision']:.3f}")
print(f"   Average Fact Recall: {metrics['avg_fact_recall']:.3f}")
print(f"   Average Processing Time: {metrics['avg_processing_time']:.3f}s")

# Display individual results
print("\n📋 Individual Query Results:")
for result in results['individual_results']:
    status = "✅" if result.task_success else "❌"
    print(f"   Query {result.query_id}: {status} (F1: {result.fact_f1:.3f}, Time: {result.processing_time:.3f}s)")

## 8. Knowledge Base Testing

Let's test the Prolog knowledge base directly:

In [None]:
# Test the Prolog knowledge base
print("🧠 Testing Prolog Knowledge Base:\n")

kb_results = legal_engine.test_knowledge_base()

for test_name, result in kb_results.items():
    status = "✅ PASS" if result else "❌ FAIL"
    print(f"   {test_name}: {status}")

# Test custom scenario
print("\n🧪 Custom Test Case:")
custom_facts = [
    'applicant(custom_test)',
    'income_monthly(custom_test, 8000)',
    'is_disabled(custom_test, true)',
    'case_type(custom_test, "consumer_dispute")'
]

custom_decision = legal_engine.check_eligibility(custom_facts)
print(f"   Facts: Disabled person, 8000/month income, consumer dispute")
print(f"   Decision: {'✅ ELIGIBLE' if custom_decision['eligible'] else '❌ NOT ELIGIBLE'}")
print(f"   Reason: {custom_decision['explanation']}")

## 9. Next Steps

Congratulations! You've successfully explored the HybEx-Law system. Here are some next steps:

### For Development:
1. **Install full dependencies**: `pip install -r requirements.txt`
2. **Install SWI-Prolog** for full Prolog functionality
3. **Create more training data** for better ML models
4. **Train the Stage 1 classifier** with annotated data
5. **Extend the knowledge base** with more legal rules

### For Research:
1. **Compare with baselines** (regex-only, LLM-only)
2. **Evaluate on larger datasets**
3. **Add more case types and legal scenarios**
4. **Implement active learning** for data annotation
5. **Study real-world deployment** considerations

### For Production:
1. **Optimize performance** for faster processing
2. **Add robust error handling**
3. **Implement user interface**
4. **Add logging and monitoring**
5. **Ensure legal compliance** and accuracy

In [None]:
print("🎉 Tutorial completed!")
print("\nThank you for exploring HybEx-Law!")
print("For more information, check the README.md and documentation.")