# NL2SPARQL Quick Start Guide

This notebook demonstrates all the main functionalities of the NL2SPARQL package for translating natural language questions into SPARQL queries for the LiITA knowledge base.

## 1. Setup and Configuration

First, make sure you have the package installed:

```bash
pip install liita-nl2sparql[openai]  # or [anthropic], [mistral], [all]
```

In [None]:
# Set your API key (uncomment and modify the one you need)
import os

# os.environ["OPENAI_API_KEY"] = "your-openai-api-key"
# os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"
# os.environ["MISTRAL_API_KEY"] = "your-mistral-api-key"
# os.environ["GEMINI_API_KEY"] = "your-gemini-api-key"

## 2. Basic Translation

The simplest way to translate a natural language question to SPARQL.

In [None]:
from nl2sparql import translate

# Italian question
result = translate("Quali lemmi esprimono tristezza?")
print("Generated SPARQL:")
print(result.sparql)

In [None]:
# English question
result = translate("Find all nouns that express joy")
print("Generated SPARQL:")
print(result.sparql)

## 3. Advanced Usage with NL2SPARQL Class

For more control over the translation process, use the `NL2SPARQL` class directly.

In [None]:
from nl2sparql import NL2SPARQL

# Initialize with specific provider and options
translator = NL2SPARQL(
    provider="openai",      # or "anthropic", "mistral", "gemini", "ollama"
    model="gpt-4.1-mini",   # optional: specify model
    validate=True,          # validate generated queries
    fix_errors=True,        # automatically fix invalid queries
    max_retries=3           # retry attempts for fixing
)

print("Translator initialized successfully")

In [None]:
# Translate with full result details
result = translator.translate("Trova le traduzioni siciliane di 'casa'")

print("=" * 60)
print("TRANSLATION RESULT")
print("=" * 60)
print(f"\nDetected patterns: {result.detected_patterns}")
print(f"Confidence: {result.confidence:.2f}")
print(f"\nValidation:")
print(f"  - Syntax valid: {result.validation.syntax_valid}")
print(f"  - Execution success: {result.validation.execution_success}")
print(f"  - Result count: {result.validation.result_count}")
print(f"\nGenerated SPARQL:")
print(result.sparql)

In [None]:
# Check if query was auto-fixed
if result.was_fixed:
    print(f"Query was fixed after {result.fix_attempts} attempt(s)")
else:
    print("Query was valid on first attempt")

## 3.5 Agentic Translation with LangGraph

The agent module provides an advanced agentic workflow using LangGraph. It features:
- Iterative refinement with automatic error correction
- Schema exploration when queries return empty results
- Semantic verification of results

**Installation:**
```bash
pip install liita-nl2sparql[agent-openai]  # or [agent-anthropic], [agent-mistral]
```

In [None]:
from nl2sparql.agent import NL2SPARQLAgent

# Initialize the agent
agent = NL2SPARQLAgent(
    provider="openai",      # or "anthropic", "mistral", "gemini", "ollama"
    model="gpt-4.1-mini",   # optional: uses provider default if None
)

print("Agent initialized successfully")

In [None]:
# Basic agent translation
result = agent.translate("Quali lemmi esprimono tristezza?")

print("=" * 60)
print("AGENT TRANSLATION RESULT")
print("=" * 60)
print(f"\nDetected patterns: {result['detected_patterns']}")
print(f"Confidence: {result['confidence']:.2f}")
print(f"Attempts: {result['attempts']}")
print(f"Valid: {result['is_valid']}")
print(f"Result count: {result['result_count']}")
print(f"\nGenerated SPARQL:")
print(result["sparql"])

In [None]:
# Streaming mode - watch the agent work step by step
print("Streaming agent workflow:")
print("=" * 60)

final_state = None
for node_name, state in agent.stream("Trova le traduzioni siciliane di 'casa'"):
    final_state = state
    
    if node_name == "analyze":
        print(f"[analyze] Patterns: {state.get('detected_patterns')}, Complexity: {state.get('complexity')}")
    elif node_name == "plan":
        print(f"[plan] Sub-tasks: {len(state.get('sub_tasks', []))}")
    elif node_name == "retrieve":
        print(f"[retrieve] Examples: {len(state.get('retrieved_examples', []))}")
    elif node_name == "generate":
        print(f"[generate] Attempt: {state.get('generation_attempts')}")
    elif node_name == "execute":
        error = state.get('execution_error')
        if error:
            print(f"[execute] Error: {error}")
        else:
            print(f"[execute] Results: {state.get('result_count')}")
    elif node_name == "verify":
        if state.get('is_valid'):
            print(f"[verify] Valid!")
        else:
            print(f"[verify] Issues: {state.get('validation_errors')}")
    elif node_name == "refine":
        print(f"[refine] Retrying...")
    elif node_name == "output":
        print(f"[output] Done!")

# Get final result from accumulated state
result = agent.get_final_result(final_state)
print(f"\nFinal SPARQL ({result['attempts']} attempt(s), {result['result_count']} results):")
print(result["sparql"])

In [None]:
# Verbose mode - built-in progress output
result = agent.translate(
    "Qual è la definizione di 'amore'?",
    verbose=True
)

print(f"\nFinal SPARQL:\n{result['sparql']}")

In [None]:
# Check refinement history (if the agent had to retry)
result = agent.translate("Trova tutti gli iponimi di 'animale'")

if result["refinement_history"]:
    print(f"Agent refined the query {len(result['refinement_history'])} time(s):")
    for i, attempt in enumerate(result["refinement_history"], 1):
        print(f"\n  Attempt {i}:")
        print(f"    Error: {attempt['error'][:100]}...")
        print(f"    Results: {attempt['result_count']}")
else:
    print("Query was valid on first attempt!")

print(f"\nFinal result:")
print(f"  Valid: {result['is_valid']}")
print(f"  Confidence: {result['confidence']:.2f}")
print(f"  Results: {result['result_count']}")

In [None]:
# Async usage (for async contexts)
import asyncio

async def translate_async():
    result = await agent.atranslate("Lemmi che esprimono rabbia")
    return result

# Run in Jupyter (already has event loop)
result = await translate_async()
print(f"Async result: {result['is_valid']}, {result['result_count']} results")

## 4. Working with Retrieved Examples

See which examples were retrieved for few-shot learning.

In [None]:
result = translator.translate("Quali sono gli iperonimi di 'cane'?")

print("Retrieved examples for few-shot learning:")
print("=" * 60)
for i, ex in enumerate(result.retrieved_examples[:3], 1):
    print(f"\n{i}. Score: {ex.score:.3f}")
    print(f"   Question: {ex.example.nl}")
    print(f"   SPARQL preview: {ex.example.sparql[:100]}...")

## 5. Query Types Examples

Examples of different query types supported by the system.

In [None]:
# Helper function to display results
def show_translation(question: str, language: str = "it"):
    """Translate and display results."""
    print(f"\nQuestion ({language}): {question}")
    print("-" * 60)
    result = translator.translate(question)
    print(f"Patterns: {result.detected_patterns}")
    print(f"Valid: {result.validation.syntax_valid}, Results: {result.validation.result_count}")
    print(f"\nSPARQL:\n{result.sparql}")
    return result

In [None]:
# Emotion query (ELITA)
_ = show_translation("Quali lemmi esprimono paura?")

In [None]:
# Translation query (Sicilian)
_ = show_translation("Traduzioni siciliane di 'acqua'")

In [None]:
# Definition query (CompL-it)
_ = show_translation("Qual è la definizione di 'amore'?")

In [None]:
# Semantic relations (hypernyms)
_ = show_translation("What are the hypernyms of 'dog'?", "en")

In [None]:
# Part of speech filter
_ = show_translation("Find all verbs in LiITA", "en")

In [None]:
# Morphological pattern
_ = show_translation("Lemmi che iniziano con 'pre'")

## 6. Single Model Evaluation

Run systematic evaluation on the test dataset.

In [None]:
from nl2sparql.evaluation import (
    evaluate_dataset,
    evaluate_single,
    load_test_dataset,
    print_report,
    save_report,
)

In [None]:
# Load and inspect the test dataset
test_data = load_test_dataset()

print(f"Dataset version: {test_data['metadata']['version']}")
print(f"Total test cases: {len(test_data['test_cases'])}")
print(f"Patterns covered: {test_data['metadata']['patterns_covered']}")

In [None]:
# Evaluate a single test case
test_case = test_data["test_cases"][0]
print(f"Test case: {test_case['id']}")
print(f"Question (IT): {test_case['nl_it']}")
print(f"Question (EN): {test_case['nl_en']}")
print(f"Expected patterns: {test_case['patterns']}")

In [None]:
# Run single test
result = evaluate_single(test_case, translator, language="it")

print(f"\nResults for {result.test_id}:")
print(f"  Syntax valid: {result.syntax_valid}")
print(f"  Endpoint valid: {result.endpoint_valid}")
print(f"  Component score: {result.component_score:.2%}")
print(f"  Generation time: {result.generation_time:.2f}s")
if result.missing_components:
    print(f"  Missing components: {result.missing_components}")

In [None]:
# Run evaluation on a subset (single_pattern category only, for speed)
report = evaluate_dataset(
    translator,
    language="it",
    categories=["single_pattern"],  # Filter to single pattern tests
    validate_endpoint=True,
)

print_report(report)

In [None]:
# Save report with generated SPARQL queries
save_report(report, "evaluation_report.json")
print("Report saved to evaluation_report.json")

In [None]:
# Inspect a generated query from the report
import json

with open("evaluation_report.json", "r") as f:
    saved_report = json.load(f)

# Show first test result with its generated SPARQL
first_result = saved_report["test_results"][0]
print(f"Test: {first_result['test_id']}")
print(f"Question: {first_result['question']}")
print(f"\nGenerated SPARQL:")
print(first_result["generated_sparql"])

## 7. Batch Model Comparison

Compare multiple LLM providers and models.

In [None]:
from nl2sparql.evaluation import (
    ModelConfig,
    run_batch_evaluation,
    create_comparison_report,
    print_comparison,
    PRESETS,
)

# View available presets
print("Available presets:")
for name, configs in PRESETS.items():
    models = [c.name for c in configs]
    print(f"  {name}: {models}")

In [None]:
# Define custom model configurations
custom_configs = [
    ModelConfig("openai", "gpt-4.1-mini", "GPT-4.1-mini"),
    # Add more as needed:
    # ModelConfig("anthropic", "claude-3-5-haiku-20241022", "Claude 3.5 Haiku"),
    # ModelConfig("mistral", "mistral-small-latest", "Mistral Small"),
]

print(f"Testing {len(custom_configs)} model(s)")

In [None]:
# Run batch evaluation (on single_pattern only for speed)
# Note: This will take some time depending on the number of models and tests

results = run_batch_evaluation(
    configs=custom_configs,
    language="it",
    validate_endpoint=True,
    categories=["single_pattern"],  # Subset for demo
    output_dir="./batch_reports",   # Save individual reports
    verbose=True,
)

In [None]:
# Generate and display comparison
comparison = create_comparison_report(results, "model_comparison.json")
print_comparison(comparison)

In [None]:
# Access comparison data programmatically
print("\nProgrammatic access to comparison data:")
for model in comparison["models"]:
    if model.get("syntax_valid_rate") is not None:
        print(f"\n{model['name']}:")
        print(f"  Syntax validity: {model['syntax_valid_rate']:.1%}")
        print(f"  Endpoint success: {model['endpoint_valid_rate']:.1%}")
        print(f"  Component score: {model['avg_component_score']:.1%}")
        print(f"  Avg time: {model['avg_generation_time']:.2f}s")

## 8. Testing Generated Queries

You can copy generated queries and test them directly on the LiITA SPARQL endpoint.

In [None]:
from SPARQLWrapper import SPARQLWrapper, JSON
from nl2sparql.config import LIITA_ENDPOINT

def execute_sparql(query: str, limit: int = 10):
    """Execute a SPARQL query on the LiITA endpoint."""
    sparql = SPARQLWrapper(LIITA_ENDPOINT)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    sparql.addCustomHttpHeader("Accept", "application/sparql-results+json")
    
    try:
        results = sparql.query().convert()
        bindings = results["results"]["bindings"]
        print(f"Endpoint: {LIITA_ENDPOINT}")
        print(f"Found {len(bindings)} results")
        
        if bindings:
            # Get column names
            cols = list(bindings[0].keys())
            print(f"Columns: {cols}")
            print("-" * 60)
            
            # Display first N results
            for row in bindings[:limit]:
                values = [row[c]["value"][:50] for c in cols]
                print(" | ".join(values))
                
        return bindings
    except Exception as e:
        print(f"Error: {e}")
        return None

In [None]:
# Generate and execute a query
result = translator.translate("Lemmi che esprimono gioia")
print("Query:")
print(result.sparql)
print("\nResults:")
_ = execute_sparql(result.sparql)

## 9. Cleanup

In [None]:
# Optional: Clean up generated files
import os
import shutil

files_to_remove = ["evaluation_report.json", "model_comparison.json"]
dirs_to_remove = ["batch_reports"]

for f in files_to_remove:
    if os.path.exists(f):
        os.remove(f)
        print(f"Removed {f}")

for d in dirs_to_remove:
    if os.path.exists(d):
        shutil.rmtree(d)
        print(f"Removed {d}/")

## Summary

This notebook demonstrated:

1. **Basic translation** using the `translate()` function
2. **Advanced usage** with the `NL2SPARQL` class for more control
3. **Agentic translation** with `NL2SPARQLAgent` featuring iterative refinement, streaming, and semantic verification
4. **Retrieved examples** inspection for few-shot learning
5. **Different query types**: emotions, translations, definitions, semantic relations, POS filters, morphological patterns
6. **Single model evaluation** with metrics and reports
7. **Batch model comparison** across multiple LLM providers
8. **Direct query execution** on the LiITA SPARQL endpoint

For more information, see:
- [README.md](../README.md)
- [Evaluation Documentation](../docs/evaluation.md)
- [Architecture Documentation](../docs/architecture.md)