# QIAGEN BKB Text2Cypher Agent - Example Usage

This notebook demonstrates how to use the QIAGEN BKB Text2Cypher Agent for biomedical knowledge discovery.

## Setup

In [None]:
import sys
sys.path.append('..')

from src.main import BKBQueryAgent
from src.agents import get_query_router, get_intent_classifier
from src.templates import get_all_templates
import json

In [None]:
# Initialize the agent
agent = BKBQueryAgent()
print("Agent initialized successfully!")

## 1. Basic Queries

### Drug Target Interaction

In [None]:
result = agent.query("What drugs target EGFR?")

print(f"Query Type: {result['query_type']}")
print(f"Intent: {result['intent']}")
print(f"Results Found: {result['result_count']}")
print(f"\nAnswer:\n{result['answer']}")

### Gene-Disease Association

In [None]:
result = agent.query("Find genes associated with breast cancer")

print(f"\nAnswer:\n{result['answer']}")

# Show raw results
print("\nTop 5 Results:")
for i, res in enumerate(result['results'][:5], 1):
    print(f"{i}. {res}")

## 2. Drug Repurposing

### Find Similar Drugs

In [None]:
result = agent.query("Find drugs similar to Imatinib with at least 2 shared targets")

print(f"Query Type: {result['query_type']}")
print(f"Template: {result.get('template_name', 'N/A')}")
print(f"\nCypher Query:\n{result['cypher_query']}")
print(f"\nAnswer:\n{result['answer']}")

### Repurposing for New Disease

In [None]:
result = agent.query("Find existing drugs that could be repurposed for Alzheimer's disease")

print(f"Results: {result['result_count']}")
print(f"\nAnswer:\n{result['answer']}")

## 3. Target Identification

### Find Biomarkers

In [None]:
result = agent.query("What are the biomarkers for lung cancer?")

print(f"\nAnswer:\n{result['answer']}")

### Undrugged Targets

In [None]:
result = agent.query("Find novel undrugged targets for diabetes")

print(f"Found {result['result_count']} potential targets")
print(f"\nAnswer:\n{result['answer']}")

## 4. Indication Expansion

### New Indications for Existing Drug

In [None]:
result = agent.query("What new diseases could Metformin treat?")

print(f"\nAnswer:\n{result['answer']}")

### Orphan Disease Opportunities

In [None]:
result = agent.query("Find orphan disease opportunities for Rapamycin")

print(f"\nAnswer:\n{result['answer']}")

## 5. Complex Queries with Text2Cypher

### Multi-hop Pathway Query

In [None]:
# Force text2cypher for complex query
result = agent.query(
    "Find proteins that interact with TP53 in the apoptosis pathway",
    force_text2cypher=True
)

print(f"Query Type: {result['query_type']}")
print(f"\nGenerated Cypher:\n{result['cypher_query']}")
print(f"\nAnswer:\n{result['answer']}")

## 6. Batch Processing

In [None]:
questions = [
    "What drugs treat Parkinson's disease?",
    "Find genes in the MAPK signaling pathway",
    "Which drugs inhibit VEGF?",
    "What are biomarkers for colorectal cancer?"
]

results = agent.batch_query(questions)

for i, result in enumerate(results, 1):
    print(f"\n{'='*60}")
    print(f"Query {i}: {questions[i-1]}")
    print(f"Results: {result['result_count']}")
    print(f"Answer: {result['answer'][:200]}...")

## 7. Template Suggestions

In [None]:
suggestions = agent.get_suggestions("Find drugs for cancer treatment")

print("Template Suggestions:")
for i, sug in enumerate(suggestions, 1):
    print(f"\n{i}. {sug['name']}")
    print(f"   Description: {sug['description']}")
    print(f"   Intent: {sug['intent']}")
    print(f"   Example: {sug['example_question']}")

## 8. Exploring Available Templates

In [None]:
templates = get_all_templates()

print(f"Total Templates: {len(templates)}\n")

# Group by intent
by_intent = {}
for t in templates:
    if t.intent not in by_intent:
        by_intent[t.intent] = []
    by_intent[t.intent].append(t)

for intent, tmps in by_intent.items():
    print(f"\n{intent.upper()} ({len(tmps)} templates):")
    for t in tmps:
        print(f"  - {t.name}: {t.description}")

## 9. Output Format Options

In [None]:
# Natural language (default)
result = agent.query("Find top 5 drugs that target BCR-ABL", format="natural")
print("Natural Language:")
print(result['answer'])

In [None]:
# JSON format
result = agent.query("Find top 5 drugs that target BCR-ABL", format="json")
print("\nJSON Format:")
print(json.dumps(result['results'][:3], indent=2))

In [None]:
# Table format
result = agent.query("Find top 5 drugs that target BCR-ABL", format="table")
print("\nTable Format:")
print(result['formatted_results'])

## 10. Intent Classification

In [None]:
classifier = get_intent_classifier()

test_queries = [
    "What drugs target BRAF?",
    "Find similar compounds to Aspirin",
    "Could Imatinib be used for new indications?",
    "What are biomarkers for prostate cancer?",
    "Find undrugged targets in the EGFR pathway"
]

for query in test_queries:
    intent, confidence_scores = classifier.classify_with_confidence(query)
    print(f"\nQuery: {query}")
    print(f"Intent: {intent}")
    if confidence_scores:
        print(f"Confidence: {confidence_scores[0][1]:.2f}")

## Summary

This notebook demonstrated:
- Basic query execution
- Drug repurposing queries
- Target identification
- Indication expansion
- Complex text2cypher queries
- Batch processing
- Template suggestions
- Multiple output formats
- Intent classification

The hybrid approach ensures:
- **High accuracy** for common patterns (predefined templates)
- **Flexibility** for novel queries (text2cypher fallback)
- **Automatic routing** based on query intent
- **Error recovery** with query refinement