# 🕸️ Tutorial 4: Building Knowledge Graphs

**Learn to build knowledge graphs from research papers using AI.**

## What You'll Learn:
- Extract entities from papers using AI
- Build knowledge graphs automatically
- Visualize research connections
- Query graphs for insights

**Time:** 15 minutes | **Level:** Beginner

## Step 1: Setup

In [None]:
# Import what we need
import sys
import os

# Add parent directory to path
if os.path.basename(os.getcwd()) == 'tutorial':
    sys.path.insert(0, '..')
else:
    sys.path.insert(0, '.')

# Enable widgets for visualization
try:
    import google.colab
    from google.colab import output
    output.enable_custom_widget_manager()
    print("📱 Google Colab widget support enabled")
except:
    pass

# Import the GraphRAG system
from src.langchain_graph_rag import LangChainGraphRAG

print("✅ Setup complete!")

## Step 2: Create GraphRAG System

In [None]:
# Create knowledge graph system
graph_rag = LangChainGraphRAG(
    llm_model="llama3.1:8b",
    embedding_model="nomic-embed-text"
)

print("🕸️ Knowledge graph system ready!")
print(f"📊 Current papers: {len(graph_rag.get_all_papers())}")

## Step 3: Add Your First Paper

In [None]:
# Sample research paper content
paper_content = """
Machine Learning for Drug Discovery
Authors: Dr. Sarah Chen (MIT), Prof. Michael Torres (Stanford)

This study presents deep learning approaches for molecular property prediction.
We developed ChemNet, a transformer architecture for chemical analysis.
The model was trained on PubChem and ChEMBL datasets with 95% accuracy.
Applications include drug discovery and materials science.
"""

# Add paper to knowledge graph
result = graph_rag.extract_entities_and_relationships(
    paper_content=paper_content,
    paper_title="Machine Learning for Drug Discovery",
    paper_id="paper_1"
)

print("✅ Paper added to knowledge graph!")
print(f"📝 Documents created: {result['documents_added']}")
print(f"🏷️ Entities extracted: {len([e for entities in result['entities'].values() for e in entities])}")

## Step 4: See What AI Found

In [None]:
# Show extracted entities
entities = result['entities']

print("🤖 AI Found These Entities:")
print("=" * 30)

for category, entity_list in entities.items():
    if entity_list:
        print(f"\n📋 {category.upper()}:")
        for entity in entity_list:
            print(f"   • {entity}")

## Step 5: Add Another Paper

In [None]:
# Second paper with overlapping entities
paper_content_2 = """
Neural Networks in Chemical Research
Authors: Prof. Michael Torres (Stanford), Dr. Elena Rodriguez (UCSF)

We explore transformer models for pharmaceutical applications.
Our system uses attention mechanisms for molecular analysis.
Training data included PubChem and proprietary datasets.
Results show improved drug candidate generation.
"""

# Add second paper
result_2 = graph_rag.extract_entities_and_relationships(
    paper_content=paper_content_2,
    paper_title="Neural Networks in Chemical Research", 
    paper_id="paper_2"
)

print("✅ Second paper added!")
print(f"📚 Total papers: {len(graph_rag.get_all_papers())}")

## Step 6: Find Connections

In [None]:
# Find papers connected by shared authors
connections = graph_rag.find_related_papers("paper_1", "authors")

print("🔗 Paper Connections Found:")
print("=" * 30)

if connections['related_papers']:
    for paper_id, info in connections['related_papers'].items():
        print(f"\n📄 {info['paper_title']}")
        print(f"   🔗 Shared authors: {', '.join(info['shared_entities'])}")
else:
    print("No connections found")

## Step 7: Query the Knowledge Graph

In [None]:
# Ask questions about your research
query = "machine learning and chemistry"

results = graph_rag.query_graph(query)

print(f"🔍 Query: '{query}'")
print("=" * 40)
print(f"📊 Found {results['papers_found']} relevant papers")

for paper_id, paper_data in results['papers'].items():
    print(f"\n📄 {paper_data['paper_title']}")
    print(f"   💬 Relevant sections: {len(paper_data['chunks'])}")
    
    # Show snippet
    if paper_data['chunks']:
        snippet = paper_data['chunks'][0][:100]
        print(f"   📝 {snippet}...")

## Step 8: Visualize Your Knowledge Graph

In [None]:
# Visualize Your Knowledge Graph
from src.notebook_visualization import show_knowledge_graph

print("🎨 Creating knowledge graph visualization...")

# Display interactive graph with professional features
result = show_knowledge_graph(graph_rag)

if result:
    print("\n💡 If you see a graph above, you can:")
    print("   • Drag nodes to rearrange")
    print("   • Use sidebar to explore properties") 
    print("   • Search for specific entities")
    print("   • Zoom and pan to navigate")
else:
    print("\n⚠️ Visualization had issues - but your knowledge graph is working!")
    
    # Show what we built anyway
    summary = graph_rag.get_graph_summary()
    print(f"\n📊 Your Knowledge Graph:")
    print(f"   📄 Papers: {summary['total_papers']}")
    print(f"   📝 Documents: {summary['total_documents']}")
    print(f"   🏷️ Entities: {sum(summary['unique_entities'].values())}")
    print("\n🎉 Knowledge graph built successfully!")

## Step 9: Explore Your Graph

In [None]:
# Get overview of your knowledge graph
summary = graph_rag.get_graph_summary()

print("📊 Knowledge Graph Summary:")
print("=" * 30)
print(f"📄 Papers: {summary['total_papers']}")
print(f"📝 Document chunks: {summary['total_documents']}")

print("\n🏷️ Unique Entities:")
for entity_type, count in summary['unique_entities'].items():
    if count > 0:
        print(f"   • {entity_type.title()}: {count}")

print("\n🎉 You built a knowledge graph!")

## Try It Yourself!

In [None]:
# Add your own paper content here!
your_paper = """
Replace this with content from your research area:
- Title and authors
- Abstract or summary
- Key methods and findings
- Datasets used
"""

# Uncomment and modify to add your paper:
# result = graph_rag.extract_entities_and_relationships(
#     paper_content=your_paper,
#     paper_title="Your Paper Title",
#     paper_id="your_paper"
# )
# print("✅ Your paper added to the knowledge graph!")

print("💡 Add your own research content above to expand the graph!")

## 🎓 What You Learned

**Congratulations!** You built an AI-powered knowledge graph that:

✅ **Extracts entities** from research papers automatically  
✅ **Finds connections** between different papers  
✅ **Answers questions** about your research  
✅ **Visualizes relationships** in an interactive graph  

### 🚀 Next Steps:
- Add real papers from your research area
- Try different query types
- Explore the interactive visualization
- Scale up to larger paper collections

### 🔗 Real-world Applications:
- **Literature reviews** - Find research gaps
- **Collaboration mapping** - Discover research networks  
- **Citation analysis** - Track research influence
- **Knowledge discovery** - Uncover hidden connections