# üï∏Ô∏è AWS GraphRAG Toolkit Demo

This notebook demonstrates how to use **AWS's GraphRAG Toolkit** for building graph-enhanced Retrieval-Augmented Generation (RAG) applications. The toolkit integrates with Amazon Neptune (graph database), Amazon OpenSearch Serverless (vector store), and Amazon Bedrock (foundation models).

## What is GraphRAG?

GraphRAG is an advanced RAG approach that:
- **Extracts entities and relationships** from documents to build a knowledge graph
- **Creates hierarchical lexical graphs** for structured understanding
- **Combines graph traversal with vector search** for comprehensive retrieval
- **Leverages Amazon Bedrock** for LLM-powered entity extraction and response generation

## AWS Services Used

| Service | Purpose |
|---------|--------|
| **Amazon Neptune** | Graph database for storing entities and relationships |
| **Amazon OpenSearch Serverless** | Vector store for semantic search |
| **Amazon Bedrock** | Foundation models (Claude, Titan) for extraction & generation |

---

## üì¶ 1. Installation

First, let's install the GraphRAG Toolkit and its dependencies.

In [None]:
# Install AWS GraphRAG Toolkit from GitHub
%pip install git+https://github.com/awslabs/graphrag-toolkit.git --quiet

# Install additional dependencies
%pip install boto3 python-dotenv pyyaml networkx matplotlib pandas --quiet

## üîß 2. AWS Environment Setup

The GraphRAG Toolkit requires AWS credentials and access to the following services:
- **Amazon Neptune** (Database or Analytics)
- **Amazon OpenSearch Serverless**
- **Amazon Bedrock**

### Prerequisites

1. Set up an Amazon Neptune cluster or Neptune Analytics graph
2. Create an Amazon OpenSearch Serverless collection
3. Enable Amazon Bedrock models in your region (Claude Sonnet, Titan Embeddings)
4. Configure IAM roles with appropriate permissions

In [None]:
import os
import boto3
from pathlib import Path
from dotenv import load_dotenv

# Load environment variables from .env file if it exists
load_dotenv()

# AWS Configuration
AWS_REGION = os.environ.get("AWS_REGION", "us-east-1")

# Neptune Configuration
NEPTUNE_ENDPOINT = os.environ.get("NEPTUNE_ENDPOINT", "<your-neptune-endpoint>")
NEPTUNE_PORT = os.environ.get("NEPTUNE_PORT", "8182")

# OpenSearch Serverless Configuration
OPENSEARCH_ENDPOINT = os.environ.get("OPENSEARCH_ENDPOINT", "<your-opensearch-endpoint>")

# Bedrock Model Configuration
LLM_MODEL_ID = os.environ.get("LLM_MODEL_ID", "anthropic.claude-3-sonnet-20240229-v1:0")
EMBEDDING_MODEL_ID = os.environ.get("EMBEDDING_MODEL_ID", "amazon.titan-embed-text-v2:0")

print(f"üåç AWS Region: {AWS_REGION}")
print(f"üîó Neptune Endpoint: {NEPTUNE_ENDPOINT}")
print(f"üîç OpenSearch Endpoint: {OPENSEARCH_ENDPOINT}")
print(f"ü§ñ LLM Model: {LLM_MODEL_ID}")
print(f"üìä Embedding Model: {EMBEDDING_MODEL_ID}")

In [None]:
# Verify AWS credentials
def verify_aws_credentials():
    """Verify AWS credentials are configured correctly."""
    try:
        sts = boto3.client('sts', region_name=AWS_REGION)
        identity = sts.get_caller_identity()
        print(f"‚úÖ AWS credentials configured")
        print(f"   Account: {identity['Account']}")
        print(f"   User/Role: {identity['Arn']}")
        return True
    except Exception as e:
        print(f"‚ùå AWS credentials error: {e}")
        print("   Please configure AWS credentials using:")
        print("   - AWS CLI: aws configure")
        print("   - Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY")
        print("   - IAM role (for EC2/Lambda)")
        return False

verify_aws_credentials()

## üìÅ 3. Project Structure Setup

Let's create the project directory structure for our GraphRAG application.

In [None]:
# Define project directories
PROJECT_DIR = Path.cwd()
INPUT_DIR = PROJECT_DIR / "input"
OUTPUT_DIR = PROJECT_DIR / "output"

# Create directories
INPUT_DIR.mkdir(exist_ok=True)
OUTPUT_DIR.mkdir(exist_ok=True)

print(f"üìÇ Project directory: {PROJECT_DIR}")
print(f"üìÇ Input directory: {INPUT_DIR}")
print(f"üìÇ Output directory: {OUTPUT_DIR}")

## üìÑ 4. Sample Data Preparation

For this demo, we'll create a sample text document about a fictional tech company (same as the Microsoft GraphRAG demo for comparison).

In [None]:
# Sample text about a fictional tech company for demonstration
sample_text = """
# TechCorp Innovation Report 2025

## Company Overview

TechCorp is a leading technology company founded in 2015 by Sarah Chen and Michael Rodriguez in San Francisco.
The company specializes in artificial intelligence solutions for enterprise customers. With over 5,000 employees
across 20 offices worldwide, TechCorp has become a major player in the AI industry.

## Leadership Team

Sarah Chen serves as the CEO and has led the company through multiple successful funding rounds. She previously
worked at Google and Stanford AI Lab. Michael Rodriguez, the CTO, oversees all technical operations and R&D.
He holds a PhD in Machine Learning from MIT.

The CFO, Jennifer Park, joined in 2019 from Goldman Sachs. She has been instrumental in the company's financial
growth and successful IPO in 2023. David Thompson leads the Sales division and has expanded the customer base
to include Fortune 500 companies like Amazon, Microsoft, and Walmart.

## Products and Services

TechCorp's flagship product, "AIAssist Pro", is an enterprise AI assistant that helps companies automate
customer service operations. It uses advanced natural language processing and has been deployed by over
200 enterprise customers.

"DataSense Analytics" is the company's second major product, offering predictive analytics for supply chain
optimization. Major clients include Walmart and Target, who have reported 30% efficiency improvements.

The newest product, "SecureAI", launched in 2024, focuses on AI-powered cybersecurity. It has already
attracted partnerships with three major banks: JPMorgan Chase, Bank of America, and Wells Fargo.

## Research and Development

TechCorp's R&D division, led by Dr. Emily Watson, has published over 50 papers in top AI conferences.
The team recently made a breakthrough in efficient transformer architectures, reducing compute costs by 40%.

The company collaborates with Stanford University, MIT, and Carnegie Mellon on various research projects.
Dr. Watson's team includes researchers from DeepMind, OpenAI, and Google Brain.

## Financial Performance

In 2024, TechCorp reported revenue of $2.5 billion, a 45% increase from the previous year. The company's
market cap reached $50 billion after the successful IPO. Major investors include Sequoia Capital,
Andreessen Horowitz, and SoftBank Vision Fund.

## Future Plans

TechCorp plans to expand into the healthcare AI market in 2025, with partnerships already in place with
Mayo Clinic and Cleveland Clinic. The company is also developing autonomous systems for logistics,
working with FedEx and UPS on pilot programs.

Sarah Chen announced plans to open new R&D centers in London, Singapore, and Tel Aviv to attract
global talent and serve international customers better.
"""

# Save the sample text to the input directory
input_file = INPUT_DIR / "techcorp_report.txt"
input_file.write_text(sample_text)

print(f"‚úÖ Sample document saved to: {input_file}")
print(f"üìù Document length: {len(sample_text)} characters")

## ‚öôÔ∏è 5. Configuration

Let's create a configuration for the AWS GraphRAG Toolkit.

In [None]:
import yaml

# GraphRAG Toolkit configuration
config = {
    "graph_store": {
        "type": "neptune",
        "endpoint": NEPTUNE_ENDPOINT,
        "port": int(NEPTUNE_PORT),
        "region": AWS_REGION,
        "iam_auth": True
    },
    "vector_store": {
        "type": "opensearch_serverless",
        "endpoint": OPENSEARCH_ENDPOINT,
        "region": AWS_REGION,
        "index_name": "graphrag-demo-index"
    },
    "llm": {
        "provider": "bedrock",
        "model_id": LLM_MODEL_ID,
        "region": AWS_REGION,
        "max_tokens": 4096,
        "temperature": 0
    },
    "embedding": {
        "provider": "bedrock",
        "model_id": EMBEDDING_MODEL_ID,
        "region": AWS_REGION
    },
    "extraction": {
        "chunk_size": 1200,
        "chunk_overlap": 100,
        "entity_types": ["PERSON", "ORGANIZATION", "PRODUCT", "LOCATION", "EVENT"],
        "max_entities_per_chunk": 20
    },
    "input": {
        "directory": str(INPUT_DIR),
        "file_pattern": "*.txt"
    },
    "output": {
        "directory": str(OUTPUT_DIR)
    }
}

# Save configuration to YAML file
config_file = PROJECT_DIR / "config.yaml"
with open(config_file, 'w') as f:
    yaml.dump(config, f, default_flow_style=False, sort_keys=False)

print(f"‚úÖ Configuration saved to: {config_file}")
print("\nüìã Configuration preview:")
print(yaml.dump(config, default_flow_style=False, sort_keys=False))

## üîç 6. Initialize GraphRAG Toolkit

Let's verify the GraphRAG Toolkit is installed correctly and initialize the components.

In [None]:
# Check if we can import graphrag_toolkit
try:
    import graphrag_toolkit
    print(f"‚úÖ GraphRAG Toolkit imported successfully")
    
    # List available modules
    modules = [attr for attr in dir(graphrag_toolkit) if not attr.startswith('_')]
    print(f"\nüì¶ Available modules: {modules[:10]}..." if len(modules) > 10 else f"\nüì¶ Available modules: {modules}")
    
except ImportError as e:
    print(f"‚ùå GraphRAG Toolkit not installed: {e}")
    print("\nRun: pip install git+https://github.com/awslabs/graphrag-toolkit.git")

## üèóÔ∏è 7. Entity Extraction with Amazon Bedrock

The GraphRAG Toolkit uses Amazon Bedrock to extract entities and relationships from documents.

‚ö†Ô∏è **Note**: This step requires:
- Active AWS credentials with Bedrock access
- Running Neptune database
- Running OpenSearch Serverless collection

In [None]:
# Entity extraction using Amazon Bedrock
import json

def extract_entities_with_bedrock(text: str, model_id: str = LLM_MODEL_ID) -> dict:
    """
    Extract entities and relationships from text using Amazon Bedrock.
    
    This demonstrates the core extraction logic used by the GraphRAG Toolkit.
    """
    bedrock = boto3.client('bedrock-runtime', region_name=AWS_REGION)
    
    extraction_prompt = f"""Analyze the following text and extract:
1. ENTITIES: People, organizations, products, locations, and events mentioned
2. RELATIONSHIPS: How these entities are connected to each other

Text:
{text}

Return your response in this JSON format:
{{
    "entities": [
        {{"name": "Entity Name", "type": "PERSON|ORGANIZATION|PRODUCT|LOCATION|EVENT", "description": "Brief description"}}
    ],
    "relationships": [
        {{"source": "Entity1", "target": "Entity2", "relationship": "Description of relationship"}}
    ]
}}

Only return valid JSON, no additional text."""
    
    # Call Bedrock with Claude
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "temperature": 0,
        "messages": [
            {"role": "user", "content": extraction_prompt}
        ]
    })
    
    try:
        response = bedrock.invoke_model(
            modelId=model_id,
            body=body,
            contentType='application/json',
            accept='application/json'
        )
        
        response_body = json.loads(response['body'].read())
        result_text = response_body['content'][0]['text']
        
        # Parse JSON response
        return json.loads(result_text)
    
    except Exception as e:
        print(f"‚ùå Bedrock error: {e}")
        return {"entities": [], "relationships": []}

# Extract entities from sample text
print("üöÄ Extracting entities and relationships using Amazon Bedrock...")
print("‚è≥ This may take a moment...\n")

try:
    extraction_result = extract_entities_with_bedrock(sample_text)
    
    print(f"‚úÖ Extraction complete!")
    print(f"   Entities found: {len(extraction_result.get('entities', []))}")
    print(f"   Relationships found: {len(extraction_result.get('relationships', []))}")
    
except Exception as e:
    print(f"‚ùå Error during extraction: {e}")
    print("\nüí° Make sure you have:")
    print("   - Valid AWS credentials")
    print("   - Access to Amazon Bedrock in your region")
    print(f"   - Enabled the {LLM_MODEL_ID} model")
    extraction_result = {"entities": [], "relationships": []}

In [None]:
# Display extracted entities
import pandas as pd

if extraction_result.get('entities'):
    entities_df = pd.DataFrame(extraction_result['entities'])
    print("üè∑Ô∏è Extracted Entities:")
    display(entities_df)
else:
    print("‚ö†Ô∏è No entities extracted. Check Bedrock configuration.")

In [None]:
# Display extracted relationships
if extraction_result.get('relationships'):
    rels_df = pd.DataFrame(extraction_result['relationships'])
    print("üîó Extracted Relationships:")
    display(rels_df)
else:
    print("‚ö†Ô∏è No relationships extracted. Check Bedrock configuration.")

## üìä 8. Store in Amazon Neptune

The extracted entities and relationships are stored in Amazon Neptune as a knowledge graph.

‚ö†Ô∏è **Note**: This requires an active Neptune cluster or Neptune Analytics graph.

In [None]:
def store_in_neptune(entities: list, relationships: list, endpoint: str = NEPTUNE_ENDPOINT):
    """
    Store extracted entities and relationships in Amazon Neptune.
    
    This uses the openCypher query language for Neptune.
    """
    from botocore.auth import SigV4Auth
    from botocore.awsrequest import AWSRequest
    import requests
    
    neptune_url = f"https://{endpoint}:{NEPTUNE_PORT}/openCypher"
    
    session = boto3.Session()
    credentials = session.get_credentials()
    
    def execute_query(query: str):
        """Execute an openCypher query on Neptune with IAM auth."""
        headers = {'Content-Type': 'application/x-www-form-urlencoded'}
        data = f"query={query}"
        
        request = AWSRequest(method='POST', url=neptune_url, data=data, headers=headers)
        SigV4Auth(credentials, 'neptune-db', AWS_REGION).add_auth(request)
        
        response = requests.post(
            neptune_url,
            data=data,
            headers=dict(request.headers)
        )
        return response
    
    # Create entities
    print("üì§ Creating entity nodes...")
    for entity in entities:
        name = entity['name'].replace("'", "\\'")
        entity_type = entity.get('type', 'ENTITY')
        description = entity.get('description', '').replace("'", "\\'")
        
        query = f"""
        MERGE (e:{entity_type} {{name: '{name}'}})
        SET e.description = '{description}'
        RETURN e
        """
        
        try:
            execute_query(query)
        except Exception as e:
            print(f"   ‚ö†Ô∏è Failed to create {name}: {e}")
    
    # Create relationships
    print("üì§ Creating relationships...")
    for rel in relationships:
        source = rel['source'].replace("'", "\\'")
        target = rel['target'].replace("'", "\\'")
        relationship = rel.get('relationship', 'RELATED_TO').replace("'", "\\'")
        
        query = f"""
        MATCH (s {{name: '{source}'}}), (t {{name: '{target}'}})
        MERGE (s)-[r:RELATED_TO {{description: '{relationship}'}}]->(t)
        RETURN r
        """
        
        try:
            execute_query(query)
        except Exception as e:
            print(f"   ‚ö†Ô∏è Failed to create relationship: {e}")
    
    print(f"‚úÖ Stored {len(entities)} entities and {len(relationships)} relationships")

# Store in Neptune (uncomment when Neptune is configured)
# store_in_neptune(extraction_result.get('entities', []), extraction_result.get('relationships', []))

print("‚ö†Ô∏è Neptune storage is commented out. Uncomment after configuring Neptune endpoint.")

## üîé 9. Querying the Knowledge Graph

The GraphRAG Toolkit supports various query strategies that combine:
- **Graph traversal** for entity-centric queries
- **Vector search** for semantic similarity
- **LLM generation** for natural language responses

In [None]:
def query_graphrag(question: str, entities: list, relationships: list, model_id: str = LLM_MODEL_ID) -> str:
    """
    Answer questions using the extracted knowledge graph.
    
    This demonstrates the query strategy used by the GraphRAG Toolkit.
    """
    bedrock = boto3.client('bedrock-runtime', region_name=AWS_REGION)
    
    # Build context from knowledge graph
    context = "Knowledge Graph Information:\n\n"
    
    context += "ENTITIES:\n"
    for entity in entities:
        context += f"- {entity['name']} ({entity.get('type', 'UNKNOWN')}): {entity.get('description', '')}\n"
    
    context += "\nRELATIONSHIPS:\n"
    for rel in relationships:
        context += f"- {rel['source']} -> {rel['target']}: {rel.get('relationship', '')}\n"
    
    query_prompt = f"""You are a helpful assistant that answers questions based on a knowledge graph.

{context}

Question: {question}

Answer the question based only on the information provided in the knowledge graph. 
If the information is not available, say so.
Be concise but comprehensive."""
    
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2048,
        "temperature": 0,
        "messages": [
            {"role": "user", "content": query_prompt}
        ]
    })
    
    try:
        response = bedrock.invoke_model(
            modelId=model_id,
            body=body,
            contentType='application/json',
            accept='application/json'
        )
        
        response_body = json.loads(response['body'].read())
        return response_body['content'][0]['text']
    
    except Exception as e:
        return f"Error querying: {e}"

def run_query(question: str):
    """Run a GraphRAG query and display results."""
    print(f"üîç Query: {question}")
    print("-" * 50)
    
    if extraction_result.get('entities'):
        response = query_graphrag(
            question,
            extraction_result.get('entities', []),
            extraction_result.get('relationships', [])
        )
        print(f"\nüìù Response:\n{response}")
    else:
        print("‚ö†Ô∏è No knowledge graph available. Run entity extraction first.")
    
    print("\n")

### Example Queries

Let's test our GraphRAG system with some questions:

In [None]:
# Entity-specific query
run_query("Who is Sarah Chen and what is her role at TechCorp?")

In [None]:
# Product-focused query
run_query("What products does TechCorp offer?")

In [None]:
# Relationship query
run_query("What universities does TechCorp collaborate with?")

In [None]:
# Synthesis query
run_query("Summarize TechCorp's business strategy and future plans.")

## üé® 10. Visualizing the Knowledge Graph

Let's create a visualization of the extracted knowledge graph.

In [None]:
import networkx as nx
import matplotlib.pyplot as plt

def visualize_knowledge_graph(entities: list, relationships: list):
    """Create a visualization of the knowledge graph."""
    if not entities:
        print("‚ö†Ô∏è No entities to visualize. Run extraction first.")
        return
    
    # Create graph
    G = nx.Graph()
    
    # Add nodes
    for entity in entities:
        G.add_node(entity['name'], type=entity.get('type', 'UNKNOWN'))
    
    # Add edges
    for rel in relationships:
        if rel['source'] in G.nodes and rel['target'] in G.nodes:
            G.add_edge(rel['source'], rel['target'])
    
    # Create visualization
    plt.figure(figsize=(16, 12))
    
    # Color nodes by type
    node_types = nx.get_node_attributes(G, 'type')
    unique_types = list(set(node_types.values()))
    
    # Define colors for entity types
    type_colors = {
        'PERSON': '#FF6B6B',
        'ORGANIZATION': '#4ECDC4',
        'PRODUCT': '#95E1D3',
        'LOCATION': '#F38181',
        'EVENT': '#AA96DA',
        'UNKNOWN': '#CCCCCC'
    }
    
    colors = [type_colors.get(node_types.get(node, 'UNKNOWN'), '#CCCCCC') for node in G.nodes()]
    
    # Layout
    pos = nx.spring_layout(G, k=2, iterations=50, seed=42)
    
    # Draw
    nx.draw(G, pos,
            node_color=colors,
            node_size=1500,
            font_size=8,
            font_weight='bold',
            with_labels=True,
            edge_color='#888888',
            alpha=0.9,
            width=1.5)
    
    plt.title("AWS GraphRAG Knowledge Graph", fontsize=16, fontweight='bold', pad=20)
    
    # Add legend
    legend_elements = [plt.Line2D([0], [0], marker='o', color='w',
                                   markerfacecolor=type_colors.get(t, '#CCCCCC'),
                                   markersize=12, label=t)
                       for t in unique_types]
    plt.legend(handles=legend_elements, loc='upper left', title='Entity Types', fontsize=10)
    
    plt.tight_layout()
    plt.savefig(PROJECT_DIR / 'knowledge_graph.png', dpi=150, bbox_inches='tight', facecolor='white')
    plt.show()
    
    print(f"\nüìä Graph Statistics:")
    print(f"   Nodes: {G.number_of_nodes()}")
    print(f"   Edges: {G.number_of_edges()}")
    print(f"   Entity Types: {unique_types}")

# Visualize the knowledge graph
visualize_knowledge_graph(
    extraction_result.get('entities', []),
    extraction_result.get('relationships', [])
)

## üìà 11. Vector Embeddings with Amazon Titan

For semantic search, the GraphRAG Toolkit generates vector embeddings using Amazon Titan.

In [None]:
def generate_embeddings(texts: list, model_id: str = EMBEDDING_MODEL_ID) -> list:
    """
    Generate vector embeddings using Amazon Titan.
    """
    bedrock = boto3.client('bedrock-runtime', region_name=AWS_REGION)
    embeddings = []
    
    for text in texts:
        body = json.dumps({
            "inputText": text
        })
        
        try:
            response = bedrock.invoke_model(
                modelId=model_id,
                body=body,
                contentType='application/json',
                accept='application/json'
            )
            
            response_body = json.loads(response['body'].read())
            embeddings.append(response_body['embedding'])
            
        except Exception as e:
            print(f"‚ùå Embedding error for '{text[:30]}...': {e}")
            embeddings.append(None)
    
    return embeddings

# Generate embeddings for entity descriptions
if extraction_result.get('entities'):
    print("üî¢ Generating embeddings for entities...")
    
    entity_texts = [f"{e['name']}: {e.get('description', '')}" for e in extraction_result['entities']]
    
    try:
        entity_embeddings = generate_embeddings(entity_texts[:5])  # Limit for demo
        
        valid_embeddings = [e for e in entity_embeddings if e is not None]
        if valid_embeddings:
            print(f"‚úÖ Generated {len(valid_embeddings)} embeddings")
            print(f"   Embedding dimension: {len(valid_embeddings[0])}")
        else:
            print("‚ö†Ô∏è No embeddings generated. Check Bedrock configuration.")
    except Exception as e:
        print(f"‚ùå Error generating embeddings: {e}")
else:
    print("‚ö†Ô∏è No entities available. Run extraction first.")

## üßπ 12. Cleanup

Optional: Remove generated files and clean up resources.

In [None]:
import shutil

def cleanup():
    """Remove all generated files."""
    dirs_to_remove = ['input', 'output']
    files_to_remove = ['config.yaml', 'knowledge_graph.png']
    
    for dir_name in dirs_to_remove:
        dir_path = PROJECT_DIR / dir_name
        if dir_path.exists():
            shutil.rmtree(dir_path)
            print(f"üóëÔ∏è Removed directory: {dir_path}")
    
    for file_name in files_to_remove:
        file_path = PROJECT_DIR / file_name
        if file_path.exists():
            file_path.unlink()
            print(f"üóëÔ∏è Removed file: {file_path}")
    
    print("\n‚úÖ Cleanup complete!")

# Uncomment to run cleanup
# cleanup()

---

## Summary

In this notebook, we covered:

1. **Installation** of AWS GraphRAG Toolkit
2. **AWS Configuration** for Neptune, OpenSearch, and Bedrock
3. **Project Setup** with proper directory structure
4. **Entity Extraction** using Amazon Bedrock (Claude)
5. **Knowledge Graph Storage** in Amazon Neptune
6. **Querying** the knowledge graph with natural language
7. **Visualization** of extracted entities and relationships
8. **Vector Embeddings** with Amazon Titan

## AWS Services Required

| Service | Purpose | Cost Tier |
|---------|---------|----------|
| Amazon Neptune | Graph database | Pay per instance/hour |
| Amazon OpenSearch Serverless | Vector store | Pay per OCU-hour |
| Amazon Bedrock | LLM & Embeddings | Pay per token |

## Resources

- [AWS GraphRAG Toolkit GitHub](https://github.com/awslabs/graphrag-toolkit)
- [Amazon Neptune Documentation](https://docs.aws.amazon.com/neptune/)
- [Amazon Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
- [Amazon OpenSearch Serverless](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless.html)
- [GraphRAG Toolkit Blog Post](https://aws.amazon.com/blogs/database/introducing-the-graphrag-toolkit/)

## Comparison with Microsoft GraphRAG

| Feature | AWS GraphRAG Toolkit | Microsoft GraphRAG |
|---------|---------------------|-------------------|
| Graph Store | Amazon Neptune | File-based (Parquet) |
| Vector Store | OpenSearch Serverless | Built-in |
| LLM Provider | Amazon Bedrock | OpenAI |
| Embeddings | Amazon Titan | OpenAI |
| Deployment | AWS Infrastructure | Local/Cloud |
| Community Detection | Neptune Analytics | Leiden Algorithm |

## Tips

- Use **Neptune Analytics** for serverless graph queries
- Enable **Bedrock model access** in your AWS region before use
- Consider **cost optimization** by using smaller models for development
- Use **IAM roles** for secure access to AWS services