# GraphQL Query Resolvers - Interactive Tutorial
# ==========================================

In this notebook, you'll learn how to implement GraphQL query resolvers for the RAG Engine.

## üìö Learning Objectives

By the end of this notebook, you will:
- Understand GraphQL resolver patterns
- Implement document query resolvers
- Implement chat session resolvers
- Implement query history resolvers
- Learn performance optimization techniques

## üîß Prerequisites

Ensure you have the following installed:
- Python 3.11+
- Strawberry GraphQL
- FastAPI
- PostgreSQL

## üì¶ Setup

Let's start by importing necessary libraries and setting up our environment.

In [None]:
# Import required libraries
import asyncio
from typing import List, Optional
from datetime import datetime
from dataclasses import dataclass

# GraphQL library
import strawberry
from strawberry.tools import merge_types

# Print setup confirmation
print("‚úÖ Libraries imported successfully!")
print(f"   - Strawberry version: {strawberry.__version__}")

## 1. GraphQL Basics

GraphQL is a query language for APIs that allows clients to request exactly the data they need.

### 1.1 Schema Definition

A GraphQL schema defines the types, queries, mutations, and subscriptions available in the API.

In [None]:
# Define GraphQL types
@strawberry.enum
class DocumentStatus(strawberry.Enum):
    CREATED = "created"
    INDEXED = "indexed"
    FAILED = "failed"

@strawberry.enum
class QuerySortBy(strawberry.Enum):
    CREATED = "created"
    UPDATED = "updated"
    FILENAME = "filename"
    SIZE = "size"

# Document type
@strawberry.type
class DocumentType:
    id: strawberry.ID
    filename: str
    content_type: str
    size_bytes: int
    status: DocumentStatus
    created_at: datetime
    updated_at: Optional[datetime]

# Facet type for search results
@strawberry.type
class FacetType:
    name: str
    count: int

# Search result type
@strawberry.type
class SearchResultType:
    results: List[DocumentType]
    total: int
    facets: Optional[List[FacetType]]

print("‚úÖ GraphQL types defined successfully!")

### 1.2 Query Resolvers

Query resolvers are functions that fetch data for each field in your GraphQL schema.

When a client executes a query, GraphQL calls the resolver for each field and assembles the response.

In [None]:
# Mock data for demonstration
MOCK_DOCUMENTS = [
    {
        "id": "doc-001",
        "filename": "research-paper.pdf",
        "content_type": "application/pdf",
        "size_bytes": 1048576,
        "status": "indexed",
        "created_at": datetime(2026, 1, 31, 10, 0, 0),
        "updated_at": datetime(2026, 1, 31, 10, 5, 0),
    },
    {
        "id": "doc-002",
        "filename": "presentation.pptx",
        "content_type": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
        "size_bytes": 5242880,
        "status": "indexed",
        "created_at": datetime(2026, 1, 30, 15, 30, 0),
        "updated_at": None,
    },
    {
        "id": "doc-003",
        "filename": "data-analysis.xlsx",
        "content_type": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
        "size_bytes": 262144,
        "status": "created",
        "created_at": datetime(2026, 1, 29, 9, 15, 0),
        "updated_at": None,
    },
]

print(f"‚úÖ Mock data prepared: {len(MOCK_DOCUMENTS)} documents")

## 2. Document Query Resolvers

### 2.1 List Documents Resolver

The `documents` resolver fetches a paginated list of documents for a tenant.

In [None]:
@strawberry.type
class Query:
    @strawberry.field
    def documents(
        self,
        info,
        limit: int = 20,
        offset: int = 0,
        status: Optional[DocumentStatus] = None,
    ) -> List[DocumentType]:
        """
        Query documents with pagination and filtering.

        Args:
            info: GraphQL execution context
            limit: Max results (default: 20, max: 100)
            offset: Pagination offset (default: 0)
            status: Filter by status (optional)

        Returns:
            List of documents
        """
        # Validate inputs
        if limit < 0:
            raise ValueError("limit must be non-negative")
        if limit > 100:
            limit = 100  # Enforce maximum
        if offset < 0:
            raise ValueError("offset must be non-negative")

        # Simulate tenant extraction (in real app: from request headers)
        tenant_id = "demo-tenant-001"

        # Simulate database query
        documents = MOCK_DOCUMENTS[offset:offset + limit]

        # Filter by status if provided
        if status:
            documents = [d for d in documents if d["status"] == status.value]

        # Convert to GraphQL types
        return [
            DocumentType(
                id=doc["id"],
                filename=doc["filename"],
                content_type=doc["content_type"],
                size_bytes=doc["size_bytes"],
                status=DocumentStatus(doc["status"]),
                created_at=doc["created_at"],
                updated_at=doc.get("updated_at"),
            )
            for doc in documents
        ]

print("‚úÖ documents resolver defined successfully!")

### 2.2 Test the documents resolver

Let's test our resolver with a GraphQL query.

In [None]:
# Create schema
schema = strawberry.Schema(query=Query)

# Test query: list all documents
query = '''
query ListDocuments($limit: Int, $offset: Int, $status: DocumentStatus) {
  documents(limit: $limit, offset: $offset, status: $status) {
    id
    filename
    contentType
    sizeBytes
    status
    createdAt
    updatedAt
  }
}
'''

# Execute query
result = schema.execute_sync(
    query,
    variable_values={
        "limit": 10,
        "offset": 0,
        "status": None,
    }
)

# Display results
print("üìÑ GraphQL Query Results:")
print(result.data)

### 2.3 Get Single Document Resolver

The `document` resolver fetches a single document by ID.

In [None]:
@strawberry.type
class Query:
    # ... previous documents resolver ...

    @strawberry.field
    def document(
        self,
        info,
        document_id: strawberry.ID,
    ) -> Optional[DocumentType]:
        """
        Get a single document by ID.

        Args:
            info: GraphQL execution context
            document_id: Document ID to fetch

        Returns:
            Document or None if not found
        """
        # Validate input
        if not document_id:
            raise ValueError("document_id is required")

        # Simulate tenant extraction
        tenant_id = "demo-tenant-001"

        # Simulate database query
        doc = next((d for d in MOCK_DOCUMENTS if d["id"] == str(document_id)), None)

        # Return None if not found
        if not doc:
            return None

        # Convert to GraphQL type
        return DocumentType(
            id=doc["id"],
            filename=doc["filename"],
            content_type=doc["content_type"],
            size_bytes=doc["size_bytes"],
            status=DocumentStatus(doc["status"]),
            created_at=doc["created_at"],
            updated_at=doc.get("updated_at"),
        )

print("‚úÖ document resolver defined successfully!")

### 2.4 Test the document resolver

In [None]:
# Re-create schema with updated Query type
schema = strawberry.Schema(query=Query)

# Test query: get single document
query = '''
query GetDocument($documentId: ID!) {
  document(documentId: $documentId) {
    id
    filename
    status
    createdAt
  }
}
'''

# Execute query
result = schema.execute_sync(
    query,
    variable_values={
        "documentId": "doc-001",
    }
)

# Display results
print("üìÑ Single Document Query Results:")
print(result.data)

## 3. Search Documents Resolver

### 3.1 Full-Text Search

The `search_documents` resolver performs full-text search with faceted results.

In [None]:
@strawberry.type
class Query:
    # ... previous resolvers ...

    @strawberry.field
    def search_documents(
        self,
        info,
        query: str,
        k: int = 10,
        sort_by: QuerySortBy = QuerySortBy.CREATED,
        limit: int = 20,
        offset: int = 0,
    ) -> SearchResultType:
        """
        Search documents with full-text search.

        Args:
            info: GraphQL execution context
            query: Search query (required)
            k: Number of results (default: 10)
            sort_by: Sort order (default: CREATED)
            limit: Max results (default: 20)
            offset: Pagination offset (default: 0)

        Returns:
            Search results with facets
        """
        # Validate inputs
        if not query or not query.strip():
            raise ValueError("query is required")

        if k < 1 or k > 100:
            raise ValueError("k must be between 1 and 100")

        if limit < 1 or limit > 100:
            raise ValueError("limit must be between 1 and 100")

        # Simulate search (filter by filename contains query)
        search_results = [
            d for d in MOCK_DOCUMENTS
            if query.lower() in d["filename"].lower()
        ]

        # Sort results
        if sort_by == QuerySortBy.CREATED:
            search_results.sort(key=lambda x: x["created_at"], reverse=True)
        elif sort_by == QuerySortBy.FILENAME:
            search_results.sort(key=lambda x: x["filename"])
        elif sort_by == QuerySortBy.SIZE:
            search_results.sort(key=lambda x: x["size_bytes"], reverse=True)

        # Paginate
        paginated_results = search_results[offset:offset + limit]

        # Calculate facets
        status_facets = {}
        for doc in search_results:
            status = doc["status"].upper()
            status_facets[status] = status_facets.get(status, 0) + 1

        facets = [
            FacetType(name=status, count=count)
            for status, count in status_facets.items()
        ]

        # Convert to GraphQL types
        return SearchResultType(
            results=[
                DocumentType(
                    id=doc["id"],
                    filename=doc["filename"],
                    content_type=doc["content_type"],
                    size_bytes=doc["size_bytes"],
                    status=DocumentStatus(doc["status"]),
                    created_at=doc["created_at"],
                    updated_at=doc.get("updated_at"),
                )
                for doc in paginated_results
            ],
            total=len(search_results),
            facets=facets,
        )

print("‚úÖ search_documents resolver defined successfully!")

### 3.2 Test the search_documents resolver

In [None]:
# Re-create schema with updated Query type
schema = strawberry.Schema(query=Query)

# Test query: search documents
query = '''
query SearchDocuments($query: String!, $k: Int, $sortBy: QuerySortBy) {
  searchDocuments(query: $query, k: $k, sortBy: $sortBy) {
    results {
      id
      filename
      status
    }
    total
    facets {
      name
      count
    }
  }
}
'''

# Execute query
result = schema.execute_sync(
    query,
    variable_values={
        "query": "pdf",
        "k": 10,
        "sortBy": "CREATED",
    }
)

# Display results
print("üîç Search Results:")
print(f"Total: {result.data['searchDocuments']['total']}")
print(f"\nFacets:")
for facet in result.data['searchDocuments']['facets']:
    print(f"  - {facet['name']}: {facet['count']}")
print(f"\nResults:")
for doc in result.data['searchDocuments']['results']:
    print(f"  - {doc['filename']}")

## 4. Chat Session Resolvers

### 4.1 Chat Session Type

First, let's define the chat session type.

In [None]:
@strawberry.type
class ChatSessionType:
    id: strawberry.ID
    title: Optional[str]
    created_at: datetime

# Mock chat sessions
MOCK_CHAT_SESSIONS = [
    {
        "id": "session-001",
        "title": "Research on RAG",
        "created_at": datetime(2026, 1, 31, 10, 0, 0),
    },
    {
        "id": "session-002",
        "title": "Data Analysis",
        "created_at": datetime(2026, 1, 30, 15, 30, 0),
    },
    {
        "id": "session-003",
        "title": None,  # Untitled session
        "created_at": datetime(2026, 1, 29, 9, 15, 0),
    },
]

print("‚úÖ Chat session types and mock data prepared!")

### 4.2 List Chat Sessions Resolver

In [None]:
@strawberry.type
class Query:
    # ... previous resolvers ...

    @strawberry.field
    def chat_sessions(
        self,
        info,
        limit: int = 20,
        offset: int = 0,
    ) -> List[ChatSessionType]:
        """
        Query chat sessions with pagination.

        Args:
            info: GraphQL execution context
            limit: Max results (default: 20)
            offset: Pagination offset (default: 0)

        Returns:
            List of chat sessions
        """
        # Validate inputs
        if limit < 0:
            raise ValueError("limit must be non-negative")
        if limit > 100:
            limit = 100
        if offset < 0:
            raise ValueError("offset must be non-negative")

        # Simulate tenant extraction
        tenant_id = "demo-tenant-001"

        # Simulate database query
        sessions = MOCK_CHAT_SESSIONS[offset:offset + limit]

        # Convert to GraphQL types
        return [
            ChatSessionType(
                id=session["id"],
                title=session.get("title"),
                created_at=session["created_at"],
            )
            for session in sessions
        ]

print("‚úÖ chat_sessions resolver defined successfully!")

## 5. Query History Resolvers

### 5.1 Query History Type

In [None]:
@strawberry.type
class QueryHistoryItemType:
    question: str
    answer: str
    sources: List[str]
    timestamp: datetime

# Mock query history
MOCK_QUERY_HISTORY = [
    {
        "question": "What is RAG?",
        "answer": "RAG stands for Retrieval-Augmented Generation...",
        "sources": ["chunk-123", "chunk-456"],
        "timestamp": datetime(2026, 1, 31, 12, 0, 0),
    },
    {
        "question": "How does vector search work?",
        "answer": "Vector search uses embeddings to find similar...",
        "sources": ["chunk-789", "chunk-012"],
        "timestamp": datetime(2026, 1, 30, 15, 30, 0),
    },
]

print("‚úÖ Query history type and mock data prepared!")

### 5.2 List Query History Resolver

In [None]:
@strawberry.type
class Query:
    # ... previous resolvers ...

    @strawberry.field
    def query_history(
        self,
        info,
        limit: int = 50,
        offset: int = 0,
    ) -> List[QueryHistoryItemType]:
        """
        Get query history with pagination.

        Args:
            info: GraphQL execution context
            limit: Max results (default: 50)
            offset: Pagination offset (default: 0)

        Returns:
            List of query history items
        """
        # Validate inputs
        if limit < 0:
            raise ValueError("limit must be non-negative")
        if limit > 100:
            limit = 100
        if offset < 0:
            raise ValueError("offset must be non-negative")

        # Simulate tenant extraction
        tenant_id = "demo-tenant-001"

        # Simulate database query
        history = MOCK_QUERY_HISTORY[offset:offset + limit]

        # Convert to GraphQL types
        return [
            QueryHistoryItemType(
                question=item["question"],
                answer=item["answer"],
                sources=item["sources"],
                timestamp=item["timestamp"],
            )
            for item in history
        ]

print("‚úÖ query_history resolver defined successfully!")

## 6. Performance Optimization

### 6.1 The N+1 Query Problem

The N+1 query problem occurs when you query N items, then make N additional queries for nested fields.

In [None]:
# BAD: N+1 query problem example
@strawberry.type
class BadQuery:
    @strawberry.field
    def documents_with_chunks_bad(self) -> List[DocumentType]:
        """
        BAD: This makes N+1 queries!
        - 1 query to get documents
        - N queries to get chunks for each document
        """
        docs = MOCK_DOCUMENTS  # 1 query
        
        result = []
        for doc in docs:  # N iterations
            # Each iteration makes another query!
            chunks = []  # Hypothetical: chunk_repo.get_chunks(doc["id"])
            result.append({
                "document": doc,
                "chunks": chunks,
            })
        
        return result

print("‚ö†Ô∏è  N+1 query problem demonstrated (BAD)")

### 6.2 DataLoader Solution

DataLoader batches requests to eliminate N+1 queries.

In [None]:
# GOOD: DataLoader solution
from typing import Dict, List

class ChunkLoader:
    """
    DataLoader for batching chunk queries.
    
    This caches chunks and batches requests.
    """
    
    def __init__(self):
        self._cache: Dict[str, List[str]] = {}
    
    async def load_many(self, document_ids: List[str]) -> Dict[str, List[str]]:
        """
        Load chunks for multiple documents in a single query.
        
        This eliminates N+1 queries.
        """
        # Check cache first
        uncached_ids = [id for id in document_ids if id not in self._cache]
        
        # Batch query (simulated)
        # In real app: chunk_repo.get_chunks_batch(uncached_ids)
        for doc_id in uncached_ids:
            self._cache[doc_id] = [f"chunk-{doc_id}-1", f"chunk-{doc_id}-2"]
        
        # Return results
        return {doc_id: self._cache[doc_id] for doc_id in document_ids}

# Create loader instance
chunk_loader = ChunkLoader()

print("‚úÖ DataLoader solution implemented (GOOD)")

## 7. Practice Exercise

### Task: Implement a filtered documents resolver

Create a resolver that filters documents by multiple criteria:
- Status (indexed, created, failed)
- Minimum size (in bytes)
- Maximum size (in bytes)
- Content type (pdf, docx, etc.)

In [None]:
# YOUR CODE HERE: Implement filtered documents resolver

@strawberry.type
class Query:
    # ... existing resolvers ...

    @strawberry.field
    def filtered_documents(
        self,
        info,
        status: Optional[DocumentStatus] = None,
        min_size: Optional[int] = None,
        max_size: Optional[int] = None,
        content_type: Optional[str] = None,
    ) -> List[DocumentType]:
        """
        Filter documents by multiple criteria.

        Args:
            status: Filter by status
            min_size: Minimum file size in bytes
            max_size: Maximum file size in bytes
            content_type: Filter by content type

        Returns:
            Filtered list of documents
        """
        # YOUR IMPLEMENTATION
        filtered = MOCK_DOCUMENTS

        # TODO: Implement filtering logic
        # if status:
        # if min_size:
        # if max_size:
        # if content_type:

        return [
            DocumentType(
                id=doc["id"],
                filename=doc["filename"],
                content_type=doc["content_type"],
                size_bytes=doc["size_bytes"],
                status=DocumentStatus(doc["status"]),
                created_at=doc["created_at"],
                updated_at=doc.get("updated_at"),
            )
            for doc in filtered
        ]

print("‚úÖ filtered_documents resolver defined (TODO: implement filtering)")

## 8. Quiz

### Question 1
What is the purpose of GraphQL query resolvers?

A) Define the GraphQL schema
B) Fetch data for each field in the schema
C) Validate GraphQL queries
D) Generate documentation

**Answer:** B - Resolvers fetch data for each field.

---

### Question 2
How do you pass the tenant ID to a GraphQL resolver?

A) Global variable
B) Request context (info.context["request"])
C) Environment variable
D) Query parameter

**Answer:** B - Extract from request context.

---

### Question 3
What should you return when a resource is not found?

A) Raise ValueError
B) Return None or empty list
C) Return 404 error
D) Return null

**Answer:** B - Return None (GraphQL best practice).

---

### Question 4
How do you prevent N+1 query problems?

A) Limit query size
B) Use DataLoader or batch queries
C) Cache all queries
D) Use async resolvers

**Answer:** B - Use DataLoader for batch fetching.

---

### Question 5
What is cursor-based pagination?

A) Using OFFSET/LIMIT
B) Using a pointer (cursor) to the last item
C) Using page numbers
D) Using random sampling

**Answer:** B - Cursor points to the last fetched item.

## 9. Summary

In this notebook, you learned:

1. **GraphQL Basics** - Schema definition and resolver patterns
2. **Document Resolvers** - List, get single, search
3. **Chat Session Resolvers** - List sessions
4. **Query History Resolvers** - Fetch historical queries
5. **Performance Optimization** - DataLoader for N+1 queries
6. **Best Practices** - Validation, error handling, authorization

### üéØ Key Takeaways

- Resolvers fetch data for each GraphQL field
- Always validate inputs to prevent abuse
- Use DataLoader to solve N+1 query problems
- Return None for not found (don't leak existence)
- Implement tenant isolation for security

### üöÄ Next Steps

1. Implement the resolver code in `src/api/v1/graphql.py`
2. Test resolvers using GraphQL Playground
3. Proceed to Phase 1.2: GraphQL Ask Question Resolver

### üìö Further Reading

- [Strawberry GraphQL Documentation](https://strawberry.rocks/docs)
- [GraphQL Specification](https://spec.graphql.org/)
- [DataLoader](https://github.com/graphql/dataloader)
- [GraphQL Best Practices](https://graphql.best practices/)

In [None]:
# Print completion message
print("\nüéâ Congratulations!")
print("You've completed the GraphQL Query Resolvers tutorial!")
print("\nüìù Next: Implement these resolvers in the actual codebase.")