## API Key Configuration

**Security Note:** Never hardcode your API key in notebooks or commit it to version control.

This notebook uses an LLM API for:
- Generating hypothetical code snippets (HyDE)
- Generating explanations for topic queries (RAG)

All embeddings are still generated locally for efficiency.

In [1]:
import os
import getpass
from typing import Optional

def load_api_key() -> str:
    """
    Securely load LLM API key from environment variable or user input.
    The key is never printed or displayed in notebook outputs.
    """
    # Try to load from environment variable first
    api_key = os.environ.get('LLM_API_KEY')
    
    if api_key:
        print("✓ API key loaded from environment variable")
        return api_key
    
    # Try to load from .env file
    try:
        from dotenv import load_dotenv
        load_dotenv()
        api_key = os.environ.get('LLM_API_KEY')
        if api_key:
            print("✓ API key loaded from .env file")
            return api_key
    except ImportError:
        pass
    
    # Prompt user for API key (not echoed to terminal)
    print("API key not found in environment.")
    api_key = getpass.getpass("Enter your LLM API key: ")
    
    if not api_key or api_key.strip() == "":
        raise ValueError("API key is required to use this notebook")
    
    print("✓ API key entered manually (not saved)")
    return api_key.strip()

# Load API key securely
API_KEY = load_api_key()
print("\n  API key loaded successfully. It will NOT be displayed in any outputs.")
print("Key length:", len(API_KEY), "characters")

✓ API key loaded from .env file

  API key loaded successfully. It will NOT be displayed in any outputs.
Key length: 56 characters


## 1. Importing and Setup

In [2]:
import os
import sys
from typing import List, Dict, Tuple, Optional
from pathlib import Path
import requests
import time
import re

# Import utils functions
sys.path.append(os.getcwd())
from utils import find_repo_root, list_python_files

import chromadb
from sentence_transformers import SentenceTransformer
from tree_sitter_languages import get_parser
import numpy as np

print("✓ All imports loaded successfully!")

  from .autonotebook import tqdm as notebook_tqdm


✓ All imports loaded successfully!


## 2. Configuration and LLM API Setup

In [3]:
# Repository and Database Configuration
FLASK_REPO_PATH = "../flask"  # Adjust to your Flask repo location
CHROMA_DB_PATH = "../data/chroma_db_api"
COLLECTION_NAME = "flask_code_api"
EMBEDDING_MODEL = "jinaai/jina-embeddings-v2-base-code"

# LLM API Configuration (Groq)
API_URL = "https://api.groq.com/openai/v1/chat/completions"
MODEL_NAME = "llama-3.3-70b-versatile"

# Initialize parser
parser = get_parser("python")

# Load local embedding model
print("Loading local embedding model...")
try:
    embedding_model = SentenceTransformer(EMBEDDING_MODEL, trust_remote_code=True)
    print("✓ Local embedding model loaded successfully!")
except Exception as e:
    print(f" Error loading embedding model: {e}")



Loading local embedding model...


'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 38e4c505-fdc2-41be-8c91-eb91af3c3946)')' thrown while requesting HEAD https://huggingface.co/jinaai/jina-embeddings-v2-base-code/resolve/main/tokenizer_config.json
Retrying in 1s [Retry 1/5].


✓ Local embedding model loaded successfully!


## 3. LLM API Integration

API-based functions for HyDE and RAG workflows.

In [4]:
def call_llm_api(
    prompt: str,
    model: str = MODEL_NAME,
    temperature: float = 0.3,
    timeout: int = 60,
    max_retries: int = 3,
    max_tokens: int = 500
) -> Optional[str]:
    """
    Send a prompt to the LLM API and return the generated response.
    
    Args:
        prompt: The input prompt for the LLM
        model: Model name
        temperature: Sampling temperature
        timeout: Request timeout in seconds
        max_retries: Maximum number of retry attempts
        max_tokens: Maximum tokens in response
    
    Returns:
        The generated text response, or None if the request fails
    """
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model,
        "messages": [
            {"role": "system", "content": "You are a helpful code analysis assistant."},
            {"role": "user", "content": prompt}
        ],
        "temperature": temperature,
        "max_tokens": max_tokens,
        "stop": None
    }
    
    for attempt in range(max_retries):
        try:
            response = requests.post(API_URL, json=payload, headers=headers, timeout=timeout)
            response.raise_for_status()
            
            result = response.json()
            
            if 'choices' in result and len(result['choices']) > 0:
                return result['choices'][0]['message']['content'].strip()
            else:
                print(f"  Unexpected API response format")
                return None
                
        except requests.exceptions.Timeout:
            print(f"  LLM request timed out (attempt {attempt + 1}/{max_retries})")
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt
                time.sleep(wait_time)
            else:
                return None
                
        except requests.exceptions.HTTPError as e:
            status_code = e.response.status_code
            
            if status_code == 429:
                print(f" Rate limit hit (attempt {attempt + 1}/{max_retries})")
                if attempt < max_retries - 1:
                    wait_time = 5 * (attempt + 1)
                    time.sleep(wait_time)
                else:
                    return None
            elif status_code == 401:
                print(f" Authentication failed: Invalid API key")
                return None
            else:
                print(f" HTTP error {status_code}: {e}")
                if attempt < max_retries - 1:
                    wait_time = 2 ** attempt
                    time.sleep(wait_time)
                else:
                    return None
                    
        except requests.exceptions.RequestException as e:
            print(f" Request failed: {e}")
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt
                time.sleep(wait_time)
            else:
                return None
    
    return None

def test_api_connection() -> bool:
    """Test if the LLM API is accessible."""
    print("Testing API connection...")
    
    test_prompt = "Respond with only 'OK' if you can read this."
    
    try:
        response = call_llm_api(test_prompt, timeout=30, max_retries=1, max_tokens=10)
        
        if response:
            print(f"✓ API connection successful!")
            print(f"  Model: {MODEL_NAME}")
            return True
        else:
            print(f" API test failed - no response received")
            return False
            
    except Exception as e:
        print(f" API test failed: {e}")
        return False

# Test the API connection
test_api_connection()


Testing API connection...
✓ API connection successful!
  Model: llama-3.3-70b-versatile


True

## 4. Code Parsing and Chunking

(Same as original notebook - no changes needed)

In [5]:
def extract_code_chunks(file_path: str) -> List[Dict]:
    """Extract functions and classes from a Python file using tree-sitter."""
    chunks = []
    
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            code = f.read()
        
        tree = parser.parse(bytes(code, "utf8"))
        root_node = tree.root_node
        
        def traverse(node, depth=0):
            # Extract function definitions
            if node.type == 'function_definition':
                name_node = node.child_by_field_name('name')
                if name_node:
                    func_name = code[name_node.start_byte:name_node.end_byte]
                    func_code = code[node.start_byte:node.end_byte]
                    
                    # Extract docstring if present
                    docstring = ""
                    body = node.child_by_field_name('body')
                    if body and body.child_count > 0:
                        first_child = body.children[0]
                        if first_child.type == 'expression_statement':
                            expr = first_child.children[0]
                            if expr.type == 'string':
                                docstring = code[expr.start_byte:expr.end_byte].strip('"""').strip("'''").strip()
                    
                    chunks.append({
                        'type': 'function',
                        'name': func_name,
                        'code': func_code,
                        'docstring': docstring,
                        'file_path': file_path,
                        'start_line': node.start_point[0] + 1,
                        'end_line': node.end_point[0] + 1,
                    })
            
            # Extract class definitions
            elif node.type == 'class_definition':
                name_node = node.child_by_field_name('name')
                if name_node:
                    class_name = code[name_node.start_byte:name_node.end_byte]
                    class_code = code[node.start_byte:node.end_byte]
                    
                    # Extract class docstring
                    docstring = ""
                    body = node.child_by_field_name('body')
                    if body and body.child_count > 0:
                        first_child = body.children[0]
                        if first_child.type == 'expression_statement':
                            expr = first_child.children[0]
                            if expr.type == 'string':
                                docstring = code[expr.start_byte:expr.end_byte].strip('"""').strip("'''").strip()
                    
                    # Limit class code to avoid huge chunks
                    if len(class_code) > 2000:
                        class_code = class_code[:2000] + "\n    # ... (truncated)"
                    
                    chunks.append({
                        'type': 'class',
                        'name': class_name,
                        'code': class_code,
                        'docstring': docstring,
                        'file_path': file_path,
                        'start_line': node.start_point[0] + 1,
                        'end_line': node.end_point[0] + 1,
                    })
            
            # Recursively traverse children
            for child in node.children:
                traverse(child, depth + 1)
        
        traverse(root_node)
    except Exception as e:
        print(f"Error parsing {file_path}: {e}")
    
    return chunks

def create_searchable_text(chunk: Dict) -> str:
    """Create searchable text with prioritized metadata for embeddings."""
    parts = []
    
    # Prioritize docstring
    if chunk['docstring']:
        parts.append(f"Documentation: {chunk['docstring']}")
    
    # Add type and name
    parts.append(f"{chunk['type']}: {chunk['name']}")
    
    # Add file path
    file_name = chunk['file_path'].split('/')[-1] if '/' in chunk['file_path'] else chunk['file_path'].split('\\')[-1]
    parts.append(f"File: {file_name}")
    
    # Add limited code
    code_snippet = chunk['code'][:400]
    parts.append(f"Code:\n{code_snippet}")
    
    return "\n\n".join(parts)

print("✓ Code parsing functions loaded successfully!")

✓ Code parsing functions loaded successfully!


## 5. Indexing Pipeline

(Same as original - uses local embeddings)

In [6]:
def index_repository(repo_path: str, force_reindex: bool = False):
    """Index all Python files in the repository."""
    
    # Initialize ChromaDB
    os.makedirs(CHROMA_DB_PATH, exist_ok=True)
    client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
    
    # Get or create collection with COSINE similarity
    try:
        if force_reindex:
            client.delete_collection(name=COLLECTION_NAME)
            print("Deleted existing collection for reindexing.")
    except:
        pass
    
    collection = client.get_or_create_collection(
        name=COLLECTION_NAME,
        metadata={
            "description": "Flask repository code chunks with API-based search",
            "hnsw:space": "cosine"
        }
    )
    
    # Check if already indexed
    if collection.count() > 0 and not force_reindex:
        print(f"Repository already indexed with {collection.count()} chunks.")
        return collection
    
    # Get all Python files
    print(f"Finding Python files in {repo_path}...")
    py_files = list_python_files(repo_path)
    print(f"Found {len(py_files)} Python files.")
    
    # Extract and index chunks
    all_chunks = []
    for i, file_path in enumerate(py_files):
        if i % 10 == 0:
            print(f"Processing file {i+1}/{len(py_files)}...")
        
        chunks = extract_code_chunks(file_path)
        all_chunks.extend(chunks)
    
    print(f"Extracted {len(all_chunks)} code chunks.")
    
    if not all_chunks:
        print("No code chunks found!")
        return collection
    
    # Generate embeddings in batches (LOCAL)
    print("Generating embeddings locally...")
    batch_size = 32
    indexed_count = 0
    
    for i in range(0, len(all_chunks), batch_size):
        batch = all_chunks[i:i+batch_size]
        texts = [create_searchable_text(chunk) for chunk in batch]
        
        # Generate embeddings locally
        embeddings = embedding_model.encode(texts, show_progress_bar=False)
        
        # Prepare unique IDs
        ids = [f"{indexed_count + j}:{chunk['file_path']}:{chunk['name']}:{chunk['start_line']}" 
               for j, chunk in enumerate(batch)]
        
        metadatas = [{
            'type': chunk['type'],
            'name': chunk['name'],
            'file_path': chunk['file_path'],
            'start_line': chunk['start_line'],
            'end_line': chunk['end_line'],
            'docstring': chunk['docstring'][:500] if chunk['docstring'] else "",
        } for chunk in batch]
        
        documents = [chunk['code'] for chunk in batch]
        
        # Add to collection
        collection.add(
            ids=ids,
            embeddings=embeddings.tolist(),
            metadatas=metadatas,
            documents=documents
        )
        
        indexed_count += len(batch)
        
        if indexed_count % 100 == 0 or indexed_count == len(all_chunks):
            print(f"Indexed {indexed_count}/{len(all_chunks)} chunks...")
    
    print(f"✓ Indexing complete! Total chunks: {collection.count()}")
    return collection

print("✓ Indexing pipeline ready!")

✓ Indexing pipeline ready!


## 6. Query Type Detection

Automatically detect whether a query is topic-focused (RAG) or code-specific (HyDE).

In [7]:
def detect_query_type(query: str) -> str:
    """
    Detect whether the query is a topic query (RAG) or code query (HyDE).
    
    Returns:
        'topic': For high-level questions needing explanations (use RAG)
        'code': For code-specific queries (use HyDE)
    """
    query_lower = query.lower()
    
    # Topic indicators (conceptual questions)
    topic_indicators = [
        'how does', 'how do', 'what is', 'what are', 'why does', 'why do',
        'explain', 'describe', 'tell me about', 'work', 'understand',
        'architecture', 'design', 'pattern', 'concept', 'mechanism'
    ]
    
    # Code indicators (implementation requests)
    code_indicators = [
        'show me', 'find', 'function', 'class', 'method', 'implementation',
        'code for', 'example of', 'sample', 'usage', 'api', 'interface'
    ]
    
    # Check for topic indicators
    topic_score = sum(1 for indicator in topic_indicators if indicator in query_lower)
    
    # Check for code indicators
    code_score = sum(1 for indicator in code_indicators if indicator in query_lower)
    
    # Default to code if ambiguous or no clear indicators
    if topic_score > code_score:
        return 'topic'
    else:
        return 'code'

# Test query type detection
test_queries = [
    "How does Flask handle routing?",
    "Show me the route decorator implementation",
    "What is the request context?",
    "Find error handling functions"
]

print("Query Type Detection Examples:")
for query in test_queries:
    qtype = detect_query_type(query)
    print(f"  '{query}' → {qtype.upper()}")

Query Type Detection Examples:
  'How does Flask handle routing?' → TOPIC
  'Show me the route decorator implementation' → CODE
  'What is the request context?' → TOPIC
  'Find error handling functions' → CODE


## 7. HyDE Code Search Implementation

Generate hypothetical code via API, embed locally, and search.

In [8]:
def hyde_code_search(query: str, top_k: int = 5) -> List[Dict]:
    """
    HyDE-based code search:
    1. Generate hypothetical code snippet via LLM API
    2. Embed the hypothetical code locally
    3. Search for similar code chunks
    
    Args:
        query: User's code search query
        top_k: Number of results to return
    
    Returns:
        List of relevant code chunks
    """
    print(f"\n HyDE Code Search: '{query}'")
    
    # Step 1: Generate hypothetical code via API
    print("  → Generating hypothetical code via API...")
    hyde_prompt = f"""Generate a Python code snippet that would answer this query: "{query}"

Requirements:
- Write only valid Python code (no explanations)
- Include function/class signatures with docstrings
- Keep it concise (5-15 lines)
- Focus on the core implementation

Code:"""
    
    hypothetical_code = call_llm_api(hyde_prompt, temperature=0.3, max_tokens=300)
    
    if not hypothetical_code:
        print("   Failed to generate hypothetical code, falling back to direct search")
        return direct_search(query, top_k)
    
    print(f"  ✓ Generated {len(hypothetical_code)} characters of hypothetical code")
    
    # Step 2: Embed the hypothetical code locally
    print("  → Embedding hypothetical code locally...")
    hypothetical_embedding = embedding_model.encode([hypothetical_code])[0]
    
    # Step 3: Search for similar code chunks
    print("  → Searching for similar code...")
    client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
    
    try:
        collection = client.get_collection(name=COLLECTION_NAME)
    except:
        print("   Collection not found. Please index the repository first.")
        return []
    
    results = collection.query(
        query_embeddings=[hypothetical_embedding.tolist()],
        n_results=top_k,
        include=['metadatas', 'documents', 'distances']
    )
    
    # Format results
    formatted_results = []
    if results['ids'] and results['ids'][0]:
        for i in range(len(results['ids'][0])):
            distance = results['distances'][0][i]
            similarity = 1 - distance
            
            formatted_results.append({
                'id': results['ids'][0][i],
                'type': results['metadatas'][0][i]['type'],
                'name': results['metadatas'][0][i]['name'],
                'file_path': results['metadatas'][0][i]['file_path'],
                'start_line': results['metadatas'][0][i]['start_line'],
                'end_line': results['metadatas'][0][i]['end_line'],
                'docstring': results['metadatas'][0][i]['docstring'],
                'code': results['documents'][0][i],
                'distance': distance,
                'similarity': similarity,
                'method': 'HyDE'
            })
    
    print(f"  ✓ Found {len(formatted_results)} results\n")
    return formatted_results

print("✓ HyDE code search implementation loaded!")

✓ HyDE code search implementation loaded!


## 8. RAG Topic Query Implementation

Retrieve relevant context and generate explanations via API.

In [9]:
def rag_topic_query(query: str, top_k: int = 5, context_chunks: int = 3) -> Dict:
    """
    RAG-based topic query:
    1. Retrieve relevant code/doc chunks
    2. Augment the query with retrieved context
    3. Generate explanation via LLM API
    
    Args:
        query: User's topic query
        top_k: Number of chunks to retrieve
        context_chunks: Number of chunks to use in LLM context
    
    Returns:
        Dict with generated answer and supporting code chunks
    """
    print(f"\n RAG Topic Query: '{query}'")
    
    # Step 1: Retrieve relevant chunks using direct semantic search
    print("  → Retrieving relevant code chunks...")
    retrieved_chunks = direct_search(query, top_k)
    
    if not retrieved_chunks:
        print("    No relevant chunks found")
        return {
            'answer': "No relevant information found in the codebase.",
            'sources': []
        }
    
    print(f"  ✓ Retrieved {len(retrieved_chunks)} chunks")
    
    # Step 2: Build context from top chunks
    context_parts = []
    for i, chunk in enumerate(retrieved_chunks[:context_chunks], 1):
        context_parts.append(f"### Source {i}: {chunk['type']} `{chunk['name']}` ({chunk['file_path']})")
        if chunk['docstring']:
            context_parts.append(f"Docstring: {chunk['docstring'][:200]}")
        context_parts.append(f"Code:\n```python\n{chunk['code'][:500]}\n```")
        context_parts.append("")
    
    context = "\n".join(context_parts)
    
    # Step 3: Generate answer via API
    print("  → Generating explanation via API...")
    rag_prompt = f"""Answer this question about the Flask codebase: "{query}"

Use ONLY the information provided below. Be specific and reference the code/functions mentioned.

{context}

Provide a clear, concise explanation (2-4 sentences) that directly answers the question.

Answer:"""
    
    answer = call_llm_api(rag_prompt, temperature=0.3, max_tokens=400)
    
    if not answer:
        print("    Failed to generate answer")
        answer = "Unable to generate answer due to API error."
    else:
        print(f"  ✓ Generated answer ({len(answer)} characters)\n")
    
    return {
        'answer': answer,
        'sources': retrieved_chunks[:context_chunks],
        'query': query,
        'method': 'RAG'
    }

print("✓ RAG topic query implementation loaded!")

✓ RAG topic query implementation loaded!


## 9. Direct Search (Fallback)

Traditional semantic search without HyDE or RAG.

In [10]:
def direct_search(query: str, top_k: int = 5) -> List[Dict]:
    """
    Direct semantic search using query embedding.
    Fallback when HyDE fails or for baseline comparison.
    """
    client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
    
    try:
        collection = client.get_collection(name=COLLECTION_NAME)
    except:
        print("Collection not found. Please index the repository first.")
        return []
    
    # Generate query embedding
    query_embedding = embedding_model.encode([query])[0]
    
    # Search
    results = collection.query(
        query_embeddings=[query_embedding.tolist()],
        n_results=top_k,
        include=['metadatas', 'documents', 'distances']
    )
    
    # Format results
    formatted_results = []
    if results['ids'] and results['ids'][0]:
        for i in range(len(results['ids'][0])):
            distance = results['distances'][0][i]
            similarity = 1 - distance
            
            formatted_results.append({
                'id': results['ids'][0][i],
                'type': results['metadatas'][0][i]['type'],
                'name': results['metadatas'][0][i]['name'],
                'file_path': results['metadatas'][0][i]['file_path'],
                'start_line': results['metadatas'][0][i]['start_line'],
                'end_line': results['metadatas'][0][i]['end_line'],
                'docstring': results['metadatas'][0][i]['docstring'],
                'code': results['documents'][0][i],
                'distance': distance,
                'similarity': similarity,
                'method': 'Direct'
            })
    
    return formatted_results

print("✓ Direct search implementation loaded!")

✓ Direct search implementation loaded!


## 10. Unified Search Interface

Automatically route queries to HyDE or RAG based on query type.

In [11]:
def smart_search(query: str, top_k: int = 5, force_method: str = None) -> Dict:
    """
    Unified search interface that automatically selects HyDE or RAG.
    
    Args:
        query: User's search query
        top_k: Number of results to return
        force_method: Force a specific method ('hyde', 'rag', 'direct')
    
    Returns:
        Dict with results and metadata
    """
    # Detect query type if not forced
    if force_method:
        method = force_method.lower()
    else:
        query_type = detect_query_type(query)
        method = 'hyde' if query_type == 'code' else 'rag'
    
    print(f"Smart Search: {method.upper()} mode")
    print(f"Query: '{query}'")
    
    if method == 'hyde':
        results = hyde_code_search(query, top_k)
        return {
            'method': 'HyDE',
            'query': query,
            'results': results,
            'type': 'code'
        }
    
    elif method == 'rag':
        rag_result = rag_topic_query(query, top_k)
        return {
            'method': 'RAG',
            'query': query,
            'answer': rag_result['answer'],
            'sources': rag_result['sources'],
            'type': 'topic'
        }
    
    else:  # direct
        results = direct_search(query, top_k)
        return {
            'method': 'Direct',
            'query': query,
            'results': results,
            'type': 'code'
        }

print("✓ Unified search interface loaded!")

✓ Unified search interface loaded!


## 11. Display Functions

Pretty print results for different search modes.

In [12]:
def display_code_results(results: List[Dict]):
    """Display code search results (HyDE or Direct)."""
    if not results:
        print("No results found.")
        return
    
    print(f"Found {len(results)} code results:")
    
    for i, result in enumerate(results, 1):
        print(f"{i}. {result['type'].upper()}: {result['name']}")
        print(f"   File: {result['file_path']}:{result['start_line']}-{result['end_line']}")
        print(f"   Similarity: {result['similarity']:.4f} | Method: {result.get('method', 'N/A')}")
        
        if result['docstring']:
            doc_preview = result['docstring'][:150].replace('\n', ' ')
            print(f"   Doc: {doc_preview}{'...' if len(result['docstring']) > 150 else ''}")
        
        print(f"   Code Preview:")
        code_lines = result['code'].split('\n')[:6]
        for line in code_lines:
            if line.strip():
                print(f"      {line[:100]}")
        
        total_lines = len(result['code'].split('\n'))
        if total_lines > 6:
            print(f"      ... ({total_lines - 6} more lines)")
        print()

def display_topic_result(result: Dict):
    """Display RAG topic query result."""
    print(f"RAG Topic Query Result")
    
    print(f"Query: {result['query']}\n")
    print(f"Answer:\n{result['answer']}\n")
    
    print(f"Supporting Sources ({len(result['sources'])}):")
    
    for i, source in enumerate(result['sources'], 1):
        print(f"{i}. {source['type'].upper()}: {source['name']}")
        print(f"   File: {source['file_path']}:{source['start_line']}-{source['end_line']}")
        print(f"   Similarity: {source['similarity']:.4f}")
        
        if source['docstring']:
            print(f"   Doc: {source['docstring'][:100]}...")
        
        code_preview = source['code'].split('\n')[:4]
        print(f"   Code:")
        for line in code_preview:
            if line.strip():
                print(f"      {line[:80]}")
        print()

def display_search_result(result: Dict):
    """Display any search result based on its type."""
    if result['type'] == 'topic':
        display_topic_result(result)
    else:
        display_code_results(result['results'])

print("✓ Display functions loaded!")

✓ Display functions loaded!


## 12. Index the Repository

Run this once to index your codebase.

In [13]:
# Index the repository
collection = index_repository(FLASK_REPO_PATH, force_reindex=False)

Finding Python files in ../flask...
Found 34 Python files.
Processing file 1/34...
Processing file 11/34...
Processing file 21/34...
Processing file 31/34...
Extracted 448 code chunks.
Generating embeddings locally...


KeyboardInterrupt: 

## 13. Example Searches

Try different types of queries to see HyDE and RAG in action.

### Example 1: Topic Query (RAG) - "How does Flask handle routing?"

In [14]:
# Topic query - will use RAG
query1 = "How does Flask handle routing?"
result1 = smart_search(query1, top_k=5)
display_search_result(result1)

Smart Search: RAG mode
Query: 'How does Flask handle routing?'

 RAG Topic Query: 'How does Flask handle routing?'
  → Retrieving relevant code chunks...
  ✓ Retrieved 5 chunks
  → Generating explanation via API...
  ✓ Generated answer (538 characters)

RAG Topic Query Result
Query: How does Flask handle routing?

Answer:
Flask handles routing through the `url_map` attribute of the `Flask` application object, which is a central registry for URL rules. The `routes_command` function in `cli.py` iterates over the rules in `url_map` using the `iter_rules` method to show all registered routes with endpoints and methods. The `Flask` class in `app.py` acts as a central object that manages these URL rules, among other things. The actual routing is not shown in the provided code, but it is managed by the `Flask` application object and its `url_map` attribute.

Supporting Sources (3):
1. FUNCTION: routes_command
   File: ../flask\src\flask\cli.py:1069-1115
   Similarity: 0.6147
   Doc: Show all 

### Example 2: Code Query (HyDE) - "Show me route decorator implementation"

In [15]:
# Code query - will use HyDE
query2 = "Show me route decorator implementation"
result2 = smart_search(query2, top_k=5)
display_search_result(result2)

Smart Search: HYDE mode
Query: 'Show me route decorator implementation'

 HyDE Code Search: 'Show me route decorator implementation'
  → Generating hypothetical code via API...
  ✓ Generated 479 characters of hypothetical code
  → Embedding hypothetical code locally...
  → Searching for similar code...
  ✓ Found 5 results

Found 5 code results:
1. FUNCTION: decorator
   File: ../flask\src\flask\cli.py:395-400
   Similarity: 0.6156 | Method: HyDE
   Code Preview:
      def decorator(ctx: click.Context, /, *args: t.Any, **kwargs: t.Any) -> t.Any:
              if not current_app:
                  app = ctx.ensure_object(ScriptInfo).load_app()
                  ctx.with_resource(app.app_context())
              return ctx.invoke(f, *args, **kwargs)

2. FUNCTION: decorator
   File: ../flask\src\flask\helpers.py:108-110
   Similarity: 0.5835 | Method: HyDE
   Code Preview:
      def decorator(*args: t.Any, **kwargs: t.Any) -> t.Any:
                  gen = generator_or_function(*args, **kw

### Example 3: Force Direct Search for Comparison

In [16]:
# Force direct search to compare with HyDE
query3 = "error handling functions"
result3_direct = smart_search(query3, top_k=5, force_method='direct')
display_search_result(result3_direct)

Smart Search: DIRECT mode
Query: 'error handling functions'
Found 5 code results:
1. CLASS: UnexpectedUnicodeError
   File: ../flask\src\flask\debughelpers.py:17-20
   Similarity: 0.4215 | Method: Direct
   Doc: Raised in places where we want some better error reporting for     unexpected unicode or binary data.
   Code Preview:
      class UnexpectedUnicodeError(AssertionError, UnicodeError):
          """Raised in places where we want some better error reporting for
          unexpected unicode or binary data.
          """

2. FUNCTION: _called_with_wrong_args
   File: ../flask\src\flask\cli.py:94-117
   Similarity: 0.3834 | Method: Direct
   Doc: Check whether calling a function raised a ``TypeError`` because     the call failed or because something in the factory raised the     error.      :pa...
   Code Preview:
      def _called_with_wrong_args(f: t.Callable[..., Flask]) -> bool:
          """Check whether calling a function raised a ``TypeError`` because
          the call fail

### Example 4: Compare HyDE vs Direct

In [17]:
# Compare HyDE vs Direct for the same query
query4 = "request context management"

print("### HyDE Search ###")
result4_hyde = smart_search(query4, top_k=5, force_method='hyde')
display_search_result(result4_hyde)


print("### Direct Search ###")
result4_direct = smart_search(query4, top_k=5, force_method='direct')
display_search_result(result4_direct)

### HyDE Search ###
Smart Search: HYDE mode
Query: 'request context management'

 HyDE Code Search: 'request context management'
  → Generating hypothetical code via API...
  ✓ Generated 648 characters of hypothetical code
  → Embedding hypothetical code locally...
  → Searching for similar code...
  ✓ Found 5 results

Found 5 code results:
1. CLASS: RequestContext
   File: ../flask\src\flask\ctx.py:287-449
   Similarity: 0.6628 | Method: HyDE
   Doc: The request context contains per-request information. The Flask     app creates and pushes it at the beginning of the request, then pops     it at the...
   Code Preview:
      class RequestContext:
          """The request context contains per-request information. The Flask
          app creates and pushes it at the beginning of the request, then pops
          it at the end of the request. It will create the URL adapter and
          request object for the WSGI environment provided.
      ... (43 more lines)

2. FUNCTION: copy_current_r

## 14. Custom Search Interface

Run your own queries here with automatic mode selection.

In [18]:
# Custom search - modify the query below
custom_query = "What is the Blueprint class used for?"
custom_result = smart_search(custom_query, top_k=5)
display_search_result(custom_result)

Smart Search: HYDE mode
Query: 'What is the Blueprint class used for?'

 HyDE Code Search: 'What is the Blueprint class used for?'
  → Generating hypothetical code via API...
  ✓ Generated 488 characters of hypothetical code
  → Embedding hypothetical code locally...
  → Searching for similar code...
  ✓ Found 5 results

Found 5 code results:
1. CLASS: Blueprint
   File: ../flask\src\flask\blueprints.py:18-128
   Similarity: 0.7347 | Method: HyDE
   Code Preview:
      class Blueprint(SansioBlueprint):
          def __init__(
              self,
              name: str,
              import_name: str,
              static_folder: str | os.PathLike[str] | None = None,
      ... (50 more lines)

2. FUNCTION: __init__
   File: ../flask\src\flask\blueprints.py:19-53
   Similarity: 0.6421 | Method: HyDE
   Code Preview:
      def __init__(
              self,
              name: str,
              import_name: str,
              static_folder: str | os.PathLike[str] | None = None,
         

## 15. Performance Analysis

Compare HyDE, RAG, and Direct search quality.

In [19]:
def compare_search_methods(query: str, top_k: int = 5):
    """Compare all three search methods for a given query."""
    print(f"Comparing Search Methods")
    print(f"Query: '{query}'")
    
    # Detect suggested method
    suggested_type = detect_query_type(query)
    print(f"Suggested method: {suggested_type.upper()}\n")
    
    # Try HyDE
    print("1. HyDE Code Search")
    hyde_results = hyde_code_search(query, top_k)
    if hyde_results:
        print(f"Top result: {hyde_results[0]['name']} (similarity: {hyde_results[0]['similarity']:.4f})")
    
    # Try Direct
    print("2. Direct Search")
    direct_results = direct_search(query, top_k)
    if direct_results:
        print(f"Top result: {direct_results[0]['name']} (similarity: {direct_results[0]['similarity']:.4f})")
    
    # Try RAG (if topic query)
    if suggested_type == 'topic':
        print("3. RAG Topic Query")
        rag_result = rag_topic_query(query, top_k, context_chunks=3)
        print(f"Answer length: {len(rag_result['answer'])} characters")
        print(f"Answer preview: {rag_result['answer'][:200]}...")

# Test comparison
test_query = "template rendering"
compare_search_methods(test_query, top_k=3)

Comparing Search Methods
Query: 'template rendering'
Suggested method: CODE

1. HyDE Code Search

 HyDE Code Search: 'template rendering'
  → Generating hypothetical code via API...
  ✓ Generated 478 characters of hypothetical code
  → Embedding hypothetical code locally...
  → Searching for similar code...
  ✓ Found 3 results

Top result: index (similarity: 0.4885)
2. Direct Search
Top result: index (similarity: 0.4691)


## Summary

This notebook implements advanced semantic code search with:

✅ **Local Embedding**: Fast, offline embedding generation  
✅ **API-Based HyDE**: Generate hypothetical code via API for better search relevance  
✅ **API-Based RAG**: Retrieve context and generate explanations for topic queries  
✅ **Automatic Mode Selection**: Smart detection of query type  
✅ **Modular Design**: Easy to extend and customize  

**Usage:**
1. Index your repository once
2. Use `smart_search()` for automatic mode selection
3. Force specific methods with `force_method` parameter
4. Compare methods with `compare_search_methods()`

**Next Steps:**
- Integrate into VS Code extension
- Add caching for API responses
- Implement hybrid search (BM25 + semantic)
- Add user feedback collection