# 🔍 RAG API - Sistema de Busca Semântica

## 📋 O que este notebook faz

Este notebook implementa um **sistema completo de RAG (Retrieval-Augmented Generation)** com busca semântica híbrida:

- 🔍 **Busca vetorial** usando embeddings BAAI/bge-m3
- 🏷️ **Filtros por metadata** (repo, branch, arquivo, data)
- 🎯 **Pesquisa híbrida** combinando vetores e metadata
- ⚡ **Cache inteligente** para queries frequentes
- 📊 **Reranking** e scoring avançado

## 🌐 Endpoints Disponíveis

### Busca Principal
- `GET /api/v1/search` - Busca semântica híbrida
- `POST /api/v1/search` - Busca com payload complexo

### Busca Especializada  
- `GET /api/v1/search/similar/{point_id}` - Documentos similares
- `GET /api/v1/search/metadata` - Busca apenas por metadata

### Utilitários
- `GET /api/v1/search/stats` - Estatísticas da collection
- `GET /api/v1/search/test` - Teste de conectividade

---

## 🔧 Configuração e Imports

In [2]:
import os
import json
import hashlib
import time
from pathlib import Path
from datetime import datetime, timedelta
from typing import List, Dict, Optional, Any
from collections import defaultdict
from functools import lru_cache

# Qdrant
from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, PointStruct,
    Filter, FieldCondition, MatchValue, 
    FilterSelector, Range
)

# Embeddings
from sentence_transformers import SentenceTransformer
import numpy as np

# Configuração do ambiente
QDRANT_URL = os.getenv("QDRANT_URL", "http://qdrant.codrstudio.dev:6333")
QDRANT_API_KEY = os.getenv("QDRANT_API_KEY")
COLLECTION_NAME = os.getenv("QDRANT_COLLECTION", "nic")

# Modelo de embeddings (mesmo do pipeline)
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "BAAI/bge-m3")

# Parâmetros de busca
DEFAULT_TOP_K = 5
DEFAULT_SCORE_THRESHOLD = 0.7
MAX_TOP_K = 20

# Cache
CACHE_TTL = 300  # 5 minutos
query_cache = {}

print(f"🔧 Configuração:")
print(f"  Qdrant URL: {QDRANT_URL}")
print(f"  Collection: {COLLECTION_NAME}")
print(f"  Modelo: {EMBEDDING_MODEL}")
print(f"  API Key: ***{QDRANT_API_KEY[-4:] if QDRANT_API_KEY and len(QDRANT_API_KEY) > 4 else '???'}")

🔧 Configuração:
  Qdrant URL: http://qdrant.codrstudio.dev:6333
  Collection: nic
  Modelo: BAAI/bge-m3
  API Key: ***d857


## 🔗 Conexão com Qdrant

In [3]:
# Inicializar cliente Qdrant
try:
    client = QdrantClient(
        url=QDRANT_URL,
        api_key=QDRANT_API_KEY
    )
    
    # Verificar collection
    collection_info = client.get_collection(COLLECTION_NAME)
    
    print(f"✅ Conectado ao Qdrant")
    print(f"📊 Collection '{COLLECTION_NAME}':")
    print(f"  Pontos: {collection_info.points_count}")
    print(f"  Status: {collection_info.status}")
    print(f"  Vetores: {collection_info.config.params.vectors.size} dimensões")
    
    VECTOR_SIZE = collection_info.config.params.vectors.size
    
except Exception as e:
    print(f"❌ Erro ao conectar ao Qdrant: {e}")
    print(f"   Verifique as configurações no .env")
    raise

✅ Conectado ao Qdrant
📊 Collection 'nic':
  Pontos: 644
  Status: green
  Vetores: 1024 dimensões


  client = QdrantClient(


## 🤖 Modelo de Embeddings

In [4]:
# Carregar modelo (com cache singleton)
_model_instance = None

def get_embedding_model():
    """Retorna instância singleton do modelo de embeddings"""
    global _model_instance
    if _model_instance is None:
        print(f"🤖 Carregando modelo {EMBEDDING_MODEL}...")
        _model_instance = SentenceTransformer(EMBEDDING_MODEL)
        print(f"✅ Modelo carregado: {_model_instance.get_sentence_embedding_dimension()} dimensões")
    return _model_instance

def generate_embedding(text: str) -> List[float]:
    """
    Gera embedding normalizado para um texto
    """
    model = get_embedding_model()
    embedding = model.encode(
        text,
        normalize_embeddings=True,
        show_progress_bar=False
    )
    return embedding.tolist()

# Teste do modelo
test_embedding = generate_embedding("teste de embedding")
print(f"📏 Embedding de teste: {len(test_embedding)} dimensões")
print(f"📊 Magnitude: {np.linalg.norm(test_embedding):.3f}")

🤖 Carregando modelo BAAI/bge-m3...
✅ Modelo carregado: 1024 dimensões
📏 Embedding de teste: 1024 dimensões
📊 Magnitude: 1.000


## 🏷️ Construção de Filtros de Metadata

In [5]:
def build_metadata_filter(filters: Optional[Dict[str, Any]] = None) -> Optional[Filter]:
    """
    Constrói filtros Qdrant a partir de parâmetros de busca.
    
    Filtros suportados:
    - repo: string (repositório GitLab)
    - branch: string (branch do Git)
    - relpath: string (caminho relativo do arquivo)
    - source_document: string (nome do documento)
    - lang: string (idioma: pt-BR, en)
    - date_from: string (ISO date)
    - date_to: string (ISO date)
    - embed_model_major: string (versão do modelo)
    """
    if not filters:
        return None
    
    conditions = []
    
    # Filtros de string exata
    string_fields = ['repo', 'branch', 'relpath', 'source_document', 'lang', 'embed_model_major']
    for field in string_fields:
        if field in filters and filters[field]:
            conditions.append(
                FieldCondition(
                    key=field,
                    match=MatchValue(value=filters[field])
                )
            )
    
    # Filtro de range de datas
    if 'date_from' in filters or 'date_to' in filters:
        date_range = {}
        if 'date_from' in filters:
            date_range['gte'] = filters['date_from']
        if 'date_to' in filters:
            date_range['lte'] = filters['date_to']
        
        conditions.append(
            FieldCondition(
                key='last_updated',
                range=Range(**date_range)
            )
        )
    
    if conditions:
        return Filter(must=conditions)
    
    return None

# Teste de filtros
test_filter = build_metadata_filter({
    'repo': 'nic/documentacao/base-de-conhecimento',
    'branch': 'main'
})
print(f"🏷️ Filtro de teste criado: {test_filter is not None}")

🏷️ Filtro de teste criado: True


## 🎯 Função Principal de Busca Híbrida

In [6]:
def hybrid_search(
    query: str,
    top_k: int = DEFAULT_TOP_K,
    score_threshold: float = DEFAULT_SCORE_THRESHOLD,
    filters: Optional[Dict[str, Any]] = None,
    include_context: bool = False
) -> Dict[str, Any]:
    """
    Realiza busca híbrida no Qdrant.
    
    Args:
        query: Texto da pesquisa
        top_k: Número de resultados (max: 20)
        score_threshold: Score mínimo (0-1)
        filters: Filtros de metadata
        include_context: Se deve incluir chunks adjacentes
    
    Returns:
        Dicionário com resultados e metadata
    """
    start_time = time.time()
    
    # Validar parâmetros
    top_k = min(top_k, MAX_TOP_K)
    
    # Verificar cache
    cache_key = hashlib.md5(
        f"{query}:{top_k}:{score_threshold}:{json.dumps(filters or {}, sort_keys=True)}".encode()
    ).hexdigest()
    
    if cache_key in query_cache:
        cached_result, cached_time = query_cache[cache_key]
        if time.time() - cached_time < CACHE_TTL:
            cached_result['from_cache'] = True
            return cached_result
    
    # Gerar embedding da query
    query_embedding = generate_embedding(query)
    
    # Construir filtros
    search_filter = build_metadata_filter(filters)
    
    # Buscar no Qdrant
    search_results = client.search(
        collection_name=COLLECTION_NAME,
        query_vector=query_embedding,
        query_filter=search_filter,
        limit=top_k,
        score_threshold=score_threshold,
        with_payload=True,
        with_vectors=False
    )
    
    # Formatar resultados
    results = []
    for hit in search_results:
        result = {
            'score': round(hit.score, 4),
            'text': hit.payload.get('text', ''),
            'metadata': {
                'chunk_id': hit.payload.get('chunk_id'),
                'chunk_index': hit.payload.get('chunk_index'),
                'source_document': hit.payload.get('source_document'),
                'repo': hit.payload.get('repo'),
                'branch': hit.payload.get('branch'),
                'commit': hit.payload.get('commit'),
                'last_updated': hit.payload.get('last_updated')
            },
            'point_id': hit.id
        }
        
        # Adicionar highlights (palavras da query no texto)
        highlights = []
        query_words = query.lower().split()
        text_lower = result['text'].lower()
        for word in query_words:
            if len(word) > 2 and word in text_lower:
                # Encontrar contexto da palavra
                idx = text_lower.index(word)
                start = max(0, idx - 30)
                end = min(len(result['text']), idx + len(word) + 30)
                highlight = result['text'][start:end]
                if start > 0:
                    highlight = '...' + highlight
                if end < len(result['text']):
                    highlight = highlight + '...'
                highlights.append(highlight)
        
        if highlights:
            result['highlights'] = highlights[:3]  # Max 3 highlights
        
        results.append(result)
    
    # Preparar resposta
    response = {
        'query': query,
        'total_results': len(results),
        'results': results,
        'search_metadata': {
            'model': EMBEDDING_MODEL,
            'collection': COLLECTION_NAME,
            'top_k': top_k,
            'score_threshold': score_threshold,
            'filters_applied': filters or {},
            'search_time_ms': round((time.time() - start_time) * 1000, 2),
            'from_cache': False
        }
    }
    
    # Adicionar ao cache
    query_cache[cache_key] = (response, time.time())
    
    # Limpar cache antigo
    if len(query_cache) > 100:
        # Remover entradas mais antigas
        sorted_cache = sorted(query_cache.items(), key=lambda x: x[1][1])
        for key, _ in sorted_cache[:50]:
            del query_cache[key]
    
    return response

# Teste da busca
test_result = hybrid_search(
    query="self checkout",
    top_k=3
)
print(f"🔍 Teste de busca:")
print(f"  Resultados: {test_result['total_results']}")
print(f"  Tempo: {test_result['search_metadata']['search_time_ms']}ms")
if test_result['results']:
    print(f"  Melhor score: {test_result['results'][0]['score']}")

🔍 Teste de busca:
  Resultados: 0
  Tempo: 164.04ms


  search_results = client.search(


## 🌐 Endpoint: Busca Principal

In [7]:
# GET /api/v1/search
import sys
import urllib.parse

# Parse query parameters
def parse_query_params():
    """Parse query parameters from REQUEST global"""
    try:
        # Acessar REQUEST global do Jupyter Kernel Gateway
        request = REQUEST
        query_string = request.get('query', {})
        
        # Extrair parâmetros
        query = query_string.get('q', [''])[0]
        top_k = int(query_string.get('top_k', [DEFAULT_TOP_K])[0])
        score_threshold = float(query_string.get('score_threshold', [DEFAULT_SCORE_THRESHOLD])[0])
        
        # Filtros opcionais
        filters = {}
        for key in ['repo', 'branch', 'relpath', 'source_document', 'lang']:
            if key in query_string:
                filters[key] = query_string[key][0]
        
        return query, top_k, score_threshold, filters
        
    except (NameError, KeyError):
        # Fallback para teste local
        return "self checkout", 5, 0.7, {}

# Executar busca
query, top_k, score_threshold, filters = parse_query_params()

if not query:
    response = {
        'error': 'Query parameter "q" is required',
        'example': '/api/v1/search?q=self+checkout&top_k=5'
    }
else:
    response = hybrid_search(
        query=query,
        top_k=top_k,
        score_threshold=score_threshold,
        filters=filters if filters else None
    )

# Retornar JSON
print(json.dumps(response, indent=2, ensure_ascii=False))

{
  "query": "self checkout",
  "total_results": 0,
  "results": [],
  "search_metadata": {
    "model": "BAAI/bge-m3",
    "collection": "nic",
    "top_k": 5,
    "score_threshold": 0.7,
    "filters_applied": {},
    "search_time_ms": 138.77,
    "from_cache": false
  }
}


  search_results = client.search(


## 🌐 Endpoint: Busca POST

In [8]:
# POST /api/v1/search
import sys

def parse_post_body():
    """Parse JSON body from POST request"""
    try:
        # Acessar REQUEST global do Jupyter Kernel Gateway
        request = REQUEST
        body = request.get('body', '{}')
        return json.loads(body)
    except (NameError, json.JSONDecodeError):
        # Fallback para teste
        return {
            'query': 'self checkout pagamento',
            'top_k': 5,
            'filters': {'branch': 'main'}
        }

# Parse request body
request_data = parse_post_body()

# Validar campos obrigatórios
if 'query' not in request_data:
    response = {
        'error': 'Field "query" is required in request body',
        'example': {
            'query': 'texto de busca',
            'top_k': 5,
            'score_threshold': 0.7,
            'filters': {
                'repo': 'nic/documentacao',
                'branch': 'main'
            }
        }
    }
else:
    # Executar busca
    response = hybrid_search(
        query=request_data['query'],
        top_k=request_data.get('top_k', DEFAULT_TOP_K),
        score_threshold=request_data.get('score_threshold', DEFAULT_SCORE_THRESHOLD),
        filters=request_data.get('filters'),
        include_context=request_data.get('include_context', False)
    )

print(json.dumps(response, indent=2, ensure_ascii=False))

{
  "query": "self checkout pagamento",
  "total_results": 0,
  "results": [],
  "search_metadata": {
    "model": "BAAI/bge-m3",
    "collection": "nic",
    "top_k": 5,
    "score_threshold": 0.7,
    "filters_applied": {
      "branch": "main"
    },
    "search_time_ms": 102.96,
    "from_cache": false
  }
}


  search_results = client.search(


## 🔄 Endpoint: Busca por Similaridade

In [9]:
# GET /api/v1/search/similar/<point_id>
def find_similar_documents(
    point_id: int,
    top_k: int = DEFAULT_TOP_K,
    score_threshold: float = 0.8
) -> Dict[str, Any]:
    """
    Encontra documentos similares a um documento existente.
    """
    start_time = time.time()
    
    try:
        # Buscar o ponto original
        original_points = client.retrieve(
            collection_name=COLLECTION_NAME,
            ids=[point_id],
            with_vectors=True,
            with_payload=True
        )
        
        if not original_points:
            return {
                'error': f'Point {point_id} not found',
                'point_id': point_id
            }
        
        original = original_points[0]
        
        # Buscar similares usando o vetor do documento
        similar_results = client.search(
            collection_name=COLLECTION_NAME,
            query_vector=original.vector,
            limit=top_k + 1,  # +1 porque o próprio documento será retornado
            score_threshold=score_threshold,
            with_payload=True,
            with_vectors=False
        )
        
        # Filtrar o próprio documento dos resultados
        results = []
        for hit in similar_results:
            if hit.id != point_id:
                results.append({
                    'score': round(hit.score, 4),
                    'text': hit.payload.get('text', '')[:200] + '...',
                    'metadata': {
                        'chunk_id': hit.payload.get('chunk_id'),
                        'source_document': hit.payload.get('source_document'),
                        'chunk_index': hit.payload.get('chunk_index')
                    },
                    'point_id': hit.id
                })
        
        return {
            'original': {
                'point_id': point_id,
                'text': original.payload.get('text', '')[:200] + '...',
                'source_document': original.payload.get('source_document')
            },
            'similar_documents': results[:top_k],
            'total_similar': len(results),
            'search_metadata': {
                'collection': COLLECTION_NAME,
                'score_threshold': score_threshold,
                'search_time_ms': round((time.time() - start_time) * 1000, 2)
            }
        }
        
    except Exception as e:
        return {
            'error': str(e),
            'point_id': point_id
        }

# Teste (usar um ID real da collection)
# similar_result = find_similar_documents(12345, top_k=3)
# print(json.dumps(similar_result, indent=2, ensure_ascii=False))

## 📊 Endpoint: Busca por Metadata

In [10]:
# GET /api/v1/search/metadata
def search_by_metadata(
    filters: Dict[str, Any],
    limit: int = 20,
    offset: int = 0
) -> Dict[str, Any]:
    """
    Busca apenas por filtros de metadata, sem usar vetores.
    Útil para listar documentos de um repositório/branch específico.
    """
    start_time = time.time()
    
    if not filters:
        return {
            'error': 'At least one filter is required',
            'available_filters': [
                'repo', 'branch', 'relpath', 'source_document', 
                'lang', 'date_from', 'date_to'
            ]
        }
    
    # Construir filtro
    search_filter = build_metadata_filter(filters)
    
    if not search_filter:
        return {
            'error': 'Invalid filters provided',
            'filters': filters
        }
    
    # Buscar com scroll (paginação)
    results, next_offset = client.scroll(
        collection_name=COLLECTION_NAME,
        scroll_filter=search_filter,
        limit=limit,
        offset=offset,
        with_payload=True,
        with_vectors=False
    )
    
    # Agrupar por documento
    documents = defaultdict(list)
    for point in results:
        doc_name = point.payload.get('source_document', 'unknown')
        documents[doc_name].append({
            'chunk_index': point.payload.get('chunk_index'),
            'text_preview': point.payload.get('text', '')[:100] + '...',
            'point_id': point.id
        })
    
    # Formatar resposta
    response = {
        'filters': filters,
        'total_points': len(results),
        'documents': [
            {
                'source_document': doc,
                'chunks_count': len(chunks),
                'chunks': sorted(chunks, key=lambda x: x['chunk_index'])[:3]  # Primeiros 3 chunks
            }
            for doc, chunks in documents.items()
        ],
        'pagination': {
            'limit': limit,
            'offset': offset,
            'has_more': next_offset is not None,
            'next_offset': next_offset
        },
        'search_metadata': {
            'collection': COLLECTION_NAME,
            'search_time_ms': round((time.time() - start_time) * 1000, 2)
        }
    }
    
    return response

# Teste
metadata_result = search_by_metadata(
    filters={'branch': 'main'},
    limit=10
)
print(f"📊 Busca por metadata:")
print(f"  Total pontos: {metadata_result.get('total_points', 0)}")
print(f"  Documentos: {len(metadata_result.get('documents', []))}")

📊 Busca por metadata:
  Total pontos: 10
  Documentos: 8


## 📈 Endpoint: Estatísticas

In [11]:
# GET /api/v1/search/stats
def get_collection_stats() -> Dict[str, Any]:
    """
    Retorna estatísticas da collection e do sistema de busca.
    """
    try:
        # Informações da collection
        collection_info = client.get_collection(COLLECTION_NAME)
        
        # Estatísticas de cache
        cache_stats = {
            'entries': len(query_cache),
            'ttl_seconds': CACHE_TTL,
            'max_entries': 100
        }
        
        # Exemplo de busca para mostrar campos disponíveis
        sample_points = client.scroll(
            collection_name=COLLECTION_NAME,
            limit=1,
            with_payload=True,
            with_vectors=False
        )[0]
        
        available_fields = []
        if sample_points:
            available_fields = list(sample_points[0].payload.keys())
        
        return {
            'collection': {
                'name': COLLECTION_NAME,
                'points_count': collection_info.points_count,
                'status': collection_info.status,
                'vector_size': collection_info.config.params.vectors.size,
                'distance_metric': 'COSINE'
            },
            'model': {
                'name': EMBEDDING_MODEL,
                'dimensions': VECTOR_SIZE
            },
            'search_config': {
                'default_top_k': DEFAULT_TOP_K,
                'max_top_k': MAX_TOP_K,
                'default_score_threshold': DEFAULT_SCORE_THRESHOLD
            },
            'cache': cache_stats,
            'available_metadata_fields': available_fields,
            'api_version': '1.0.0',
            'timestamp': datetime.now().isoformat() + 'Z'
        }
        
    except Exception as e:
        return {
            'error': str(e),
            'collection': COLLECTION_NAME
        }

# Executar
stats = get_collection_stats()
print(json.dumps(stats, indent=2, ensure_ascii=False))

{
  "collection": {
    "name": "nic",
    "points_count": 644,
    "status": "green",
    "vector_size": 1024,
    "distance_metric": "COSINE"
  },
  "model": {
    "name": "BAAI/bge-m3",
    "dimensions": 1024
  },
  "search_config": {
    "default_top_k": 5,
    "max_top_k": 20,
    "default_score_threshold": 0.7
  },
  "cache": {
    "entries": 3,
    "ttl_seconds": 300,
    "max_entries": 100
  },
  "available_metadata_fields": [
    "chunk_id",
    "chunk_index",
    "text",
    "char_count",
    "repo",
    "branch",
    "relpath",
    "source_document",
    "commit",
    "last_updated",
    "embed_model_major",
    "embed_model_full",
    "tokenizer_major",
    "tokenizer_full",
    "embedding_model",
    "content_sha256",
    "lang",
    "processing_date",
    "pipeline_version"
  ],
  "api_version": "1.0.0",
  "timestamp": "2025-08-18T20:55:23.863359Z"
}


## 🧪 Endpoint: Teste de Conectividade

In [12]:
# GET /api/v1/search/test
def test_connectivity() -> Dict[str, Any]:
    """
    Testa conectividade com Qdrant e funcionalidades básicas.
    """
    tests = {
        'qdrant_connection': False,
        'collection_exists': False,
        'model_loaded': False,
        'embedding_generation': False,
        'search_capability': False
    }
    
    errors = []
    
    # Teste 1: Conexão Qdrant
    try:
        client.get_collections()
        tests['qdrant_connection'] = True
    except Exception as e:
        errors.append(f"Qdrant connection: {str(e)}")
    
    # Teste 2: Collection existe
    try:
        client.get_collection(COLLECTION_NAME)
        tests['collection_exists'] = True
    except Exception as e:
        errors.append(f"Collection check: {str(e)}")
    
    # Teste 3: Modelo carregado
    try:
        model = get_embedding_model()
        tests['model_loaded'] = True
    except Exception as e:
        errors.append(f"Model loading: {str(e)}")
    
    # Teste 4: Geração de embedding
    try:
        test_embedding = generate_embedding("teste")
        if len(test_embedding) == VECTOR_SIZE:
            tests['embedding_generation'] = True
    except Exception as e:
        errors.append(f"Embedding generation: {str(e)}")
    
    # Teste 5: Capacidade de busca
    try:
        result = hybrid_search("teste", top_k=1)
        if 'results' in result:
            tests['search_capability'] = True
    except Exception as e:
        errors.append(f"Search test: {str(e)}")
    
    # Resultado geral
    all_passed = all(tests.values())
    
    return {
        'status': 'healthy' if all_passed else 'unhealthy',
        'tests': tests,
        'errors': errors if errors else None,
        'environment': {
            'qdrant_url': QDRANT_URL,
            'collection': COLLECTION_NAME,
            'model': EMBEDDING_MODEL
        },
        'timestamp': datetime.now().isoformat() + 'Z'
    }

# Executar teste
test_result = test_connectivity()
print(json.dumps(test_result, indent=2, ensure_ascii=False))

{
  "status": "healthy",
  "tests": {
    "qdrant_connection": true,
    "collection_exists": true,
    "model_loaded": true,
    "embedding_generation": true,
    "search_capability": true
  },
  "errors": null,
  "environment": {
    "qdrant_url": "http://qdrant.codrstudio.dev:6333",
    "collection": "nic",
    "model": "BAAI/bge-m3"
  },
  "timestamp": "2025-08-18T20:55:24.104979Z"
}


  search_results = client.search(


## 📚 Documentação OpenAPI

In [13]:
# GET /api/v1/search/openapi
def get_openapi_spec() -> Dict[str, Any]:
    """
    Retorna especificação OpenAPI para os endpoints de busca.
    """
    return {
        "openapi": "3.0.0",
        "info": {
            "title": "NIC RAG API",
            "description": "API de busca semântica e retrieval para o sistema NIC",
            "version": "1.0.0"
        },
        "paths": {
            "/api/v1/search": {
                "get": {
                    "summary": "Busca semântica híbrida",
                    "parameters": [
                        {
                            "name": "q",
                            "in": "query",
                            "required": True,
                            "schema": {"type": "string"},
                            "description": "Texto da busca"
                        },
                        {
                            "name": "top_k",
                            "in": "query",
                            "schema": {"type": "integer", "default": 5, "maximum": 20},
                            "description": "Número de resultados"
                        },
                        {
                            "name": "score_threshold",
                            "in": "query",
                            "schema": {"type": "number", "default": 0.7, "minimum": 0, "maximum": 1},
                            "description": "Score mínimo de similaridade"
                        },
                        {
                            "name": "repo",
                            "in": "query",
                            "schema": {"type": "string"},
                            "description": "Filtrar por repositório"
                        },
                        {
                            "name": "branch",
                            "in": "query",
                            "schema": {"type": "string"},
                            "description": "Filtrar por branch"
                        }
                    ],
                    "responses": {
                        "200": {
                            "description": "Resultados da busca",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "query": {"type": "string"},
                                            "total_results": {"type": "integer"},
                                            "results": {
                                                "type": "array",
                                                "items": {
                                                    "type": "object",
                                                    "properties": {
                                                        "score": {"type": "number"},
                                                        "text": {"type": "string"},
                                                        "metadata": {"type": "object"},
                                                        "highlights": {"type": "array"}
                                                    }
                                                }
                                            },
                                            "search_metadata": {"type": "object"}
                                        }
                                    }
                                }
                            }
                        }
                    }
                },
                "post": {
                    "summary": "Busca com payload complexo",
                    "requestBody": {
                        "required": True,
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "required": ["query"],
                                    "properties": {
                                        "query": {"type": "string"},
                                        "top_k": {"type": "integer"},
                                        "score_threshold": {"type": "number"},
                                        "filters": {"type": "object"},
                                        "include_context": {"type": "boolean"}
                                    }
                                }
                            }
                        }
                    }
                }
            },
            "/api/v1/search/similar/{point_id}": {
                "get": {
                    "summary": "Buscar documentos similares",
                    "parameters": [
                        {
                            "name": "point_id",
                            "in": "path",
                            "required": True,
                            "schema": {"type": "integer"}
                        }
                    ]
                }
            },
            "/api/v1/search/metadata": {
                "get": {
                    "summary": "Busca apenas por metadata"
                }
            },
            "/api/v1/search/stats": {
                "get": {
                    "summary": "Estatísticas da collection"
                }
            },
            "/api/v1/search/test": {
                "get": {
                    "summary": "Teste de conectividade"
                }
            }
        }
    }

# Mostrar spec
openapi = get_openapi_spec()
print(f"📚 OpenAPI Spec:")
print(f"  Endpoints: {len(openapi['paths'])}")
for path in openapi['paths']:
    print(f"    - {path}")

📚 OpenAPI Spec:
  Endpoints: 5
    - /api/v1/search
    - /api/v1/search/similar/{point_id}
    - /api/v1/search/metadata
    - /api/v1/search/stats
    - /api/v1/search/test


## 🚀 Exemplos de Uso

In [14]:
# Exemplos de queries para teste
print("🧪 Exemplos de uso da API RAG:\n")

# 1. Busca simples
print("1️⃣ Busca simples:")
print("   GET /api/v1/search?q=self+checkout\n")

# 2. Busca com filtros
print("2️⃣ Busca com filtros:")
print("   GET /api/v1/search?q=pagamento&branch=main&top_k=10\n")

# 3. Busca POST com payload complexo
print("3️⃣ Busca POST:")
post_example = {
    "query": "identificação do cliente",
    "top_k": 5,
    "score_threshold": 0.8,
    "filters": {
        "repo": "nic/documentacao/base-de-conhecimento",
        "branch": "main"
    },
    "include_context": True
}
print(f"   POST /api/v1/search")
print(f"   Body: {json.dumps(post_example, indent=4)}\n")

# 4. Busca por similaridade
print("4️⃣ Documentos similares:")
print("   GET /api/v1/search/similar/12345\n")

# 5. Busca por metadata
print("5️⃣ Busca por metadata:")
print("   GET /api/v1/search/metadata?branch=main&repo=nic/documentacao\n")

# 6. Estatísticas
print("6️⃣ Estatísticas:")
print("   GET /api/v1/search/stats\n")

print("✅ API RAG pronta para uso!")

🧪 Exemplos de uso da API RAG:

1️⃣ Busca simples:
   GET /api/v1/search?q=self+checkout

2️⃣ Busca com filtros:
   GET /api/v1/search?q=pagamento&branch=main&top_k=10

3️⃣ Busca POST:
   POST /api/v1/search
   Body: {
    "query": "identifica\u00e7\u00e3o do cliente",
    "top_k": 5,
    "score_threshold": 0.8,
    "filters": {
        "repo": "nic/documentacao/base-de-conhecimento",
        "branch": "main"
    },
    "include_context": true
}

4️⃣ Documentos similares:
   GET /api/v1/search/similar/12345

5️⃣ Busca por metadata:
   GET /api/v1/search/metadata?branch=main&repo=nic/documentacao

6️⃣ Estatísticas:
   GET /api/v1/search/stats

✅ API RAG pronta para uso!
