# Advanced GraphRAG Search Patterns - Ungraph

Este notebook demuestra los patrones avanzados de b√∫squeda GraphRAG que requieren m√≥dulos opcionales.

## Objetivos

1. **Graph-Enhanced Vector Search** - Combina b√∫squeda vectorial con estructura del grafo
2. **Local Retriever** - B√∫squeda en comunidades peque√±as y focalizadas
3. **Community Summary Retriever (GDS)** - Detecci√≥n de comunidades usando Graph Data Science
4. **Comparaci√≥n de patrones** - Cu√°ndo usar cada patr√≥n avanzado

## Requisitos

- `pip install ungraph[gds]` - Para patrones que usan GDS
- Neo4j GDS plugin instalado y configurado
- Datos en el grafo con entidades extra√≠das (para Graph-Enhanced Vector Search)

**Referencias:**
- [Patrones Avanzados](../../docs/api/advanced-search-patterns.md)
- [GraphRAG Patterns](../../docs/api/search-patterns.md)


In [None]:
def add_src_to_path(path_folder: str):
    import sys
    from pathlib import Path
    base_path = Path().resolve()
    for parent in [base_path] + list(base_path.parents):
        candidate = parent / path_folder
        if candidate.exists():
            parent_dir = candidate.parent
            if str(parent_dir) not in sys.path:
                sys.path.insert(0, str(parent_dir))
            if str(candidate) not in sys.path:
                sys.path.append(str(candidate))
            return

add_src_to_path(path_folder="src")
add_src_to_path(path_folder="src/utils")
add_src_to_path(path_folder="src/data")


In [None]:
# Importar librer√≠as
try:
    import ungraph
except ImportError:
    import src
    ungraph = src

from infrastructure.services.advanced_search_patterns import AdvancedSearchPatterns
from infrastructure.services.gds_service import GDSService
from infrastructure.services.huggingface_embedding_service import HuggingFaceEmbeddingService
from src.utils.graph_operations import graph_session

print(f"üì¶ Ungraph version: {ungraph.__version__}")

# Verificar GDS
try:
    gds_service = GDSService()
    if gds_service._check_gds_available():
        print("‚úÖ Graph Data Science (GDS) disponible")
    else:
        print("‚ö†Ô∏è  GDS no disponible. Algunos patrones no funcionar√°n.")
        print("   Instalar con: pip install ungraph[gds] y configurar Neo4j GDS plugin")
except Exception as e:
    print(f"‚ö†Ô∏è  Error verificando GDS: {e}")


## Parte 1: Graph-Enhanced Vector Search

Combina b√∫squeda vectorial con traversal del grafo para encontrar contexto relacionado.


In [None]:
# Generar embedding de la query
embedding_service = HuggingFaceEmbeddingService()
query_text = "machine learning"
query_embedding = embedding_service.generate_embedding(query_text)

print(f"üîç Query: '{query_text}'")
print(f"üìä Embedding: {len(query_embedding.vector)} dimensiones")

# Generar query Cypher para Graph-Enhanced Vector Search
query, params = AdvancedSearchPatterns.graph_enhanced_vector_search(
    query_text=query_text,
    query_vector=query_embedding.vector,
    limit=5,
    max_traversal_depth=2
)

print("\nüìù Query Cypher generado:")
print("=" * 80)
print(query[:500] + "...")
print("=" * 80)


In [None]:
# Ejecutar query (requiere entidades en el grafo)
driver = graph_session()
try:
    with driver.session() as session:
        result = session.run(query, **params)
        records = list(result)
        
        if records:
            print(f"‚úÖ Encontrados {len(records)} resultados:\n")
            for i, record in enumerate(records[:3], 1):
                data = record["result"]
                print(f"{i}. Chunk central:")
                print(f"   Score: {data.get('central_chunk', {}).get('score', 'N/A')}")
                print(f"   Contenido: {data.get('central_chunk', {}).get('content', '')[:150]}...")
                print(f"   Chunks relacionados: {len(data.get('related_chunks', []))}")
                print()
        else:
            print("‚ö†Ô∏è  No se encontraron resultados")
            print("   Nota: Este patr√≥n requiere entidades extra√≠das en el grafo")
            print("   Usa el notebook 8 (Inference & Entity Extraction) primero")
finally:
    driver.close()


## Parte 2: Local Retriever

B√∫squeda optimizada para comunidades peque√±as y focalizadas.


In [None]:
# Generar query para Local Retriever
query_text = "deep learning"
query, params = AdvancedSearchPatterns.local_retriever(
    query_text=query_text,
    limit=5,
    community_threshold=3,  # M√≠nimo 3 chunks relacionados
    max_depth=1  # Profundidad de relaciones
)

print(f"üîç Query: '{query_text}'")
print(f"üìä Configuraci√≥n: community_threshold=3, max_depth=1\n")

print("üìù Query Cypher generado:")
print("=" * 80)
print(query[:400] + "...")
print("=" * 80)


In [None]:
# Ejecutar Local Retriever
driver = graph_session()
try:
    with driver.session() as session:
        result = session.run(query, **params)
        records = list(result)
        
        if records:
            print(f"‚úÖ Encontradas {len(records)} comunidades locales:\n")
            for i, record in enumerate(records[:3], 1):
                data = record["result"]
                print(f"{i}. Comunidad:")
                print(f"   Tama√±o: {data.get('community_size', 0)} chunks")
                print(f"   Score central: {data.get('central_score', 'N/A')}")
                print(f"   Contenido central: {data.get('central_content', '')[:150]}...")
                print(f"   Resumen comunidad: {data.get('community_summary', '')[:200]}...")
                print()
        else:
            print("‚ö†Ô∏è  No se encontraron comunidades que cumplan el threshold")
finally:
    driver.close()


## Parte 3: Community Summary Retriever (GDS)

Usa Graph Data Science para detectar comunidades y generar res√∫menes.


In [None]:
# Detectar comunidades usando GDS
try:
    gds_service = GDSService()
    
    print("üîç Detectando comunidades con algoritmo Louvain...\n")
    stats = gds_service.detect_communities(
        graph_name="chunk-graph",
        algorithm="louvain",
        relationship_types=["NEXT_CHUNK", "MENTIONS"],
        write_property="community_id"
    )
    
    print("‚úÖ Comunidades detectadas:")
    print(f"   Algoritmo: {stats['algorithm']}")
    print(f"   N√∫mero de comunidades: {stats['community_count']}")
    print(f"   Iteraciones: {stats['iterations']}")
    print(f"   Convergi√≥: {stats['converged']}")
    print(f"   Propiedad escrita: {stats['write_property']}")
    
except Exception as e:
    print(f"‚ùå Error detectando comunidades: {e}")
    print("   Verifica que GDS est√© instalado y configurado")


In [None]:
# Generar query para Community Summary Retriever
query_text = "neural networks"
query, params = AdvancedSearchPatterns.community_summary_retriever_gds(
    query_text=query_text,
    limit=3,
    min_community_size=5,
    algorithm="louvain"
)

print(f"üîç Query: '{query_text}'")
print("üìù Query Cypher para Community Summary (requiere comunidades detectadas):")
print("=" * 80)
print(query[:500] + "...")
print("=" * 80)


## Comparaci√≥n de Patrones Avanzados

| Patr√≥n | Requisitos | Velocidad | Precisi√≥n | Uso Recomendado |
|--------|-----------|----------|-----------|-----------------|
| Graph-Enhanced Vector | Entidades en grafo | ‚ö° | ‚≠ê‚≠ê‚≠ê‚≠ê | B√∫squedas sem√°nticas con contexto estructural |
| Local Retriever | Ninguno | ‚ö°‚ö° | ‚≠ê‚≠ê‚≠ê | Exploraci√≥n focalizada, comunidades peque√±as |
| Community Summary (GDS) | GDS plugin | ‚ö° | ‚≠ê‚≠ê | Res√∫menes amplios, contexto de comunidades |

## Mejores Pr√°cticas

1. **Graph-Enhanced Vector**: Requiere entidades extra√≠das. Usa despu√©s de ejecutar inferencia.
2. **Local Retriever**: No requiere GDS. Bueno para exploraci√≥n r√°pida.
3. **Community Summary**: Requiere GDS y detecci√≥n previa de comunidades. Mejor para an√°lisis amplio.

## Referencias

- [Patrones Avanzados](../../docs/api/advanced-search-patterns.md)
- [GraphRAG Patterns](../../docs/api/search-patterns.md)
- [Neo4j GDS Documentation](https://neo4j.com/docs/graph-data-science/)
