# BioLink RAG Pipeline (SQL Server → pgvector → Ollama)
This notebook validates the end-to-end RAG pipeline and demonstrates hybrid retrieval.

## 1) Environment Check
Ensure SQL Server, pgvector, and Ollama are running.

In [None]:
import os

required = [
    'SQLSERVER_HOST','SQLSERVER_PORT','SQLSERVER_DB','SQLSERVER_USER','SQLSERVER_PASSWORD',
    'RAG_PG_URL','OLLAMA_BASE_URL','RAG_EMBEDDING_MODEL'
]
missing = [k for k in required if not os.getenv(k)]
missing

## 2) Stage 1: SQL Server Smoke Test
Verifies schema + sample rows from EHVol registry.

In [None]:
!python -m app.scripts.sqlserver_smoke_test

## 3) Stage 2: pgvector setup
Creates extension, table, and index.

In [None]:
!python -m app.scripts.stage2_pgvector_setup

## 4) Stage 3: Embed sample notes
Fetches a few notes, chunks them, embeds, and upserts to pgvector.

In [None]:
!python -m app.scripts.stage3_embed_sample

## 5) Stage 5: RAG Query Test
Runs a vector search with optional SQL filters.

In [None]:
!python -m app.scripts.stage5_rag_query