Skip to content

feat(rag): implement PRP-9 RAG Knowledge Base with pgvector#49

Merged
w7-mgfcode merged 5 commits into
devfrom
feat/prp-9-rag-knowledge-base
Feb 1, 2026
Merged

feat(rag): implement PRP-9 RAG Knowledge Base with pgvector#49
w7-mgfcode merged 5 commits into
devfrom
feat/prp-9-rag-knowledge-base

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

@w7-mgfcode w7-mgfcode commented Feb 1, 2026

Summary

  • Implement RAG (Retrieval-Augmented Generation) knowledge base feature for semantic document indexing and retrieval
  • Add pgvector-based vector storage in PostgreSQL with HNSW index for fast similarity search
  • Support markdown-aware and OpenAPI-aware document chunking with tiktoken token counting
  • Enable idempotent re-indexing via SHA-256 content hash comparison

Changes

New RAG Feature (app/features/rag/)

  • models.py: ORM models DocumentSource and DocumentChunk with pgvector Vector column
  • schemas.py: Pydantic v2 schemas with strict validation for all API contracts
  • chunkers.py: MarkdownChunker (heading-aware) and OpenAPIChunker (endpoint-based)
  • embeddings.py: EmbeddingService with async OpenAI API, batch processing, rate limit handling
  • service.py: RAGService orchestrating indexing, retrieval, and source management
  • routes.py: FastAPI endpoints for RAG operations

API Endpoints

  • POST /rag/index - Index documents with automatic chunking and embedding
  • POST /rag/retrieve - Semantic search with cosine similarity and relevance scoring
  • GET /rag/sources - List indexed sources with chunk statistics
  • DELETE /rag/sources/{source_id} - Remove source and cascade delete chunks

Configuration

  • Added RAG settings to app/core/config.py (embedding, chunking, retrieval, index config)
  • Updated .env.example with RAG environment variables
  • Added dependencies: pgvector, openai, tiktoken, httpx

Database

  • Migration b4c8d9e0f123 creates pgvector extension and RAG tables with HNSW index

Tests

  • 68 unit tests covering schemas, chunkers, embeddings, and service
  • 14 integration tests for API routes (requires PostgreSQL)

Test plan

  • Run uv run pytest app/features/rag/tests/ -v -m "not integration" - 68 unit tests pass
  • Run uv run mypy app/features/rag/ - 0 issues
  • Run uv run pyright app/features/rag/ - 0 errors
  • Run uv run ruff check app/features/rag/ - All checks passed
  • Run integration tests with PostgreSQL + pgvector: uv run pytest app/features/rag/tests/ -v -m integration
  • Apply migration and smoke test endpoints manually

🤖 Generated with Claude Code

Add RAG (Retrieval-Augmented Generation) knowledge base feature for
semantic document indexing and retrieval using PostgreSQL pgvector.

Key components:
- Document indexing with markdown-aware and OpenAPI-aware chunking
- Semantic retrieval using cosine similarity with configurable thresholds
- Idempotent re-indexing via SHA-256 content hash comparison
- OpenAI text-embedding-3-small for embeddings (1536 dimensions)
- HNSW index for fast approximate nearest neighbor search

API endpoints:
- POST /rag/index - Index documents with automatic chunking
- POST /rag/retrieve - Semantic search with relevance scoring
- GET /rag/sources - List indexed sources with statistics
- DELETE /rag/sources/{source_id} - Remove source and chunks

Includes:
- ORM models: DocumentSource, DocumentChunk with Vector column
- Pydantic v2 schemas with strict validation
- 68 unit tests + 14 integration tests
- Migration for pgvector extension and RAG tables
- Examples and environment configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/prp-9-rag-knowledge-base

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Feb 1, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedopenai@​2.16.096100100100100
Addedpgvector@​0.4.2100100100100100

View full report

@w7-mgfcode
Copy link
Copy Markdown
Owner Author

Verification Complete ✅

Migration Applied

  • b4c8d9e0f123_create_rag_tables.py successfully applied
  • Created pgvector extension, document_source and document_chunk tables with HNSW index

Integration Tests

All 14 integration tests passed:

app/features/rag/tests/test_routes.py::TestIndexEndpoint::test_index_markdown_creates_chunks PASSED
app/features/rag/tests/test_routes.py::TestIndexEndpoint::test_index_same_content_returns_unchanged PASSED
app/features/rag/tests/test_routes.py::TestIndexEndpoint::test_index_updated_content_re_indexes PASSED
app/features/rag/tests/test_routes.py::TestIndexEndpoint::test_index_invalid_source_type PASSED
app/features/rag/tests/test_routes.py::TestIndexEndpoint::test_index_file_not_found PASSED
app/features/rag/tests/test_routes.py::TestRetrieveEndpoint::test_retrieve_returns_relevant_chunks PASSED
app/features/rag/tests/test_routes.py::TestRetrieveEndpoint::test_retrieve_respects_threshold PASSED
app/features/rag/tests/test_routes.py::TestRetrieveEndpoint::test_retrieve_empty_database PASSED
app/features/rag/tests/test_routes.py::TestRetrieveEndpoint::test_retrieve_validates_query PASSED
app/features/rag/tests/test_routes.py::TestSourcesEndpoint::test_list_sources_returns_all PASSED
app/features/rag/tests/test_routes.py::TestSourcesEndpoint::test_delete_source_removes_chunks PASSED
app/features/rag/tests/test_routes.py::TestSourcesEndpoint::test_delete_nonexistent_returns_404 PASSED
app/features/rag/tests/test_routes.py::TestSourcesEndpoint::test_source_not_in_list_after_delete PASSED
app/features/rag/tests/test_routes.py::TestOpenAPIIndexing::test_index_openapi_creates_endpoint_chunks PASSED

Smoke Test Results

All endpoints respond correctly at http://localhost:8123/rag/*:

  • GET /rag/sources → Returns source list
  • POST /rag/index → Indexes documents (requires OPENAI_API_KEY)
  • POST /rag/retrieve → Retrieves relevant chunks (requires OPENAI_API_KEY)
  • DELETE /rag/sources/{id} → Proper 404 handling
  • ✅ Validation errors return RFC 7807 error responses

Ready for review.

w7-learn and others added 4 commits February 1, 2026 14:04
- Add EmbeddingProvider abstract base class with provider pattern
- Refactor existing OpenAI code to OpenAIEmbeddingProvider
- Add OllamaEmbeddingProvider using /v1/embeddings endpoint
  - Supports configurable dimensions parameter
  - Uses OpenAI-compatible response format
- Add config settings: rag_embedding_provider, ollama_base_url, ollama_embedding_model
- Add migration for dynamic embedding dimension support
- Update tests for both providers (25 tests)

Enables local/LAN embedding generation without OpenAI API dependency.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update .env.example with Ollama configuration options
- Add RAG Knowledge Base section to README with:
  - Embedding provider options (OpenAI/Ollama)
  - Example index and retrieve requests
  - Configuration examples for both providers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create docs/PHASE/8-RAG_KNOWLEDGE_BASE.md with full phase details
- Update docs/PHASE-index.md:
  - Mark Phase 8 as Completed in overview table
  - Add Phase 8 summary to Completed Phases section
  - Add entry to Version History

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add rag models import to alembic/env.py for schema validation
- Format test_embeddings.py to pass ruff format check

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@w7-mgfcode w7-mgfcode merged commit 66ca309 into dev Feb 1, 2026
10 checks passed
@w7-mgfcode w7-mgfcode deleted the feat/prp-9-rag-knowledge-base branch February 1, 2026 14:21
w7-mgfcode added a commit that referenced this pull request Feb 1, 2026
* feat(registry): implement model registry for run tracking and deployments (#36)

* docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(registry): implement model registry for run tracking and deployments

Add model registry feature (PRP-7) with:

- ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info),
  DeploymentAlias for mutable deployment pointers
- Storage: LocalFSProvider with SHA-256 integrity verification and path
  traversal prevention, abstract interface for future S3/GCS support
- Service: RegistryService with state machine validation, duplicate
  detection, config hashing, and run comparison
- API endpoints: CRUD for runs and aliases, artifact verification,
  run comparison with config/metrics diffs
- Database: Alembic migration with GIN indexes for JSONB containment queries
- Tests: 103 unit tests (schemas, storage, service) + 24 integration tests
- Example: registry_demo.py demonstrating full workflow

Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED
Aliases can only point to SUCCESS runs for deployment safety.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for model registry implementation

- README.md: Add registry to project structure, API endpoints section,
  and example reference
- docs/ARCHITECTURE.md: Update section 7.6 with full implementation
  details, add registry endpoints to section 8, mark Phase 1 complete
- docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed
  completion entries for Forecasting, Backtesting, and Registry

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PHASE documentation for forecasting, backtesting, and registry

Create missing phase documentation files to complete the project's
implementation records:

- 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict
  endpoints, and joblib persistence
- 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies,
  metrics calculation, and baseline comparisons
- 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases,
  and SHA-256 artifact integrity verification

Update PHASE-index.md to link to the new documentation files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(registry): resolve type checking issues with Pydantic model_config alias

- Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking
- Use model_config_data instead of model_config alias in tests to avoid collision
  with Pydantic's reserved model_config attribute
- Update _model_to_response to use model_validate() for proper alias handling
- Change docker-compose postgres port to 5433 to avoid conflicts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: resolve CI failures for registry PR

- Import registry models in alembic/env.py for schema validation
- Fix import order and remove extraneous f-strings in registry_demo.py
- Add type: ignore comments for frozen model tests with pydantic.mypy plugin

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent db_session fixtures from dropping all tables

The data_platform and root conftest.py db_session fixtures were dropping
all tables after each test, causing subsequent integration tests to fail
when they couldn't find migrated tables.

Changes:
- Remove Base.metadata.drop_all from db_session fixtures
- Tests now rely on migrations for table creation
- Each test just rolls back its own changes

Also fixes ruff format issue in examples/registry_demo.py.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add proper test data cleanup to db_session fixtures

Update data_platform and ingest test fixtures to clean up test data
explicitly instead of dropping all tables or just rolling back.

- data_platform: delete test stores, products, calendar entries
- ingest: delete test stores, products, sales, calendar entries

This ensures test isolation while preserving migrated tables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use separate session for test cleanup to avoid transaction issues

When tests cause integrity errors, the session enters a failed state.
Use a fresh session for cleanup to avoid PendingRollbackError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use contextlib.suppress instead of try-except-pass

Replace try-except-pass patterns with contextlib.suppress to satisfy
ruff S110 linting rule.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: code improvements and documentation fixes

- Add date range filter to SalesDaily cleanup in ingest tests
- Enforce artifact_hash presence before verification in registry routes
- Compute SHA256 from saved file instead of source in storage
- Fix override_get_db to mirror production transaction semantics
- Filter DeploymentAlias cleanup to only test runs
- Update database port to 5433 in config and .env.example
- Add language identifiers to fenced code blocks (MD040)
- Fix table formatting for markdownlint MD060
- Update PR reference in PHASE/6-MODEL_REGISTRY.md
- Convert bare URLs to markdown links in INITIAL-7.md
- Wrap __init__.py in backticks in PRP-7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* sync: update dev from phase-6 (#40)

* chore: release v0.2.0 (#37)

* feat(registry): implement model registry for run tracking and deployments (#36)

* docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(registry): implement model registry for run tracking and deployments

Add model registry feature (PRP-7) with:

- ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info),
  DeploymentAlias for mutable deployment pointers
- Storage: LocalFSProvider with SHA-256 integrity verification and path
  traversal prevention, abstract interface for future S3/GCS support
- Service: RegistryService with state machine validation, duplicate
  detection, config hashing, and run comparison
- API endpoints: CRUD for runs and aliases, artifact verification,
  run comparison with config/metrics diffs
- Database: Alembic migration with GIN indexes for JSONB containment queries
- Tests: 103 unit tests (schemas, storage, service) + 24 integration tests
- Example: registry_demo.py demonstrating full workflow

Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED
Aliases can only point to SUCCESS runs for deployment safety.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for model registry implementation

- README.md: Add registry to project structure, API endpoints section,
  and example reference
- docs/ARCHITECTURE.md: Update section 7.6 with full implementation
  details, add registry endpoints to section 8, mark Phase 1 complete
- docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed
  completion entries for Forecasting, Backtesting, and Registry

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PHASE documentation for forecasting, backtesting, and registry

Create missing phase documentation files to complete the project's
implementation records:

- 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict
  endpoints, and joblib persistence
- 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies,
  metrics calculation, and baseline comparisons
- 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases,
  and SHA-256 artifact integrity verification

Update PHASE-index.md to link to the new documentation files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(registry): resolve type checking issues with Pydantic model_config alias

- Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking
- Use model_config_data instead of model_config alias in tests to avoid collision
  with Pydantic's reserved model_config attribute
- Update _model_to_response to use model_validate() for proper alias handling
- Change docker-compose postgres port to 5433 to avoid conflicts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: resolve CI failures for registry PR

- Import registry models in alembic/env.py for schema validation
- Fix import order and remove extraneous f-strings in registry_demo.py
- Add type: ignore comments for frozen model tests with pydantic.mypy plugin

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent db_session fixtures from dropping all tables

The data_platform and root conftest.py db_session fixtures were dropping
all tables after each test, causing subsequent integration tests to fail
when they couldn't find migrated tables.

Changes:
- Remove Base.metadata.drop_all from db_session fixtures
- Tests now rely on migrations for table creation
- Each test just rolls back its own changes

Also fixes ruff format issue in examples/registry_demo.py.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add proper test data cleanup to db_session fixtures

Update data_platform and ingest test fixtures to clean up test data
explicitly instead of dropping all tables or just rolling back.

- data_platform: delete test stores, products, calendar entries
- ingest: delete test stores, products, sales, calendar entries

This ensures test isolation while preserving migrated tables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use separate session for test cleanup to avoid transaction issues

When tests cause integrity errors, the session enters a failed state.
Use a fresh session for cleanup to avoid PendingRollbackError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use contextlib.suppress instead of try-except-pass

Replace try-except-pass patterns with contextlib.suppress to satisfy
ruff S110 linting rule.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: code improvements and documentation fixes

- Add date range filter to SalesDaily cleanup in ingest tests
- Enforce artifact_hash presence before verification in registry routes
- Compute SHA256 from saved file instead of source in storage
- Fix override_get_db to mirror production transaction semantics
- Filter DeploymentAlias cleanup to only test runs
- Update database port to 5433 in config and .env.example
- Add language identifiers to fenced code blocks (MD040)
- Fix table formatting for markdownlint MD060
- Update PR reference in PHASE/6-MODEL_REGISTRY.md
- Convert bare URLs to markdown links in INITIAL-7.md
- Wrap __init__.py in backticks in PRP-7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(main): release 0.2.0 (#38)

Release-As: 0.2.0

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(main): release 0.2.0 (#39)

* chore(main): release 0.2.0

* chore: trigger CI

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gabe@w7dev <gabor@w7-7.net>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(serving-layer): implement PRP-8 agent-first API design (#42)

* docs(initial-8): expand serving layer requirements

Add specifications for job-driven orchestration, dimension discovery
endpoints, standardized API protocols (filtering/pagination), and
agent-first API design patterns for LLM tool-calling optimization.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp-8): add serving layer implementation spec

Comprehensive PRP for FastAPI serving layer including:
- Dimensions module for store/product discovery endpoints
- Analytics module for KPI/drilldown queries
- Jobs module for async-ready task orchestration
- RFC 7807 problem details for semantic error responses
- OpenAPI export optimization for LLM tool-calling

26 tasks with validation gates and 8.5/10 confidence score.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(serving-layer): implement PRP-8 agent-first API design

Add RFC 7807 Problem Details for semantic error responses:
- ProblemDetail schema with type URIs and error codes
- application/problem+json content type
- Validation exception handler with field-level errors

Add dimensions module for store/product discovery:
- GET /dimensions/stores with pagination, filtering, search
- GET /dimensions/products with pagination, filtering, search
- LLM-optimized Field descriptions for tool-calling

Add analytics module for KPI aggregations:
- GET /analytics/kpis with date range and dimension filters
- GET /analytics/drilldowns for store/product/category/region/date
- Revenue share and ranking calculations

Add jobs module for async-ready task orchestration:
- POST /jobs for train/predict/backtest operations
- Job model with JSONB params/results
- Status transitions: pending → running → completed/failed

Integration:
- New settings: analytics_max_rows, jobs_retention_days
- Register routers in main.py
- Alembic migration for jobs table

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for PRP-8 serving layer

Update README.md:
- Add dimensions, analytics, jobs modules to project structure
- Document new API endpoints with examples
- Add RFC 7807 error response documentation

Update docs/ARCHITECTURE.md:
- Mark serving layer section as implemented
- Add configuration settings for new modules
- Update roadmap with Phase-2 completion

Update docs/PHASE-index.md:
- Add Phase 7 (Serving Layer) as completed
- Update phase overview table
- Add version history entry

Create docs/PHASE/7-SERVING_LAYER.md:
- Comprehensive phase documentation
- API endpoint specifications
- Database schema and migration details
- Usage examples and test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: fix ruff formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix(serving-layer): improve analytics validation and jobs run_id handling

- Add validate_date_range helper to analytics routes for reusable date validation
- Apply date range validation to both get_kpis and get_drilldowns endpoints
- Fix total_revenue_all calculation to use full dataset before limiting
- Add run_id to train job result for downstream predict jobs
- Fix predict job to resolve run_id to model metadata from bundle
- Update test fixtures to use 32-char hex IDs per schema requirements

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: format jobs service

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: restructure roadmap into modular three-phase architecture (INITIAL-9/10/11) (#47)

* docs: restructure INITIAL-9 into modular three-phase roadmap

Decompose monolithic INITIAL-9 into three specialized technical phases:

- INITIAL-9: RAG Knowledge Base ("The Memory")
  - pgvector + OpenAI embeddings
  - Markdown/OpenAPI-aware chunking
  - Semantic retrieval endpoints

- INITIAL-10: Agentic Layer ("The Brain")
  - PydanticAI agents (Experiment Orchestrator, RAG Assistant)
  - Tool orchestration with structured outputs
  - Human-in-the-loop approval workflow

- INITIAL-11: ForecastLab Dashboard ("The Face")
  - React 19 + Vite + shadcn/ui
  - TanStack Table/Query for data management
  - Recharts for time series visualization
  - Agent chat interface with streaming

Update PHASE-index.md and DAILY-FLOW.md to align with new structure.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-9 RAG Knowledge Base implementation plan

Comprehensive PRP for INITIAL-9 RAG Knowledge Base feature:

- pgvector + SQLAlchemy 2.0 integration patterns
- Markdown-aware and OpenAPI-aware chunking
- Async OpenAI embeddings with batch processing
- HNSW index for cosine similarity search
- 15 ordered implementation tasks
- 5-level validation loop (syntax → types → unit → integration → smoke)
- Full ORM models and Pydantic schemas
- Known gotchas and anti-patterns documented

Confidence score: 8.5/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-10 Agentic Layer implementation plan

Comprehensive PRP for INITIAL-10 Agentic Layer feature:

- PydanticAI agent framework integration
- Experiment Orchestrator Agent (backtest → compare → deploy)
- RAG Assistant Agent (query → retrieve → answer with citations)
- Human-in-the-loop approval workflow for sensitive actions
- WebSocket streaming for real-time token delivery
- Session persistence with JSONB message history
- 17 ordered implementation tasks
- Tool definitions for registry, backtesting, forecasting, RAG
- Full Pydantic schemas and ORM models

Confidence score: 7.5/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-11 ForecastLab Dashboard implementation plan (#48)

Comprehensive PRP for INITIAL-11 (The Face) with:
- 24 implementation tasks across 6 phases
- React 19 + Vite + shadcn/ui + TanStack Table/Query
- TypeScript types matching all backend API schemas
- Reusable DataTable with server-side pagination
- TimeSeriesChart component with Recharts
- WebSocket hook for agent chat streaming
- Complete documentation links and gotchas

Confidence score: 7.5/10 (chat depends on INITIAL-10)

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* feat(rag): implement PRP-9 RAG Knowledge Base with pgvector (#49)

* feat(rag): implement PRP-9 RAG Knowledge Base with pgvector

Add RAG (Retrieval-Augmented Generation) knowledge base feature for
semantic document indexing and retrieval using PostgreSQL pgvector.

Key components:
- Document indexing with markdown-aware and OpenAPI-aware chunking
- Semantic retrieval using cosine similarity with configurable thresholds
- Idempotent re-indexing via SHA-256 content hash comparison
- OpenAI text-embedding-3-small for embeddings (1536 dimensions)
- HNSW index for fast approximate nearest neighbor search

API endpoints:
- POST /rag/index - Index documents with automatic chunking
- POST /rag/retrieve - Semantic search with relevance scoring
- GET /rag/sources - List indexed sources with statistics
- DELETE /rag/sources/{source_id} - Remove source and chunks

Includes:
- ORM models: DocumentSource, DocumentChunk with Vector column
- Pydantic v2 schemas with strict validation
- 68 unit tests + 14 integration tests
- Migration for pgvector extension and RAG tables
- Examples and environment configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(rag): add Ollama embedding provider with OpenAI-compatible API

- Add EmbeddingProvider abstract base class with provider pattern
- Refactor existing OpenAI code to OpenAIEmbeddingProvider
- Add OllamaEmbeddingProvider using /v1/embeddings endpoint
  - Supports configurable dimensions parameter
  - Uses OpenAI-compatible response format
- Add config settings: rag_embedding_provider, ollama_base_url, ollama_embedding_model
- Add migration for dynamic embedding dimension support
- Update tests for both providers (25 tests)

Enables local/LAN embedding generation without OpenAI API dependency.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add Ollama embedding provider documentation

- Update .env.example with Ollama configuration options
- Add RAG Knowledge Base section to README with:
  - Embedding provider options (OpenAI/Ollama)
  - Example index and retrieve requests
  - Configuration examples for both providers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add Phase 8 RAG Knowledge Base documentation

- Create docs/PHASE/8-RAG_KNOWLEDGE_BASE.md with full phase details
- Update docs/PHASE-index.md:
  - Mark Phase 8 as Completed in overview table
  - Add Phase 8 summary to Completed Phases section
  - Add entry to Version History

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): add RAG models import to alembic env and format tests

- Add rag models import to alembic/env.py for schema validation
- Format test_embeddings.py to pass ruff format check

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address code review issues for RAG module and docs

- Make migration deterministic by hardcoding dimension values instead
  of reading from environment (alembic migration)
- Add pyyaml dependency for YAML parsing in OpenAPI chunker
- Fix token count logging to capture original count before truncation
- Add path traversal protection to RAG service _read_content_from_path
  (mirrors registry/storage.py pattern)
- Fix markdown linting issues:
  - Add language tags to fenced code blocks (MD040)
  - Fix table pipe spacing (MD060)
- Fix index_docs.py to treat 200 same as 201 for idempotent responses
- Add test for path traversal protection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
w7-mgfcode added a commit that referenced this pull request Feb 1, 2026
* feat: RAG Knowledge Base, Serving Layer, and Model Registry (#50)

* feat(registry): implement model registry for run tracking and deployments (#36)

* docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(registry): implement model registry for run tracking and deployments

Add model registry feature (PRP-7) with:

- ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info),
  DeploymentAlias for mutable deployment pointers
- Storage: LocalFSProvider with SHA-256 integrity verification and path
  traversal prevention, abstract interface for future S3/GCS support
- Service: RegistryService with state machine validation, duplicate
  detection, config hashing, and run comparison
- API endpoints: CRUD for runs and aliases, artifact verification,
  run comparison with config/metrics diffs
- Database: Alembic migration with GIN indexes for JSONB containment queries
- Tests: 103 unit tests (schemas, storage, service) + 24 integration tests
- Example: registry_demo.py demonstrating full workflow

Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED
Aliases can only point to SUCCESS runs for deployment safety.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for model registry implementation

- README.md: Add registry to project structure, API endpoints section,
  and example reference
- docs/ARCHITECTURE.md: Update section 7.6 with full implementation
  details, add registry endpoints to section 8, mark Phase 1 complete
- docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed
  completion entries for Forecasting, Backtesting, and Registry

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PHASE documentation for forecasting, backtesting, and registry

Create missing phase documentation files to complete the project's
implementation records:

- 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict
  endpoints, and joblib persistence
- 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies,
  metrics calculation, and baseline comparisons
- 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases,
  and SHA-256 artifact integrity verification

Update PHASE-index.md to link to the new documentation files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(registry): resolve type checking issues with Pydantic model_config alias

- Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking
- Use model_config_data instead of model_config alias in tests to avoid collision
  with Pydantic's reserved model_config attribute
- Update _model_to_response to use model_validate() for proper alias handling
- Change docker-compose postgres port to 5433 to avoid conflicts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: resolve CI failures for registry PR

- Import registry models in alembic/env.py for schema validation
- Fix import order and remove extraneous f-strings in registry_demo.py
- Add type: ignore comments for frozen model tests with pydantic.mypy plugin

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent db_session fixtures from dropping all tables

The data_platform and root conftest.py db_session fixtures were dropping
all tables after each test, causing subsequent integration tests to fail
when they couldn't find migrated tables.

Changes:
- Remove Base.metadata.drop_all from db_session fixtures
- Tests now rely on migrations for table creation
- Each test just rolls back its own changes

Also fixes ruff format issue in examples/registry_demo.py.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add proper test data cleanup to db_session fixtures

Update data_platform and ingest test fixtures to clean up test data
explicitly instead of dropping all tables or just rolling back.

- data_platform: delete test stores, products, calendar entries
- ingest: delete test stores, products, sales, calendar entries

This ensures test isolation while preserving migrated tables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use separate session for test cleanup to avoid transaction issues

When tests cause integrity errors, the session enters a failed state.
Use a fresh session for cleanup to avoid PendingRollbackError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use contextlib.suppress instead of try-except-pass

Replace try-except-pass patterns with contextlib.suppress to satisfy
ruff S110 linting rule.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: code improvements and documentation fixes

- Add date range filter to SalesDaily cleanup in ingest tests
- Enforce artifact_hash presence before verification in registry routes
- Compute SHA256 from saved file instead of source in storage
- Fix override_get_db to mirror production transaction semantics
- Filter DeploymentAlias cleanup to only test runs
- Update database port to 5433 in config and .env.example
- Add language identifiers to fenced code blocks (MD040)
- Fix table formatting for markdownlint MD060
- Update PR reference in PHASE/6-MODEL_REGISTRY.md
- Convert bare URLs to markdown links in INITIAL-7.md
- Wrap __init__.py in backticks in PRP-7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* sync: update dev from phase-6 (#40)

* chore: release v0.2.0 (#37)

* feat(registry): implement model registry for run tracking and deployments (#36)

* docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(registry): implement model registry for run tracking and deployments

Add model registry feature (PRP-7) with:

- ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info),
  DeploymentAlias for mutable deployment pointers
- Storage: LocalFSProvider with SHA-256 integrity verification and path
  traversal prevention, abstract interface for future S3/GCS support
- Service: RegistryService with state machine validation, duplicate
  detection, config hashing, and run comparison
- API endpoints: CRUD for runs and aliases, artifact verification,
  run comparison with config/metrics diffs
- Database: Alembic migration with GIN indexes for JSONB containment queries
- Tests: 103 unit tests (schemas, storage, service) + 24 integration tests
- Example: registry_demo.py demonstrating full workflow

Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED
Aliases can only point to SUCCESS runs for deployment safety.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for model registry implementation

- README.md: Add registry to project structure, API endpoints section,
  and example reference
- docs/ARCHITECTURE.md: Update section 7.6 with full implementation
  details, add registry endpoints to section 8, mark Phase 1 complete
- docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed
  completion entries for Forecasting, Backtesting, and Registry

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add PHASE documentation for forecasting, backtesting, and registry

Create missing phase documentation files to complete the project's
implementation records:

- 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict
  endpoints, and joblib persistence
- 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies,
  metrics calculation, and baseline comparisons
- 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases,
  and SHA-256 artifact integrity verification

Update PHASE-index.md to link to the new documentation files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(registry): resolve type checking issues with Pydantic model_config alias

- Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking
- Use model_config_data instead of model_config alias in tests to avoid collision
  with Pydantic's reserved model_config attribute
- Update _model_to_response to use model_validate() for proper alias handling
- Change docker-compose postgres port to 5433 to avoid conflicts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: resolve CI failures for registry PR

- Import registry models in alembic/env.py for schema validation
- Fix import order and remove extraneous f-strings in registry_demo.py
- Add type: ignore comments for frozen model tests with pydantic.mypy plugin

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: prevent db_session fixtures from dropping all tables

The data_platform and root conftest.py db_session fixtures were dropping
all tables after each test, causing subsequent integration tests to fail
when they couldn't find migrated tables.

Changes:
- Remove Base.metadata.drop_all from db_session fixtures
- Tests now rely on migrations for table creation
- Each test just rolls back its own changes

Also fixes ruff format issue in examples/registry_demo.py.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: add proper test data cleanup to db_session fixtures

Update data_platform and ingest test fixtures to clean up test data
explicitly instead of dropping all tables or just rolling back.

- data_platform: delete test stores, products, calendar entries
- ingest: delete test stores, products, sales, calendar entries

This ensures test isolation while preserving migrated tables.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use separate session for test cleanup to avoid transaction issues

When tests cause integrity errors, the session enters a failed state.
Use a fresh session for cleanup to avoid PendingRollbackError.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: use contextlib.suppress instead of try-except-pass

Replace try-except-pass patterns with contextlib.suppress to satisfy
ruff S110 linting rule.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: code improvements and documentation fixes

- Add date range filter to SalesDaily cleanup in ingest tests
- Enforce artifact_hash presence before verification in registry routes
- Compute SHA256 from saved file instead of source in storage
- Fix override_get_db to mirror production transaction semantics
- Filter DeploymentAlias cleanup to only test runs
- Update database port to 5433 in config and .env.example
- Add language identifiers to fenced code blocks (MD040)
- Fix table formatting for markdownlint MD060
- Update PR reference in PHASE/6-MODEL_REGISTRY.md
- Convert bare URLs to markdown links in INITIAL-7.md
- Wrap __init__.py in backticks in PRP-7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(main): release 0.2.0 (#38)

Release-As: 0.2.0

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(main): release 0.2.0 (#39)

* chore(main): release 0.2.0

* chore: trigger CI

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gabe@w7dev <gabor@w7-7.net>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(serving-layer): implement PRP-8 agent-first API design (#42)

* docs(initial-8): expand serving layer requirements

Add specifications for job-driven orchestration, dimension discovery
endpoints, standardized API protocols (filtering/pagination), and
agent-first API design patterns for LLM tool-calling optimization.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp-8): add serving layer implementation spec

Comprehensive PRP for FastAPI serving layer including:
- Dimensions module for store/product discovery endpoints
- Analytics module for KPI/drilldown queries
- Jobs module for async-ready task orchestration
- RFC 7807 problem details for semantic error responses
- OpenAPI export optimization for LLM tool-calling

26 tasks with validation gates and 8.5/10 confidence score.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(serving-layer): implement PRP-8 agent-first API design

Add RFC 7807 Problem Details for semantic error responses:
- ProblemDetail schema with type URIs and error codes
- application/problem+json content type
- Validation exception handler with field-level errors

Add dimensions module for store/product discovery:
- GET /dimensions/stores with pagination, filtering, search
- GET /dimensions/products with pagination, filtering, search
- LLM-optimized Field descriptions for tool-calling

Add analytics module for KPI aggregations:
- GET /analytics/kpis with date range and dimension filters
- GET /analytics/drilldowns for store/product/category/region/date
- Revenue share and ranking calculations

Add jobs module for async-ready task orchestration:
- POST /jobs for train/predict/backtest operations
- Job model with JSONB params/results
- Status transitions: pending → running → completed/failed

Integration:
- New settings: analytics_max_rows, jobs_retention_days
- Register routers in main.py
- Alembic migration for jobs table

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: update documentation for PRP-8 serving layer

Update README.md:
- Add dimensions, analytics, jobs modules to project structure
- Document new API endpoints with examples
- Add RFC 7807 error response documentation

Update docs/ARCHITECTURE.md:
- Mark serving layer section as implemented
- Add configuration settings for new modules
- Update roadmap with Phase-2 completion

Update docs/PHASE-index.md:
- Add Phase 7 (Serving Layer) as completed
- Update phase overview table
- Add version history entry

Create docs/PHASE/7-SERVING_LAYER.md:
- Comprehensive phase documentation
- API endpoint specifications
- Database schema and migration details
- Usage examples and test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: fix ruff formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix(serving-layer): improve analytics validation and jobs run_id handling

- Add validate_date_range helper to analytics routes for reusable date validation
- Apply date range validation to both get_kpis and get_drilldowns endpoints
- Fix total_revenue_all calculation to use full dataset before limiting
- Add run_id to train job result for downstream predict jobs
- Fix predict job to resolve run_id to model metadata from bundle
- Update test fixtures to use 32-char hex IDs per schema requirements

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* style: format jobs service

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: restructure roadmap into modular three-phase architecture (INITIAL-9/10/11) (#47)

* docs: restructure INITIAL-9 into modular three-phase roadmap

Decompose monolithic INITIAL-9 into three specialized technical phases:

- INITIAL-9: RAG Knowledge Base ("The Memory")
  - pgvector + OpenAI embeddings
  - Markdown/OpenAPI-aware chunking
  - Semantic retrieval endpoints

- INITIAL-10: Agentic Layer ("The Brain")
  - PydanticAI agents (Experiment Orchestrator, RAG Assistant)
  - Tool orchestration with structured outputs
  - Human-in-the-loop approval workflow

- INITIAL-11: ForecastLab Dashboard ("The Face")
  - React 19 + Vite + shadcn/ui
  - TanStack Table/Query for data management
  - Recharts for time series visualization
  - Agent chat interface with streaming

Update PHASE-index.md and DAILY-FLOW.md to align with new structure.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-9 RAG Knowledge Base implementation plan

Comprehensive PRP for INITIAL-9 RAG Knowledge Base feature:

- pgvector + SQLAlchemy 2.0 integration patterns
- Markdown-aware and OpenAPI-aware chunking
- Async OpenAI embeddings with batch processing
- HNSW index for cosine similarity search
- 15 ordered implementation tasks
- 5-level validation loop (syntax → types → unit → integration → smoke)
- Full ORM models and Pydantic schemas
- Known gotchas and anti-patterns documented

Confidence score: 8.5/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-10 Agentic Layer implementation plan

Comprehensive PRP for INITIAL-10 Agentic Layer feature:

- PydanticAI agent framework integration
- Experiment Orchestrator Agent (backtest → compare → deploy)
- RAG Assistant Agent (query → retrieve → answer with citations)
- Human-in-the-loop approval workflow for sensitive actions
- WebSocket streaming for real-time token delivery
- Session persistence with JSONB message history
- 17 ordered implementation tasks
- Tool definitions for registry, backtesting, forecasting, RAG
- Full Pydantic schemas and ORM models

Confidence score: 7.5/10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* docs(prp): add PRP-11 ForecastLab Dashboard implementation plan (#48)

Comprehensive PRP for INITIAL-11 (The Face) with:
- 24 implementation tasks across 6 phases
- React 19 + Vite + shadcn/ui + TanStack Table/Query
- TypeScript types matching all backend API schemas
- Reusable DataTable with server-side pagination
- TimeSeriesChart component with Recharts
- WebSocket hook for agent chat streaming
- Complete documentation links and gotchas

Confidence score: 7.5/10 (chat depends on INITIAL-10)

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* feat(rag): implement PRP-9 RAG Knowledge Base with pgvector (#49)

* feat(rag): implement PRP-9 RAG Knowledge Base with pgvector

Add RAG (Retrieval-Augmented Generation) knowledge base feature for
semantic document indexing and retrieval using PostgreSQL pgvector.

Key components:
- Document indexing with markdown-aware and OpenAPI-aware chunking
- Semantic retrieval using cosine similarity with configurable thresholds
- Idempotent re-indexing via SHA-256 content hash comparison
- OpenAI text-embedding-3-small for embeddings (1536 dimensions)
- HNSW index for fast approximate nearest neighbor search

API endpoints:
- POST /rag/index - Index documents with automatic chunking
- POST /rag/retrieve - Semantic search with relevance scoring
- GET /rag/sources - List indexed sources with statistics
- DELETE /rag/sources/{source_id} - Remove source and chunks

Includes:
- ORM models: DocumentSource, DocumentChunk with Vector column
- Pydantic v2 schemas with strict validation
- 68 unit tests + 14 integration tests
- Migration for pgvector extension and RAG tables
- Examples and environment configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(rag): add Ollama embedding provider with OpenAI-compatible API

- Add EmbeddingProvider abstract base class with provider pattern
- Refactor existing OpenAI code to OpenAIEmbeddingProvider
- Add OllamaEmbeddingProvider using /v1/embeddings endpoint
  - Supports configurable dimensions parameter
  - Uses OpenAI-compatible response format
- Add config settings: rag_embedding_provider, ollama_base_url, ollama_embedding_model
- Add migration for dynamic embedding dimension support
- Update tests for both providers (25 tests)

Enables local/LAN embedding generation without OpenAI API dependency.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add Ollama embedding provider documentation

- Update .env.example with Ollama configuration options
- Add RAG Knowledge Base section to README with:
  - Embedding provider options (OpenAI/Ollama)
  - Example index and retrieve requests
  - Configuration examples for both providers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add Phase 8 RAG Knowledge Base documentation

- Create docs/PHASE/8-RAG_KNOWLEDGE_BASE.md with full phase details
- Update docs/PHASE-index.md:
  - Mark Phase 8 as Completed in overview table
  - Add Phase 8 summary to Completed Phases section
  - Add entry to Version History

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): add RAG models import to alembic env and format tests

- Add rag models import to alembic/env.py for schema validation
- Format test_embeddings.py to pass ruff format check

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* fix: address code review issues for RAG module and docs

- Make migration deterministic by hardcoding dimension values instead
  of reading from environment (alembic migration)
- Add pyyaml dependency for YAML parsing in OpenAPI chunker
- Fix token count logging to capture original count before truncation
- Add path traversal protection to RAG service _read_content_from_path
  (mirrors registry/storage.py pattern)
- Fix markdown linting issues:
  - Add language tags to fenced code blocks (MD040)
  - Fix table pipe spacing (MD060)
- Fix index_docs.py to treat 200 same as 201 for idempotent responses
- Add test for path traversal protection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* ci: add release-please branch trigger and wire workflow_dispatch ref (#52)

- Add 'release-please--branches--**' pattern to match actual release-please
  branch naming (e.g., release-please--branches--main--components--forecastlabai)
- Add 'ref' input to workflow_dispatch with proper type declaration
- Wire ref input to all checkout steps via CHECKOUT_REF env var
- Use inputs.ref || github.ref for predictable fallback behavior
- Update concurrency group to respect manual ref input

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore(main): release 0.2.2 (#51)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gabor Szabo <168316277+w7-mgfcode@users.noreply.github.com>

---------

Co-authored-by: Gabe@w7dev <gabor@w7-7.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants