Skip to content

PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice)#27

Merged
Intrinsical-AI merged 9 commits into
pr/02-2026-10-eval-monitoring-rerankerfrom
pr/02-2026-11-hardening-consistency
Feb 16, 2026
Merged

PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice)#27
Intrinsical-AI merged 9 commits into
pr/02-2026-10-eval-monitoring-rerankerfrom
pr/02-2026-11-hardening-consistency

Conversation

@Intrinsical-AI
Copy link
Copy Markdown
Owner

Base: pr/02-2026-10 (stacked PR).

Contexto

Durante la auditoría técnica se detectaron riesgos críticos de consistencia en operaciones mutantes concurrentes y validación insuficiente de parámetros de generación.

Cambios principales

1) Serialización de escrituras multi-store (SQL + FAISS)

  • Nuevo lock cross-process/thread: src/local_rag_backend/core/services/write_lock.py.
  • Los paths mutantes de API se ejecutan bajo lock compartido:
    • POST /api/docs
    • POST /api/docs/upsert
    • POST /api/docs/delete
    • POST /api/docs/delete_by_external_id
    • POST /api/index/rebuild
  • Objetivo: evitar interleavings que dejen drift entre SQL e índice vectorial.

2) Hardening de parámetros de generación

  • OpenRouterGenerateRequest ahora valida rangos:
    • temperature en [0.0, 2.0]
    • top_p en [0.0, 1.0]
    • max_tokens en [1, 4096]
  • AskEvalConfig.top_p ahora también se valida en [0.0, 1.0].
  • validate_rag_config() incluye check explícito de top_p.

Tests añadidos/actualizados

  • Nuevo: tests/unit/app/test_multistore_write_lock.py
    • Reproduce concurrencia forzada y valida consistencia final SQL/vector.
  • Actualizado: tests/unit/app/test_request_limits.py
    • Rechazo de sampling params inválidos para OpenRouter.
  • Actualizado: tests/unit/app/test_api_router_more.py
    • Casos inválidos de top_p en ask_eval.

Verificación

  • pytest -q ✅ (255 passed, cobertura 85.29%)
  • ruff check src tests
  • black --check src tests
  • mypy src
  • isort --check-only src tests ✅ (luego se arregló y commiteó)

Commits incluidos

  • 5fa6c06 fix: prevent SQL/vector drift when embedding fails
  • fd412d8 fix: harden rag service reload token writes
  • 9a33cb7 fix(api): serialize multi-store writes and harden generation params
  • 69e715a chore: fix import ordering and include roadmap updates

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.
Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.
Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.
@Intrinsical-AI Intrinsical-AI merged commit 1a56f78 into pr/02-2026-10-eval-monitoring-reranker Feb 16, 2026
@Intrinsical-AI Intrinsical-AI deleted the pr/02-2026-11-hardening-consistency branch February 16, 2026 03:43
Intrinsical-AI added a commit that referenced this pull request Feb 16, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 16, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 16, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 16, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 16, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 17, 2026
…ndice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates
Intrinsical-AI added a commit that referenced this pull request Feb 24, 2026
…val, hardening) (#31)

* feat(ops): add readiness/index diagnostics and richer rag-status

* test(ops): cover readiness drift/corruption and cli status

* docs: note strict /api/ready drift detection in dense/hybrid

* updated .gitignore - pr descs into /pr

* feat(sql): add document identity columns + best-effort sqlite migration

* test(sqlite): cover identity column migration, unique external_id, sha256 fill

* docs: document document identity fields (external_id, metadata, hash, timestamps)

* feat(docs): idempotent upsert by external_id (API/CLI)

* test(docs): cover upsert idempotency (api/sql/cli)

* docs: document docs upsert endpoint and rag-upsert-docs

* docs: update roadmap with stabilization/refactor PR and incorporate quick-wins

* refactor(structure): move api schemas/diagnostics and core helpers; keep shims

* refactor(imports): prefer new module locations (app.schemas, core.services, app.diagnostics)

* refactor(structure): drop root shims and enforce app/core module boundaries

* docs: update roadmap and architecture after removing root shims

* feat(ingestion): add txt/md/csv loaders with discovery + sniffing

- Add TextFileLoader and MarkdownLoader
- Add discover_files() limits (max files / bytes) for safe dir ingestion
- Add loader factory with best-effort format detection (extension + bytes heuristics, optional python-magic)
- Extend CSVLoader with delimiter sniffing and row_index metadata

* feat(cli): add rag-ingest (dir/file) with SQL+FAISS consistency

- Add  CLI/entrypoint (files/dirs, limits, dry-run)
- Ingest via external_id upsert per chunk and delete stale chunks by source prefix
- Dense/hybrid: incremental vector updates, fallback rebuild on failure
- Add SqlDocumentStorage.list_ids_by_external_id_prefix() helper
- Add optional extra  for python-magic

* test(ingestion): cover rag-ingest + loader detection/discovery

- CLI ingest: mixed dir, idempotency, and stale-chunk deletion
- Factory detection: prefer content heuristics over misleading extensions
- Discovery: enforce limits and handle broken symlinks
- Update CSVLoader unit tests for row_index metadata

* docs: document rag-ingest and optional python-magic

- README: add rag-ingest to CLI commands and mention magic extra
- custom usage guide: add a CLI ingestion section

* fix(ingest): avoid external_id prefix collisions when syncing stale chunks

Pass a delimiter-suffixed prefix when listing existing external_ids so /tmp/foo does not match /tmp/foo2.

* feat(ingestion): configurable cleaning + deterministic chunking (chars_v1)

- Add chunk_chars_v1 with stable boundaries (start/end offsets)
- Make preprocess_text configurable via keyword options
- Add settings flags for ingestion cleaning and chunk strategy
- Provide helpers to build preprocess/chunk functions from Settings

* feat(ingest): store chunk metadata (chunk_index/offsets/parent_doc_id)

- rag-ingest now records chunk offsets and parent_doc_id in SQLite metadata
- bootstrap uses Settings-driven cleaning/chunking for determinism

* test(chunking): validate boundaries/overlap and ingestion metadata

- Add unit tests for chunk_chars_v1 determinism, boundaries and overlap
- Extend rag-ingest CLI test to assert prepared metadata (chunk_index, offsets, parent_doc_id)
- Add coverage for preprocess_text options

* feat(dedup): add chunk-level hash column + index and helper

- Add Settings.ingest_chunker_version to version the dedup key
- Add chunk_dedup_sha256 helper (sha256(cleaned_text + chunker_version + embedding_model_name))
- Extend documents schema with chunk_dedup_sha256 and a partial unique index
- Allow upsert to persist chunk_dedup_sha256 when provided

* feat(api): dedup /api/docs ingestion via chunk hash upsert

- Clean + chunk texts using Settings-driven pipeline
- Compute external_id as chunk hash (includes chunker_version + embedding model)
- Upsert into SQLite and update FAISS incrementally in dense/hybrid mode

* test(dedup): cover hash-based /api/docs idempotency and SQL constraint

- Verify /api/docs dedup is idempotent across re-ingests
- Verify changing ingest_chunker_version produces new inserts
- Assert chunk_dedup_sha256 unique index is enforced

* docs: document chunker version and dedup column

- Add chunking strategy/version env vars to README and .env.example
- Mention chunk_dedup_sha256 schema column

* fix(db): run SQLite compatibility migrations for CLI/scripts

CLI/scripts can be executed without starting FastAPI, so ensure the same best-effort SQLite migrations (AUTOINCREMENT + identity columns) run before using ORM-mapped queries.

* feat(delete): add tombstones and delete-by-external-id in SQL repo

- Add DocumentTombstone table (external_id unique)
- Add SqlDocumentStorage delete_by_external_ids + tombstone helpers
- Tombstones prevent deleted identities from reappearing on future ingestions

* feat(api): delete by external_id with tombstones

- Add /api/docs/delete_by_external_id endpoint
- Filter tombstoned external_ids from /api/docs ingestion so deletes don't reappear
- Reject tombstoned external_ids on /api/docs/upsert

* feat(cli): rag-delete-external-ids

- Add CLI command delete-external-ids (SQL + FAISS when dense/hybrid)
- Respect tombstones during rag-ingest and block upserts for tombstoned external_ids
- Add project script entrypoint rag-delete-external-ids

* test(delete): cover external_id delete (tombstones) + ask_eval + rebuild

- Sparse: delete external_id prevents re-ingest and removes results from ask_eval
- Dense: delete external_id updates vector store and survives rebuild-index

* docs: document delete-by-external-id (API/CLI)

- Add rag-delete-external-ids and /api/docs/delete_by_external_id to usage guides

* feat(index-manifest): persist manifest and report drift in ready/status

* chore(mypy): avoid optional python-magic typing issues

* test(index-manifest): cover manifest drift and rebuild recovery

* docs(index-manifest): document index_manifest and drift detection

* test(index-manifest): unit coverage for manifest helpers

* feat(reranker): optional overlap reranker behind settings

* feat(observability): structured logs + domain metrics for ingest/query

* feat(eval): add rag-eval CLI with versioned dataset and regression gate

* test(eval,reranker): cover offline eval gate and reranking behavior

* docs(pr10): document rag-eval, monitoring, and optional reranker

* PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates

* fix(api,lock): prevent duplicate ingest pass and fail closed on lock acquisition

* fix(cli): serialize mutating commands with shared multi-store write lock

* docs(pr11): include integral audit findings, fixes, and verification

* fix(faiss): fail closed on missing manifest and lock acquisition

* fix(api,cli): always invalidate cached rag service after mutating attempts

* chore(lint): align isort with ruff and normalize import ordering

* docs(pr11): document new critical audit findings and validations

* fix(delete): deduplicate external_ids to prevent tombstone integrity failures

* fix(openai): enforce request timeouts with compatibility fallback

* chore(format): normalize TYPE_CHECKING import block in evaluation service

* docs(pr11): add critical audit findings for delete dedup and OpenAI timeouts

* fix(security): enforce runtime API-key guard for non-local requests

* fix(coordination): align lock and reload token paths across processes

* docs(pr11): record critical findings 14-15 and runtime security notes

* refactor: centralize dense upsert consistency flow and shared locking/client helpers

* test: add regression coverage for shared dense sync, manifest config, and OpenAI client fallback

* docs: update PR11 audit notes and include curso indices

* fix(security): fail closed when client host is unavailable

* fix(consistency): make dense delete paths lazy-load embedders

* fix(ingest): honor no-follow-symlinks for symlink directory inputs

* refactor(cli): centralize dense embedder construction across commands

* docs: clarify symlink ingest scope and dense delete behavior

* polish

* refactor(core): simplify lock orchestration and dense precompute flow

* test(app): cover blocking worker env parsing and execution path

* perf(ingest): avoid duplicate format detection and reuse precomputed result

* perf(text): precompile whitespace regex in preprocessing

* perf(ingest): batch rag-ingest mutations to reduce upsert/lock churn

* perf(sparse): cache docs in memory and remove per-query SQL get

* chore(lint): move NDArray import under TYPE_CHECKING

* perf(sparse): avoid duplicate SQL loads when building retrievers

* fix(security): enforce API key for forwarded non-local requests

* fix(consistency): harden multi-store script locking and fail-fast build-index

* fix(security): enforce API-key guard for RFC7239 Forwarded hosts

* fix(consistency): preflight vector mutability before SQL deletes

* fix(sqlite): tolerate duplicate-column races in identity migration

* ci: harden workflow gates and pin actions by SHA

* build: make Docker reproducible and enforce strict security target

* docs: document CI gates and contributor verification workflow

* refactor: centralize composition, harden multistore deletes, and split eval layering

* fix(ingest): handle unicode text detection and unreadable files

Treat non-ASCII UTF-8 text as textual content during loader detection by using Unicode-aware printability checks. This prevents valid multilingual .txt files from being misclassified as unknown and silently skipped by rag-ingest.

Also harden detect_file_format to fail soft on unreadable files (e.g. permission errors), returning unknown/read-error instead of raising and aborting the full ingestion run.

Add regression tests for:
- UTF-8 non-ASCII plain text detection
- read-head permission error handling
- end-to-end CLI ingest with non-ASCII text

Update README ingestion notes to document the behavior.

* refactor(composition): unify runtime wiring policy across api, cli, and scripts

Centralize adapter-selection policy in app.composition and reuse it from API factory, ask_eval retriever wiring, CLI dense embedder paths, and bootstrap/build-index scripts.

Changes:
- Add DEFAULT_DENSE_BACKEND_MESSAGE in composition.
- Make build_dense_embedder_from_settings use the shared default when message is omitted.
- Add build_retriever_with_default_embedder_from_settings to remove duplicated dense-embedder closure wiring.
- Add resolve_preferred_llm_provider and reuse it in app.factory.
- Update API/CLI/scripts to call shared composition policy helpers.
- Add unit tests for new composition helpers and provider policy.

This keeps existing monkeypatch seams in api/factory tests while consolidating policy decisions in one reusable module.

* refactor(app): extract docs and index use-cases from api router

Move business orchestration for /api/docs*, /api/docs/upsert, /api/docs/delete*, and /api/index/rebuild into app-layer services.

Changes:
- Add app/services/docs.py with sync use-cases for ingest/upsert/delete flows and typed summaries.
- Add app/services/index.py with rebuild_index_sync orchestration.
- Keep api_router transport-focused: request normalization, HTTP mapping, observability, and response serialization only.
- Preserve existing monkeypatch seams by injecting adapter factories/callbacks from api_router into services.
- Update architecture docs to reflect app/services docs/index layering.

Validation:
- ruff check .
- mypy .
- pytest -q (335 passed, coverage >= 85%).

* refactor(cli): partition commands by bounded context modules

- split monolithic cli.py into domain command modules: docs, index, eval, server\n- keep cli.py as composition root + entrypoint wrappers + shared hooks\n- preserve command surface/flags and existing monkeypatch seams\n- add registry/entrypoint tests for command wiring\n- update architecture docs with CLI layering

* refactor(factory): replace reload token file with DB-backed system_state version

- introduce SystemStateStorage adapter in SQLAlchemy persistence\n- persist rag service cache version in system_state (key: rag_service)\n- remove filesystem token invalidation from app factory\n- keep process-local cache maxsize=1 while enabling cross-process invalidation via DB version\n- add focused tests for factory invalidation and system_state storage behavior\n- update architecture docs to reflect DB-backed invalidation

* refactor(blocking): add task-type pools with explicit pending limits

- partition run_blocking work into task types: default, mutation, network, eval\n- add per-type worker and queue limits via env-configurable policies\n- enforce max pending per task type to prevent unbounded contention\n- route API run_blocking calls with explicit task_type labels\n- extend blocking and multistore tests for routing, queue-full behavior, and slot release guarantees\n- document async/sync boundary policy in architecture guide

* test(app): add mandatory chaos scenarios for mutating operations

- add chaos coverage for vector write failure during upsert\n- add lock acquisition failure scenario for mutating upsert path\n- reproduce crash window: SQL committed while vector upsert+rebuild fail\n- add rebuild-index failure scenario with cache invalidation assertion

* refactor(app): introduce docs/index application ports to reduce infra coupling

- add app-level ports module with DocsMutationPorts and IndexMutationPorts contracts\n- refactor docs and index use-cases to depend on port bundles instead of infra callables\n- centralize docs/index dependency wiring in api_router via _docs_mutation_ports/_index_mutation_ports\n- keep existing monkeypatch seams intact by resolving router symbols at runtime\n- add wiring tests for docs/index ports binding\n- update architecture docs to document new app ports layer

* fix(security): fail closed on ambiguous proxy forwarding chains

* fix(api): return 502 for malformed openrouter responses

* fix(cli): validate duplicate external_id before embedding work

* refactor(app): split transport routers and enforce typed provider errors

Introduce typed cross-layer LLM errors in core and remove FastAPI exceptions from infra adapters.

Add app-layer HTTP error mapping, extract OpenRouter into dedicated router/service, and move docs/index mutation port wiring into reusable builders for API/CLI.

Expand tests for error mapping, OpenRouter service behavior, mutation port builders, and updated adapter/router mappings.

* refactor(cli): reuse app docs services and add multiprocess lock regression

Route docs CLI mutations through app services and shared mutation port builders to remove duplicated orchestration logic.

Add real spawn-based integration coverage for cross-process write-lock serialization and update architecture docs for bounded routers and error-layer boundaries.

* refactor(api): reduce root router to composition and split bounded routers

Extract health, rag, docs and index endpoints into dedicated router modules and move runtime wiring/composition helpers to app/wiring.py.

Migrate ingest schemas into app/schemas and adjust unit tests to patch the new seams (wiring and bounded routers) while preserving endpoint behavior.

Update architecture docs to reflect api_router composition-only role and new router/wiring structure.

* hotfix-refactor: arch

* chore(rescue): snapshot mixed worktree before pr04 split

* update(ci): splitted branches globbing patterns on CI for granular control on trigerage. Included release** pattern

* add(pre-commit): included pre-commit in  TO PROJ. deps

* config

* ci(security): pin safety v2 and use stable check command

* updated doc, repo structure

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution
Intrinsical-AI added a commit that referenced this pull request Mar 3, 2026
* feat(frontend): split inline assets and serve via /assets

* security: protect API and tighten CORS defaults

* security: add request size limits

* persistence: harden FAISS index persistence

* scripts: add repo entrypoints and drop packaged FAQ

* monitoring: reduce metrics label cardinality

* runtime: invalidate cached RAG service across workers

* runtime: avoid thread offload in api endpoints

* docs: update configuration and changelog

* chore: remove obsolete PR description proposal

* feat(prompting): safe prompt template rendering

* feat(app): run blocking work in dedicated thread pool

* feat(security): require API key when binding publicly

* fix(sqlite): prevent document id reuse; migrate legacy schema

* feat(faiss): support delete/rebuild + safe persistence

* fix(etl): rollback vector index on ingestion failure

* feat(api): doc delete + index rebuild; improve readiness/offload

* fix(rag): make history persistence best-effort

* feat(app): enforce safe bind and migrate sqlite schema at startup

* chore(app): tighten rag service cache

* fix(llm): use safe prompt renderer

* docs: document delete/rebuild maintenance flows

* docs: added ROADMAP with next PRs, feats, steps and DOD

* update(llm): default ollama model set to gemma3:4b across codebase

* release(02-2026): merge stabilization (ops, ingestion, consistency, eval, hardening) (#31)

* feat(ops): add readiness/index diagnostics and richer rag-status

* test(ops): cover readiness drift/corruption and cli status

* docs: note strict /api/ready drift detection in dense/hybrid

* updated .gitignore - pr descs into /pr

* feat(sql): add document identity columns + best-effort sqlite migration

* test(sqlite): cover identity column migration, unique external_id, sha256 fill

* docs: document document identity fields (external_id, metadata, hash, timestamps)

* feat(docs): idempotent upsert by external_id (API/CLI)

* test(docs): cover upsert idempotency (api/sql/cli)

* docs: document docs upsert endpoint and rag-upsert-docs

* docs: update roadmap with stabilization/refactor PR and incorporate quick-wins

* refactor(structure): move api schemas/diagnostics and core helpers; keep shims

* refactor(imports): prefer new module locations (app.schemas, core.services, app.diagnostics)

* refactor(structure): drop root shims and enforce app/core module boundaries

* docs: update roadmap and architecture after removing root shims

* feat(ingestion): add txt/md/csv loaders with discovery + sniffing

- Add TextFileLoader and MarkdownLoader
- Add discover_files() limits (max files / bytes) for safe dir ingestion
- Add loader factory with best-effort format detection (extension + bytes heuristics, optional python-magic)
- Extend CSVLoader with delimiter sniffing and row_index metadata

* feat(cli): add rag-ingest (dir/file) with SQL+FAISS consistency

- Add  CLI/entrypoint (files/dirs, limits, dry-run)
- Ingest via external_id upsert per chunk and delete stale chunks by source prefix
- Dense/hybrid: incremental vector updates, fallback rebuild on failure
- Add SqlDocumentStorage.list_ids_by_external_id_prefix() helper
- Add optional extra  for python-magic

* test(ingestion): cover rag-ingest + loader detection/discovery

- CLI ingest: mixed dir, idempotency, and stale-chunk deletion
- Factory detection: prefer content heuristics over misleading extensions
- Discovery: enforce limits and handle broken symlinks
- Update CSVLoader unit tests for row_index metadata

* docs: document rag-ingest and optional python-magic

- README: add rag-ingest to CLI commands and mention magic extra
- custom usage guide: add a CLI ingestion section

* fix(ingest): avoid external_id prefix collisions when syncing stale chunks

Pass a delimiter-suffixed prefix when listing existing external_ids so /tmp/foo does not match /tmp/foo2.

* feat(ingestion): configurable cleaning + deterministic chunking (chars_v1)

- Add chunk_chars_v1 with stable boundaries (start/end offsets)
- Make preprocess_text configurable via keyword options
- Add settings flags for ingestion cleaning and chunk strategy
- Provide helpers to build preprocess/chunk functions from Settings

* feat(ingest): store chunk metadata (chunk_index/offsets/parent_doc_id)

- rag-ingest now records chunk offsets and parent_doc_id in SQLite metadata
- bootstrap uses Settings-driven cleaning/chunking for determinism

* test(chunking): validate boundaries/overlap and ingestion metadata

- Add unit tests for chunk_chars_v1 determinism, boundaries and overlap
- Extend rag-ingest CLI test to assert prepared metadata (chunk_index, offsets, parent_doc_id)
- Add coverage for preprocess_text options

* feat(dedup): add chunk-level hash column + index and helper

- Add Settings.ingest_chunker_version to version the dedup key
- Add chunk_dedup_sha256 helper (sha256(cleaned_text + chunker_version + embedding_model_name))
- Extend documents schema with chunk_dedup_sha256 and a partial unique index
- Allow upsert to persist chunk_dedup_sha256 when provided

* feat(api): dedup /api/docs ingestion via chunk hash upsert

- Clean + chunk texts using Settings-driven pipeline
- Compute external_id as chunk hash (includes chunker_version + embedding model)
- Upsert into SQLite and update FAISS incrementally in dense/hybrid mode

* test(dedup): cover hash-based /api/docs idempotency and SQL constraint

- Verify /api/docs dedup is idempotent across re-ingests
- Verify changing ingest_chunker_version produces new inserts
- Assert chunk_dedup_sha256 unique index is enforced

* docs: document chunker version and dedup column

- Add chunking strategy/version env vars to README and .env.example
- Mention chunk_dedup_sha256 schema column

* fix(db): run SQLite compatibility migrations for CLI/scripts

CLI/scripts can be executed without starting FastAPI, so ensure the same best-effort SQLite migrations (AUTOINCREMENT + identity columns) run before using ORM-mapped queries.

* feat(delete): add tombstones and delete-by-external-id in SQL repo

- Add DocumentTombstone table (external_id unique)
- Add SqlDocumentStorage delete_by_external_ids + tombstone helpers
- Tombstones prevent deleted identities from reappearing on future ingestions

* feat(api): delete by external_id with tombstones

- Add /api/docs/delete_by_external_id endpoint
- Filter tombstoned external_ids from /api/docs ingestion so deletes don't reappear
- Reject tombstoned external_ids on /api/docs/upsert

* feat(cli): rag-delete-external-ids

- Add CLI command delete-external-ids (SQL + FAISS when dense/hybrid)
- Respect tombstones during rag-ingest and block upserts for tombstoned external_ids
- Add project script entrypoint rag-delete-external-ids

* test(delete): cover external_id delete (tombstones) + ask_eval + rebuild

- Sparse: delete external_id prevents re-ingest and removes results from ask_eval
- Dense: delete external_id updates vector store and survives rebuild-index

* docs: document delete-by-external-id (API/CLI)

- Add rag-delete-external-ids and /api/docs/delete_by_external_id to usage guides

* feat(index-manifest): persist manifest and report drift in ready/status

* chore(mypy): avoid optional python-magic typing issues

* test(index-manifest): cover manifest drift and rebuild recovery

* docs(index-manifest): document index_manifest and drift detection

* test(index-manifest): unit coverage for manifest helpers

* feat(reranker): optional overlap reranker behind settings

* feat(observability): structured logs + domain metrics for ingest/query

* feat(eval): add rag-eval CLI with versioned dataset and regression gate

* test(eval,reranker): cover offline eval gate and reranking behavior

* docs(pr10): document rag-eval, monitoring, and optional reranker

* PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice) (#27)

* fix(consistency): preflight manifest drift before index mutations

* fix(maintenance): raise explicit multi-store inconsistency error on double failure

* ci: add windows smoke tests for faiss persistence and consistency paths

* fix: prevent SQL/vector drift when embedding fails

Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates.

Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior).

Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.

* fix: harden rag service reload token writes

Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers.

Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.

* docs(pr): add PR11 hardening consistency description

* fix(api): serialize multi-store writes and harden generation params

Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock.

Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation.

Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.

* chore: fix import ordering and include roadmap updates

* fix(api,lock): prevent duplicate ingest pass and fail closed on lock acquisition

* fix(cli): serialize mutating commands with shared multi-store write lock

* docs(pr11): include integral audit findings, fixes, and verification

* fix(faiss): fail closed on missing manifest and lock acquisition

* fix(api,cli): always invalidate cached rag service after mutating attempts

* chore(lint): align isort with ruff and normalize import ordering

* docs(pr11): document new critical audit findings and validations

* fix(delete): deduplicate external_ids to prevent tombstone integrity failures

* fix(openai): enforce request timeouts with compatibility fallback

* chore(format): normalize TYPE_CHECKING import block in evaluation service

* docs(pr11): add critical audit findings for delete dedup and OpenAI timeouts

* fix(security): enforce runtime API-key guard for non-local requests

* fix(coordination): align lock and reload token paths across processes

* docs(pr11): record critical findings 14-15 and runtime security notes

* refactor: centralize dense upsert consistency flow and shared locking/client helpers

* test: add regression coverage for shared dense sync, manifest config, and OpenAI client fallback

* docs: update PR11 audit notes and include curso indices

* fix(security): fail closed when client host is unavailable

* fix(consistency): make dense delete paths lazy-load embedders

* fix(ingest): honor no-follow-symlinks for symlink directory inputs

* refactor(cli): centralize dense embedder construction across commands

* docs: clarify symlink ingest scope and dense delete behavior

* polish

* refactor(core): simplify lock orchestration and dense precompute flow

* test(app): cover blocking worker env parsing and execution path

* perf(ingest): avoid duplicate format detection and reuse precomputed result

* perf(text): precompile whitespace regex in preprocessing

* perf(ingest): batch rag-ingest mutations to reduce upsert/lock churn

* perf(sparse): cache docs in memory and remove per-query SQL get

* chore(lint): move NDArray import under TYPE_CHECKING

* perf(sparse): avoid duplicate SQL loads when building retrievers

* fix(security): enforce API key for forwarded non-local requests

* fix(consistency): harden multi-store script locking and fail-fast build-index

* fix(security): enforce API-key guard for RFC7239 Forwarded hosts

* fix(consistency): preflight vector mutability before SQL deletes

* fix(sqlite): tolerate duplicate-column races in identity migration

* ci: harden workflow gates and pin actions by SHA

* build: make Docker reproducible and enforce strict security target

* docs: document CI gates and contributor verification workflow

* refactor: centralize composition, harden multistore deletes, and split eval layering

* fix(ingest): handle unicode text detection and unreadable files

Treat non-ASCII UTF-8 text as textual content during loader detection by using Unicode-aware printability checks. This prevents valid multilingual .txt files from being misclassified as unknown and silently skipped by rag-ingest.

Also harden detect_file_format to fail soft on unreadable files (e.g. permission errors), returning unknown/read-error instead of raising and aborting the full ingestion run.

Add regression tests for:
- UTF-8 non-ASCII plain text detection
- read-head permission error handling
- end-to-end CLI ingest with non-ASCII text

Update README ingestion notes to document the behavior.

* refactor(composition): unify runtime wiring policy across api, cli, and scripts

Centralize adapter-selection policy in app.composition and reuse it from API factory, ask_eval retriever wiring, CLI dense embedder paths, and bootstrap/build-index scripts.

Changes:
- Add DEFAULT_DENSE_BACKEND_MESSAGE in composition.
- Make build_dense_embedder_from_settings use the shared default when message is omitted.
- Add build_retriever_with_default_embedder_from_settings to remove duplicated dense-embedder closure wiring.
- Add resolve_preferred_llm_provider and reuse it in app.factory.
- Update API/CLI/scripts to call shared composition policy helpers.
- Add unit tests for new composition helpers and provider policy.

This keeps existing monkeypatch seams in api/factory tests while consolidating policy decisions in one reusable module.

* refactor(app): extract docs and index use-cases from api router

Move business orchestration for /api/docs*, /api/docs/upsert, /api/docs/delete*, and /api/index/rebuild into app-layer services.

Changes:
- Add app/services/docs.py with sync use-cases for ingest/upsert/delete flows and typed summaries.
- Add app/services/index.py with rebuild_index_sync orchestration.
- Keep api_router transport-focused: request normalization, HTTP mapping, observability, and response serialization only.
- Preserve existing monkeypatch seams by injecting adapter factories/callbacks from api_router into services.
- Update architecture docs to reflect app/services docs/index layering.

Validation:
- ruff check .
- mypy .
- pytest -q (335 passed, coverage >= 85%).

* refactor(cli): partition commands by bounded context modules

- split monolithic cli.py into domain command modules: docs, index, eval, server\n- keep cli.py as composition root + entrypoint wrappers + shared hooks\n- preserve command surface/flags and existing monkeypatch seams\n- add registry/entrypoint tests for command wiring\n- update architecture docs with CLI layering

* refactor(factory): replace reload token file with DB-backed system_state version

- introduce SystemStateStorage adapter in SQLAlchemy persistence\n- persist rag service cache version in system_state (key: rag_service)\n- remove filesystem token invalidation from app factory\n- keep process-local cache maxsize=1 while enabling cross-process invalidation via DB version\n- add focused tests for factory invalidation and system_state storage behavior\n- update architecture docs to reflect DB-backed invalidation

* refactor(blocking): add task-type pools with explicit pending limits

- partition run_blocking work into task types: default, mutation, network, eval\n- add per-type worker and queue limits via env-configurable policies\n- enforce max pending per task type to prevent unbounded contention\n- route API run_blocking calls with explicit task_type labels\n- extend blocking and multistore tests for routing, queue-full behavior, and slot release guarantees\n- document async/sync boundary policy in architecture guide

* test(app): add mandatory chaos scenarios for mutating operations

- add chaos coverage for vector write failure during upsert\n- add lock acquisition failure scenario for mutating upsert path\n- reproduce crash window: SQL committed while vector upsert+rebuild fail\n- add rebuild-index failure scenario with cache invalidation assertion

* refactor(app): introduce docs/index application ports to reduce infra coupling

- add app-level ports module with DocsMutationPorts and IndexMutationPorts contracts\n- refactor docs and index use-cases to depend on port bundles instead of infra callables\n- centralize docs/index dependency wiring in api_router via _docs_mutation_ports/_index_mutation_ports\n- keep existing monkeypatch seams intact by resolving router symbols at runtime\n- add wiring tests for docs/index ports binding\n- update architecture docs to document new app ports layer

* fix(security): fail closed on ambiguous proxy forwarding chains

* fix(api): return 502 for malformed openrouter responses

* fix(cli): validate duplicate external_id before embedding work

* refactor(app): split transport routers and enforce typed provider errors

Introduce typed cross-layer LLM errors in core and remove FastAPI exceptions from infra adapters.

Add app-layer HTTP error mapping, extract OpenRouter into dedicated router/service, and move docs/index mutation port wiring into reusable builders for API/CLI.

Expand tests for error mapping, OpenRouter service behavior, mutation port builders, and updated adapter/router mappings.

* refactor(cli): reuse app docs services and add multiprocess lock regression

Route docs CLI mutations through app services and shared mutation port builders to remove duplicated orchestration logic.

Add real spawn-based integration coverage for cross-process write-lock serialization and update architecture docs for bounded routers and error-layer boundaries.

* refactor(api): reduce root router to composition and split bounded routers

Extract health, rag, docs and index endpoints into dedicated router modules and move runtime wiring/composition helpers to app/wiring.py.

Migrate ingest schemas into app/schemas and adjust unit tests to patch the new seams (wiring and bounded routers) while preserving endpoint behavior.

Update architecture docs to reflect api_router composition-only role and new router/wiring structure.

* hotfix-refactor: arch

* chore(rescue): snapshot mixed worktree before pr04 split

* update(ci): splitted branches globbing patterns on CI for granular control on trigerage. Included release** pattern

* add(pre-commit): included pre-commit in  TO PROJ. deps

* config

* ci(security): pin safety v2 and use stable check command

* updated doc, repo structure

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution

* clean: README

* release: readability/quick improvements (#32)

* refactor: replace assert guards with explicit RuntimeError + low-risk cleanups

- Replace all production `assert x is not None` with explicit `if x is None: raise RuntimeError(...)`
  across sql_.py, index.py, composition.py, factory.py, docs_ingest.py — asserts are
  disabled under `python -O`; the new form is visible in all execution modes.

- Extract `_to_domain_document()` module-level helper in sql_.py; eliminates duplicate
  11-line ORM→domain mapping in `get()` and `get_all_documents()`.

- Replace `getattr(db_doc, "field", None)` fallbacks with direct attribute access in
  sql_.py; all columns are explicitly mapped in Document ORM model — getattr implied
  optional columns that are guaranteed present.

- Add `FaissIndex._locked_write()` context-manager abstracting the repeated pattern
  `with self._state_lock, _exclusive_file_lock(self._lock_path)` (5 call sites → 1 definition).

- Add `Settings.ingest_batch_size` field (default 64, env-overridable via INGEST_BATCH_SIZE);
  removes magic number hardcoded in `_execute_ingest_batches`.

All changes are behaviour-preserving. lint/mypy clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(sql): extract _DocumentChanges + _detect_document_changes, add change-variant tests

Motivation: the upsert change detection block mixed four independent boolean checks
inline — hard to test in isolation and easy to introduce regressions when adding a
new field.

Changes:
- Add module-level `_DocumentChanges` frozen dataclass with `any_changed` property.
- Add module-level `_detect_document_changes(db_doc, item, new_content, new_sha)`
  function, placed after `SqlDocumentStorage` so `SqlDocumentStorage.UpsertDoc`
  is already defined in scope.
- Replace the 14-line inline detection block in `upsert_documents_by_external_id`
  with a single `changes = _detect_document_changes(...)` call.
- Remove now-redundant local `content_changed` variable; reference `changes.content`.

Tests added (test_sql_upsert_documents_by_external_id.py):
- `test_upsert_metadata_only_change` — metadata diff → action=updated, content_changed=False
- `test_upsert_source_only_change`   — source diff   → action=updated, content_changed=False
- `test_upsert_dedup_only_change`    — dedup diff     → action=updated, content_changed=False

These variants were not covered; they exercise all four branches of `_DocumentChanges`.
lint/mypy clean, 6/6 upsert tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(ingest): replace tuple aliases IngestPlan + BatchSyncResult with frozen dataclasses

Motivation: positional tuple unpacking of 4 and 7 fields respectively makes call sites
opaque — a field reorder would silently produce wrong behaviour.

Changes:
- Add `from dataclasses import dataclass` import.
- Replace `IngestPlan = tuple[Path, str, tuple[Any, ...], tuple[str, ...]]` with
  `@dataclass(frozen=True) class IngestPlan` with named fields
  (file_path, file_prefix, items, desired_external_ids).
- Replace `BatchSyncResult = tuple[int, int, int, bool, int, int, list[...]]` with
  `@dataclass(frozen=True) class BatchSyncResult` with named fields
  (inserted, updated, unchanged, rebuilt, deleted_stale, ingested_chunks, stale_by_file).
- Update `_build_file_ingest_plan`: return `IngestPlan(...)` instead of bare tuple.
- Update `_build_ingest_plans`: `plan[2]` → `plan.items`.
- Update `_collect_batch_items_and_stale`: replace 4-way destructuring `for a,b,c,d in`
  with `for plan in` + attribute access.
- Update `_ingest_batch_sync`: return `BatchSyncResult(...)` instead of 7-tuple.
- Update `_execute_ingest_batches`: replace 7-way destructuring with `batch = ...; batch.field`.

Behaviour: identical. lint/mypy clean. 82 ingest-related unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(factory)+docs(settings): consolidate PR-A/PR-D and runtime defaults

Includes all pending modified files in this branch, as requested.

- PR-A: extract _collect_container_overrides() from _build_container() in app/factory.py while preserving monkeypatch seams, override keys, and resolution order.

- PR-D: formalize INGEST_BATCH_SIZE in .env.example and README (config table + ingestion tuning section).

- Add settings validator tests for ingest_batch_size bounds: accepts 1/512 and rejects 0/513.

- Consolidate Ollama default model/docs updates to lfm2.5-thinking across settings, README, docker-compose comment, and custom usage guide.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* version: semantic versioning aligment for release v1.0

* fix(core): clean prompt template wording regression

* refactor(core): split service DTOs into types/results

* feat(infra): add sql/vector/shared persistence adapters

* refactor(app): wire vector/sql adapters and drop legacy faiss/sqlalchemy paths

* test(refactor): migrate suites to sql/vector persistence layout

* docs(architecture): update vector/sql module boundaries and usage

* chore(lock): sync uv.lock project version metadata

* chore(docs): normalize whitespace in app architecture note

* fix(ci): updated windows smoke tests paths for ci workflow

* fix(ci): updated windows smoke tests paths for ci workflow

* fix(norecurse): added uv_cache

* refactor(core): introduce DocId and lineage domain contracts

* refactor(core-services): propagate DocId through orchestration and retrieval

* refactor(sql): move to doc_id string primary key and string history source_ids

* chore(sql): enforce fresh-install-only schema bootstrap without legacy migration

* refactor(vector): store DocId strings in id_map and vector repository interfaces

* feat(ingestion): require lineage in LoadedItem and emit lineage across loaders

* refactor(app): expose string document IDs across services, schemas, routers and diagnostics

* refactor(cli): switch docs operations and batch ingestion flows to string DocIds

* test: update suites for DocId string contract and fresh-install SQL behavior

* docs: document fresh-install-only policy, string IDs, and lineage requirement

* fix(sql): preserve input order in document get(ids)

* build(uv): standardize dependency groups and lock policy

* feat(core): harden write-lock and error guarantees

Introduce explicit write-lock timeout handling and map it to API-level 503 responses.

Key changes:

- add WriteLockTimeoutError to core error taxonomy

- enforce timeout-aware file/multi-store write lock behavior

- expose lock/recovery tuning in settings for deterministic ops behavior

- remove silent rollback swallowing in ETL and surface incomplete rollback as runtime error

- bound ingestion pipeline memory via configurable batched flushes

* feat(mutation): add durable multi-store saga coordinator

Implement a single mutation coordinator for SQL+vector writes with incremental delta semantics and durable journal state machine.

Key changes:

- add storage profile capabilities and enforce write floor at DURABLE_SAGA

- add file-backed mutation journal with PREPARED/COMMITTED/FAILED recovery states

- extend SQL storage with before-image snapshot and deterministic restore helpers

- extend vector contract with apply_delta_atomic for delete+upsert as one operation

- wire recovery on startup plus background loop, and expose incomplete-journal diagnostics/health warnings

* refactor(app): collapse orchestration layers and cut legacy write surface

Remove the ambiguous app/services layer and move orchestration to explicit application use-cases + coordinator path.

Key changes:

- replace app/services with app/contracts + app/wiring + application use-case modules

- rename RAG modules by responsibility (rag_query_use_case, rag_router, rag_api_models, rag_runtime)

- make /api/docs/mutate the canonical write endpoint and remove legacy upsert/delete endpoints

- align CLI to rag-mutate-docs + rag-ingest + rag-rebuild-index and drop legacy delete/upsert commands

- update script entry points to match the breaking CLI surface

* test(architecture): enforce layer boundaries and unified mutation path

Update tests to the new write contract and add guardrails so architecture drift fails fast.

Key changes:

- add layer-boundary checks (no app.services imports; no invalid cross-layer dependencies)

- add explicit mutation boundary tests ensuring HTTP/CLI writes terminate in MutationCoordinator.execute

- update endpoint/CLI tests for removed legacy routes and commands

- refresh RAG/ingest/index tests for renamed modules and new contracts

* docs(release): update architecture and usage for v1.0 mutation model

Document the final v1.0 cut: durable saga writes, unified mutation surface, and explicit repair workflows.

Key changes:

- rewrite architecture docs around single application orchestration layer and storage profiles

- update custom usage guide with canonical mutate/ingest/rebuild flows

- align docs navigation and README with breaking HTTP/CLI changes

- ignore local mutation journal artifacts under data/.mutation_journal

* refactor(factory): remove dynamic override collector and wire AppContainer explicitly

* refactor(cli-structure): move docs commands into docs package namespace

* refactor(ingestion): remove legacy script wrappers and call sample_data_ingestion directly

* refactor(index-wiring): drop BuildIndexPorts and deprecated build_index_sync path

* feat!(cli): unify bootstrap/build-index into bootstrap and remove rag-build-index

* chore(app): remove unused build_rag_service re-export from dependencies

* fix(tests): make asgi_client depend on in_memory_sqlite fixture

* fix(router): replace unsafe assert with RuntimeError in openrouter generate

assert statements are stripped by python -O; a failed isinstance check
would produce an AttributeError with no context.  Raise an explicit
RuntimeError with a descriptive message so production logs surface the
real problem.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(mutation-journal): warn before resetting unrecognized saga state to PREPARED

_record_from_dict silently coerced any unknown state value to PREPARED,
which could cause double-execution of a mutation during recovery from a
partially-committed journal.  Add a logger.warning so operators are
alerted to version mismatches or journal corruption.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(ports): drop 4 dead fields from DocsMutationPorts across app layer

precompute_vectors_fn, sync_dense_fn, delete_docs_fn, and
delete_external_ids_fn were stored on DocsMutationPorts but never read by
MutationCoordinator after the saga refactor.  Remove them from the
dataclass, all construction sites (wiring, container, factory), and update
the two tests that asserted their presence.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(contracts): purge legacy one-shot mutation result DTOs

UpsertDocsSummary, DeleteDocsByExternalIdSummary, and DeleteDocsSummary
pre-date the MutationCoordinator and were never produced or consumed after
the saga refactor.  Remove them from results.py and the contracts __all__.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(cli): remove infrastructure import from docs ingest plan builder

_build_file_ingest_plan was importing SqlDocumentStorage directly to
access SqlDocumentStorage.UpsertDoc, crossing the infrastructure boundary
from the CLI layer.  Thread build_upsert_doc through _build_ingest_plans
and source it from the already-constructed DocsMutationPorts in ingest_cmd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(scripts,app,cli): replace print with logging, fix dead code, tighten exports

sample_data_ingestion.py:
- print() → logger.info() (progress reporting belongs to the logging framework)
- magic literal 128 → settings_obj.ingest_batch_size (consistent with pipeline)
- move logger = getLogger(__name__) after all imports (fixes ruff E402)

factory.py:
- replace unreachable second if _APP_CONTEXT is None with a local var ctx
  so mypy type-narrows correctly without an impossible branch

runtime.py:
- remove private _reset_rag_service_best_effort from __all__

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(integration): replace capsys with caplog for ingestion confirmation assertions

run_sample_data_ingestion now emits progress via logger.info instead of
print(); update four integration tests to capture log records with caplog
instead of stdout with capsys.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: QW1	app/composition.py	resolve_preferred_llm_provider ahora retorna "openrouter" cuando es el único proveedor configurado
QW2	app/application/index.py	rebuild_index_sync pasa backend= al vector_repo_factory, consistente con _apply_vector_delta
QW3	app/main.py	logger.error(f"...") → logger.error("...", e) (Ruff G004)
QW4	app/blocking.py	_pending_snapshot simplificado a return state.pending_snapshot() — dead branches eliminados
QW5	app/application/mutations.py	Flag mutation_attempted eliminado en run_api_mutation y run_cli_mutation — reset_after() siempre en finally
QW6	app/contracts/ports.py	Tipo de retorno corregido: list[str] → list[DocId] en DocsRepositoryPort
QW7	core/services/rag_runtime.py	String español → constante NO_DOCS_ANSWER = "No documents are indexed..." + parámetro no_docs_answer en RagService.__init__

* feat: Task 8 — Write lock contract

Added WriteLockPort Protocol in app/contracts/ports.py with explicit (*, coordination_dir, timeout_s, poll_s) signature; updated DocsMutationPorts.write_lock from Callable[..., Any] to WriteLockPort
Removed except TypeError fallback from write_lock.py — direct _exclusive_file_lock(path, timeout_s=…, poll_s=…) call
Simplified _write_lock() in docs_mutation.py — removed try/except, direct call only
Updated _fake_lock in test_write_lock.py and _broken_lock in test_mutation_chaos.py to accept the full keyword signature
Task 10 — VectorStorage DI

storage.py: Added settings_obj: Settings | None = None; stores self._settings; removed bare global settings import
composition.py: Passes settings_obj=settings_obj to vector_repo_factory
sample_data_ingestion.py: Passes settings_obj=settings_obj to VectorStorage
test_vector_storage_manifest_guard.py and test_composition.py: Updated assertions/calls accordingly

* fix(lifespan): fix lifespan removing old entries on pytest.fixture to avoid dirty loading

* fix(lifespan): asgi_client now accepts tmp_path and monkeypatch as parameters. Before entering the lifespan it sets settings.data_dir to an absolute per-test tmp_path / "data". Since get_coordination_dir() returns data_dir.resolve() for absolute paths, every test gets a fully isolated journal directory. No deletion, no race condition

* added push branch to ci file

* revert -.-"

* fix(ports): add ntotal read-only health probe to VectorRepoPort

`delete([])` was a destructive write used as a preflight check in
maintenance operations. Replacing it with a pure read-only `ntotal`
property removes the side-effect and makes the intent explicit in
the port contract.

- VectorRepoPort: declare `@property ntotal() -> int` in the Protocol
- VectorStorage: implement ntotal by proxying vector_index.ntotal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(maintenance): ntotal preflight, result dataclasses, shared embedder resolver

Three interrelated improvements to delete_documents_multi_store and
delete_external_ids_multi_store:

1. Preflight now reads vec_repo.ntotal instead of calling delete([]),
   eliminating the unintended write side-effect (pairs with ports fix).
   The redundant `if deleted_index_raw is not None` dead branch is
   removed since VectorRepoPort.delete() always returns int.

2. n-tuple returns replaced with frozen dataclasses:
   - MultiStoreDeleteResult(deleted_sql, deleted_index, rebuilt)
   - MultiStoreExternalIdDeleteResult(deleted_sql, deleted_index,
     missing_external_ids, tombstoned, rebuilt)
   Named fields prevent silent positional-ordering bugs at call sites.

3. The embedder resolution pattern (try factory; raise if still None)
   was duplicated 4 times across the two functions. Extracted into
   _resolve_embedder_or_raise() private helper. Error message strings
   promoted to module-level constants to eliminate literal duplication.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(ports): replace Any with concrete types in mutation port contracts

Any in frozen dataclass fields is invisible to mypy and defeats the
purpose of the port abstraction. Four fields are now statically typed:

- DocsMutationPorts.vector_repo_factory: Callable[..., Any]
  → Callable[..., VectorRepoPort]
- DocsMutationPorts.storage_profile_registry: Any
  → StorageProfileRegistry
- IndexMutationPorts.doc_repo_factory: Callable[[], Any]
  → Callable[[], DocumentRepoPort]
- IndexMutationPorts.vector_repo_factory: Callable[..., Any]
  → Callable[..., VectorRepoPort]

Builders in mutation_ports.py and container.py updated to match; the
redundant cast("Any", ...) in AppContainer.index_mutation_ports removed.

Also add inline comments to document two intentional design decisions
that were previously silent:
- UpsertDocBuilderPort returns object (not TypeVar) by design
- evaluation.py noqa: F401 import is a required SQLAlchemy side-effect

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(infra): narrow except Exception to specific types where safe

Broad except Exception clauses in infrastructure code mask unexpected
errors and reduce log signal. This commit tightens four files:

factory.py:
- _decode_text_sample(): Exception → UnicodeDecodeError (both branches;
  Latin-1 decode never raises in practice, but annotation stays honest)
- detect_json_export_format(): Exception → json.JSONDecodeError
  (decode uses errors="replace" so only loads() can raise)
- _magic_mime(): kept as Exception with comment — optional C extension
  (libmagic) can raise arbitrary binding errors beyond ImportError/OSError

id_map_json.py:
- load_id_map_json(): Exception → (json.JSONDecodeError, UnicodeDecodeError)
  (the two exact failure modes of json.loads(bytes.decode()))

alchemy_engine.py / index.py:
- Rollback/recovery guards kept as except Exception with explanatory
  comments: they intentionally catch all failures (including Python-level
  AttributeError etc.) so the original exception always propagates cleanly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(deps): bump fastapi 0.111→0.115, anyio 4.3→4.12; add python-multipart explicitly

Security: starlette 0.37.2 had three DoS CVEs in multipart parsing and
Range-header processing. Upgrading fastapi 0.111→0.115 pulls in
starlette 0.46.2, which resolves CVE-2024-47874 (<0.40.0).

anyio 4.3.0 had a thread race condition (PVE-2024-71199) in multi-event-
loop environments, fixed in 4.4.0. The old pin anyio>=4.3,<4.4 explicitly
excluded the patched release; now >=4.4,<5.

python-multipart was a transitive dependency of fastapi 0.111 and is no
longer bundled in 0.115+. Added as an explicit dependency so form/file
upload endpoints continue to work.

starlette 0.37.2 → 0.46.2  (CVE-2024-47874 fixed)
anyio    4.3.0  → 4.12.1   (PVE-2024-71199 fixed)
fastapi  0.111.1 → 0.115.14

465 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(deps): bump fastapi 0.115→0.124 + starlette 0.46→0.50; all CVEs resolved

FastAPI 0.124 widens the starlette constraint to <0.51.0, allowing the
resolver to reach starlette 0.50.0. This resolves the remaining two CVEs
that 0.115→0.46.2 could not address:

  CVE-2024-47874  starlette <0.40.0  → fixed in 0.40+ (was already done)
  CVE-2025-54121  starlette <0.47.2  → fixed in 0.50.0  ✓
  CVE-2025-62727  starlette <0.49.1  → fixed in 0.50.0  ✓

starlette 0.46.2 → 0.50.0 (forced via --upgrade-package starlette)
fastapi   0.115.14 → 0.124.4

httpx pin >=0.27,<0.28 remains compatible: starlette[full] allows 0.27-0.29.
pydantic pin >=2.0,<3.0 remains compatible: FastAPI 0.124 requires >=1.7.4.

465 tests pass, mypy and ruff clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(tooling): add gitleaks + check-json + debug-statements hooks; scaffold mypy test override

- gitleaks v8.24.2: secret/credential detection (gap not covered by bandit)
- check-json: validates JSON files on commit (check-yaml/toml were already present)
- debug-statements: catches stray breakpoint()/pdb imports before they reach CI
- [[tool.mypy.overrides]] for tests.*: lenient scaffolding (disallow_untyped_defs=false)
  for when tests/ gains __init__.py and is removed from the mypy exclude

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: fix stale references after FastAPI bump and refactor

- README.md: update FastAPI badge 0.111 → 0.124
- README.md: add cli_commands/ to project structure tree;
  fix scripts/ description (only sample_data_ingestion.py, not bootstrap)
- docs/USAGE.md: sessionmaker(bind=engine) → sessionmaker(engine)
  (SQLAlchemy 2.x removed the bind= keyword from sessionmaker)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: up-to-date. Polish + fix.

* docs: up-to-date. Polish + fix.

* docs: add transport isolation contract + 1.1.0 changelog entry

architecture.md: document golden rule (delete app/ → core/services still work),
current compliance (ingest/import use cases accept Sequence[str]/bytes),
and the router's responsibility to convert HTTP types before use-case call.

CHANGELOG.md: add [1.1.0] entry covering security CVE fixes, dep bumps,
port contract changes, refactors, and tooling additions from the sprint.
Clear stale [Unreleased] content (app/services/* references predated
the app/contracts/ rename already shipped in 1.0.0).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat!: screaming architecture refactor — eliminate app/, decouple FastAPI

BREAKING CHANGE: the `app/` package is removed entirely.

- Use cases move to `core/use_cases/` (transport-agnostic).
- DI composition root moves to `composition/` (transport-neutral).
- HTTP adapter lives in `http/` (FastAPI optional via `[server]` extra).
- Observability moves to `infrastructure/observability/`.
- Blocking executor moves to `infrastructure/concurrency/`.
- Storage profiles move to `core/domain/profiles.py`.
- ASGI entry point: `local_rag_backend.http.main:app`.
- `AskEvalConfigLike` Protocol replaces concrete Pydantic schema in
  container public signatures.
- `map_runtime_error` moves from `http/error_mapping.py` to
  `core/use_cases/errors.py`.
- Layer dependency rules enforced by architecture guard tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: fix stale app/ references after FastAPI bump and refactor

- Architecture guard tests now scan `http/routers/` (not defunct `app/routers/`).
- `test_http_routers_do_not_import_infrastructure_directly` refined to allow
  cross-cutting infrastructure (concurrency, observability) while still blocking
  direct imports of persistence, retrieval, LLM, and embedding adapters.
- Stale path comment removed from `http/api_router.py`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* stabilize: refactor @architecture.md. polish on http.main

* refactor(mutation): extract coordinator handlers and shared-session uow

* refactor(app): migrate use-cases and transports to port-based wiring

* test(architecture): enforce strict use-case boundaries in CI

* docs(architecture): close phase B and add persistence-topology ADR

* chore(cleanup): remove legacy seams and relocate lock/dataset internals

* refactor(mutation-core): split docs mutation into orchestrator + batch + saga modules

* refactor(sql): split alchemy engine into focused modules with compatibility shim

* refactor(composition): add AppContainer.from_settings while preserving injectable __init__

* refactor(cli-ingest): extract ingestion planner/discovery helpers

* refactor(http-cli): consolidate mutation execution/runtime glue and openrouter wiring

* refactor(core-infra): align dense/eval/vector/journal behavior and hardening tests

* docs(sync): align README and docs with final topology

* refactor(persistence): remove sql legacy module and simplify sqlite bootstrap contract

* refactor(composition): centralize runtime wiring and simplify router/container access

* fix(ingest): enforce strong idempotency and canonical mutation bootstrap flow

* test(integration): add golden invariants suite for F1-F5 contract

* docs(sync): align docs/changelog/license and remove legacy packaging docs

* docs(readme): polish

* release: cut v1.3.0 and close changelog unreleased

* fix(ci): install server extra for typing and test jobs

* docs(readme): hide pypi/download badges for release

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant