PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice)#27
Merged
Intrinsical-AI merged 9 commits intoFeb 16, 2026
Conversation
Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures.
Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes.
Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection.
…-11-hardening-consistency
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 16, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 16, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 16, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 16, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 16, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 17, 2026
…ndice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates
Intrinsical-AI
added a commit
that referenced
this pull request
Feb 24, 2026
…val, hardening) (#31) * feat(ops): add readiness/index diagnostics and richer rag-status * test(ops): cover readiness drift/corruption and cli status * docs: note strict /api/ready drift detection in dense/hybrid * updated .gitignore - pr descs into /pr * feat(sql): add document identity columns + best-effort sqlite migration * test(sqlite): cover identity column migration, unique external_id, sha256 fill * docs: document document identity fields (external_id, metadata, hash, timestamps) * feat(docs): idempotent upsert by external_id (API/CLI) * test(docs): cover upsert idempotency (api/sql/cli) * docs: document docs upsert endpoint and rag-upsert-docs * docs: update roadmap with stabilization/refactor PR and incorporate quick-wins * refactor(structure): move api schemas/diagnostics and core helpers; keep shims * refactor(imports): prefer new module locations (app.schemas, core.services, app.diagnostics) * refactor(structure): drop root shims and enforce app/core module boundaries * docs: update roadmap and architecture after removing root shims * feat(ingestion): add txt/md/csv loaders with discovery + sniffing - Add TextFileLoader and MarkdownLoader - Add discover_files() limits (max files / bytes) for safe dir ingestion - Add loader factory with best-effort format detection (extension + bytes heuristics, optional python-magic) - Extend CSVLoader with delimiter sniffing and row_index metadata * feat(cli): add rag-ingest (dir/file) with SQL+FAISS consistency - Add CLI/entrypoint (files/dirs, limits, dry-run) - Ingest via external_id upsert per chunk and delete stale chunks by source prefix - Dense/hybrid: incremental vector updates, fallback rebuild on failure - Add SqlDocumentStorage.list_ids_by_external_id_prefix() helper - Add optional extra for python-magic * test(ingestion): cover rag-ingest + loader detection/discovery - CLI ingest: mixed dir, idempotency, and stale-chunk deletion - Factory detection: prefer content heuristics over misleading extensions - Discovery: enforce limits and handle broken symlinks - Update CSVLoader unit tests for row_index metadata * docs: document rag-ingest and optional python-magic - README: add rag-ingest to CLI commands and mention magic extra - custom usage guide: add a CLI ingestion section * fix(ingest): avoid external_id prefix collisions when syncing stale chunks Pass a delimiter-suffixed prefix when listing existing external_ids so /tmp/foo does not match /tmp/foo2. * feat(ingestion): configurable cleaning + deterministic chunking (chars_v1) - Add chunk_chars_v1 with stable boundaries (start/end offsets) - Make preprocess_text configurable via keyword options - Add settings flags for ingestion cleaning and chunk strategy - Provide helpers to build preprocess/chunk functions from Settings * feat(ingest): store chunk metadata (chunk_index/offsets/parent_doc_id) - rag-ingest now records chunk offsets and parent_doc_id in SQLite metadata - bootstrap uses Settings-driven cleaning/chunking for determinism * test(chunking): validate boundaries/overlap and ingestion metadata - Add unit tests for chunk_chars_v1 determinism, boundaries and overlap - Extend rag-ingest CLI test to assert prepared metadata (chunk_index, offsets, parent_doc_id) - Add coverage for preprocess_text options * feat(dedup): add chunk-level hash column + index and helper - Add Settings.ingest_chunker_version to version the dedup key - Add chunk_dedup_sha256 helper (sha256(cleaned_text + chunker_version + embedding_model_name)) - Extend documents schema with chunk_dedup_sha256 and a partial unique index - Allow upsert to persist chunk_dedup_sha256 when provided * feat(api): dedup /api/docs ingestion via chunk hash upsert - Clean + chunk texts using Settings-driven pipeline - Compute external_id as chunk hash (includes chunker_version + embedding model) - Upsert into SQLite and update FAISS incrementally in dense/hybrid mode * test(dedup): cover hash-based /api/docs idempotency and SQL constraint - Verify /api/docs dedup is idempotent across re-ingests - Verify changing ingest_chunker_version produces new inserts - Assert chunk_dedup_sha256 unique index is enforced * docs: document chunker version and dedup column - Add chunking strategy/version env vars to README and .env.example - Mention chunk_dedup_sha256 schema column * fix(db): run SQLite compatibility migrations for CLI/scripts CLI/scripts can be executed without starting FastAPI, so ensure the same best-effort SQLite migrations (AUTOINCREMENT + identity columns) run before using ORM-mapped queries. * feat(delete): add tombstones and delete-by-external-id in SQL repo - Add DocumentTombstone table (external_id unique) - Add SqlDocumentStorage delete_by_external_ids + tombstone helpers - Tombstones prevent deleted identities from reappearing on future ingestions * feat(api): delete by external_id with tombstones - Add /api/docs/delete_by_external_id endpoint - Filter tombstoned external_ids from /api/docs ingestion so deletes don't reappear - Reject tombstoned external_ids on /api/docs/upsert * feat(cli): rag-delete-external-ids - Add CLI command delete-external-ids (SQL + FAISS when dense/hybrid) - Respect tombstones during rag-ingest and block upserts for tombstoned external_ids - Add project script entrypoint rag-delete-external-ids * test(delete): cover external_id delete (tombstones) + ask_eval + rebuild - Sparse: delete external_id prevents re-ingest and removes results from ask_eval - Dense: delete external_id updates vector store and survives rebuild-index * docs: document delete-by-external-id (API/CLI) - Add rag-delete-external-ids and /api/docs/delete_by_external_id to usage guides * feat(index-manifest): persist manifest and report drift in ready/status * chore(mypy): avoid optional python-magic typing issues * test(index-manifest): cover manifest drift and rebuild recovery * docs(index-manifest): document index_manifest and drift detection * test(index-manifest): unit coverage for manifest helpers * feat(reranker): optional overlap reranker behind settings * feat(observability): structured logs + domain metrics for ingest/query * feat(eval): add rag-eval CLI with versioned dataset and regression gate * test(eval,reranker): cover offline eval gate and reranking behavior * docs(pr10): document rag-eval, monitoring, and optional reranker * PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates * fix(api,lock): prevent duplicate ingest pass and fail closed on lock acquisition * fix(cli): serialize mutating commands with shared multi-store write lock * docs(pr11): include integral audit findings, fixes, and verification * fix(faiss): fail closed on missing manifest and lock acquisition * fix(api,cli): always invalidate cached rag service after mutating attempts * chore(lint): align isort with ruff and normalize import ordering * docs(pr11): document new critical audit findings and validations * fix(delete): deduplicate external_ids to prevent tombstone integrity failures * fix(openai): enforce request timeouts with compatibility fallback * chore(format): normalize TYPE_CHECKING import block in evaluation service * docs(pr11): add critical audit findings for delete dedup and OpenAI timeouts * fix(security): enforce runtime API-key guard for non-local requests * fix(coordination): align lock and reload token paths across processes * docs(pr11): record critical findings 14-15 and runtime security notes * refactor: centralize dense upsert consistency flow and shared locking/client helpers * test: add regression coverage for shared dense sync, manifest config, and OpenAI client fallback * docs: update PR11 audit notes and include curso indices * fix(security): fail closed when client host is unavailable * fix(consistency): make dense delete paths lazy-load embedders * fix(ingest): honor no-follow-symlinks for symlink directory inputs * refactor(cli): centralize dense embedder construction across commands * docs: clarify symlink ingest scope and dense delete behavior * polish * refactor(core): simplify lock orchestration and dense precompute flow * test(app): cover blocking worker env parsing and execution path * perf(ingest): avoid duplicate format detection and reuse precomputed result * perf(text): precompile whitespace regex in preprocessing * perf(ingest): batch rag-ingest mutations to reduce upsert/lock churn * perf(sparse): cache docs in memory and remove per-query SQL get * chore(lint): move NDArray import under TYPE_CHECKING * perf(sparse): avoid duplicate SQL loads when building retrievers * fix(security): enforce API key for forwarded non-local requests * fix(consistency): harden multi-store script locking and fail-fast build-index * fix(security): enforce API-key guard for RFC7239 Forwarded hosts * fix(consistency): preflight vector mutability before SQL deletes * fix(sqlite): tolerate duplicate-column races in identity migration * ci: harden workflow gates and pin actions by SHA * build: make Docker reproducible and enforce strict security target * docs: document CI gates and contributor verification workflow * refactor: centralize composition, harden multistore deletes, and split eval layering * fix(ingest): handle unicode text detection and unreadable files Treat non-ASCII UTF-8 text as textual content during loader detection by using Unicode-aware printability checks. This prevents valid multilingual .txt files from being misclassified as unknown and silently skipped by rag-ingest. Also harden detect_file_format to fail soft on unreadable files (e.g. permission errors), returning unknown/read-error instead of raising and aborting the full ingestion run. Add regression tests for: - UTF-8 non-ASCII plain text detection - read-head permission error handling - end-to-end CLI ingest with non-ASCII text Update README ingestion notes to document the behavior. * refactor(composition): unify runtime wiring policy across api, cli, and scripts Centralize adapter-selection policy in app.composition and reuse it from API factory, ask_eval retriever wiring, CLI dense embedder paths, and bootstrap/build-index scripts. Changes: - Add DEFAULT_DENSE_BACKEND_MESSAGE in composition. - Make build_dense_embedder_from_settings use the shared default when message is omitted. - Add build_retriever_with_default_embedder_from_settings to remove duplicated dense-embedder closure wiring. - Add resolve_preferred_llm_provider and reuse it in app.factory. - Update API/CLI/scripts to call shared composition policy helpers. - Add unit tests for new composition helpers and provider policy. This keeps existing monkeypatch seams in api/factory tests while consolidating policy decisions in one reusable module. * refactor(app): extract docs and index use-cases from api router Move business orchestration for /api/docs*, /api/docs/upsert, /api/docs/delete*, and /api/index/rebuild into app-layer services. Changes: - Add app/services/docs.py with sync use-cases for ingest/upsert/delete flows and typed summaries. - Add app/services/index.py with rebuild_index_sync orchestration. - Keep api_router transport-focused: request normalization, HTTP mapping, observability, and response serialization only. - Preserve existing monkeypatch seams by injecting adapter factories/callbacks from api_router into services. - Update architecture docs to reflect app/services docs/index layering. Validation: - ruff check . - mypy . - pytest -q (335 passed, coverage >= 85%). * refactor(cli): partition commands by bounded context modules - split monolithic cli.py into domain command modules: docs, index, eval, server\n- keep cli.py as composition root + entrypoint wrappers + shared hooks\n- preserve command surface/flags and existing monkeypatch seams\n- add registry/entrypoint tests for command wiring\n- update architecture docs with CLI layering * refactor(factory): replace reload token file with DB-backed system_state version - introduce SystemStateStorage adapter in SQLAlchemy persistence\n- persist rag service cache version in system_state (key: rag_service)\n- remove filesystem token invalidation from app factory\n- keep process-local cache maxsize=1 while enabling cross-process invalidation via DB version\n- add focused tests for factory invalidation and system_state storage behavior\n- update architecture docs to reflect DB-backed invalidation * refactor(blocking): add task-type pools with explicit pending limits - partition run_blocking work into task types: default, mutation, network, eval\n- add per-type worker and queue limits via env-configurable policies\n- enforce max pending per task type to prevent unbounded contention\n- route API run_blocking calls with explicit task_type labels\n- extend blocking and multistore tests for routing, queue-full behavior, and slot release guarantees\n- document async/sync boundary policy in architecture guide * test(app): add mandatory chaos scenarios for mutating operations - add chaos coverage for vector write failure during upsert\n- add lock acquisition failure scenario for mutating upsert path\n- reproduce crash window: SQL committed while vector upsert+rebuild fail\n- add rebuild-index failure scenario with cache invalidation assertion * refactor(app): introduce docs/index application ports to reduce infra coupling - add app-level ports module with DocsMutationPorts and IndexMutationPorts contracts\n- refactor docs and index use-cases to depend on port bundles instead of infra callables\n- centralize docs/index dependency wiring in api_router via _docs_mutation_ports/_index_mutation_ports\n- keep existing monkeypatch seams intact by resolving router symbols at runtime\n- add wiring tests for docs/index ports binding\n- update architecture docs to document new app ports layer * fix(security): fail closed on ambiguous proxy forwarding chains * fix(api): return 502 for malformed openrouter responses * fix(cli): validate duplicate external_id before embedding work * refactor(app): split transport routers and enforce typed provider errors Introduce typed cross-layer LLM errors in core and remove FastAPI exceptions from infra adapters. Add app-layer HTTP error mapping, extract OpenRouter into dedicated router/service, and move docs/index mutation port wiring into reusable builders for API/CLI. Expand tests for error mapping, OpenRouter service behavior, mutation port builders, and updated adapter/router mappings. * refactor(cli): reuse app docs services and add multiprocess lock regression Route docs CLI mutations through app services and shared mutation port builders to remove duplicated orchestration logic. Add real spawn-based integration coverage for cross-process write-lock serialization and update architecture docs for bounded routers and error-layer boundaries. * refactor(api): reduce root router to composition and split bounded routers Extract health, rag, docs and index endpoints into dedicated router modules and move runtime wiring/composition helpers to app/wiring.py. Migrate ingest schemas into app/schemas and adjust unit tests to patch the new seams (wiring and bounded routers) while preserving endpoint behavior. Update architecture docs to reflect api_router composition-only role and new router/wiring structure. * hotfix-refactor: arch * chore(rescue): snapshot mixed worktree before pr04 split * update(ci): splitted branches globbing patterns on CI for granular control on trigerage. Included release** pattern * add(pre-commit): included pre-commit in TO PROJ. deps * config * ci(security): pin safety v2 and use stable check command * updated doc, repo structure * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution
Intrinsical-AI
added a commit
that referenced
this pull request
Mar 3, 2026
* feat(frontend): split inline assets and serve via /assets * security: protect API and tighten CORS defaults * security: add request size limits * persistence: harden FAISS index persistence * scripts: add repo entrypoints and drop packaged FAQ * monitoring: reduce metrics label cardinality * runtime: invalidate cached RAG service across workers * runtime: avoid thread offload in api endpoints * docs: update configuration and changelog * chore: remove obsolete PR description proposal * feat(prompting): safe prompt template rendering * feat(app): run blocking work in dedicated thread pool * feat(security): require API key when binding publicly * fix(sqlite): prevent document id reuse; migrate legacy schema * feat(faiss): support delete/rebuild + safe persistence * fix(etl): rollback vector index on ingestion failure * feat(api): doc delete + index rebuild; improve readiness/offload * fix(rag): make history persistence best-effort * feat(app): enforce safe bind and migrate sqlite schema at startup * chore(app): tighten rag service cache * fix(llm): use safe prompt renderer * docs: document delete/rebuild maintenance flows * docs: added ROADMAP with next PRs, feats, steps and DOD * update(llm): default ollama model set to gemma3:4b across codebase * release(02-2026): merge stabilization (ops, ingestion, consistency, eval, hardening) (#31) * feat(ops): add readiness/index diagnostics and richer rag-status * test(ops): cover readiness drift/corruption and cli status * docs: note strict /api/ready drift detection in dense/hybrid * updated .gitignore - pr descs into /pr * feat(sql): add document identity columns + best-effort sqlite migration * test(sqlite): cover identity column migration, unique external_id, sha256 fill * docs: document document identity fields (external_id, metadata, hash, timestamps) * feat(docs): idempotent upsert by external_id (API/CLI) * test(docs): cover upsert idempotency (api/sql/cli) * docs: document docs upsert endpoint and rag-upsert-docs * docs: update roadmap with stabilization/refactor PR and incorporate quick-wins * refactor(structure): move api schemas/diagnostics and core helpers; keep shims * refactor(imports): prefer new module locations (app.schemas, core.services, app.diagnostics) * refactor(structure): drop root shims and enforce app/core module boundaries * docs: update roadmap and architecture after removing root shims * feat(ingestion): add txt/md/csv loaders with discovery + sniffing - Add TextFileLoader and MarkdownLoader - Add discover_files() limits (max files / bytes) for safe dir ingestion - Add loader factory with best-effort format detection (extension + bytes heuristics, optional python-magic) - Extend CSVLoader with delimiter sniffing and row_index metadata * feat(cli): add rag-ingest (dir/file) with SQL+FAISS consistency - Add CLI/entrypoint (files/dirs, limits, dry-run) - Ingest via external_id upsert per chunk and delete stale chunks by source prefix - Dense/hybrid: incremental vector updates, fallback rebuild on failure - Add SqlDocumentStorage.list_ids_by_external_id_prefix() helper - Add optional extra for python-magic * test(ingestion): cover rag-ingest + loader detection/discovery - CLI ingest: mixed dir, idempotency, and stale-chunk deletion - Factory detection: prefer content heuristics over misleading extensions - Discovery: enforce limits and handle broken symlinks - Update CSVLoader unit tests for row_index metadata * docs: document rag-ingest and optional python-magic - README: add rag-ingest to CLI commands and mention magic extra - custom usage guide: add a CLI ingestion section * fix(ingest): avoid external_id prefix collisions when syncing stale chunks Pass a delimiter-suffixed prefix when listing existing external_ids so /tmp/foo does not match /tmp/foo2. * feat(ingestion): configurable cleaning + deterministic chunking (chars_v1) - Add chunk_chars_v1 with stable boundaries (start/end offsets) - Make preprocess_text configurable via keyword options - Add settings flags for ingestion cleaning and chunk strategy - Provide helpers to build preprocess/chunk functions from Settings * feat(ingest): store chunk metadata (chunk_index/offsets/parent_doc_id) - rag-ingest now records chunk offsets and parent_doc_id in SQLite metadata - bootstrap uses Settings-driven cleaning/chunking for determinism * test(chunking): validate boundaries/overlap and ingestion metadata - Add unit tests for chunk_chars_v1 determinism, boundaries and overlap - Extend rag-ingest CLI test to assert prepared metadata (chunk_index, offsets, parent_doc_id) - Add coverage for preprocess_text options * feat(dedup): add chunk-level hash column + index and helper - Add Settings.ingest_chunker_version to version the dedup key - Add chunk_dedup_sha256 helper (sha256(cleaned_text + chunker_version + embedding_model_name)) - Extend documents schema with chunk_dedup_sha256 and a partial unique index - Allow upsert to persist chunk_dedup_sha256 when provided * feat(api): dedup /api/docs ingestion via chunk hash upsert - Clean + chunk texts using Settings-driven pipeline - Compute external_id as chunk hash (includes chunker_version + embedding model) - Upsert into SQLite and update FAISS incrementally in dense/hybrid mode * test(dedup): cover hash-based /api/docs idempotency and SQL constraint - Verify /api/docs dedup is idempotent across re-ingests - Verify changing ingest_chunker_version produces new inserts - Assert chunk_dedup_sha256 unique index is enforced * docs: document chunker version and dedup column - Add chunking strategy/version env vars to README and .env.example - Mention chunk_dedup_sha256 schema column * fix(db): run SQLite compatibility migrations for CLI/scripts CLI/scripts can be executed without starting FastAPI, so ensure the same best-effort SQLite migrations (AUTOINCREMENT + identity columns) run before using ORM-mapped queries. * feat(delete): add tombstones and delete-by-external-id in SQL repo - Add DocumentTombstone table (external_id unique) - Add SqlDocumentStorage delete_by_external_ids + tombstone helpers - Tombstones prevent deleted identities from reappearing on future ingestions * feat(api): delete by external_id with tombstones - Add /api/docs/delete_by_external_id endpoint - Filter tombstoned external_ids from /api/docs ingestion so deletes don't reappear - Reject tombstoned external_ids on /api/docs/upsert * feat(cli): rag-delete-external-ids - Add CLI command delete-external-ids (SQL + FAISS when dense/hybrid) - Respect tombstones during rag-ingest and block upserts for tombstoned external_ids - Add project script entrypoint rag-delete-external-ids * test(delete): cover external_id delete (tombstones) + ask_eval + rebuild - Sparse: delete external_id prevents re-ingest and removes results from ask_eval - Dense: delete external_id updates vector store and survives rebuild-index * docs: document delete-by-external-id (API/CLI) - Add rag-delete-external-ids and /api/docs/delete_by_external_id to usage guides * feat(index-manifest): persist manifest and report drift in ready/status * chore(mypy): avoid optional python-magic typing issues * test(index-manifest): cover manifest drift and rebuild recovery * docs(index-manifest): document index_manifest and drift detection * test(index-manifest): unit coverage for manifest helpers * feat(reranker): optional overlap reranker behind settings * feat(observability): structured logs + domain metrics for ingest/query * feat(eval): add rag-eval CLI with versioned dataset and regression gate * test(eval,reranker): cover offline eval gate and reranking behavior * docs(pr10): document rag-eval, monitoring, and optional reranker * PR11: Hardening de consistencia transaccional y concurrencia (SQL + índice) (#27) * fix(consistency): preflight manifest drift before index mutations * fix(maintenance): raise explicit multi-store inconsistency error on double failure * ci: add windows smoke tests for faiss persistence and consistency paths * fix: prevent SQL/vector drift when embedding fails Precompute dense/hybrid embeddings before SQL upsert in API and CLI flows so provider failures cannot leave partially committed SQL without vector updates. Add SqlDocumentStorage.get_existing_doc_states_by_external_id to embed only inserts/content updates (preserving idempotent no-op behavior). Add regression tests for /api/docs, /api/docs/upsert, rag-upsert-docs, and rag-ingest to verify no SQL persistence on embedding failures. * fix: harden rag service reload token writes Use unique temp files + fsync when writing .rag_service_reload_token to avoid .tmp path collisions across workers. Add regression test ensuring _write_reload_token does not reuse temp source paths across consecutive writes. * docs(pr): add PR11 hardening consistency description * fix(api): serialize multi-store writes and harden generation params Prevent concurrent mutating API operations from interleaving SQL and FAISS updates by running write paths under a shared cross-process lock. Add strict bounds for generation sampling params (temperature/top_p/max_tokens) in OpenRouter requests and AskEval top_p validation. Add regression tests for concurrent upsert consistency and invalid sampling parameter rejection. * chore: fix import ordering and include roadmap updates * fix(api,lock): prevent duplicate ingest pass and fail closed on lock acquisition * fix(cli): serialize mutating commands with shared multi-store write lock * docs(pr11): include integral audit findings, fixes, and verification * fix(faiss): fail closed on missing manifest and lock acquisition * fix(api,cli): always invalidate cached rag service after mutating attempts * chore(lint): align isort with ruff and normalize import ordering * docs(pr11): document new critical audit findings and validations * fix(delete): deduplicate external_ids to prevent tombstone integrity failures * fix(openai): enforce request timeouts with compatibility fallback * chore(format): normalize TYPE_CHECKING import block in evaluation service * docs(pr11): add critical audit findings for delete dedup and OpenAI timeouts * fix(security): enforce runtime API-key guard for non-local requests * fix(coordination): align lock and reload token paths across processes * docs(pr11): record critical findings 14-15 and runtime security notes * refactor: centralize dense upsert consistency flow and shared locking/client helpers * test: add regression coverage for shared dense sync, manifest config, and OpenAI client fallback * docs: update PR11 audit notes and include curso indices * fix(security): fail closed when client host is unavailable * fix(consistency): make dense delete paths lazy-load embedders * fix(ingest): honor no-follow-symlinks for symlink directory inputs * refactor(cli): centralize dense embedder construction across commands * docs: clarify symlink ingest scope and dense delete behavior * polish * refactor(core): simplify lock orchestration and dense precompute flow * test(app): cover blocking worker env parsing and execution path * perf(ingest): avoid duplicate format detection and reuse precomputed result * perf(text): precompile whitespace regex in preprocessing * perf(ingest): batch rag-ingest mutations to reduce upsert/lock churn * perf(sparse): cache docs in memory and remove per-query SQL get * chore(lint): move NDArray import under TYPE_CHECKING * perf(sparse): avoid duplicate SQL loads when building retrievers * fix(security): enforce API key for forwarded non-local requests * fix(consistency): harden multi-store script locking and fail-fast build-index * fix(security): enforce API-key guard for RFC7239 Forwarded hosts * fix(consistency): preflight vector mutability before SQL deletes * fix(sqlite): tolerate duplicate-column races in identity migration * ci: harden workflow gates and pin actions by SHA * build: make Docker reproducible and enforce strict security target * docs: document CI gates and contributor verification workflow * refactor: centralize composition, harden multistore deletes, and split eval layering * fix(ingest): handle unicode text detection and unreadable files Treat non-ASCII UTF-8 text as textual content during loader detection by using Unicode-aware printability checks. This prevents valid multilingual .txt files from being misclassified as unknown and silently skipped by rag-ingest. Also harden detect_file_format to fail soft on unreadable files (e.g. permission errors), returning unknown/read-error instead of raising and aborting the full ingestion run. Add regression tests for: - UTF-8 non-ASCII plain text detection - read-head permission error handling - end-to-end CLI ingest with non-ASCII text Update README ingestion notes to document the behavior. * refactor(composition): unify runtime wiring policy across api, cli, and scripts Centralize adapter-selection policy in app.composition and reuse it from API factory, ask_eval retriever wiring, CLI dense embedder paths, and bootstrap/build-index scripts. Changes: - Add DEFAULT_DENSE_BACKEND_MESSAGE in composition. - Make build_dense_embedder_from_settings use the shared default when message is omitted. - Add build_retriever_with_default_embedder_from_settings to remove duplicated dense-embedder closure wiring. - Add resolve_preferred_llm_provider and reuse it in app.factory. - Update API/CLI/scripts to call shared composition policy helpers. - Add unit tests for new composition helpers and provider policy. This keeps existing monkeypatch seams in api/factory tests while consolidating policy decisions in one reusable module. * refactor(app): extract docs and index use-cases from api router Move business orchestration for /api/docs*, /api/docs/upsert, /api/docs/delete*, and /api/index/rebuild into app-layer services. Changes: - Add app/services/docs.py with sync use-cases for ingest/upsert/delete flows and typed summaries. - Add app/services/index.py with rebuild_index_sync orchestration. - Keep api_router transport-focused: request normalization, HTTP mapping, observability, and response serialization only. - Preserve existing monkeypatch seams by injecting adapter factories/callbacks from api_router into services. - Update architecture docs to reflect app/services docs/index layering. Validation: - ruff check . - mypy . - pytest -q (335 passed, coverage >= 85%). * refactor(cli): partition commands by bounded context modules - split monolithic cli.py into domain command modules: docs, index, eval, server\n- keep cli.py as composition root + entrypoint wrappers + shared hooks\n- preserve command surface/flags and existing monkeypatch seams\n- add registry/entrypoint tests for command wiring\n- update architecture docs with CLI layering * refactor(factory): replace reload token file with DB-backed system_state version - introduce SystemStateStorage adapter in SQLAlchemy persistence\n- persist rag service cache version in system_state (key: rag_service)\n- remove filesystem token invalidation from app factory\n- keep process-local cache maxsize=1 while enabling cross-process invalidation via DB version\n- add focused tests for factory invalidation and system_state storage behavior\n- update architecture docs to reflect DB-backed invalidation * refactor(blocking): add task-type pools with explicit pending limits - partition run_blocking work into task types: default, mutation, network, eval\n- add per-type worker and queue limits via env-configurable policies\n- enforce max pending per task type to prevent unbounded contention\n- route API run_blocking calls with explicit task_type labels\n- extend blocking and multistore tests for routing, queue-full behavior, and slot release guarantees\n- document async/sync boundary policy in architecture guide * test(app): add mandatory chaos scenarios for mutating operations - add chaos coverage for vector write failure during upsert\n- add lock acquisition failure scenario for mutating upsert path\n- reproduce crash window: SQL committed while vector upsert+rebuild fail\n- add rebuild-index failure scenario with cache invalidation assertion * refactor(app): introduce docs/index application ports to reduce infra coupling - add app-level ports module with DocsMutationPorts and IndexMutationPorts contracts\n- refactor docs and index use-cases to depend on port bundles instead of infra callables\n- centralize docs/index dependency wiring in api_router via _docs_mutation_ports/_index_mutation_ports\n- keep existing monkeypatch seams intact by resolving router symbols at runtime\n- add wiring tests for docs/index ports binding\n- update architecture docs to document new app ports layer * fix(security): fail closed on ambiguous proxy forwarding chains * fix(api): return 502 for malformed openrouter responses * fix(cli): validate duplicate external_id before embedding work * refactor(app): split transport routers and enforce typed provider errors Introduce typed cross-layer LLM errors in core and remove FastAPI exceptions from infra adapters. Add app-layer HTTP error mapping, extract OpenRouter into dedicated router/service, and move docs/index mutation port wiring into reusable builders for API/CLI. Expand tests for error mapping, OpenRouter service behavior, mutation port builders, and updated adapter/router mappings. * refactor(cli): reuse app docs services and add multiprocess lock regression Route docs CLI mutations through app services and shared mutation port builders to remove duplicated orchestration logic. Add real spawn-based integration coverage for cross-process write-lock serialization and update architecture docs for bounded routers and error-layer boundaries. * refactor(api): reduce root router to composition and split bounded routers Extract health, rag, docs and index endpoints into dedicated router modules and move runtime wiring/composition helpers to app/wiring.py. Migrate ingest schemas into app/schemas and adjust unit tests to patch the new seams (wiring and bounded routers) while preserving endpoint behavior. Update architecture docs to reflect api_router composition-only role and new router/wiring structure. * hotfix-refactor: arch * chore(rescue): snapshot mixed worktree before pr04 split * update(ci): splitted branches globbing patterns on CI for granular control on trigerage. Included release** pattern * add(pre-commit): included pre-commit in TO PROJ. deps * config * ci(security): pin safety v2 and use stable check command * updated doc, repo structure * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * fix(working): fixing pre-commit + CI + bandic (sec) trigering and execution * clean: README * release: readability/quick improvements (#32) * refactor: replace assert guards with explicit RuntimeError + low-risk cleanups - Replace all production `assert x is not None` with explicit `if x is None: raise RuntimeError(...)` across sql_.py, index.py, composition.py, factory.py, docs_ingest.py — asserts are disabled under `python -O`; the new form is visible in all execution modes. - Extract `_to_domain_document()` module-level helper in sql_.py; eliminates duplicate 11-line ORM→domain mapping in `get()` and `get_all_documents()`. - Replace `getattr(db_doc, "field", None)` fallbacks with direct attribute access in sql_.py; all columns are explicitly mapped in Document ORM model — getattr implied optional columns that are guaranteed present. - Add `FaissIndex._locked_write()` context-manager abstracting the repeated pattern `with self._state_lock, _exclusive_file_lock(self._lock_path)` (5 call sites → 1 definition). - Add `Settings.ingest_batch_size` field (default 64, env-overridable via INGEST_BATCH_SIZE); removes magic number hardcoded in `_execute_ingest_batches`. All changes are behaviour-preserving. lint/mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(sql): extract _DocumentChanges + _detect_document_changes, add change-variant tests Motivation: the upsert change detection block mixed four independent boolean checks inline — hard to test in isolation and easy to introduce regressions when adding a new field. Changes: - Add module-level `_DocumentChanges` frozen dataclass with `any_changed` property. - Add module-level `_detect_document_changes(db_doc, item, new_content, new_sha)` function, placed after `SqlDocumentStorage` so `SqlDocumentStorage.UpsertDoc` is already defined in scope. - Replace the 14-line inline detection block in `upsert_documents_by_external_id` with a single `changes = _detect_document_changes(...)` call. - Remove now-redundant local `content_changed` variable; reference `changes.content`. Tests added (test_sql_upsert_documents_by_external_id.py): - `test_upsert_metadata_only_change` — metadata diff → action=updated, content_changed=False - `test_upsert_source_only_change` — source diff → action=updated, content_changed=False - `test_upsert_dedup_only_change` — dedup diff → action=updated, content_changed=False These variants were not covered; they exercise all four branches of `_DocumentChanges`. lint/mypy clean, 6/6 upsert tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(ingest): replace tuple aliases IngestPlan + BatchSyncResult with frozen dataclasses Motivation: positional tuple unpacking of 4 and 7 fields respectively makes call sites opaque — a field reorder would silently produce wrong behaviour. Changes: - Add `from dataclasses import dataclass` import. - Replace `IngestPlan = tuple[Path, str, tuple[Any, ...], tuple[str, ...]]` with `@dataclass(frozen=True) class IngestPlan` with named fields (file_path, file_prefix, items, desired_external_ids). - Replace `BatchSyncResult = tuple[int, int, int, bool, int, int, list[...]]` with `@dataclass(frozen=True) class BatchSyncResult` with named fields (inserted, updated, unchanged, rebuilt, deleted_stale, ingested_chunks, stale_by_file). - Update `_build_file_ingest_plan`: return `IngestPlan(...)` instead of bare tuple. - Update `_build_ingest_plans`: `plan[2]` → `plan.items`. - Update `_collect_batch_items_and_stale`: replace 4-way destructuring `for a,b,c,d in` with `for plan in` + attribute access. - Update `_ingest_batch_sync`: return `BatchSyncResult(...)` instead of 7-tuple. - Update `_execute_ingest_batches`: replace 7-way destructuring with `batch = ...; batch.field`. Behaviour: identical. lint/mypy clean. 82 ingest-related unit tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(factory)+docs(settings): consolidate PR-A/PR-D and runtime defaults Includes all pending modified files in this branch, as requested. - PR-A: extract _collect_container_overrides() from _build_container() in app/factory.py while preserving monkeypatch seams, override keys, and resolution order. - PR-D: formalize INGEST_BATCH_SIZE in .env.example and README (config table + ingestion tuning section). - Add settings validator tests for ingest_batch_size bounds: accepts 1/512 and rejects 0/513. - Consolidate Ollama default model/docs updates to lfm2.5-thinking across settings, README, docker-compose comment, and custom usage guide. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * version: semantic versioning aligment for release v1.0 * fix(core): clean prompt template wording regression * refactor(core): split service DTOs into types/results * feat(infra): add sql/vector/shared persistence adapters * refactor(app): wire vector/sql adapters and drop legacy faiss/sqlalchemy paths * test(refactor): migrate suites to sql/vector persistence layout * docs(architecture): update vector/sql module boundaries and usage * chore(lock): sync uv.lock project version metadata * chore(docs): normalize whitespace in app architecture note * fix(ci): updated windows smoke tests paths for ci workflow * fix(ci): updated windows smoke tests paths for ci workflow * fix(norecurse): added uv_cache * refactor(core): introduce DocId and lineage domain contracts * refactor(core-services): propagate DocId through orchestration and retrieval * refactor(sql): move to doc_id string primary key and string history source_ids * chore(sql): enforce fresh-install-only schema bootstrap without legacy migration * refactor(vector): store DocId strings in id_map and vector repository interfaces * feat(ingestion): require lineage in LoadedItem and emit lineage across loaders * refactor(app): expose string document IDs across services, schemas, routers and diagnostics * refactor(cli): switch docs operations and batch ingestion flows to string DocIds * test: update suites for DocId string contract and fresh-install SQL behavior * docs: document fresh-install-only policy, string IDs, and lineage requirement * fix(sql): preserve input order in document get(ids) * build(uv): standardize dependency groups and lock policy * feat(core): harden write-lock and error guarantees Introduce explicit write-lock timeout handling and map it to API-level 503 responses. Key changes: - add WriteLockTimeoutError to core error taxonomy - enforce timeout-aware file/multi-store write lock behavior - expose lock/recovery tuning in settings for deterministic ops behavior - remove silent rollback swallowing in ETL and surface incomplete rollback as runtime error - bound ingestion pipeline memory via configurable batched flushes * feat(mutation): add durable multi-store saga coordinator Implement a single mutation coordinator for SQL+vector writes with incremental delta semantics and durable journal state machine. Key changes: - add storage profile capabilities and enforce write floor at DURABLE_SAGA - add file-backed mutation journal with PREPARED/COMMITTED/FAILED recovery states - extend SQL storage with before-image snapshot and deterministic restore helpers - extend vector contract with apply_delta_atomic for delete+upsert as one operation - wire recovery on startup plus background loop, and expose incomplete-journal diagnostics/health warnings * refactor(app): collapse orchestration layers and cut legacy write surface Remove the ambiguous app/services layer and move orchestration to explicit application use-cases + coordinator path. Key changes: - replace app/services with app/contracts + app/wiring + application use-case modules - rename RAG modules by responsibility (rag_query_use_case, rag_router, rag_api_models, rag_runtime) - make /api/docs/mutate the canonical write endpoint and remove legacy upsert/delete endpoints - align CLI to rag-mutate-docs + rag-ingest + rag-rebuild-index and drop legacy delete/upsert commands - update script entry points to match the breaking CLI surface * test(architecture): enforce layer boundaries and unified mutation path Update tests to the new write contract and add guardrails so architecture drift fails fast. Key changes: - add layer-boundary checks (no app.services imports; no invalid cross-layer dependencies) - add explicit mutation boundary tests ensuring HTTP/CLI writes terminate in MutationCoordinator.execute - update endpoint/CLI tests for removed legacy routes and commands - refresh RAG/ingest/index tests for renamed modules and new contracts * docs(release): update architecture and usage for v1.0 mutation model Document the final v1.0 cut: durable saga writes, unified mutation surface, and explicit repair workflows. Key changes: - rewrite architecture docs around single application orchestration layer and storage profiles - update custom usage guide with canonical mutate/ingest/rebuild flows - align docs navigation and README with breaking HTTP/CLI changes - ignore local mutation journal artifacts under data/.mutation_journal * refactor(factory): remove dynamic override collector and wire AppContainer explicitly * refactor(cli-structure): move docs commands into docs package namespace * refactor(ingestion): remove legacy script wrappers and call sample_data_ingestion directly * refactor(index-wiring): drop BuildIndexPorts and deprecated build_index_sync path * feat!(cli): unify bootstrap/build-index into bootstrap and remove rag-build-index * chore(app): remove unused build_rag_service re-export from dependencies * fix(tests): make asgi_client depend on in_memory_sqlite fixture * fix(router): replace unsafe assert with RuntimeError in openrouter generate assert statements are stripped by python -O; a failed isinstance check would produce an AttributeError with no context. Raise an explicit RuntimeError with a descriptive message so production logs surface the real problem. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(mutation-journal): warn before resetting unrecognized saga state to PREPARED _record_from_dict silently coerced any unknown state value to PREPARED, which could cause double-execution of a mutation during recovery from a partially-committed journal. Add a logger.warning so operators are alerted to version mismatches or journal corruption. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(ports): drop 4 dead fields from DocsMutationPorts across app layer precompute_vectors_fn, sync_dense_fn, delete_docs_fn, and delete_external_ids_fn were stored on DocsMutationPorts but never read by MutationCoordinator after the saga refactor. Remove them from the dataclass, all construction sites (wiring, container, factory), and update the two tests that asserted their presence. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(contracts): purge legacy one-shot mutation result DTOs UpsertDocsSummary, DeleteDocsByExternalIdSummary, and DeleteDocsSummary pre-date the MutationCoordinator and were never produced or consumed after the saga refactor. Remove them from results.py and the contracts __all__. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cli): remove infrastructure import from docs ingest plan builder _build_file_ingest_plan was importing SqlDocumentStorage directly to access SqlDocumentStorage.UpsertDoc, crossing the infrastructure boundary from the CLI layer. Thread build_upsert_doc through _build_ingest_plans and source it from the already-constructed DocsMutationPorts in ingest_cmd. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(scripts,app,cli): replace print with logging, fix dead code, tighten exports sample_data_ingestion.py: - print() → logger.info() (progress reporting belongs to the logging framework) - magic literal 128 → settings_obj.ingest_batch_size (consistent with pipeline) - move logger = getLogger(__name__) after all imports (fixes ruff E402) factory.py: - replace unreachable second if _APP_CONTEXT is None with a local var ctx so mypy type-narrows correctly without an impossible branch runtime.py: - remove private _reset_rag_service_best_effort from __all__ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(integration): replace capsys with caplog for ingestion confirmation assertions run_sample_data_ingestion now emits progress via logger.info instead of print(); update four integration tests to capture log records with caplog instead of stdout with capsys. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: QW1 app/composition.py resolve_preferred_llm_provider ahora retorna "openrouter" cuando es el único proveedor configurado QW2 app/application/index.py rebuild_index_sync pasa backend= al vector_repo_factory, consistente con _apply_vector_delta QW3 app/main.py logger.error(f"...") → logger.error("...", e) (Ruff G004) QW4 app/blocking.py _pending_snapshot simplificado a return state.pending_snapshot() — dead branches eliminados QW5 app/application/mutations.py Flag mutation_attempted eliminado en run_api_mutation y run_cli_mutation — reset_after() siempre en finally QW6 app/contracts/ports.py Tipo de retorno corregido: list[str] → list[DocId] en DocsRepositoryPort QW7 core/services/rag_runtime.py String español → constante NO_DOCS_ANSWER = "No documents are indexed..." + parámetro no_docs_answer en RagService.__init__ * feat: Task 8 — Write lock contract Added WriteLockPort Protocol in app/contracts/ports.py with explicit (*, coordination_dir, timeout_s, poll_s) signature; updated DocsMutationPorts.write_lock from Callable[..., Any] to WriteLockPort Removed except TypeError fallback from write_lock.py — direct _exclusive_file_lock(path, timeout_s=…, poll_s=…) call Simplified _write_lock() in docs_mutation.py — removed try/except, direct call only Updated _fake_lock in test_write_lock.py and _broken_lock in test_mutation_chaos.py to accept the full keyword signature Task 10 — VectorStorage DI storage.py: Added settings_obj: Settings | None = None; stores self._settings; removed bare global settings import composition.py: Passes settings_obj=settings_obj to vector_repo_factory sample_data_ingestion.py: Passes settings_obj=settings_obj to VectorStorage test_vector_storage_manifest_guard.py and test_composition.py: Updated assertions/calls accordingly * fix(lifespan): fix lifespan removing old entries on pytest.fixture to avoid dirty loading * fix(lifespan): asgi_client now accepts tmp_path and monkeypatch as parameters. Before entering the lifespan it sets settings.data_dir to an absolute per-test tmp_path / "data". Since get_coordination_dir() returns data_dir.resolve() for absolute paths, every test gets a fully isolated journal directory. No deletion, no race condition * added push branch to ci file * revert -.-" * fix(ports): add ntotal read-only health probe to VectorRepoPort `delete([])` was a destructive write used as a preflight check in maintenance operations. Replacing it with a pure read-only `ntotal` property removes the side-effect and makes the intent explicit in the port contract. - VectorRepoPort: declare `@property ntotal() -> int` in the Protocol - VectorStorage: implement ntotal by proxying vector_index.ntotal Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(maintenance): ntotal preflight, result dataclasses, shared embedder resolver Three interrelated improvements to delete_documents_multi_store and delete_external_ids_multi_store: 1. Preflight now reads vec_repo.ntotal instead of calling delete([]), eliminating the unintended write side-effect (pairs with ports fix). The redundant `if deleted_index_raw is not None` dead branch is removed since VectorRepoPort.delete() always returns int. 2. n-tuple returns replaced with frozen dataclasses: - MultiStoreDeleteResult(deleted_sql, deleted_index, rebuilt) - MultiStoreExternalIdDeleteResult(deleted_sql, deleted_index, missing_external_ids, tombstoned, rebuilt) Named fields prevent silent positional-ordering bugs at call sites. 3. The embedder resolution pattern (try factory; raise if still None) was duplicated 4 times across the two functions. Extracted into _resolve_embedder_or_raise() private helper. Error message strings promoted to module-level constants to eliminate literal duplication. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(ports): replace Any with concrete types in mutation port contracts Any in frozen dataclass fields is invisible to mypy and defeats the purpose of the port abstraction. Four fields are now statically typed: - DocsMutationPorts.vector_repo_factory: Callable[..., Any] → Callable[..., VectorRepoPort] - DocsMutationPorts.storage_profile_registry: Any → StorageProfileRegistry - IndexMutationPorts.doc_repo_factory: Callable[[], Any] → Callable[[], DocumentRepoPort] - IndexMutationPorts.vector_repo_factory: Callable[..., Any] → Callable[..., VectorRepoPort] Builders in mutation_ports.py and container.py updated to match; the redundant cast("Any", ...) in AppContainer.index_mutation_ports removed. Also add inline comments to document two intentional design decisions that were previously silent: - UpsertDocBuilderPort returns object (not TypeVar) by design - evaluation.py noqa: F401 import is a required SQLAlchemy side-effect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(infra): narrow except Exception to specific types where safe Broad except Exception clauses in infrastructure code mask unexpected errors and reduce log signal. This commit tightens four files: factory.py: - _decode_text_sample(): Exception → UnicodeDecodeError (both branches; Latin-1 decode never raises in practice, but annotation stays honest) - detect_json_export_format(): Exception → json.JSONDecodeError (decode uses errors="replace" so only loads() can raise) - _magic_mime(): kept as Exception with comment — optional C extension (libmagic) can raise arbitrary binding errors beyond ImportError/OSError id_map_json.py: - load_id_map_json(): Exception → (json.JSONDecodeError, UnicodeDecodeError) (the two exact failure modes of json.loads(bytes.decode())) alchemy_engine.py / index.py: - Rollback/recovery guards kept as except Exception with explanatory comments: they intentionally catch all failures (including Python-level AttributeError etc.) so the original exception always propagates cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(deps): bump fastapi 0.111→0.115, anyio 4.3→4.12; add python-multipart explicitly Security: starlette 0.37.2 had three DoS CVEs in multipart parsing and Range-header processing. Upgrading fastapi 0.111→0.115 pulls in starlette 0.46.2, which resolves CVE-2024-47874 (<0.40.0). anyio 4.3.0 had a thread race condition (PVE-2024-71199) in multi-event- loop environments, fixed in 4.4.0. The old pin anyio>=4.3,<4.4 explicitly excluded the patched release; now >=4.4,<5. python-multipart was a transitive dependency of fastapi 0.111 and is no longer bundled in 0.115+. Added as an explicit dependency so form/file upload endpoints continue to work. starlette 0.37.2 → 0.46.2 (CVE-2024-47874 fixed) anyio 4.3.0 → 4.12.1 (PVE-2024-71199 fixed) fastapi 0.111.1 → 0.115.14 465 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(deps): bump fastapi 0.115→0.124 + starlette 0.46→0.50; all CVEs resolved FastAPI 0.124 widens the starlette constraint to <0.51.0, allowing the resolver to reach starlette 0.50.0. This resolves the remaining two CVEs that 0.115→0.46.2 could not address: CVE-2024-47874 starlette <0.40.0 → fixed in 0.40+ (was already done) CVE-2025-54121 starlette <0.47.2 → fixed in 0.50.0 ✓ CVE-2025-62727 starlette <0.49.1 → fixed in 0.50.0 ✓ starlette 0.46.2 → 0.50.0 (forced via --upgrade-package starlette) fastapi 0.115.14 → 0.124.4 httpx pin >=0.27,<0.28 remains compatible: starlette[full] allows 0.27-0.29. pydantic pin >=2.0,<3.0 remains compatible: FastAPI 0.124 requires >=1.7.4. 465 tests pass, mypy and ruff clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(tooling): add gitleaks + check-json + debug-statements hooks; scaffold mypy test override - gitleaks v8.24.2: secret/credential detection (gap not covered by bandit) - check-json: validates JSON files on commit (check-yaml/toml were already present) - debug-statements: catches stray breakpoint()/pdb imports before they reach CI - [[tool.mypy.overrides]] for tests.*: lenient scaffolding (disallow_untyped_defs=false) for when tests/ gains __init__.py and is removed from the mypy exclude Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix stale references after FastAPI bump and refactor - README.md: update FastAPI badge 0.111 → 0.124 - README.md: add cli_commands/ to project structure tree; fix scripts/ description (only sample_data_ingestion.py, not bootstrap) - docs/USAGE.md: sessionmaker(bind=engine) → sessionmaker(engine) (SQLAlchemy 2.x removed the bind= keyword from sessionmaker) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: up-to-date. Polish + fix. * docs: up-to-date. Polish + fix. * docs: add transport isolation contract + 1.1.0 changelog entry architecture.md: document golden rule (delete app/ → core/services still work), current compliance (ingest/import use cases accept Sequence[str]/bytes), and the router's responsibility to convert HTTP types before use-case call. CHANGELOG.md: add [1.1.0] entry covering security CVE fixes, dep bumps, port contract changes, refactors, and tooling additions from the sprint. Clear stale [Unreleased] content (app/services/* references predated the app/contracts/ rename already shipped in 1.0.0). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat!: screaming architecture refactor — eliminate app/, decouple FastAPI BREAKING CHANGE: the `app/` package is removed entirely. - Use cases move to `core/use_cases/` (transport-agnostic). - DI composition root moves to `composition/` (transport-neutral). - HTTP adapter lives in `http/` (FastAPI optional via `[server]` extra). - Observability moves to `infrastructure/observability/`. - Blocking executor moves to `infrastructure/concurrency/`. - Storage profiles move to `core/domain/profiles.py`. - ASGI entry point: `local_rag_backend.http.main:app`. - `AskEvalConfigLike` Protocol replaces concrete Pydantic schema in container public signatures. - `map_runtime_error` moves from `http/error_mapping.py` to `core/use_cases/errors.py`. - Layer dependency rules enforced by architecture guard tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: fix stale app/ references after FastAPI bump and refactor - Architecture guard tests now scan `http/routers/` (not defunct `app/routers/`). - `test_http_routers_do_not_import_infrastructure_directly` refined to allow cross-cutting infrastructure (concurrency, observability) while still blocking direct imports of persistence, retrieval, LLM, and embedding adapters. - Stale path comment removed from `http/api_router.py`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * stabilize: refactor @architecture.md. polish on http.main * refactor(mutation): extract coordinator handlers and shared-session uow * refactor(app): migrate use-cases and transports to port-based wiring * test(architecture): enforce strict use-case boundaries in CI * docs(architecture): close phase B and add persistence-topology ADR * chore(cleanup): remove legacy seams and relocate lock/dataset internals * refactor(mutation-core): split docs mutation into orchestrator + batch + saga modules * refactor(sql): split alchemy engine into focused modules with compatibility shim * refactor(composition): add AppContainer.from_settings while preserving injectable __init__ * refactor(cli-ingest): extract ingestion planner/discovery helpers * refactor(http-cli): consolidate mutation execution/runtime glue and openrouter wiring * refactor(core-infra): align dense/eval/vector/journal behavior and hardening tests * docs(sync): align README and docs with final topology * refactor(persistence): remove sql legacy module and simplify sqlite bootstrap contract * refactor(composition): centralize runtime wiring and simplify router/container access * fix(ingest): enforce strong idempotency and canonical mutation bootstrap flow * test(integration): add golden invariants suite for F1-F5 contract * docs(sync): align docs/changelog/license and remove legacy packaging docs * docs(readme): polish * release: cut v1.3.0 and close changelog unreleased * fix(ci): install server extra for typing and test jobs * docs(readme): hide pypi/download badges for release --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Base:
pr/02-2026-10(stacked PR).Contexto
Durante la auditoría técnica se detectaron riesgos críticos de consistencia en operaciones mutantes concurrentes y validación insuficiente de parámetros de generación.
Cambios principales
1) Serialización de escrituras multi-store (SQL + FAISS)
src/local_rag_backend/core/services/write_lock.py.POST /api/docsPOST /api/docs/upsertPOST /api/docs/deletePOST /api/docs/delete_by_external_idPOST /api/index/rebuild2) Hardening de parámetros de generación
OpenRouterGenerateRequestahora valida rangos:temperatureen[0.0, 2.0]top_pen[0.0, 1.0]max_tokensen[1, 4096]AskEvalConfig.top_pahora también se valida en[0.0, 1.0].validate_rag_config()incluye check explícito detop_p.Tests añadidos/actualizados
tests/unit/app/test_multistore_write_lock.pytests/unit/app/test_request_limits.pytests/unit/app/test_api_router_more.pytop_penask_eval.Verificación
pytest -q✅ (255 passed, cobertura 85.29%)ruff check src tests✅black --check src tests✅mypy src✅isort --check-only src tests✅ (luego se arregló y commiteó)Commits incluidos
5fa6c06fix: prevent SQL/vector drift when embedding failsfd412d8fix: harden rag service reload token writes9a33cb7fix(api): serialize multi-store writes and harden generation params69e715achore: fix import ordering and include roadmap updates