feat: add OTLP-first observability foundation#1702
Merged
Conversation
Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
…ility-design-dc2b # Conflicts: # aperag/domains/retrieval/pipeline.py Co-authored-by: earayu <earayu@163.com>
Co-authored-by: earayu <earayu@163.com>
earayu
added a commit
that referenced
this pull request
Apr 26, 2026
…ite proposal (#1725) * docs(indexing): add indexing redesign design pack — first-principles rewrite proposal Per earayu2 directive (#celery msg=56812dd6 + msg=d8080c08): full redesign of the document indexing system, prioritizing simplicity and reliability over feature breadth, targeting 100 concurrent docs, with hard-cut authorization (pre-launch / no users / no migration). Design pack contents (1049 lines, 11 sections): - §A — Current system analysis with file:line evidence (3-layer ownership skew, Python lease thread tied to worker process, graph index NOT replace-idempotent per nebula.py:354 upsert_entities, ~995 lines in tasks.py mixing infra + business) - §B — First principles (single SoT in DB, idempotent convergence, source/ derived/index three-layer separation, concurrency bounded by external capacity, simple > complex) - §C — Three-layer document model (collections/<id>/documents/<id>/source/ + derived/parse_<v>/{markdown.md, chunks.jsonl, kg.jsonl, summary.json, vision/} + backend index stores) - §D — Idempotency contract per modality (DELETE-by-(document_id, parse_version) before INSERT for all 5 modalities; fixes graph index append bug) - §E — Concurrency model decision matrix (HTTP-only / lightweight task / Celery refactor); recommends lightweight Redis-backed asyncio worker pool per modality (5 worker processes, ~80-line reconciler, no Celery / no chord / no Python lease thread) - §F — State machine + atomic flip (4 status values vs current 6; document.active_parse_version + pending_parse_version with transactional flip; deletion via async cleanup worker) - §G — Multi-modal unified pipeline (Modality ABC with derive() + sync() contract; collapses earayu2's "Celery task 绕来绕去最后又绕回 graph index" complaint into 2 functions in 1 file) - §H — Multi-tenant isolation (recommend simple — required tenant context + bulkheads, defer fairness machinery until observability shows real noisy-neighbor signal) - §I — Failure recovery (3 modes: worker crash, transient backend, permanent failure; exponential backoff retry; Redis token bucket for LLM rate-limit backpressure) - §J — Observability (4 SLI: index_lag_seconds, index_failure_rate, queue_depth, worker_utilization; OTLP wire; aligns with PR #1702) - §K — Migration plan (7 PRs: observability primitives → idempotent indexers → object store layout → worker pool → atomic flip → cutover → availability discriminator; feature-flagged dual-stack during PR-D/E; cutover deletes ~3000 lines of Celery infrastructure) Net delta: roughly +4150 / -4850 lines across 7 PRs — net subtraction despite adding functionality. Indexing layer drops from ~2500 lines to ~1500. Three open decisions deferred to earayu2: 1. Concurrency model: lightweight Redis-asyncio (recommended), HTTP-only, or Celery refactor 2. Atomic flip contract: all-modalities-ACTIVE-required (recommended) vs per-modality independent 3. PR sequence: 7-PR cut (recommended) vs combined Sibling reference: Bryce msg=791082a4 + msg=38fbf962 first-principles analysis + architect msg=19f283d5 + msg=2ee66c89 4-blind-spot synthesis. This design pack is the single canonical deliverable per earayu2's owner directive (@符炫炜 sole author of the final design). * docs(indexing): redesign pack v2 — incorporate earayu2 拍板 + 答 derived/MinIO Driven by earayu2 msg=cc0a00d7 + PM consolidation msg=32463d64. - Drop Celery → lock Redis + asyncio (§E, decision matrix removed) - Drop atomic flip → per-modality independent is_serving cutover (§F) Accept short eventual-consistency window per earayu2 directive. - Answer derived/parse_<v>/ contents per modality (§C.6) — chunks.jsonl shared by vector+fulltext, kg.jsonl, summary.json, vision/manifest.jsonl + images/, markdown.md + outline.json. - Answer MinIO/object-store suitability (§C.7) — ~150 MB / 100-doc burst, trivial; LocalFS / MinIO / S3-compatible all work; small-file + LIST caveats addressed. - Add §L private/on-premise "deploy-and-forget": Tier 1 inline (SQLite + LocalFS, ~10 docs/hour), Tier 2 docker-compose (~100 concurrent), Tier 3 horizontal scale-out — same code. - §H tenant_scope_key forward-compat hook for future organization concept; simple Redis token-bucket quota that won't lock future fairness. - §K restructure: 7 PRs → 3 waves (Foundation / Runtime / Cutover) with per-wave parallel-writability map. - §G.5 SearchResultMetadata extends w/ parse_version + index_state_per_modality (becomes structurally required under per-modality independent flip). PR #1725 v2; awaiting earayu2 final 拍板 on Wave 1 kickoff. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(indexing): v2 amendment — Bryce review deltas (chunking trade-off / serving invariant / graph entity lineage) Bryce v2 review msg=7ccb176f surfaced 3 substantive technical deltas all agreed-as-must-address by PM msg=fc307bbf. Folded into v2: §C.6 — chunks.jsonl shared-by-vector-and-fulltext is now framed as conscious trade-off (vector wants larger chunks, fulltext wants smaller) with explicit shadow-file extension hook (chunks.fulltext.jsonl + namespaced sub-IDs) preserved so future split is unblocked. §F.1 — partial unique index added at the schema layer: CREATE UNIQUE INDEX uniq_document_index_serving ON document_index (document_id, modality) WHERE is_serving = TRUE; This makes the "at most one serving row per (doc, modality)" invariant DB- enforced, not orchestrator-enforced. SQLite 3.8+ supports the same syntax (Tier 1 deploy stays consistent). §D.3 — graph entity lineage model rewritten. Cross-document shared entities ("Linus" mentioned in 100 docs) cannot be cleared by simple DELETE-by-doc without losing other docs' contributions. New model: - source_lineage: SET<{document_id, parse_version, chunk_ids[]}> - description_parts: SET<{document_id, parse_version, text}> - sync = lineage-level DELETE+INSERT, entity GC when lineage empty Includes per-entity serialization invariant for Nebula (read-modify-write without native list ops). 5-step idempotency self-test extension specified. PR #1725 v2; ready for earayu2 / Bryce final ack. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(indexing): §D.3.2 amendment — lineage cleanup by document_id only Bryce v3 implementation review (msg=464d5b70) caught a spec bug in §D.3.2 step 1b: the pseudocode used `(document_id, parse_version)` exact-match for lineage filter, which contradicts §D.3.6 narrative step 3 ("doc_A v2 写入(覆盖 doc_A 旧 lineage)"). Strict exact-match would leave lineage[A,v_old] + lineage[A,v_new] coexisting after a re-parse, violating the expected supersede semantic. Architect ruling (msg=80c5dc06) is to amend §D.3.2 step 1b to filter by `document_id` only (not parse_version). This makes sync(doc, v_new) self-contained for supersede; orchestrator does not need to do explicit clear-then-sync. §D.3.6 narrative remains canonical. PR #1726 Wave 1 graph implementation follows the corrected algorithm. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(indexing): §J.1 amendment — failure_total + success_total counter pair (was single rate gauge) huangheng Wave 1 CR (msg=8e67bf0e) flagged §J.1 spec drift: T1.5 implementation emits index_failure_total + index_success_total counter pair, not the single index_failure_rate gauge spec called for. Architect ruling: amend spec to match implementation. Counter pair is OTLP- idiomatic, preserves raw events, re-aggregates across workers without sliding-window state, and the rate is trivially computable downstream. §J.1 spec amended; §K Wave 1 acceptance bullet updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(indexing): §F.1 + §F.5 amendments — collection_id/source_path/tenant_scope_key columns + cleanup Path C Three Wave 3 amendments folded in to unblock chenyexuan T3.1 (msg=afe345a9): §F.1 schema: - Add collection_id VARCHAR(64) NOT NULL (denormalized from document.collection_id, populated by orchestrator at INSERT for self-contained dispatch payload — per huangheng Wave 2 CR finding msg=c94b57fe + architect ruling msg=498b12f0). - Add source_path TEXT NOT NULL (pointer to source/ artifact, worker derive reads directly without JOIN). - Add tenant_scope_key VARCHAR(64) NOT NULL (forward-compat for future organization concept per §H.2; required key for §H.5 quota bucket). Was implicit-but-not-listed in the schema before; now explicit. - Add idx_document_index_collection + idx_document_index_tenant_scope indexes for cleanup / quota scans. §F.5 cleanup worker: - Restructure to three paths (A/B/C) with explicit semantics. - Path A: orphan parse_version GC (existing); now notes graph backend no-op via §D.3 lineage supersede + graph_noop counter for telemetry. - Path B: single-document deletion cascade — explicit graph dispatch via remove_entity_lineage_member(document_id) per §D.3 amended canonical (by document_id only). - Path C (NEW): collection deletion cascade — Collection.deleted_at scan + Path B per child document + final Collection row + storage tree cleanup. Replaces legacy Celery collection_delete task with state-driven recovery (no asyncio.create_task() durability gap). These amendments unblock T3.1 commit 1+ since chenyexuan needs the spec head to reference for the audit allowlist removal and the caller-migration patterns. Wave 3 task #14 acceptance criteria (per PM msg=5939e394) now references this head. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: 符炫炜 <fuxuanwei@apecloud.io> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
AGENTS.mdpointer for future agents.aperag.observabilityas the new OTLP-first observability foundation.APERAG_OBSERVABILITY_MODE=localwith optional OTLP endpoint configuration.aperag/tracepackage.README-zh.md.origin/mainand resolve retrieval pipeline conflicts against the new LLM runtime rerank flow.Testing
python3 -m compileall aperag/observability aperag/app.py config/celery.pyPATH="$HOME/.local/bin:$PATH" make lintPATH="$HOME/.local/bin:$PATH" uv run python - <<'PY'\nimport aperag.app\nimport config.celery\nfrom aperag.observability import build_observability_config\nprint(aperag.app.app.title)\nprint(config.celery.app.main)\nprint(build_observability_config().mode)\nPYPATH="$HOME/.local/bin:$PATH" uv run pytest tests/unit_test/tasks/test_document_graph_curation_contract.py tests/unit_test/test_es_p0_contract.py tests/unit_test/vectorstore/test_qdrant_filter_translation.py -q