Skip to content

feat(task-30-b3): lock graph_extraction_window_size default = 2 (sweet spot)#1925

Merged
earayu merged 3 commits into
mainfrom
architect/task-30-b3-default-window-size-2
Apr 30, 2026
Merged

feat(task-30-b3): lock graph_extraction_window_size default = 2 (sweet spot)#1925
earayu merged 3 commits into
mainfrom
architect/task-30-b3-default-window-size-2

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 30, 2026

Summary

task #30 B3 default value lock — 总架构师拍板甜蜜点 per @earayu2 directive (msg=adb0c366):

效果稍微降低一点是可以接受的,总架构师拍板一个甜蜜点,默认至少是 2,根据性价比

graph_extraction_window_size default 12

Sweet spot rationale (B2 全矩阵 evidence)

model window calls wall_s cost json_ok source_valid entity_hit relation_hit
Qwen3 30B 1 12 147.2 $0.0042 1.000 1.000 0.930 0.686
Qwen3 30B 2 6 82.9 $0.0031 1.000 0.992 0.860 0.714
Qwen3 30B 3 6 102.6 $0.0033 1.000 1.000 0.912 0.657
Qwen3 30B 5 3 75.2 $0.0025 1.000 1.000 0.754 0.543
Gemini 2.5 Flash 1 12 57.3 $0.0312 1.000 1.000 0.965 0.686
Gemini 2.5 Flash 2 6 49.6 $0.0260 1.000 1.000 0.930 0.714
Gemini 2.5 Flash 3 6 49.7 $0.0226 0.833 ⚠️ 1.000 0.667 0.514
Gemini 2.5 Flash 5 3 37.2 $0.0255 1.000 1.000 0.947 0.714

4 evidence:

  1. 跨模型稳定 (json_ok=1.0, source_valid≥0.992) — window=3 Gemini json drift 0.833 不能默认
  2. 效果降低 acceptable: Qwen 净 -0.04 / Gemini 净 -0.01 — earayu2「稍微降低可接受」
  3. 性价比显著: calls -50% / Qwen cost -26% wall -44% / Gemini cost -17% wall -13%
  4. 风险低: 跨模型一致;window=3/5 都 model-specific 不适合默认

Changes (3 files)

  1. aperag/indexing/graph_extractor.py:81 _DEFAULT_GRAPH_EXTRACTION_WINDOW_SIZE = 12 + docstring fold
  2. aperag/schema/common.py:167 KnowledgeGraphConfig.graph_extraction_window_size description update + override 推荐
  3. docs/zh-CN/architecture/task-30-graph-chunk-window-spec-v1.md § 4.2 rewrite to lock 章节

Sample limitation 免责 (per Weston msg=4b7f2357 + Planetegg msg=181518f2)

3 benchmark sample insufficient for per-model auto default. Future change requires ≥10 samples + ≥3 models 同时不退步 + PM + architect + earayu2 三方 confirm.

CR plan

🤖 Generated with Claude Code

总架构师拍板甜蜜点 per earayu2 directive msg=adb0c366「效果稍微降低
一点是可以接受的,总架构师拍板一个甜蜜点,默认至少是 2,根据性价比」.

B2 evidence-grounded sweet spot analysis (Planetegg msg=096e0089 full
matrix + Weston msg=9ae48560 + Planetegg msg=a33607aa + 架构师
msg=08ebb696 + msg=f1feb2f1 三方收敛):

- window=2 跨模型稳定 (json_ok=1.0, source_valid≥0.992)
- Qwen entity -0.07 + relation +0.028, calls -50%, cost -26%, wall -44%
- Gemini entity -0.035 + relation +0.028, cost -17%, wall -13%
- window=3 dominated (Gemini json drift 1/6 + Qwen wall +20% + relation drop)
- window=5 model-specific (Gemini good, Qwen entity 0.754 跪)

Changes:
- aperag/indexing/graph_extractor.py: _DEFAULT_GRAPH_EXTRACTION_WINDOW_SIZE
  1 → 2 + docstring fold sweet spot rationale + sample limitation
- aperag/schema/common.py: KnowledgeGraphConfig.graph_extraction_window_size
  description default 1 → 2 + override 推荐 (legacy=1 / Gemini=5)
- docs/zh-CN/architecture/task-30-graph-chunk-window-spec-v1.md § 4.2
  rewrite to lock 章节 + B2 全矩阵数据 + sweet spot rationale +
  collection-level override 推荐 + sample 限制免责

Sample limitation免责: 3 个 benchmark 文档 insufficient for per-model
auto default; future change requires ≥10 samples + ≥3 models 同时不
退步 + PM + architect + earayu2 三方 confirm.
earayu added a commit that referenced this pull request Apr 30, 2026
…audit (#1928)

* docs(task-61): DB adapter compat spec v1 — vector + graph cross-impl audit

Architect spec v1 起草 per earayu2 directive (msg=8b989470 / msg=2bad8e75
/ msg=f26b703e) + PM 不穷 task #72 dispatch.

Streaming evidence integration from 8 lanes:
- huangheng msg=ed2f2973: 3 vector P0 candidates (cross-tenant /
  filter silent / collection init)
- Bryce msg=8e895471 task #69: 11 vector findings (4 P0 + 3 P1 + 4 P2,
  including upgraded score normalization P0-V3/V4)
- 冬柏 msg=3e93bb64 task #67: 3 missing Protocol method tests
  (bulk_upsert_entity_with_lineage_parts P0 + remove_relation_lineage
  P1 + list_entities P1)
- chenyexuan msg=f298011e + PR #1926: workflow paths filter dead
  reference P0-W1 (in flight)
- cuiwenbo msg=dfebf706 task #70: FE/UX 3 candidates (score, viz error
  vs empty, confidence_score)
- Planetegg msg=db7fb085 + msg=41906f4 + msg=41665d7e task #65: alias
  resolution gather P2-S1 + Singapore QDRANT_MULTITENANT=True (no
  hot-fix needed) + env shape verify
- ziang task #64 graph store audit (in_progress, will fold-in)
- dongdong task #71 deploy/typed schema (in_progress, will fold-in)

Spec structure:
- §1 inventory by lane with file:line evidence
- §2 缺口 by severity (P0 CRITICAL hot-fix candidate / P0 必修 / P1
  允许差异 declare / P2 性能优化 / YAGNI)
- §3 三层 design direction per Weston msg=85e527e3 framework
- §4 sub-task dispatch (Phase A 8 lane parallel + Phase B per-P0
  three-PR-pattern + Phase C P2 + Phase D PR #1926 unblock)
- §5 acceptance: P0/P1 standards + boundary test gate + e2e + sample
  limitation免责
- §6 CR mandatory checklist citing Lesson #11-#16 family from
  PR #1916/#1924/#1922 sediment + new Lesson #16 candidate (workflow
  paths dead reference)

Sample limitation: spec evidence from streaming surface, not
huangzhangshu collected gap list — fix-forward amend after
huangzhangshu lane completes + Bryce/ziang audit slice输出.

Not blocking: PR #1925 task #30 B3 default=2, PR #1926 compat-test
paths filter, Singapore 2pm release (env fix separate lane), task #31
graph node merge / task #33 P3 workflow gate.

* docs(task-61): fix-forward Weston BLOCKER + 5 streaming integration

Weston msg=13dd5e91 BLOCKER (score normalization severity drift):
保持 P0-V3+V4 P0 across §1.1 / §2.2 / §5.3 — score 方向是 caller
语义硬契约,不能在 PGVector/Qdrant 间显示反向。§2.2 加 P0-V3+V4
显式行 + §5.3 加 test_score_normalization_in_vector.py boundary
test (跨 metric × 跨 adapter 全 6 cell parametrize).

Streaming integrations (5 lane):

1. Bryce msg=23a2f514 P0-V1 first-principles 重新定性 — Qdrant
   legacy mode tenant isolation 是 collection name level 不是 query
   filter level (verify qdrant_connector.py:442-446),下沉 P1-V4
   defense-in-depth (legacy mode deprecation follow-up 候选).

2. Bryce msg=8e895471 11 vector findings — 4 P0 (cross-tenant
   下沉 / filter silent / score V3+V4) + 3 P1 (collection init /
   batch atomicity / filter Or 语义) + 4 P2.

3. dongdong msg=4201465a + PR #1929 + cuiwenbo msg=bcec38ad —
   P0-D1 Helm worker Neo4j env missing (Singapore graph viz
   root-cause); P1-D1 e2e shape matrix gap; P1-D2 Nebula no Helm
   first-class; P1-D3 typed schema 缺 vector backend exposure.

4. chenyexuan NIT — Lesson #16 candidate cite added §6.

5. Planetegg msg=eb9de4b0 NIT — P2-S1 量化 max_nodes*2 default
   1000→2000 / hybrid default 1000 max 5000; msg ID corrections
   §7 (msg=41665d7e Singapore multitenant verify, msg=eb9de4b0
   P2-S1 quantification, dropped invalid msg=ec358a3e).

冬柏 PR #1927 commit b2234ae fold-in §5.3 (38 cases incl
zero-side-effect + replay idempotency post-NIT).

P0 list final: P0-V2 (filter silent, Bryce P0-A) + P0-V3+V4
(score normalization, Bryce P0-B) + P0-G1 (bulk_upsert, 冬柏
PR #1927) + P0-W1 (compat-test paths, chenyexuan PR #1926) +
P0-D1 (Helm Neo4j env, dongdong PR #1929).

* docs(task-61): § 3.1.1 historical residue cleanup per Weston msg=fdf04a69 NIT — strike old P0 hot-fix path (P0-V1 已下沉 P1-V4 per Bryce first-principles verify)

* docs(task-61): final consistency cleanup per Weston msg=e414d3cf — line 14 count 4+3+4 to 3 P0 + 4 P1 + 4 P2; § 5.1 P0-V1 line removed; § 5.2 P1-V4 defense-in-depth boundary test added
earayu added 2 commits April 30, 2026 13:26
…efer indexing-retrieval-kg.md amend to follow-up; this PR scope = code default + Pydantic Field description + spec § 4.2 lock only
…msg=bf785b12 NIT 1 — schema.d.ts default=2 align + § 3.1.1 line 85 default=1 → default=2 lock per § 4.2 sweet spot
@earayu earayu merged commit 43648f9 into main Apr 30, 2026
10 checks passed
@earayu earayu deleted the architect/task-30-b3-default-window-size-2 branch April 30, 2026 05:37
earayu added a commit that referenced this pull request Apr 30, 2026
…#1932)

§ 四 加 8 lesson sediment(task #30 B3 + task #61 全 P0 闭环累计实证)+ § 六 sediment 引用追加 6 PR commit cross-link + § 八 修订记录追加本 PR fold trail。

新增 lesson:
- Lesson #12 v7.4: external API raw contract verify (task #61 P0-B PR #1930
  Qdrant euclid raw direction first-application + fix-forward 1e30a00)
- Lesson #12 v8 second-application: test docstring fake guardrail (task #61
  P0-G1 PR #1927 description_parts assertion 缺位 fix-forward 1953933)
- Lesson #12 v9: first-principles verify catch surface signal mistakes
  (task #61 P0-V1 重新定性 Bryce + task #61 P0-B Qdrant euclid Weston catch
  双独立 source 同源 first/second-application)
- Lesson #13 v2.3: deploy manifest dual-side rewrite (task #61 P0-D1 PR #1929
  Helm Neo4j worker env first-application)
- Lesson #13 v3 application demo 2: cross-source default value alignment
  (task #30 B3 PR #1925 commit dae43f5 三 source 同步 first-application)
- Lesson #14 application demo: spec 内部 default 漂浮 multi-iteration cleanup
  (task #30 B3 PR #1925 fix-forward dae43f5 § 3.1.1 line 85 cleanup
  second-application demo, first-application 在 task #35 6 轮 fix-forward)
- Lesson #16: CI workflow paths filter dead reference 反 pattern (task #61
  P0-W1 PR #1926 first-application demo + Lesson #15 file-move 3-step verify
  升级到 v2 4-step grep .github/workflows/*.yml paths 同步)
- Lesson #17: backend 收敛 contract 优于上层 fork (simple-stable + private-deploy
  paramount directive earayu2 msg=1224bec8 在 cross-adapter contract 设计时
  应用; task #69 P0-B + task #70 P1 候选 1 cross-PR 一次性收敛 first-application)

跨 PR 多独立 source 同源 catch trail:
- Lesson #12 v9: Bryce msg=23a2f514 + Weston msg=86e05a8e 双独立 source
- Lesson #16: chenyexuan msg=f298011e + 冬柏 msg=3e93bb64 双独立 source
- Lesson #17: cuiwenbo msg=cedc7703 + Bryce msg=9895a148 双独立 source
- Lesson #13 v3 application demo 2: huangheng msg=bf785b12 + Planetegg
  msg=c63acbf5 + Weston msg=1e6b0838 三独立 source

per architect msg=c4cdf634 + msg=daaeeab5 + msg=03c892e0 sediment dispatch.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…n_window_size (#1933)

Codify Lesson #13 v3 (cross-source default value alignment) as a CI
unit test gate so future task-#30 B3-class drift is caught by
``cicd-push.yml`` lint+unit instead of by reviewers via fix-forward
rounds.

Background — task #30 B3 (PR #1925, merge ``43648f9``) locked
``graph_extraction_window_size`` default to ``2`` across **four**
sources that all need to agree:

1. ``aperag/indexing/graph_extractor.py``
   ``_DEFAULT_GRAPH_EXTRACTION_WINDOW_SIZE`` (Python const, runtime
   fallback)
2. ``aperag/schema/common.py``
   ``KnowledgeGraphConfig.graph_extraction_window_size`` Pydantic
   ``Field(examples=[N])`` (OpenAPI / TS schema source)
3. ``web/src/api-v2/schema.d.ts`` JSDoc ``@example N`` (frontend client
   surface — committed to repo, can drift if regen skipped)
4. ``docs/zh-CN/architecture/task-30-graph-chunk-window-spec-v1.md``
   § 3.1.1 line 85 ``**B3 lock default `N`**`` + § 4.2
   ``**`graph_extraction_window_size = N`**`` (architectural source of
   truth that PRs CR against)

PR #1925 itself surfaced the drift class:
- Weston ``msg=1b7d9bef`` BLOCKER 1 caught ``schema.d.ts`` still
  carrying default ``1``
- huangheng ``msg=bf785b12`` NIT 1 caught § 3.1.1 line 85 still saying
  default ``1``
Both required a fix-forward commit (``dae43f5``).

Why a unit test (not a boundary test): ``tests/boundaries/`` is not
currently invoked by ``make test-unit`` / ``test-integration`` /
``cicd-push.yml`` (task #33 Layer 1 audit finding).
``tests/unit_test/`` runs on every push via ``make test-unit``. Per
simple-stable directive (earayu2 ``msg=1224bec8``), the cheapest
reliable gate is a unit test in the existing CI lane, not a new
workflow file.

Scope discipline: pins **default value parity** across four sources
only. Does not pin description text, override-recommendation phrasing,
or rationale wording. If a future change moves the default away from
2, the test fails with a list of all observed values per source plus
the procedural reminder (``≥10 samples + ≥3 models 同时不退步 + PM +
architect + earayu2 三方 confirm``).

Tests:

- ``test_graph_extraction_window_size_default_consistent_across_sources``
  — the main gate (asserts all 4 sources agree)
- ``test_graph_extraction_window_size_default_is_positive_integer`` —
  sanity (window assembler math requires ``>= 1``)
- ``test_individual_source_extractor_does_not_raise[*]`` — separates
  "extractor broken" failures from "values drifted" failures so
  operator immediately knows whether to fix test infra or schema

Local validation:

- 5/5 pass in clean state
- Synthetic drift on each of (Python const / TS schema / spec § 3.1.1 /
  spec § 4.2) caught with clear actionable error message naming the
  drifting source
- Full ``tests/unit_test/contracts/`` 58/58 pass
- ruff format + ruff check clean

Sediment cross-link: this gate is the codified counterpart to
huangheng PR #1932 § 四 Lesson #13 v3 application demo 2 + Lesson #14
application demo (PR #1925 § 3.1.1 multi-iteration cleanup) — that PR
records the drift class as a CR-checklist lesson; this PR enforces it
mechanically so the lesson does not have to be remembered.

task #33 Layer 2 P3 (chenyexuan claim, in_progress) per PM dispatch
``msg=65465f9e``.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant