Skip to content

docs: dogfood report for v3.10.1-dev.80#1180

Open
carlos-alm wants to merge 6 commits into
mainfrom
docs/dogfood-report-v3.10.1
Open

docs: dogfood report for v3.10.1-dev.80#1180
carlos-alm wants to merge 6 commits into
mainfrom
docs/dogfood-report-v3.10.1

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

@carlos-alm carlos-alm commented May 20, 2026

Summary

Adds the dogfooding report for @optave/codegraph@3.10.1-dev.80 (run on 2026-05-20).

Test plan

@claude
Copy link
Copy Markdown

claude Bot commented May 20, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Dogfood Report v3.10.1-dev.80

Review Progress

  • Read and analyze the dogfood report file
  • Evaluate against 5-phase review criteria
  • Assess BACKLOG.md compliance
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: This PR adds comprehensive dogfooding documentation for codegraph v3.10.1-dev.80, providing validation that 12 newly-ported native extractors achieve byte-identical parity with WASM engines and documenting 4 critical bugs discovered during testing.

Why This Approach: Documentation-only PR that follows the established pattern in generated/dogfood/ and correctly separates docs from bug fixes (which are handled in separate PRs per one-PR-one-concern principle).

Risk Assessment: Zero risk - pure documentation with no code changes.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies, docs-only change
  • Foundation-aligned: ✓ — Aligns with P1 (always-current) by validating graph accuracy and P8 (development-focused) by improving tooling quality through dogfooding
  • Problem-fit: 4 — High value for maintaining tool reliability; prevents AI agents from working with incorrect dependency graphs
  • Breaking: No — Documentation-only, no API or behavior changes
  • Tier: 1

Critical Concerns

None identified. This is exemplary documentation:

Strengths

  1. Comprehensive validation coverage: Tests all 464 commands/scenarios across 12 categories
  2. Rigorous engine comparison: Documents native vs WASM parity with specific metrics
  3. Actionable bug discovery: Found and filed 4 real issues (bug: incremental rebuild silently drops 32 import edges (native) / 37 (WASM) #1174-bug: build command rejects --db flag, breaking workflow with non-default DB locations #1177) with root cause analysis
  4. Performance benchmarks: Documents 5.4× build speed improvement and other metrics
  5. Quality assessment: Honest 7.5/10 rating with specific strengths/weaknesses

Quality Indicators

  • Follows established patterns: Matches structure of previous dogfood reports in /generated/dogfood/
  • Separation of concerns: Correctly cherry-picked docs from fix branch per CLAUDE.md guidance
  • Testing depth: Engine parity validation for all 11 newly-ported extractors shows 10/11 byte-identical
  • Risk transparency: Documents incremental rebuild edge-loss bug as high priority

Content Validation

The report demonstrates the exact kind of quality validation codegraph needs:

  • Native extractors achieve near-perfect WASM parity (critical for user trust)
  • Performance improvements are real and substantial (5.4× build speed, 52× complexity)
  • Edge cases and error conditions are properly tested
  • Infrastructure (MCP, programmatic API) is validated end-to-end

Value to Project

This documentation:

  1. Proves release quality: Validates that 12 new native extractors work correctly
  2. Prevents regressions: Establishes baseline for future releases
  3. Guides users: Shows comprehensive tool usage patterns
  4. Tracks progress: Documents performance improvements and feature completeness

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: Exceptional quality documentation that validates critical functionality, discovers real bugs, and provides comprehensive release validation. This is exactly the kind of thorough testing and documentation that maintains tool reliability and user trust. Zero risk with high value for project quality assurance.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR applies three targeted corrections to the v3.10.1-dev.80 dogfood report, resolving both issues raised in the previous review cycle.

Confidence Score: 5/5

Documentation-only change with no code modifications; safe to merge.

The three edits are precise and internally consistent: the extractor count is now 11 everywhere it appears in the document, and the incomplete benchmark row is clearly marked with a follow-up link and em-dash placeholders. No code paths are touched.

No files require special attention.

Important Files Changed

Filename Overview
generated/dogfood/DOGFOOD_REPORT_v3.10.1-dev.80.md Addresses both prior review findings: corrects extractor count from 12→11 in §6 and §12, and replaces the incomplete jina-base benchmark placeholder with an explicit "not completed" marker linked to follow-up issue #1181 with em-dash fill for missing columns.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Prior review: two issues flagged] --> B[Issue 1: extractor count 12 vs 11 mismatch]
    A --> C[Issue 2: jina-base benchmark row incomplete]
    B --> D["§6 + §12: '12' → '11'"]
    C --> E["jina-base row: placeholder → explicit 'not completed'\n+ follow-up issue #1181 link\n+ em-dash fills for missing columns"]
    D --> F[Count consistent throughout document]
    E --> F
Loading

Reviews (7): Last reviewed commit: "Merge branch 'main' into docs/dogfood-re..." | Re-trigger Greptile


## 6. Release-Specific Tests (changes since v3.10.0)

The v3.10.1-dev.80 series adds 12 native extractor ports (#1097–#1107), several language-specific fixes (#1109, #1122, #1123, #1124, #1127, #1128, #1158), a MCP enhancement (#1149), benchmark-stability fixes (#1119, #1120, #1131, #1133, #1134), and CI hardening (#1146, #1151, #1164).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Extractor count mismatch between PR description and report body

The PR description states "12 newly-ported native extractors validated," but the report consistently counts 11 throughout — the §5 table lists 11 languages, the §5 conclusion reads "all 11 newly-ported extractors," and the §6 changelog entry enumerates 11 ports (#1097–#1107). One of these two numbers is wrong and will cause confusion when the report is referenced later. If the correct count is 11, the PR description should be updated to match; if it's 12, the missing language should be added to the §5 table and the §6 test row.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ca4146c. The body is the source of truth: 11 PRs in #1097-#1107, 11 rows in the §5 table, and the §5 conclusion already reads "all 11 newly-ported extractors." The §6 paragraph now says 11, and the PR description has been updated to 11 to match.


- No regressions vs the v3.10.0 baseline in `generated/benchmarks/BUILD-BENCHMARKS.md`. The corpus shrank (745 → 612 files) due to PR #1134's fixture exclusion, but per-file metrics improved on every engine.
- Native fast-skip preflight (#1054) is firing as expected: 16 ms no-op rebuild matches WASM's, validating the `detectNoChanges` short-circuit.
- The 1-file rebuild gap (WASM 45ms vs Native 67ms) is the inverse of full-build performance — WASM's lighter orchestrator setup wins on tiny incremental work.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 jina-base embedding benchmark published as incomplete

The embedding benchmark table has jina-base (768d) with the value _benchmark still running at report cut_. Publishing a report with a known-pending data point makes the §8 table misleading — reviewers cannot assess whether jina-base regressed, improved, or is even viable for the release. Either wait for the run to finish and fill in the numbers, or explicitly mark this row as "not completed, see follow-up" and omit the empty columns so it's clear no data was collected rather than data being redacted.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ca4146c. Replaced the "benchmark still running at report cut" placeholder with an explicit "not completed in this session" marker linked to follow-up issue #1181, and zeroed out the empty Hit@k columns with em-dashes so it's clear no data was collected (rather than redacted). The follow-up issue tracks finishing the jina-base run and backfilling the numbers.

Fix two Greptile P1/P2 findings in the v3.10.1-dev.80 dogfood report:

- §6 changelog said "12 native extractor ports (#1097-#1107)" but the PR
  range and the §5 parity table both enumerate 11. Correct to 11 so the
  body is internally consistent (the §5 table and conclusion already read
  11/11).
- §8 embedding-benchmark table left the jina-base (768d) row as
  "benchmark still running at report cut". Replace the placeholder with
  an explicit "not completed in this session" note and link to the
  follow-up issue #1181 so the missing data is clearly tracked rather
  than reading as redacted.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Fixed in 6d9e1bb. The §12 Overall Assessment opening sentence at line 435 now reads "11 new native extractors" — matching the §5 conclusion, §6 intro, §11 testing plan, and the §12 bullet list. The report is now fully self-consistent.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant