chore(bench): B2 follow-up #6 — S5/S7 cross-validation#128
Merged
Conversation
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Matrix re-run at S5 (streaming updates) + S7 (filter-metadata) for all
four real adapters; populates H9, H13, H14, H15 (previously insufficient).
Run command:
pnpm bench:matrix --project=chromium \
--adapters=pretable,ag-grid,tanstack,mui \
--scenarios=S5,S7 --scripts=scroll,updates \
--scale=hypothesis --repeats=3 --update-rates=1000,25000
Wall-clock ~3.5 min. Milestone:
status/milestones/2026-05-09-b2-s5-s7-cross-validation.hypotheses.json
Status delta:
| H# | Before | After | Notes |
| --- | ------------- | ----------- | --------------------------------------------------------------- |
| H9 | insufficient | satisfied | Mirrors H1 parity story on S7 scroll (9.2ms p95, 0 blank gaps). |
| H13 | insufficient | directional | AG Grid clears the streaming frame budget too (9.2ms vs 9.2ms). |
| H14 | insufficient | directional | AG Grid sustains 25k/sec — no order-of-magnitude gap. |
| H15 | insufficient | directional | AG Grid drift 0 vs pretable drift 1; threshold not exceeded. |
The streaming-uniqueness wedge (H13/H14/H15) is no longer numeric on
hypothesis scale — AG Grid Community's native applyTransaction matches
or beats pretable on every measured streaming metric. Pretable's
streaming wedge in the project narrative is integration (the
@pretable/stream-adapter + @cacheplane/json-stream pipeline), not raw
throughput. Editorial homepage refresh based on this finding is a
separate follow-up.
S2-dependent hypotheses (H1, H6-H8, H10-H12, H16-H22) remain
insufficient because S2 was not in this matrix; expected. Existing B2
+ B2-with-autosize milestones for S2 are unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Vercel preview readyPreview: https://pretable-d7eieafpk-cacheplane.vercel.app Updated automatically by the |
4 tasks
blove
added a commit
that referenced
this pull request
May 11, 2026
…5 cross-validation PR #128's S5/S7 cross-validation matrix surfaced a finding: AG Grid Community matches pretable on every measured streaming numeric (frame p95, 25k/sec envelope, visible-row drift). The homepage's stub-era "purpose-built streaming pipeline" framing — and the implication that pretable is uniquely fast at streaming — is no longer supportable on hypothesis-scale numerics. The honest wedge is package surface: pretable ships the SSE → partial-JSON → batcher → applyTransaction pipeline as a single import; AG Grid expects you to wire it yourself. Three editorial edits: - ComparisonTable.tsx: streaming row renamed from "purpose-built streaming pipeline" to "streaming pipeline (SSE → partial JSON → batcher → applyTransaction)" — same yes/n/a/n/a/n/a shape, sharper capability claim. Header docblock updated to cite the S5/S7 cross-validation milestone alongside the existing B2 sources. - ReceiptsBand.tsx: replaced the "25k/s · max sustained update rate" hero stat (no longer pretable-unique) with "OpenAI · Anthropic · SSE · streaming sources, one import". Added a `compact: true` flag to the Stat interface so the longer label renders at 20–24 px instead of 44–56 px, preserving the four-cell grid without overflowing the hero font scale. - FeatureGrid.tsx: Stream-aware card — dropped "sustained from 100 to 25,000 updates/sec" tail; rewrote the description around the pipeline that ships as one import. Test added: ReceiptsBand.test.tsx regression-guards the new capability anchor (`streaming sources` + `openai`). Repo-memory entry appended (B2 follow-up #7); MEMORY.md index updated; project_b2_followups.md regenerated to reflect everything resolved except item #5 (open comparator interaction scripts). No source/package changes outside apps/website + the docs entry; all 190+ website tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
blove
added a commit
that referenced
this pull request
May 11, 2026
…5 cross-validation (#129) PR #128's S5/S7 cross-validation matrix surfaced a finding: AG Grid Community matches pretable on every measured streaming numeric (frame p95, 25k/sec envelope, visible-row drift). The homepage's stub-era "purpose-built streaming pipeline" framing — and the implication that pretable is uniquely fast at streaming — is no longer supportable on hypothesis-scale numerics. The honest wedge is package surface: pretable ships the SSE → partial-JSON → batcher → applyTransaction pipeline as a single import; AG Grid expects you to wire it yourself. Three editorial edits: - ComparisonTable.tsx: streaming row renamed from "purpose-built streaming pipeline" to "streaming pipeline (SSE → partial JSON → batcher → applyTransaction)" — same yes/n/a/n/a/n/a shape, sharper capability claim. Header docblock updated to cite the S5/S7 cross-validation milestone alongside the existing B2 sources. - ReceiptsBand.tsx: replaced the "25k/s · max sustained update rate" hero stat (no longer pretable-unique) with "OpenAI · Anthropic · SSE · streaming sources, one import". Added a `compact: true` flag to the Stat interface so the longer label renders at 20–24 px instead of 44–56 px, preserving the four-cell grid without overflowing the hero font scale. - FeatureGrid.tsx: Stream-aware card — dropped "sustained from 100 to 25,000 updates/sec" tail; rewrote the description around the pipeline that ships as one import. Test added: ReceiptsBand.test.tsx regression-guards the new capability anchor (`streaming sources` + `openai`). Repo-memory entry appended (B2 follow-up #7); MEMORY.md index updated; project_b2_followups.md regenerated to reflect everything resolved except item #5 (open comparator interaction scripts). No source/package changes outside apps/website + the docs entry; all 190+ website tests pass. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
insufficientbecause the B2 Phase 4 retry was S2-only.--repeats=3 --update-rates=1000,25000).Hypothesis status delta
No other hypothesis status changed. S2-dependent hypotheses (H1, H6–H8, H10–H12, H16–H22) remain
insufficientbecause S2 was not in this matrix — expected. Existing B2 + B2-with-autosize S2 milestones are unchanged.What's NOT in this PR
directional(notsatisfied) because AG Grid Community's native streaming clears the same bars; that's the news, not a threshold problem.Test plan
🤖 Generated with Claude Code