Skip to content

test: add reference BPE comparison tests#22

Closed
AmitMY wants to merge 2 commits intomainfrom
test/reference-comparison
Closed

test: add reference BPE comparison tests#22
AmitMY wants to merge 2 commits intomainfrom
test/reference-comparison

Conversation

@AmitMY
Copy link
Copy Markdown
Contributor

@AmitMY AmitMY commented Apr 8, 2026

Summary

  • Standalone reference BPE and Boundless BPE implementations as ground truth
  • Verify all 3 BPE implementations (graph, fast, boundless) produce identical merges to reference
  • 5 comparison tests across different input sizes

Stacked on #21.

What improved

  • Confidence that all our BPE implementations are correct — verified against independent reference
  • Any future optimization can be validated against this reference

Test plan

  • 5 reference comparison tests pass
  • ruff check . passes

🤖 Generated with Claude Code

@AmitMY AmitMY force-pushed the test/reference-comparison branch 16 times, most recently from 51558ab to f968db6 Compare April 8, 2026 17:27
AmitMY and others added 2 commits April 8, 2026 19:28
- Return self if merged subgraphs are identical to originals
- Avoids creating new tuples and UnconnectedGraphs on no-op merges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Standalone reference BPE and Boundless BPE implementations as ground
  truth (no graph abstraction, just byte sequences + counters)
- Verify graph BPE, FastBPE, and Boundless BPE all produce identical
  merges to reference on same input
- 5 tests across small, medium, and multi-implementation comparisons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@AmitMY AmitMY force-pushed the test/reference-comparison branch from f968db6 to 46be990 Compare April 8, 2026 17:28
@AmitMY
Copy link
Copy Markdown
Contributor Author

AmitMY commented Apr 8, 2026

Redundant — BPE vs HuggingFace comparison is already covered in test_bpe.py.

@AmitMY AmitMY closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant