-
Notifications
You must be signed in to change notification settings - Fork 0
release: v0.2.0 #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add rivo/uniseg as third comparison library alongside go-runewidth - Add Uniseg variants for all StringWidth benchmarks (15 new benchmarks) - Add Complex Unicode section: flags, ZWJ sequences, combined strings (9 benchmarks) - Update README.md with results tables and library comparison - Total: 51 benchmarks (was 30) - Addresses #1
- Replace byte-by-byte isASCIIOnly() with SWAR processing 8 bytes/iter - Add asciiWidth() with Daniel Lemire's SWAR control char detection - Add short string fast path (<8 bytes) to avoid SWAR call overhead - ASCII Short: 9 ns → 6 ns (1.5x faster) - ASCII Medium: 71 ns → 24 ns (3x faster) - ASCII Long: 340 ns → 77 ns (4.4x faster) - Now 46x faster than go-runewidth, 77x faster than uniseg on long ASCII - Zero allocations maintained for all ASCII paths
- 3-stage hierarchical table: ROOT(256B) → MIDDLE(17×64) → LEAVES(78×32) - Total size: 3.8KB with bucket deduplication - 2-bit width encoding: 0=zero-width, 1=narrow, 2=wide, 3=ambiguous - Exhaustive verification: all 1,112,064 valid codepoints match - Coverage improved: 87.1% → 97.6% - Merged zero-width format chars (0x200B-0x200F) into single range - Updated generator with buildWidthMap() and buildMultiStageTable() - Fixed 3 conformance test expectations for Unicode 16.0 data
- Forward-scan state machine for correct ZWJ width calculation - Family emoji (👨👩👧👦) now correctly returns width 2, not 8 - Emoji modifier sequences (👍🏽) return width 2, not 4 - Handles profession ZWJ (👩🔬), flags (🏳️🌈), hearts (❤️🔥) - Zero overhead for ASCII strings (fast paths unchanged) - Zero allocations for short emoji sequences (stack-allocated) - Added isExtendedPictographic() and isEmojiModifier() helpers - 48 new test cases: ZWJ sequences, modifiers, edge cases - 4 new benchmarks: ZWJ family, couple, modifier, mixed - Coverage: 96.4%, lint: 0 issues
- CHANGELOG.md: add v0.2.0 section (ZWJ, SWAR, 3-stage table) - README.md: update features, architecture, benchmarks, coverage - ARCHITECTURE.md: add ZWJ state machine, SWAR, 3-stage table sections - ROADMAP.md: add public roadmap (Now/Next/Later format) - tables.go: remove dead code (replaced by tables_generated.go)
- Update GitHub Actions: checkout v6, setup-go v6, codecov v5, golangci-lint v9 - Add benchmarks.yml: regression detection (benchstat) + library comparison table - Remove develop/release/hotfix branch triggers (main-only + PRs) - Add concurrency groups to prevent duplicate CI runs - Fix gofmt formatting in source files
Library Comparison
Raw benchmark output
|
Regression DetectionComparing Summary: Full benchstat output |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Release v0.2.0 — major performance and emoji handling improvements.
Performance
isASCIIOnly()andasciiWidth()process 8 bytes/iter using uint64 word tricksFeatures
isExtendedPictographic()/isEmojiModifier()helpersCI/CD
benchmarks.yml: regression detection (benchstat) + three-way library comparison table in PR commentsDocumentation
tables.go(186 lines, unused since tables_generated.go)Benchmarks (i7-1255U)
Test plan
go test -v— all tests passgo test -race(via WSL2) — no racesgolangci-lint run— 0 issuesgofmt -l .— all formattedscripts/pre-release-check.sh— green (1 warning: planned TODO)