You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
String-level full-expansion upper/lower case drivers in unicode.casing
(@stable-since: v0.4.1): upperFull{Len,Buffer,Alloc} / lowerFull{Len,Buffer,Alloc} over []const CodePoint and upperFullUtf8{Alloc,Writer} / lowerFullUtf8{Alloc,Writer} over UTF-8,
mirroring the existing foldFull* surface. These apply expanding case
mappings the simple drivers cannot — "straße" upper-cases to "STRASSE", "ff" to "FF". Default root locale (no Turkic tailoring); the lower
drivers use the context-free mapping (no Greek Final_Sigma — use titlecaseUtf8Alloc or the per-scalar API for that).
Grapheme-cluster-aware display width in unicode.width
(@stable-since: v0.4.1): stringWidthGraphemes / stringWidthGraphemesLossy
/ stringWidthGraphemesCodePoints and the per-cluster graphemeClusterWidth.
Unlike the per-scalar stringWidth* estimators (unchanged), these count each
UAX #29 grapheme cluster once, so a ZWJ emoji family (👨👩👧) measures 2
columns instead of 6. Emoji-presentation sequences (VS16) and flags (regional
indicator pairs) are counted as 2.
Infallible lossy decode primitives encoding.utf16.decodeU16CodePointLossy
and encoding.utf32.decodeU32CodePointLossy (@stable-since: v0.4.1),
mirroring utf8.decodeCodePointLossy: malformed units yield U+FFFD with no
error union, preconditions asserted. The UTF-16/UTF-32 lossy iterators now
decode through them, so the "lossy never errors, structurally" guarantee
(previously UTF-8 only) holds for all three codecs — no catch @panic
remains on any lossy path.
Changed
Breaking: the UTF-8 stream's OutputBufferTooSmall error is renamed to BufferTooSmall, the name the encoding, transcoding, casing, and collation
layers already use, so the whole library reports one error for "output
buffer too small" (UTF8Stream.nextCodePoint / nextCodePointLossy). The
sibling NeedMoreBytes / EOFReached keep input-side starvation distinct.
unicode.properties.isAscii now delegates to encoding.isAscii instead of
hardcoding <= 0x7F; every scalar ASCII check in the library now routes
through the single encoding.isAscii predicate. The hex-digit predicate docs
(isHexDigit, isHexDigitWide, isAsciiHexDigit) were clarified so the
ASCII-vs-Unicode distinction is explicit (no renames).
Internal: collation-element generation is unified behind one Collator.recordAt, shared by buildKey and the incremental comparator, so
DUCET record lookup, discontiguous-contraction extension, and implicit
weighting cannot diverge between the sort-key and early-exit paths (closes a
divergence risk from 0.4.0; no API change).
Internal: build.zig no longer imports library source (src/utils/root.zig)
for its some build-option predicate — it carries a local copy, decoupling
the build graph from module layout.
Removed
Breaking: the dead BufferIsEmpty error variant is gone from UTF8Stream.nextCodePoint / nextCodePointLossy — it was declared but
returned by no code path (the same cleanup as 0.4.0's error.Undefined).
Fixed
The unchecked-decode contract ("never panics … undefined in ReleaseFast") is
now honored uniformly: three internal sites used @panic (which traps in all
modes) instead of unreachable — most visibly decodeCodePointReverseUnchecked,
whose doc promised no panic. Replaced with unreachable so behavior matches
the documented contract.