Skip to content

Conversation

@stevengj
Copy link
Member

… and that the lower-level functions provide more control over memory allocation. Closes #307.

@stevengj stevengj merged commit ae2f2fb into master Nov 22, 2025
12 checks passed
@stevengj stevengj deleted the map_doc branch November 22, 2025 14:34
dzfrias pushed a commit to dzfrias/utf8proc that referenced this pull request Nov 22, 2025
stevengj added a commit that referenced this pull request Nov 22, 2025
* Fix attempting to combine Hangul Jamo 0x11a7

0x11a7 is not a valid Hangul T syllable despite being equal to T_BASE.
This is because, per the Unicode spec:

  TCount is set to one more than the number of trailing consonants
  relevant to the decomposition algorithm: (0x11C2 - 0x11A8 + 1) + 1

So the first valid Hangul T syllable is 0x11a8. Also see
https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G59434
for where the spec describes the usage of 0x11a8, not 0x11a7, during
composition.

* document that utf8proc_map simply wraps utf8proc_decompose and utf8proc_reencode (#312)

* test code refactoring (#318)

* Write regression test for #317

---------

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support using external/static bounded buffers for [de]composing

2 participants