Skip to content

Conversation

@stevengj
Copy link
Member

Cleaned up some repeated code for testing transformations, and improved the output somewhat.

@stevengj stevengj merged commit 3460568 into master Nov 22, 2025
12 checks passed
@stevengj stevengj deleted the test_refactoring branch November 22, 2025 14:40
dzfrias pushed a commit to dzfrias/utf8proc that referenced this pull request Nov 22, 2025
stevengj added a commit that referenced this pull request Nov 22, 2025
* Fix attempting to combine Hangul Jamo 0x11a7

0x11a7 is not a valid Hangul T syllable despite being equal to T_BASE.
This is because, per the Unicode spec:

  TCount is set to one more than the number of trailing consonants
  relevant to the decomposition algorithm: (0x11C2 - 0x11A8 + 1) + 1

So the first valid Hangul T syllable is 0x11a8. Also see
https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G59434
for where the spec describes the usage of 0x11a8, not 0x11a7, during
composition.

* document that utf8proc_map simply wraps utf8proc_decompose and utf8proc_reencode (#312)

* test code refactoring (#318)

* Write regression test for #317

---------

Co-authored-by: Steven G. Johnson <stevenj@alum.mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants