Skip to content

Releases: Goutte/godot-addon-unicode-normalizer

Last Minute Changes

28 Nov 14:08
Compare
Choose a tag to compare

Rejection !

image

But… We don't need a root .gitignore file ! 😢

We added an empty .gitignore file and filled the whole form for the third time (first time was a failed CSRF check, because that form needs to be speedrunned). 🧂

Breaking

Since no-one is using this yet, I allowed myself to break the API (we're below 1.0.0 anyway) :

UnicodeNormalizerClassUnicodeNormalizerNode
delete_decomposableremove_decomposable

Easier Extension of Replacements

27 Nov 05:37
Compare
Choose a tag to compare

Features

  • It is now easier to add custom replacements in NormalizationMapping.

Bug Fixes

  • Fixed a potential edge case issue with characters at the bottom of the unicode table

Initial Release

26 Nov 15:00
Compare
Choose a tag to compare

UnicodeNormalizer

This singleton helps normalize your unicode strings by:

  • removing diacritics (decomposing, then keeping only the first character) — "é" → "e"
  • substituting fallback characters — "Æ" → "AE"
  • being blazingly fast (binary search)

NormalizationMapping

This Resource is our database of replacements, used by the UnicodeNormalizer.
It is built from the official unicode.org data.

It is only about 16Kio, and is derived from 1.9Mio of raw data.

Basic Usage

You can use the normalize method on the autoload singleton UnicodeNormalizer:

UnicodeNormalizer.normalize("Dès Noël, où un zéphyr haï me vêt")
# "Des Noel, ou un zephyr hai me vet"

Advanced Usage

The UnicodeNormalizer is made to be extended, to be tailored to your font capabilities and needs.

Here, the font supports some french diacritics, but only uppercase characters:

# file "MyFontNormalizer.gd"
extends UnicodeNormalizerClass

var characters_in_my_font := "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ÉÈÊËÀÂÄÔÖÙÛÜÇ"

func should_skip_character(character: String, _character_code: int) -> bool:
	return self.characters_in_my_font.contains(character)  # inefficient

func normalize(some_string: String) -> String:
	return super.normalize(some_string.to_upper())

This is a naive/inefficient implementation to keep the example short and simple.
A more performant implementation would use binary search on a sorted array.