Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove iconv dependence in unicode.c #186

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Commits on Oct 20, 2023

  1. Add a simplified 'make check' to test some library functions and rout…

    …ines
    
    The simplified 'make check' is based on similar structure I put together
    for libspiro:2013-07-22 'Run "make check" to test spiro.c UNIT TEST'
    and again for libuninameslist 20170319, plus some other projects.
    This simplified 'make check' works okay for older configure.ac/Makefile.am
    found in older/mature linux distros since the testsuite.at in 2012..2013
    had trouble building with some distros. There is no testsuite.at here.
    JoesCat committed Oct 20, 2023
    Configuration menu
    Copy the full SHA
    6a4abb1 View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2023

  1. Add Unicode >= 2.0 extended chars to ucs2_strlen()

    ucs2_strlen() was only able to act as a strlen() type function but was
    inaccurate for working with chars 0x10000..0x10ffff which are coded as
    pairs {0xd800..0xdbff}:{0xdc00..0xdfff} which is one char but uses two
    utf-16 values. String is tested as utf-16le regardless of CPU endian.
    
    mode=0: acts like the original ucs2_strlen(), which is still useful to
    find a count useful for buffers.
    
    mode=1: is a hybrid, which will count one char for the code pairs used
    to create values 0x10000..0x10ffff, and does a soft fail to count code
    as separate values if the code pair isn't grouped together. This could
    be useful in some situations.
    
    mode=2: does a strict check and returns length=-1 if the code pairs is
    out of sequence. This is useful to get the right char count and verify
    that the extended char code pairs are in the right sequence.
    JoesCat committed Oct 21, 2023
    Configuration menu
    Copy the full SHA
    6273875 View commit details
    Browse the repository at this point in the history
  2. utf16_to_utf8() - Remove dependence from iconv or libiconv

    Convert utf16 coded characters to utf8. This also fixes the output buffer
    to allow for a worst-case of 1024 x 4 utf8 chars which is possible if all
    chars are from range 0x10000..0x10ffff.
    This is a subset of a converter made for fontforge, based on utf8_idpb().
    See function: 2013-10-06, Expanded utf8_idpb() to output up to 0x7FFFFFFF
    JoesCat committed Oct 21, 2023
    Configuration menu
    Copy the full SHA
    6595d68 View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2023

  1. utf8_to_utf16() - Remove dependence from iconv or libiconv

    Convert utf8 coded characters to utf16. This also fixes the output buffer
    to allow for a worst-case of 1024 utf16 chars in range 0x10000..0x10ffff.
    This is an optimized subset of a converter made for fontforge with added
    checks for bad utf-8 codes. Default behaviour for this function returns a
    zero length string. Added another mode to return NULL since it's possible
    to receive zero length strings that are not errors (example: empty line).
    JoesCat committed Oct 24, 2023
    Configuration menu
    Copy the full SHA
    44c62c8 View commit details
    Browse the repository at this point in the history