-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF8 and Unicode #25
UTF8 and Unicode #25
Conversation
* added UTF-8 support as its own standard implementation - http://tools.ietf.org/html/rfc3629 * allows to create UTF-8 abstraction of byte sequences and decode as 32bit value * added some basic unit tests
* added code value description * updated some basic unit tests
* updated byte sequence detection * provided new `byteSequenceLengthIndication` functionality
* split single header into header file and compilation unit * updated expansion functionality to support UTF-8
* provided a new internal helper function to extract a UTF-8 slice of a given source file and the source positions
* added support to represent UTF-8 byte sequence as unicode value and string representation
* primer support of common Unicode planes and block ranges * added functionality to test if UTF-8 characters are inside certain block ranges * provided proper unit tests
EXPECT_EQ( range.plane(), Plane::SUPPLEMENTARY_MULTILINGUAL ); | ||
break; | ||
} | ||
case Block::TRANSPORT_AND_MAP_SYMBOLS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreachable?
Better use multiple test cases (or parameterized test case) instead of the for loop.
Makes it easier to follow and easier to detect the faulty block in case of an test case error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emmanuel099 good catch, I've created an issue, which addresses this comment and will fix this problem with the suggested solution in a future PR
Unicode
which provides an alias forUTF8
and defines the Unicode planes, blocks, and rangesUnicode
consists of helper functions to check of aUTF8
is inside one or multiple block ranges