Skip to content

Commit

Permalink
regcharclass.pl: Improve generated code for EBCDIC
Browse files Browse the repository at this point in the history
UTF-8 has some desirable characteristics not shared by UTF-EBCDIC.  One
example is all the continuation bytes are in a single range.

By transforming a UTF-EBCDIC byte into I8 (similar to UTF-8), we gain
those characteristics, and may be able to save a conditional or three.

This commit creates a 2nd pass over the bytes that are to be matched,
transforming them into I8.  If that pass results in fewer conditionals
than the traditional, native, generated code, use the fewer result.

This saves quite a bit in some of the generated code, enabling the
quotemeta macro to be represented in a single part; previously it had to
be split to avoid compiler macro size limits.
  • Loading branch information
khwilliamson committed Aug 1, 2021
1 parent 13d77cb commit 3de1ada
Show file tree
Hide file tree
Showing 2 changed files with 240 additions and 183 deletions.

0 comments on commit 3de1ada

Please sign in to comment.