-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up encoding handling in streams. #601
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jtv
changed the title
Speed up
Speed up encoding handling in streams.
Sep 23, 2022
stream_from
decoding.The optimisations are: 1. Inline glyph scanning function in the search loop. 2. For "ASCII-safe" encodings, use the "monobyte" search loop. The inlining optimisation works as follows: Previously the stream class kept a pointer to a function that figures out glyph boundaries (the byte where the next character begins in a byte string). It looks up the function specialised for the current kind of encoding: UTF-8, GBK, SJIS, etc... or "monobyte" for single-byte encodings. In libpqxx I call those functions _glyph scanners._ But this way of working is painfully slow: the stream calls that function pointer for every single character it tries to read. Here, I rewrite the loop to use a different specialised function pointer, which works at a higher level: "Find any one of these special characters." That means that the inner loop is now inside that function, not on the outside calling in. Gives the compiler more of a chance to optimise the loop. The other change is based on the fact that many encodings have two basic kinds of characters: ASCII ones which are in the 0..127 range, and non-ASCII ones in the non-ASCII byte range — they have the high bit in their bit value set to 1. And that means that we can never have the "SJIS" situation where an ASCII byte value (such as that of a backslash character) can also occur _inside_ a multibyte character. When we know we're in an encoding where that can't ever happen (and UTF-8 is one of those!) then we don't need the glyph scanner for that encoding at all. We can just use the simpler "monobyte" glyph scanner which just always returns `offset + 1`. Neither of these optimisations is particularly powerful on its own. Inlining UTF-8 scanning (for instance) will probably be a bit faster than a function pointer, but it won't be a huge difference. And calling a simpler glyph scanner won't do us much good, especially if that just means that we'll need to call it 3 times for a 3-byte character, for instance. But the two changes work well together: the monobyte scanner can be as simple as an `offset++`. Unfortunately this _is_ an ABI-breaking change. We're replacing a function pointer field with a pointer to a different type of function.
jtv
force-pushed
the
faster-stream_from
branch
from
October 1, 2022 16:33
828ff5b
to
ad5dd24
Compare
This is needed because future changes to array parsing require an ability for callers to specify the characters that the finder looks for.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Speeds up scanning of text in various encodings in
stream_to
andstream_from
. (Evenstream_to
needs to be able to do that, because it escapes data for use withCOPY
.)The optimisations are:
The inlining optimisation works as follows: Previously the stream classes kept a pointer to a function that figures out glyph boundaries (the byte where the next character begins in a byte string). It looks up the function specialised for the current kind of encoding: UTF-8, GBK, SJIS, etc... or "monobyte" for single-byte encodings. In libpqxx I call those functions glyph scanners. But this way of working is painfully slow: the stream calls that function pointer for every single character it tries to read. Here, I rewrite the loop to use a different specialised function pointer, which works at a higher level: "Find any one of these special characters." That means that the inner loop is now inside that function, not on the outside calling in. Gives the compiler more of a chance to optimise the loop.
The other change is based on the fact that many encodings have two basic kinds of characters: ASCII ones which are in the 0..127 range, and non-ASCII ones in the non-ASCII byte range — they have the high bit in their bit value set to 1. And that means that we can never have the "SJIS" situation where an ASCII byte value (such as that of a backslash character) can also occur inside a multibyte character. When we know we're in an encoding where that can't ever happen (and UTF-8 is one of those!) then we don't need the glyph scanner for that encoding at all. We can just use the simpler "monobyte" glyph scanner which just always returns
offset + 1
.Neither of these optimisations is particularly powerful on its own. Inlining UTF-8 scanning (for instance) will probably be a bit faster than a function pointer, but it won't be a huge difference. And calling a simpler glyph scanner won't do us much good, especially if that just means that we'll need to call it 3 times for a 3-byte character, for instance. But the two changes work well together: the monobyte scanner can be as simple as an
offset++
.Unfortunately this is an ABI-breaking change. We're replacing a function pointer field with a pointer to a different type of function.