Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always parse as UTF-8. #270

Merged
merged 2 commits into from Nov 19, 2018

Conversation

@dvander
Copy link
Member

dvander commented Nov 19, 2018

This is a ~13% compile time improvement on SourceMod plugins.

Mysteriously, it wasn't enough to parse three times: the compiler
also pre-scans the entire text of source/include files to determine
if each uses UTF-8. This is particularly mysterious as the result
of pre-scanning for UTF-8 is that it will allow parsing more UTF-8
characters. Why bother scanning in the first place?

One reason is to remove the byte order mark (BOM), which this patch
retains by scanning the first three bytes. Otherwise, it seems this code
can be removed. It's possible upstream had more dependent logic and we
removed it long ago.

@dvander dvander force-pushed the split-headers-p2 branch 2 times, most recently from ee26bb3 to 9caa869 Nov 19, 2018
dvander added 2 commits Nov 19, 2018
This is a ~13% compile time improvement on SourceMod plugins.

Mysteriously, it wasn't enough to parse three times: the compiler
also pre-scans the entire text of source/include files to determine
if each uses UTF-8. This is particularly mysterious as the result
of pre-scanning for UTF-8 is that it will allow parsing more UTF-8
characters. Why bother scanning in the first place?

One reason is to remove the byte order mark (BOM), which this patch
retains by scanning the first three bytes. Otherwise, it seems this code
can be removed. It's possible upstream had more dependent logic and we
removed it long ago.

This patch also fixes a long-standing bug where only include files had
BOM correctly skipped, and an additional buffer overflow when parsing
BOMs.
This argument doesn't really make any sense. Anything at the top of
source files intended to be ignored, and not a comment, would be
non-standard.
@dvander dvander force-pushed the split-headers-p2 branch from 9caa869 to 5a0bd34 Nov 19, 2018
@dvander dvander merged commit 2afae87 into master Nov 19, 2018
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@dvander dvander deleted the split-headers-p2 branch Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.