-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
spacecheck.pl: check for non-ASCII chars, fix fallouts #17247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
string(REGEX REPLACE "\\\\\n" "!π!α!" _makefile_inc_text ${_makefile_inc_text}) | ||
string(REGEX REPLACE "\\\\\n" "!^!^!" _makefile_inc_text ${_makefile_inc_text}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? Is there a significance for the characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is to use something that cannot occur in the original
Makefile.inc
, nor in the generated Makefile.inc.cmake
. Anything
fitting that would do. Using Unicode here was perhaps overkill.
Perhaps !^!^!
still is!
Another option is to strip the newlines without reinserting them after
transformation. With the slight downside of making these interim files
less human-readable due to the long lines (libssh2 does it like that):
string(REGEX REPLACE "\\\\\n" "" _makefile_inc_text ${_makefile_inc_text})
string(REGEX REPLACE "([a-zA-Z_][a-zA-Z0-9_]*)[\t ]*=[\t ]*([^\n]*)" "set(\\1 \\2)" _makefile_inc_text ${_makefile_inc_text})
Suggested by Dan
Exclude test data files (4 of them) based on existing feature tags: `codeset-utf8` and `Unicode`. Add a new feature 'codeset-non-ascii' to mark remaining exceptions (9 files). Follow-up to 838dc53 curl#17247
- replace ß (scharfes S) with links. - replace § (section sign) with links. - replace 🙏 emoji with `:pray:`. Supported by GitHub, Forgejo/Gitea and most likely GitLab. - docs/libcurl/curl_mprintf.md: replace Unicode ± with `{+|-}`. - docs/CIPHERS.md: URL encode Unicode in URLs. - lib1560: use hex encoding in `räksmörgås.se`. - unit1307: use hex encoding in `Lindmätarv`. - drop LATIN SMALL LETTER A WITH ACUTE exception. No longer appears in tests. This leaves the single character exception: `ö` And file exceptions holding contributor names. Follow-up to 9243ed5 #17329 Follow-up to 838dc53 #17247 Closes #17335
Reported-by: James Fuller
Assisted-by: Dan Fandrich