Skip to content

spacecheck.pl: check for non-ASCII chars, fix fallouts #17247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

vszakats
Copy link
Member

@vszakats vszakats commented May 3, 2025

Reported-by: James Fuller
Assisted-by: Dan Fandrich

@vszakats vszakats added tests CI Continuous Integration script labels May 3, 2025
string(REGEX REPLACE "\\\\\n" "!π!α!" _makefile_inc_text ${_makefile_inc_text})
string(REGEX REPLACE "\\\\\n" "!^!^!" _makefile_inc_text ${_makefile_inc_text})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for? Is there a significance for the characters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to use something that cannot occur in the original
Makefile.inc, nor in the generated Makefile.inc.cmake. Anything
fitting that would do. Using Unicode here was perhaps overkill.
Perhaps !^!^! still is!

Another option is to strip the newlines without reinserting them after
transformation. With the slight downside of making these interim files
less human-readable due to the long lines (libssh2 does it like that):

  string(REGEX REPLACE "\\\\\n" "" _makefile_inc_text ${_makefile_inc_text})
  string(REGEX REPLACE "([a-zA-Z_][a-zA-Z0-9_]*)[\t ]*=[\t ]*([^\n]*)" "set(\\1 \\2)" _makefile_inc_text ${_makefile_inc_text})

@vszakats vszakats closed this in 838dc53 May 4, 2025
@vszakats vszakats deleted the nonascii branch May 4, 2025 15:27
vszakats added a commit to vszakats/curl that referenced this pull request May 12, 2025
Exclude test data files (4 of them) based on existing feature tags:
`codeset-utf8` and `Unicode`.

Add a new feature 'codeset-non-ascii' to mark remaining exceptions
(9 files).

Follow-up to 838dc53 curl#17247
vszakats added a commit that referenced this pull request May 13, 2025
Exclude test data files (4 of them) based on existing feature tags:
`codeset-utf8` and `Unicode`.

Add the new keyword `non-ascii` to mark remaining exceptions (9 files).

Follow-up to 838dc53 #17247

Closes #17329
vszakats added a commit that referenced this pull request May 13, 2025
- replace ß (scharfes S) with links.
- replace § (section sign) with links.
- replace 🙏 emoji with `:pray:`.
 Supported by GitHub, Forgejo/Gitea and most likely GitLab.
- docs/libcurl/curl_mprintf.md: replace Unicode ± with `{+|-}`.
- docs/CIPHERS.md: URL encode Unicode in URLs.
- lib1560: use hex encoding in `räksmörgås.se`.
- unit1307: use hex encoding in `Lindmätarv`.
- drop LATIN SMALL LETTER A WITH ACUTE exception.
  No longer appears in tests.

This leaves the single character exception: `ö`
And file exceptions holding contributor names.

Follow-up to 9243ed5 #17329
Follow-up to 838dc53 #17247

Closes #17335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration script tests
Development

Successfully merging this pull request may close these issues.

2 participants