tests: fix UTF-8 detection, per-test LC_* settings, CI coverage#17988
tests: fix UTF-8 detection, per-test LC_* settings, CI coverage#17988vszakats wants to merge 38 commits intocurl:masterfrom
LC_* settings, CI coverage#17988Conversation
|
Still thinking how to return to the original state that once settled this nicely: ecd1d02 23208e3 says there was error when setting Lines 3362 to 3385 in 23208e3 This code changed later slightly to allow setting blank values. Deletion required Lines 651 to 676 in 7cf8414 |
No longer necessary after a previous change made sure to strip the '100.0%' number from the result, before checking it. The dot is a regex character catching any decimal character. Follow-up to 17c18fb curl#5194 Ref: curl#2436 Cherry-picked from curl#17988
|
The odd thing happens that after stripping all the per-test |
|
I suspect what's happening is that originally (after ecd1d02), It worked because these tests also set This changed in current master via 0b70b23 #15039, moving these In effect this makes the check test the calling environment. If it has UTF-8 selected, It means that if the calling environment has UTF-8 selected, IDN tests are run, If the env has something else selected, the IDN tests are skipped because the Except for tests doing IDN which didn't require this feature, e.g. test 1560. One possible fix is selecting UTF-8 while testing this feature in |
|
After fixing the detection in runtests, the IDN jobs are failing without LC_* settings, as expected: In Alpine and Slackware by default, and in Linux jobs (I guess those with libidn2). |
|
macOS jobs fail (correctly so) after overriding LC_ALL, not just LC_CTYPE: |
|
I've found it interesting that on Ubuntu specifically, deleting https://github.com/curl/curl/actions/runs/16454137943/job/46506733830?pr=17988 Neither does setting This looks like the issue that led to: What works is setting both |
|
Ubuntu this locale C.utf8. Does that one work?
|
I tried Testing now with both set to I'll try |
LC_* settings in tests, extend/fix CI coverage
LC_* settings in tests, extend/fix CI coverageLC_* settings, CI coverage
|
A single |
|
A bunch of tests are set to require codeset-utf8 and set the locale to UTF-8,
The common theme is that no UTF-8 processing is happening in these tests. Dropping these would unlock them in envs without UTF-8 support, I'm not sure if there is a risk in some exotic POSIX environments, but |
This reverts commit bcb9361. Doesn't look terribly interesting. It's empty on BSDs, Slackware, Alpine, Ubuntu macOS: LC_ALL: 'en_US.UTF-8' LC_CTYPE: 'en_US.UTF-8' LC_NUMERIC: '' Also need these warnings fixed where undefined Use of uninitialized value in concatenation (.) or string at ../../tests/runtests.pl line 868. Use of uninitialized value in concatenation (.) or string at ../../tests/runtests.pl line 868. Use of uninitialized value in concatenation (.) or string at ../../tests/runtests.pl line 868.
Syncing with autotools, and fixing the `Protocols:` verifier test. Cherry-picked from #17988
Fixing:
```
In file included from lib/vtls/vtls.c:50:
In file included from lib/vtls/../urldata.h:314:
lib/vtls/../curl_sspi.h:41:10: fatal error: 'security.h' file not found
41 | #include <security.h>
| ^~~~~~~~~~~~
1 error generated.
lib/curl_sspi.h:41:10: fatal error: 'security.h' file not found
41 | #include <security.h>
| ^~~~~~~~~~~~
1 error generated.
```
Cherry-picked from #17988
After 7cf8414 #12862, `VAR=` no longer removes the env variable, but sets it to an empty/blank value instead. To remove an env, `VAR` shall be used (without the assigment operator.) `SSL_CERT_FILE`, `CURL_HOME`, `HOME`, `XDG_CONFIG_HOME`, were added before the change above. Make tests unset these envs again, as their commit messages suggest, instead of blanking them. It does not change the outcome of the tests. Ref: 764e4f0 #8213 Ref: e992770 #6600 Folllow-up to 7cf8414 #12862 Cherry-picked from #17988 Closes #17994
runtests: fix
codeset-utf8feature detection. Before this patch itdetected if the calling environment had UTF-8 enabled. If not, UTF-8
tests were all skipped. After this patch, it detects if UTF-8 is
supported by the calling environment regardless of what's currently
enabled.
Follow-up to 0b70b23 test: add native features to check for, to reduce need for prechecks #15039
GHA/linux: sync
codeset-testto also resetLC_CTYPEandLC_NUMBER. To give it more spin.Follow-up to c221c0e test1560: set locale/codeset with
LC_ALL(was:LANG), test in CI #17938GHA/macos: fix to actually enable
codeset-test. Also setLC_ALL,which seems necessary to trigger issues.
Follow-up to c221c0e test1560: set locale/codeset with
LC_ALL(was:LANG), test in CI #17938tests/data: replace
LC_CTYPEenv withLC_ALLin all testsrequiring a locale. Also to avoid potential issues with a blank or
unset
LC_ALL, as seen earlier. And to ensure that the override workson all platforms (as tested in CI.)
Slight downside is that this now resets the language/culture to
C.Ref: b4c9982 tests: set LC_ALL in more tests #4743
Ref: 23208e3 test165: set LC_ALL=en_US.UTF-8 too #4738
replace
en_US.UTF-8withC.UTF-8to be language/culture-agnostic.TEST-SUITE.md: drop
UTF-8as a requirement for tests.Tests shall work (or least be skipped) without UTF-8 support.
Tests requiring UTF-8 locale:
165, 962, 963, 964, 965, 966, 967, 1448, 1560, 2046, 2047
Tests requiring UTF-8 locale, but passing without one anyway:
955, 956, 957, 958, 959, 960, 961, 968, 1034, 1035
Spec 1997: https://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html
Spec 2008: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html
Ref: c221c0e #17938
Ref: 7cf8414
Ref: 4c140a5
Ref: 28faaac #2436
Ref: ecd1d02
LC_NUMBER=env setting #17993 tests: unset some envs instead of blanking them #17994LC_*values are a bit inconsistent in the rest of tests:LC_ALL=&LC_CTYPE=en_US.UTF-8LC_ALL=&LC_NUMERIC=en_US.UTF-8LC_ALL=en_US.UTF-8&LC_CTYPE=en_US.UTF-8Method 1 clears
LC_ALL, to give way toLC_CTYPE, and to configurethis without touching
LANG. Later commits mention errors whenLC_ALLis empty, and fill it with the
LC_CTYPEvalue, also keepingLC_CTYPE.LC_CTYPEis possibly redundant (ignored) whenLC_ALLis set.LC_NUMERICin method 2 could be replaced with method 1, I think.I wonder if all could be replace with a single
LC_ALL=en_US.UTF-8,as in 1560 after this patch, because the empty value doesn't always
work and anything else is ignored if
LC_ALLis set. [DONE, withC.UTF-8]