-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test for and fix encoding errors on Windows #25322
Comments
One problem discovered is that when we spawn a daemon process with I tried the following:
(Not sure if this is important, but Java uses All this was tested on Windows 10.0.18363.2274 with PowerShell 5.1.18362.2212. Most importantly the system encoding is set to CP1252 as shown by querying |
Actually, those code points stand for something completely different... 🤔 I guess it's my terminal encoding. But I tried running the child process from Java with
It doesn't matter if I use new ProcessBuilder("cmd", "/u", "/c", "java", "-cp", ".", "-Dvar=" + value, "Main") (with or without Or execute Java directly: new ProcessBuilder("java", "-cp", ".", "-Dvar=" + value, "Main") One option I can imagine working is to encode the parameters ourselves with some ASCII-only encoding. URI encoding comes to mind, but |
Also note that if I try to send non-ASCII characters that are part of CP1252, like
|
Using
|
… Linux and macOS The overarching goal here is to shake out more encoding problems throughout Gradle by running all our integration tests in a way that the tested code has to deal with non-ASCII characters in file paths. This PR takes a step towards that goal by forcing all our non-Windows integration tests to use such a path. To keep the scope manageable, this PR does not force non-ASCII paths for Windows. That needs to be enabled in a followup PR where we can deal with Windows-specific encoding problems. The idea is similar to how we add a space to the `build/tmp/test files` directory's name where all the test output is typically located; this time we replace the `s` in `test files` with an `ŝ` (see [U+015D](https://www.compart.com/en/unicode/U+015D)). Importantly this character is not part of ASCII nor any ISO-8859-X codepage, and cannot be represented by a single byte. (See [Wikipedia](https://en.wikipedia.org/wiki/ŝ)). There is also an escape hatch for tests that for some reason can't support Unicode paths; these need to be tagged with `@DoesNotSupportNonAsciiPaths`. We have such offenders today: - Checkstyle fails because of this bug: checkstyle/checkstyle#13012 - Java 6 wrapper tests fail because Java 6 barfs on non-ASCII characters in the path Most of the problems that had to be fixed for Unixes come from the fact that `URI.toString()` does not encode non-ASCII characters, and some tools can't parse string representations of URIs with non-ASCII characters in them. So some of the `URI`s are now converted to strings using `toASCIIString()` instead. This is the canonical form of a URI, and is the intended way to go when the string form is passed to places where we can't ensure that it will be read with the right encoding (see https://www.w3.org/Addressing/URL/3_URI_Choices.html) There are followups: - #25316 - #25322 This PR is a followup to the daemon encoding fix in: - #25319 Co-authored-by: Lóránt Pintér <lorant@gradle.com>
Certain specific characters in the Windows username break compilation of project (#29213) |
This is a followup to #25261.
Let's enable the use of non-ASCII paths for Windows integration tests, and fix any discovered problems.
The text was updated successfully, but these errors were encountered: