Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Windows stubs to Unicode #58

Merged
merged 9 commits into from
Mar 9, 2024

Conversation

dra27
Copy link
Contributor

@dra27 dra27 commented Mar 5, 2024

Subsumes #55. Fixes #57. Fixes ocaml/dune#10180 which then fixes ocaml/opam#5861.

Windows OCaml uses UTF-8 internally for environment variables. The spawn_windows stub was using the ANSI version of CreateProcess (CreateProcessA) which causes the environment to become corrupted if the environment contained any non-ASCII characters (i.e. any UTF-8 sequences).

Support for Windows 98 ended in July 2006, so I think we can risk using the Unicode version of CreateProcess 😉 The code mirrors that for Unix.create_process.

emillon added a commit to emillon/dune that referenced this pull request Mar 5, 2024
This updates our spawn vendored version to include janestreet/spawn#58.

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this pull request Mar 5, 2024
This updates our spawn vendored version to include janestreet/spawn#58.

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this pull request Mar 5, 2024
This updates our spawn vendored version to include janestreet/spawn#58.

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
src/dune Outdated
(run ocaml %{dep:flags.ml})))

(rule
(with-stdout-to flags.ml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this might add the Flags module to the library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, yes indeed! Fix pushed for the original approach, but actually enabled_if is arguably "better", given that it's then a semi-pure Dune solution.

/* Must come before any other caml/ headers are included */
#define CAML_INTERNALS

#ifdef _WIN32

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the dune file:

Suggested change
#ifdef _WIN32
#ifdef _WIN32
#define UNICODE
#define _UNICODE

Based on a comment by @emillon in ocaml/dune#10212 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is that UNICODE and _UNICODE must either be defined everywhere or nowhere (OCaml headers can pull in windows.h or parts of it as well, ending up with a similar problem to that which we have already had with CAML_INTERNALS).

The other benefit of putting it in flags is that any additional C files correctly inherit it, but that's really never likely to apply here. Given that we already have the _GNU_SOURCE definition (which is where it should go - i.e. right at the top).

I don't mind, though - how strong opinions against doing it "philosophically" correctly, if more complicatedly, in the dune file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is better to do it in the dune file.

@rgrinberg
Copy link
Collaborator

@nojb could you have a look?

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo the tweaks below)

src/spawn.ml Outdated
Buffer.contents buf
if env = [] then
"\000\000"
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say a word about this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment block is supposed to be terminated with two NUL characters - if there are any bindings, that obviously in the other branch (and did in the ANSI version too). When there are no bindings, the ANSI version was passing a pointer to the OCaml string "\000" but that will have had an additional NUL terminator so it quietly worked. With the wide version, the issue was instantly revealed by one of the tests where CreateProcessW will have received two NUL bytes (i.e. a NUL wchar_t) but will have then "segfaulted" reading past the end of that buffer (CreateProcessW actually translated that to an invalid argument, which worryingly suggests that it carried on reading random memory and instead got upset that there was no = character anywhere or some such😱)

I'm not sure if it's worth a comment in the code - CreateProcess does indicate that the block should be a series of nul-terminated strings themselves terminated by another nul, it's just hazy on the case of an empty environment (not surprisingly, really).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had that same shock thought! 🙂 Just awaiting a train, and then I’ll flex some environments…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, OCaml's fine because createprocess.c speculatively adds a nul character. I was speculatively adding that character but not zeroing it. I've fixed it by not adding it, as I think it's nicer that the function in spawn.ml is entirely responsible for producing the UTF-8 version and then the C stub transcodes exactly that string.

/* Must come before any other caml/ headers are included */
#define CAML_INTERNALS

#ifdef _WIN32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is better to do it in the dune file.

@@ -648,22 +679,22 @@ CAMLprim value spawn_windows(value v_env,
if (!dup2_and_clear_close_on_exec(v_stdin , &si.hStdInput ) ||
!dup2_and_clear_close_on_exec(v_stdout, &si.hStdOutput) ||
!dup2_and_clear_close_on_exec(v_stderr, &si.hStdError )) {
win32_maperr(GetLastError());
caml_win32_maperr(GetLastError());
close_std_handles(&si);
uerror("DuplicateHandle", Nothing);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we error out here, we will leak the strings allocated above. Maybe invert the order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh blimey, good catch. (Another) revised version incoming...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@emillon
Copy link
Contributor

emillon commented Mar 8, 2024

ok, thanks. my only concern with the dune file was that it requires changing a bit our bootstrap system in dune but that's fine with me.

Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
If the duplication of the handles failed, then the memory allocated for
the strings wasn't freed. Duplicate the handles first - the only failure
for the strings is OOM, and all bets are then off.

Signed-off-by: David Allsopp <david.allsopp@metastack.com>
@dra27
Copy link
Contributor Author

dra27 commented Mar 8, 2024

While looking at it again, for some reason the check for nul characters in environment values was only being done in Env_unix so I shifted the function out of there and used it in Env_win32 as well.

Hopefully this is the last version 🙂

Copy link
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (after the latest changes). Thanks!

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
_
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
@rgrinberg rgrinberg merged commit fa620a8 into janestreet:master Mar 9, 2024
1 check passed
emillon added a commit to emillon/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to ocaml/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes #10180

Signed-off-by: Etienne Millon <me@emillon.org>
Leonidas-from-XIV pushed a commit to Leonidas-from-XIV/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes ocaml#10180

Signed-off-by: Etienne Millon <me@emillon.org>
Leonidas-from-XIV added a commit to ocaml/dune that referenced this pull request Mar 11, 2024
This does several things:
- update our spawn vendored version to include janestreet/spawn#58
- pick the dune file
- add new build flags for bootstrapping

Fixes #10180

Signed-off-by: Etienne Millon <me@emillon.org>
Co-authored-by: Etienne Millon <me@emillon.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants