New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support for the Windows runtime: Let's do it! #1200

Merged
merged 72 commits into from Sep 18, 2017

Conversation

Projects
None yet
@nojb
Contributor

nojb commented Jun 10, 2017

This PR is a follow-up to #153. See MPR#3771 for context.

The original patch (#153) was created by @ygrek and the patch here is the rebase of that made by Clément Franchini (contacted by email, not yet present in GH I think) so that it applies to trunk.

Over the years (the original patch dates from 2012) there has been a lot of interest in getting this code integrated but the considerable amount of work required has meant that each time the effort has run out of steam and the patch has been left to languish.

We (at LexiFi) are interested in getting this patch merged. Also, the consensus in #153 was that the approach taken here (wrapping Windows "wide" functions to translate to- and from- UTF-8) is the right one. So, let's push to get this merged!

As a first step, I integrated the tests provided by Clément to the OCaml testsuite so that they are run with make test (minus the symbolic link tests, which require fiddling with permissions).

I will be keeping this PR in sync with trunk so that it can be tested easily. Below is a list of issues that still need to be worked on. I will update the list as the discussion progresses.

  • Runtime switch rather than configure-time switch: the functionality is in place; all that is left is to discuss how/if to expose it.
  • Adapting functions dealing with environment variables (GetEnvironmentVariable, ...) and process creation (execv, CreateProcess, ...)
  • Fix the UTF-8 validation function (see comment)
  • Compile with UNICODE and _UNICODE defined (see comment)
  • Hande illegal Unicode: Windows file names are not UTF-16, but sequences of 16-byte values (so that unpaired or mismatched surrogates may appear, see comment). In particular, some file names cannot be represented in valid UTF-8. The approach taken in this patch is as follows: 1) invalid UTF-8 is never generated, and 2) four possible settings:
    • disabled
    • non-strict: Unicode translation between Windows (UTF-16) and OCaml (UTF-8) will silently drop illegal characters.
    • strict with fallback: if illegal characters are found when translating to UTF-16, then the argument string is considered to be encoded in the local codepage. This is the key mechanism used for backwards compatibility.
    • strict without fallback. Like strict with fallback except that there is no fallback: it simply fails if faced with illegal characters.
  • Investigate the segfaults in the testsuite: lib-bigarray-file and lib-unix
  • Adapt ocamlrun
  • Update flexlink (see alainfrisch/flexdll#34)

Any and all comments (as well as help reviewing) very much appreciated. Particularly valuable:

  • Reports from people who are able to test this code
  • Guidance from the core developers as to which points need to be addressed in order to get this to a mergeable state

Thanks!

/cc @alainfrisch @ygrek @dra27

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Jun 10, 2017

Contributor

was that the approach taken here (wrapping Windows "wide" functions to translate to- and from- UTF-8) is the right one. So, let's push to get this merged!

It seems this patch is still using its own, wrong, UTF-8 validation function. These comments are not addressed by this PR.

Contributor

dbuenzli commented Jun 10, 2017

was that the approach taken here (wrapping Windows "wide" functions to translate to- and from- UTF-8) is the right one. So, let's push to get this merged!

It seems this patch is still using its own, wrong, UTF-8 validation function. These comments are not addressed by this PR.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

Hi @dbuenzli, thanks for the reminder! I have added it to the TODO list and will be looking at that soon.

Contributor

nojb commented Jun 10, 2017

Hi @dbuenzli, thanks for the reminder! I have added it to the TODO list and will be looking at that soon.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 10, 2017

Contributor

Thanks for taking this one on, @nojb! There is one further thing which should be on the TODO list, but doesn't necessarily have to be fixed in this PR, which is converting ocamlrun to be built using _UNICODE (see MSDN) - i.e. adding -D_UNICODE -DUNICODE to the building of all C files). Note that this has to be done with a certain amount of care - the aim here is to switch the OCaml codebase to build correctly for "modern" Windows (i.e. Windows NT!), but at this stage we wouldn't want -D_UNICODE -DUNICODE to leak to third party C stubs which correctly compile their C files using ocamlopt.

Contributor

dra27 commented Jun 10, 2017

Thanks for taking this one on, @nojb! There is one further thing which should be on the TODO list, but doesn't necessarily have to be fixed in this PR, which is converting ocamlrun to be built using _UNICODE (see MSDN) - i.e. adding -D_UNICODE -DUNICODE to the building of all C files). Note that this has to be done with a certain amount of care - the aim here is to switch the OCaml codebase to build correctly for "modern" Windows (i.e. Windows NT!), but at this stage we wouldn't want -D_UNICODE -DUNICODE to leak to third party C stubs which correctly compile their C files using ocamlopt.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

Hi @dra27, thanks for the reminder! It's been added.

Contributor

nojb commented Jun 10, 2017

Hi @dra27, thanks for the reminder! It's been added.

@shindere

This comment has been minimized.

Show comment
Hide comment
@shindere

shindere Jun 10, 2017

Contributor
Contributor

shindere commented Jun 10, 2017

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 10, 2017

Contributor

@shindere - thanks for confirming: I had a memory that was something you'd improved, but I didn't check!

Contributor

dra27 commented Jun 10, 2017

@shindere - thanks for confirming: I had a memory that was something you'd improved, but I didn't check!

@shindere

This comment has been minimized.

Show comment
Hide comment
@shindere

shindere Jun 10, 2017

Contributor
Contributor

shindere commented Jun 10, 2017

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

@dbuenzli I switched the UTF-8 validation function to use the Windows API MultiByteToWideChar. There are some warnings in the doc (look under "Remarks") about false positives produced by this function under Windows XP, but if I understand correctly this issue is only present when checking validity of UTF-16, not UTF-8. @dra27, do you agree ?

Contributor

nojb commented Jun 10, 2017

@dbuenzli I switched the UTF-8 validation function to use the Windows API MultiByteToWideChar. There are some warnings in the doc (look under "Remarks") about false positives produced by this function under Windows XP, but if I understand correctly this issue is only present when checking validity of UTF-16, not UTF-8. @dra27, do you agree ?

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

OK, it turns out I was being too optimistic. It seems that MultiByteToWideChar will, under Windows XP, incorrectly validate surrogate characters (which are not valid UTF-8). So, what to do ?

  • Don't do anything.
  • Use an alternative code path for old versions of Windows, or
  • Fix/clean up the hand-written UTF-8 validator in the original code ?

Opinions welcome.

Contributor

nojb commented Jun 10, 2017

OK, it turns out I was being too optimistic. It seems that MultiByteToWideChar will, under Windows XP, incorrectly validate surrogate characters (which are not valid UTF-8). So, what to do ?

  • Don't do anything.
  • Use an alternative code path for old versions of Windows, or
  • Fix/clean up the hand-written UTF-8 validator in the original code ?

Opinions welcome.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

There is a lot of interesting information on this issue here and especially in the linked MSDN article. It turns out that Windows file names are not UTF-16 after all, but just a sequence of WCHARs (16-bit quantities). This means that Windows file names can contain unpaired or mismatched surrogates. If this is the case, then there are filenames that can not be represented in "strict" UTF-8 (i.e. without surrogate characters).

See also this issue from the Rust community.

Contributor

nojb commented Jun 10, 2017

There is a lot of interesting information on this issue here and especially in the linked MSDN article. It turns out that Windows file names are not UTF-16 after all, but just a sequence of WCHARs (16-bit quantities). This means that Windows file names can contain unpaired or mismatched surrogates. If this is the case, then there are filenames that can not be represented in "strict" UTF-8 (i.e. without surrogate characters).

See also this issue from the Rust community.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 10, 2017

Contributor

Hmm, at first glance this is making my intuition that we might need a wchar type to do this properly a reality.

Quick thoughts: don't worry about XP for now, unless it's critical to your own objectives (better to worry about the port and then we'll worry about XP later). On the UCS-2 names, my instinct is that we should not generate invalid UTF-8, but possibly raise an exception for invalid pairs (in a similar way to having a filesize which is too large).

Contributor

dra27 commented Jun 10, 2017

Hmm, at first glance this is making my intuition that we might need a wchar type to do this properly a reality.

Quick thoughts: don't worry about XP for now, unless it's critical to your own objectives (better to worry about the port and then we'll worry about XP later). On the UCS-2 names, my instinct is that we should not generate invalid UTF-8, but possibly raise an exception for invalid pairs (in a similar way to having a filesize which is too large).

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 10, 2017

Contributor

Personally, introducing any new types is one of the things I would really, really like to avoid (after all, Sys and Unix are precisely useful because they offer a uniform interface to both Linux and Windows).

In any case, agree completely with not worrying about the WinXP situation for now. Just to be clear, for later versions, the current implementation using WideCharToMultiByte and MultiByteToWideChar will only generate and decode "valid" UTF-8. This means that some "valid" Windows file names will not be representable, but we will worry about this later.

I am adding a TODO item to remind us to think about this point and marking the "Fix UTF-8 validator" as done.

Contributor

nojb commented Jun 10, 2017

Personally, introducing any new types is one of the things I would really, really like to avoid (after all, Sys and Unix are precisely useful because they offer a uniform interface to both Linux and Windows).

In any case, agree completely with not worrying about the WinXP situation for now. Just to be clear, for later versions, the current implementation using WideCharToMultiByte and MultiByteToWideChar will only generate and decode "valid" UTF-8. This means that some "valid" Windows file names will not be representable, but we will worry about this later.

I am adding a TODO item to remind us to think about this point and marking the "Fix UTF-8 validator" as done.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 11, 2017

Contributor

@nojb - I agree about the resistance to adding types, but at the moment we don't have uniformity (because a very common kind of filename breaks the Sys and Unix interfaces) and although supporting valid UTF-8-representable filenames on Windows is a vast leap in the right direction, we still won't have uniformity if there are valid Windows filenames which OCaml will return an exception if it's asked to read! But all that is on top of what you're doing here.

Contributor

dra27 commented Jun 11, 2017

@nojb - I agree about the resistance to adding types, but at the moment we don't have uniformity (because a very common kind of filename breaks the Sys and Unix interfaces) and although supporting valid UTF-8-representable filenames on Windows is a vast leap in the right direction, we still won't have uniformity if there are valid Windows filenames which OCaml will return an exception if it's asked to read! But all that is on top of what you're doing here.

@dra27

Good progress, @nojb! This is only a brief review for now. The way memory is being allocated concerns me - however largely irrelevant it may be for small strings, it does feel daft that Unix will now copy every single string which refers to a PATH and then free it.

On the Windows side, it's a shame that strings end up being copied twice - once to get the UTF-8 form and then again via caml_copy_string. This point, though, I think may be fixed by sorting the conversion functions - WideCharToMultiByte can be called without a buffer to determine the size of the UTF-8 output. At present, the code will fail on certain strings where really it should reallocate - so you could use heuristic for buffer size and, on failure, call with no buffer to get the actual size and reallocate - but at this point the string will have converted three times. Alternatively, use WideCharToMultiByte with no buffer to get the size and use caml_alloc_string to put the UTF-8 output directly into an OCaml string (there will also then be no need to spend time with memset zeroing the memory).

I was briefly concerned, but didn't look further, about the path checking in Unix - are there any implications for that on the Windows side with a UTF-8 encoded path (I can't remember what it does)?

Show outdated Hide outdated Changes
Show outdated Hide outdated byterun/caml/u8tou16.h
Show outdated Hide outdated byterun/caml/u8tou16.h
Show outdated Hide outdated byterun/caml/u8tou16.h
Show outdated Hide outdated byterun/u8tou16.c
Show outdated Hide outdated byterun/caml/sys.h
Show outdated Hide outdated byterun/win32.c
Show outdated Hide outdated byterun/win32.c
Show outdated Hide outdated otherlibs/win32unix/system.c
@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 11, 2017

Contributor

Thanks @dra27 for the review! I will be looking at the points raised. I also did a first quick reading and found a couple of places where the rebase seems to have gone bad, which will be fixed shortly.

Contributor

nojb commented Jun 11, 2017

Thanks @dra27 for the review! I will be looking at the points raised. I also did a first quick reading and found a couple of places where the rebase seems to have gone bad, which will be fixed shortly.

@shayne-fletcher

This comment has been minimized.

Show comment
Hide comment
@shayne-fletcher

shayne-fletcher Jun 11, 2017

shayne-fletcher commented Jun 11, 2017

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 11, 2017

Contributor

@dra27 Re Unix path checking: it checks whether the OCaml string has embedded NULLs, so should work OK with UTF-8.

Contributor

nojb commented Jun 11, 2017

@dra27 Re Unix path checking: it checks whether the OCaml string has embedded NULLs, so should work OK with UTF-8.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 11, 2017

Contributor

Found another bug in the UTF-16 -> UTF-8 conversion and added wrappers for getenv and command in Sys.

Contributor

nojb commented Jun 11, 2017

Found another bug in the UTF-16 -> UTF-8 conversion and added wrappers for getenv and command in Sys.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 11, 2017

Contributor

@dra27 Re the crt-naming issue: if I understand correctly you are suggesting to replace HAS_WINAPI_UTF16 by UNICODE and simply use the names defined in <tchar.h>.

However, my understanding is that we want to have a runtime switch for all this functionality, so that we would want to be able to explicitly refer to both versions (Unicode and ANSI), in which case I think using <tchar.h> wouldn't help us much...

Contributor

nojb commented Jun 11, 2017

@dra27 Re the crt-naming issue: if I understand correctly you are suggesting to replace HAS_WINAPI_UTF16 by UNICODE and simply use the names defined in <tchar.h>.

However, my understanding is that we want to have a runtime switch for all this functionality, so that we would want to be able to explicitly refer to both versions (Unicode and ANSI), in which case I think using <tchar.h> wouldn't help us much...

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 12, 2017

Contributor

@nojb - the present set-up doesn't help us with being able to use both versions at once either! I've just put some thoughts on Mantis about it.

Contributor

dra27 commented Jun 12, 2017

@nojb - the present set-up doesn't help us with being able to use both versions at once either! I've just put some thoughts on Mantis about it.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 12, 2017

Contributor

I can't reproduce the AppVeyor error locally. Any ideas ?

Contributor

nojb commented Jun 12, 2017

I can't reproduce the AppVeyor error locally. Any ideas ?

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 12, 2017

Contributor

@avsm - I think that's transient (well, it's not - I've seen it very, very occasionally, but we can't debug it after the fact) - please could you restart the AppVeyor build?

Contributor

dra27 commented Jun 12, 2017

@avsm - I think that's transient (well, it's not - I've seen it very, very occasionally, but we can't debug it after the fact) - please could you restart the AppVeyor build?

@ygrek

This comment has been minimized.

Show comment
Hide comment
@ygrek

ygrek Jun 13, 2017

Contributor

What will happen if environment has some not-valid unicode contents? I am not sure what wgetenv does in this case. Will it be impossible to pass arbitrary bytes with Unix.putenv?

Contributor

ygrek commented Jun 13, 2017

What will happen if environment has some not-valid unicode contents? I am not sure what wgetenv does in this case. Will it be impossible to pass arbitrary bytes with Unix.putenv?

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 13, 2017

Contributor

Hi @ygrek ! My understanding is that Windows actually keeps two copies of the environment (one for Unicode and one for "legacy"). These are kept synchronized in general but I think there are situations where they can become out of sync. If you use _wputenv, then I don't think you can pass arbitrary bytes.

Contributor

nojb commented Jun 13, 2017

Hi @ygrek ! My understanding is that Windows actually keeps two copies of the environment (one for Unicode and one for "legacy"). These are kept synchronized in general but I think there are situations where they can become out of sync. If you use _wputenv, then I don't think you can pass arbitrary bytes.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 13, 2017

Contributor

@nojb - _wputenv is not Windows, it's MSVCRT. I haven't checked thoroughly (I don't have access to the machine I have the MSVCRT sources on), but I think that on Windows NT (i.e. everywhere) the environment block is UCS-2 and calling GetEnvironmentVariableA converts the parameter to UCS-2 and then queries the environment block using that converted key. The value returned is then (potentially lossily) converted back to ANSI for the return. MSVCRT sits on top of that process - it caches the entire environment, and does indeed maintain two copies if you call both putenv and _wputenv.

@ygrek - I think it depends on your definition of "not-valid unicode" - nothing's invalid in UCS-2 (I think?). However, your arbitrary bytes should be fine - they'll have been converted to wide characters (so every other byte will be null) and this should successfully convert those normal 16-bit code-points to UTF-8.

Contributor

dra27 commented Jun 13, 2017

@nojb - _wputenv is not Windows, it's MSVCRT. I haven't checked thoroughly (I don't have access to the machine I have the MSVCRT sources on), but I think that on Windows NT (i.e. everywhere) the environment block is UCS-2 and calling GetEnvironmentVariableA converts the parameter to UCS-2 and then queries the environment block using that converted key. The value returned is then (potentially lossily) converted back to ANSI for the return. MSVCRT sits on top of that process - it caches the entire environment, and does indeed maintain two copies if you call both putenv and _wputenv.

@ygrek - I think it depends on your definition of "not-valid unicode" - nothing's invalid in UCS-2 (I think?). However, your arbitrary bytes should be fine - they'll have been converted to wide characters (so every other byte will be null) and this should successfully convert those normal 16-bit code-points to UTF-8.

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 13, 2017

Contributor

Thanks for the clarification @dra27 !

An update on this patch. As discussed, I decided to use <tchar.h> in order to get rid of all the renaming in u8tou16.h. This means we compile with _UNICODE when we want the new functionality (this has the nice side-effect of making it a compile time error to call a MSVCRT function with a "narrow" string).

To make this work we need a tiny compatibility header (for now byterun/caml/tchar_compat.h) for those bits of code which are compiled both under Linux and Windows.

I also revamped the support code in u8tou16.c. Its API now consists of:

  • _TCHAR* caml_stat_strdup_u16(char *s)

    • With _UNICODE: returns UTF-16 re-encoding of s (assuming UTF-8 and falling back to ANSI if that fails)
    • Without _UNICODE: caml_stat_strdup

    In both cases the returned string should be freed with caml_stat_free.

  • caml_stat_string caml_stat_strdup_u8(_TCHAR *s)

    • With _UNICODE: returns UTF-8 re-encoding of s (a wide string)
    • Without _UNICODE: caml_stat_strdup

    In both cases the returned string should be freed with caml_stat_free.

  • _TCHAR* caml_u16_of_u8(char *s)

    • With _UNICODE: returns UTF-16 re-encoding of s (assuming UTF-8 and falling back to ANSI if that fails)
    • Without _UNICODE: identity

    Since s may simply return its argument (in the second case), the returned string should be "freed" with a special function caml_stat_free_u (which is a no-op when _UNICODE is not defined).

  • caml_stat_string caml_u8_of_u16(_TCHAR *s)
    Exactly as caml_u16_of_u8 but in the other direction.

  • void caml_stat_free_u(caml_stat_block s)

    • WIth _UNICODE: caml_stat_free
    • Without _UNICODE: no-op
  • value caml_copy_string_u16(_TCHAR *s)

    • With _UNICODE: returns an OCaml string containing the UTF-8 re-encoding of s.
    • Without _UNICODE: caml_copy_string.

The rules governing the use of these functions are simple:

  • each use of caml_stat_strdup_u{8,16} (which takes the place of a call to caml_stat_strdup in the code prior to this patch) should be paired with a use of caml_stat_free (which already exists in the code prior to this patch).
  • each use of caml_u8_of_u16 and caml_u16_of_u8 should be paired with a use of caml_stat_free_u.

The idea behind these changes is to make the diff as minimal and easy to reason about as possible, simplifying the review of the patch. IMHO, I think it is much more readable now.

Opinions ?

Contributor

nojb commented Jun 13, 2017

Thanks for the clarification @dra27 !

An update on this patch. As discussed, I decided to use <tchar.h> in order to get rid of all the renaming in u8tou16.h. This means we compile with _UNICODE when we want the new functionality (this has the nice side-effect of making it a compile time error to call a MSVCRT function with a "narrow" string).

To make this work we need a tiny compatibility header (for now byterun/caml/tchar_compat.h) for those bits of code which are compiled both under Linux and Windows.

I also revamped the support code in u8tou16.c. Its API now consists of:

  • _TCHAR* caml_stat_strdup_u16(char *s)

    • With _UNICODE: returns UTF-16 re-encoding of s (assuming UTF-8 and falling back to ANSI if that fails)
    • Without _UNICODE: caml_stat_strdup

    In both cases the returned string should be freed with caml_stat_free.

  • caml_stat_string caml_stat_strdup_u8(_TCHAR *s)

    • With _UNICODE: returns UTF-8 re-encoding of s (a wide string)
    • Without _UNICODE: caml_stat_strdup

    In both cases the returned string should be freed with caml_stat_free.

  • _TCHAR* caml_u16_of_u8(char *s)

    • With _UNICODE: returns UTF-16 re-encoding of s (assuming UTF-8 and falling back to ANSI if that fails)
    • Without _UNICODE: identity

    Since s may simply return its argument (in the second case), the returned string should be "freed" with a special function caml_stat_free_u (which is a no-op when _UNICODE is not defined).

  • caml_stat_string caml_u8_of_u16(_TCHAR *s)
    Exactly as caml_u16_of_u8 but in the other direction.

  • void caml_stat_free_u(caml_stat_block s)

    • WIth _UNICODE: caml_stat_free
    • Without _UNICODE: no-op
  • value caml_copy_string_u16(_TCHAR *s)

    • With _UNICODE: returns an OCaml string containing the UTF-8 re-encoding of s.
    • Without _UNICODE: caml_copy_string.

The rules governing the use of these functions are simple:

  • each use of caml_stat_strdup_u{8,16} (which takes the place of a call to caml_stat_strdup in the code prior to this patch) should be paired with a use of caml_stat_free (which already exists in the code prior to this patch).
  • each use of caml_u8_of_u16 and caml_u16_of_u8 should be paired with a use of caml_stat_free_u.

The idea behind these changes is to make the diff as minimal and easy to reason about as possible, simplifying the review of the patch. IMHO, I think it is much more readable now.

Opinions ?

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Jun 13, 2017

Contributor

To be clear, the reason to have caml_u8_of_u16, caml_u16_of_u8 and caml_stat_free_u is to avoid making an extra copy of the contents of an OCaml string when it is not necessary (i.e. under Linux or under Windows in non-Unicode mode). If we are willing to accept the extra copy in those contexts, we can get rid of these three functions.

Contributor

nojb commented Jun 13, 2017

To be clear, the reason to have caml_u8_of_u16, caml_u16_of_u8 and caml_stat_free_u is to avoid making an extra copy of the contents of an OCaml string when it is not necessary (i.e. under Linux or under Windows in non-Unicode mode). If we are willing to accept the extra copy in those contexts, we can get rid of these three functions.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Sep 16, 2017

Contributor
Contributor

dra27 commented Sep 16, 2017

@nojb

This comment has been minimized.

Show comment
Hide comment
@nojb

nojb Sep 16, 2017

Contributor

Indeed you are right, so I put everything back into a CAML_INTERNALS block and added the missing #defines to otherlibs.

Contributor

nojb commented Sep 16, 2017

Indeed you are right, so I put everything back into a CAML_INTERNALS block and added the missing #defines to otherlibs.

@damiendoligez

Took a look at the diff and didn't see anything amiss. I'm going to trust @dra27 here.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Sep 18, 2017

Contributor

Thank you - I expect it will be easier for everyone if we wait until #681 has been merged?

Contributor

dra27 commented Sep 18, 2017

Thank you - I expect it will be easier for everyone if we wait until #681 has been merged?

@damiendoligez damiendoligez merged commit 9fe6d0e into ocaml:trunk Sep 18, 2017

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment