Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Microsoft toolchain #685

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

jonahbeckford
Copy link

@jonahbeckford jonahbeckford commented Aug 29, 2021

This PR is incomplete (not all the tests work yet) but builds with the cl.exe Microsoft compiler in a MSYS2 environment. (Specifically on the Windows distribution I announced at https://discuss.ocaml.org/t/ann-windows-friendly-ocaml-4-12-distribution-diskuv-ocaml-0-1-0/8358). I hope it is far enough along I can get a meaningful review of it.

Key changes:

  • cl.exe supports complex numbers but only minimally compliant with C99. That means an implementation that is a C struct with no ability to do arithmetic (ex. a + b). The vast majority of this PR is replacing float _Complex, etc. with compiler appropriate typedefs.
  • cl.exe generates .obj files not .o files
  • A couple places I had to use cygpath to force a Windows path rather than a Unix path. Those are the changes in the Makefiles with MSYS2 ifeq macros.

Outstanding issues:

  • I'm a bit puzzled about libffi. I can get ctypes-foreign to build against MSYS2's libffi, but MSYS2's libffi is dynamically linked against msys-2.0.dll (basically equivalent to cygwin1.dll) which is very problematic. I think, but do not know, that the following line in Makefile.rules:

    OCAMLMKLIB_EXTRA_FLAGS=-ldopt "-link -static-libgcc" # see GPR#1535

    is referring to Remove -static-libgcc link flag from mingw32 port ocaml/ocaml#1535 and probably was placed there so that libffi in particular is statically linked when built from Cygwin. If so then I can do the same (statically link libffi so OCaml code is not linked against MSYS2) or just switch entirely to a native Windows build of libffi. I'm leaning towards the native Windows build even though it is a bit more complicated because I am guessing that forcing static linking is going to cause more problems later. Any thoughts?

  • I stuck in a ctypes_ldouble_complex_div does not exist on current platform error for the Microsoft toolchain (for some bizarre reason Microsoft does not provide complex division in its C library but does in its C++ library) although it is straight-forward enough to provide a logarithm / subtract / exponentiation implementation inside ctypes. Which approach best fits with ctypes?

  • The test in tests/test-ldouble/test_ldouble.ml:

    let test_int_conversions _ =
    begin
      assert_equal max_int (LDouble.to_int
      		    (LDouble.of_int max_int))
        ~printer:string_of_int;
    
      assert_equal min_int (LDouble.to_int
      		    (LDouble.of_int min_int))
        ~printer:string_of_int;
    end

    is failing with:

    expected: 4611686018427387903 but got: -4611686018427387904
    

    I'm actually finding debugging this a bit difficult because I don't know (yet) how to debug mixed OCaml and C code. How are you doing this? (There are likely other test failures because I haven't run through the full test suite)

Thanks!

@yallop yallop requested a review from fdopen August 30, 2021 08:11
@fdopen
Copy link
Contributor

fdopen commented Aug 30, 2021

Ok, I will try to have a closer look at it later this week.

I played around with cl.exe long time ago (-> https://github.com/fdopen/ocaml-ctypes/commits/msvc ).

first issue:
Just compile libffi from source with cl.exe. The msys/mingw platform and msvc use different conventions.
Don't mix libraries/headers between those platforms, unless you know what you are doing.

Back then, I had to patch the source of libffi and compile both libffi and ctypes with special cflags.
Hopefully these problems have been solved by now.

third issue:
On mvsc64 long double is identic to double (that's not the case with mingw64).
Therefore LDouble.of_int max_int isn't a lossless operation.

@jonahbeckford
Copy link
Author

Thanks for the feedback, and triple thanks for maintaining the mingw repository!

Native Windows libffi: I just got the native Windows build of libffi working (they are terrible at documentation, but it builds without patches with the correct magic flags). For now those build instructions are here: https://gitlab.com/diskuv/diskuv-ocaml/-/blob/next/installtime/msys2/compile-native-libffi.sh . I can now test and do a comparison with your earlier Windows branch (which looks comprehensive; thanks!) ... that may take a couple days.

Complex number division: After thinking further, it doesn't make much sense to provide an alternate implementation for MSVC. If someone is doing complex division, they should know how to implement that operation with conjugates+multiplication or log+subtraction+exp. A ctypes implementation would have to be general-purpose (ex. be numerically stable across a wide domain of inputs).

Long double: I totally forgot that long doubles were same width as doubles in MSVC. Since ctypes is exposing the real C implementation, I will just need to adjust the long double test for Windows only.

@jonahbeckford
Copy link
Author

Testing and fixing complete:

  • All tests pass, and the only test that had to be skipped for MSVC is test-builtins which uses the atomic __sync_or_and_fetch that is unavailable in the MSVC runtime libraries.
  • I incorporated many things from your previous implementation. Without it I doubt this would be working; thanks!

I also added Merlin definitions for the tests so I could troubleshoot ctypes better.

There is no urgency to review it (I realize it is big) although this one package is very important to the overall MSVC OCaml ecosystem.

I'll submit this PR as a patch to the next preview release of my Windows distribution, and when this PR is ready I'll backport any changes you may want (or better yet, just use the latest ocaml-ctypes release).

@@ -16,7 +16,8 @@ let read_output program =
let channel = open_out input_filename in
output_string channel program;
close_out channel;
let output_filename = (Filename.chop_suffix input_filename ".c") ^ ".o" in
let obj_ext = if Sys.win32 then ".obj" else ".o" in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never test for Sys.win32, if you work around issues caused by msvc only.
Sys.win32 is also true for mingw and mingw64 who don't suffer from the same shortcomings and use different conventions (e.g. .o object files, not .obj).
The build system already has access to the difference. Just pass the info to the tests and configure scripts by a command line switch or environment variable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will change.

#include <stdint.h>
#include <stdbool.h>
#include <inttypes.h>
#include <caml/mlvalues.h>

#ifdef _MSC_VER
typedef _Lcomplex longdoublecomplex_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type names like longdoublecomplex_t are horrible to read. Use _ and prefix them with ctypes_, just follow the usual convention, see e.g: https://github.com/ocamllabs/ocaml-ctypes/blob/261fe071fad17ab323d8d2b82df2aec593e64e3f/src/ctypes/ctypes_complex_compatibility.h#L20 )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will change.

{ union ctypes_complex_long_double_union u; u.parts[0] = re; u.parts[1] = im; return u.z; }
#undef CTYPES_USE_STRUCT_BUILDER
#endif

// Primitive arithmetic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use /* */ comments in header files. They might be included in foreign source code that need special compile flags that are incompatible with C99 conventions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Thanks. Will change.

Makefile Outdated
ifeq ($(CCOMP_TYPE),msvc)
OBJ_SUFFIX=.obj
else
OBJ_SUFFIX=.o
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the right extension by looking at the ext_obj key in ocamlc -config.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will change.

@jonahbeckford
Copy link
Author

Thanks @fdopen and @nojb . I've updated the PR with your requested changes.

@@ -177,12 +178,27 @@ let brew_libffi_version flags =
| { Commands.stdout } ->
String.trim stdout

let msys_env_args () =
let pcp = Sys.getenv_opt "PKG_CONFIG_PATH" in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still support OCaml 4.03. Sys.getenv_opt and Option are not available.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will fix, and will see about installing a OCaml 4.03 compiler on Linux to regression test.

@jonahbeckford
Copy link
Author

jonahbeckford commented Sep 2, 2021

Latest PR change is "Regression fixes for Linux and 4.03".

Tested the following on Ubuntu 18.04 with OCaml 4.03.0:

$ ocaml --version
The OCaml toplevel, version 4.03.0

$ opam install ./ctypes.opam ./ctypes-foreign.opam --with-test

$ make XEN=false libffi.config ctypes-base ctypes-stubs ctypes-foreign

$ make run-examples

$ _build/ncurses.native 

$ _build/ncurses-cmd.native 

@yallop
Copy link
Owner

yallop commented Sep 2, 2021

I've enabled the CI on this PR, to make testing easier. (The macos failures are due to an unrelated opam issue, and can be ignored).

pow x negone
else
div one x
external inv : t -> t = "ctypes_ldouble_complex_cinvl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be missing. When you write it, also include a small test so that the function is not overlooked again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was implemented (CCC_PRIM_INV_OP in src/ctypes/ctypes_complex_compatibility.h) but still better to add tests. PR updated.

@fdopen
Copy link
Contributor

fdopen commented Sep 3, 2021

I've finally tried to compile it. The latest version of libffi seems to support msvc without any hacks.

Did you verify that ctypes_tls_callback is called?
It's not covered by the test suite and it didn't work for me when I've tested it manually (but I've used an older version of cl.exe).
My old solution is way more verbose but works (see fdopen@a2b66a1 )

@jonahbeckford
Copy link
Author

Did you verify that ctypes_tls_callback is called?
It's not covered by the test suite and it didn't work for me when I've tested it manually (but I've used an older version of cl.exe).
My old solution is way more verbose but works (see fdopen@a2b66a1 )

I'll add that as-is into the PR. Thanks.

I suppose I can just print something or look at the debugger to verify that ctypes_tls_callback is called.

I'm going to try (*) to do my last round of testing by going through https://github.com/aantron/luv's test cases, which is a heavy user of ctypes and which is well tested. One fix from this round of testing is 55c05cf

(*) I'm not going to spend too long on that though. Switching compilers from GCC to MSVC for luv may create other issues unrelated to ctypes.

@jonahbeckford
Copy link
Author

Status Update:

  • (not blocking) Ran into problems unrelated to ctypes with my downstream test candidate: Support MSVC compiler aantron/luv#126 . Haven't heard back.
  • Still pending: Verify ctypes_tls_callback (already integrated your change for it). I'm currently delayed on this small task because (unrelated to ctypes) I'm prepping a new Windows DKML release to fix some critical bugs.

I'll release this PR as-is under a highly unstable designation; doing so will make it easier to for downstream package maintainers to validate this PR works for them. After the release I'll come back to this PR and finish the last ctypes_tls_callback task.

* Close file before reading it so works on Windows

* Use cygpath to get Win32 name on MSYS2

* MSYS2's Makefile $(realpath) gave Unix absolute path which was correct
  but inappropriate to give to Microsoft's cl.exe

* Use .obj for Windows object files

* Win32 complex types and POSIX polyfills
* Rename norm to ldouble_norm

* Use typedefs for _Complex in tests
Also use /* */ comments in complex header
@jonahbeckford
Copy link
Author

I ran test-threads after removing all of its test cases except test_register_thread:

visualstudio

There is a small but important diff with ctypes_tls_callback in the PR commit titled Fix bug in ctypes_tls_callback segments.

The only part I don't feel good about is there is no TlsFree to go along with TlsAlloc. I believe the lack of TlsFree will cause TLS_OUT_OF_INDEXES and possible segfaults if dllctypes-foreign_stubs.dll is loaded/unloaded repeatedly. That should be rare but I can imagine it happening in a chain like something.exe -> plugin.dll -> dllctypes-foreign_stubs.dll when the plugin is loaded/unloaded.
But fixing that issue affects the MinGW code as well, so it doesn't belong in this too-big PR.

The testing and review are complete from my side. I am ready for your review. Thanks for the patience!

@yallop
Copy link
Owner

yallop commented Jan 27, 2022

It would be good to have this merged. @fdopen, are you able to (re)review now that the known issues are resolved?

src/ctypes-foreign/ffi_call_stubs.c Outdated Show resolved Hide resolved
Adopt suggested MisterDA's bugfix

Co-authored-by: Antonin Décimo <antonin.decimo@gmail.com>
@yallop
Copy link
Owner

yallop commented Jul 24, 2023

@jonahbeckford: would you like to close this PR? I'd like to have it merged, if possible, so that you don't need to maintain a fork, but from what you've posted elsewhere it sounds like it's abandoned.

@jonahbeckford
Copy link
Author

@jonahbeckford: would you like to close this PR? I'd like to have it merged, if possible, so that you don't need to maintain a fork, but from what you've posted elsewhere it sounds like it's abandoned.

Well, I'm not going to babysit this PR further, but it seems unwise to close it: the underlying issue still exists, it is a major issue, and it needs someone motivated to merge it with the latest code base. That someone motivated will waste a huge amount of time, and perhaps introduce bugs, if they can't find this PR. Perhaps this can get a label like stalled-waiting-for-merge-and-test?

@yallop
Copy link
Owner

yallop commented Jul 24, 2023

It'd be helpful to have someone rebase this against the latest build system changes, but I'd be much happier merging it if there were some way for me to verify both that it works now, and that it'll continue to work in the future. Ideally, we'd have GitHub Actions CI set up for the Microsoft toolchain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants