Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove function pointer cast emulation #2019

Merged
merged 30 commits into from
Dec 9, 2021
Merged

Conversation

hoodmane
Copy link
Member

@hoodmane hoodmane commented Dec 5, 2021

This is a third attempt at the fpcast removal. The good news is that chrome is working okay here.

I guess this one works.

Closes #2016. Closes #1677. Closes #1577.

@hoodmane hoodmane changed the title fpcast debug Remove function pointer cast emulation Dec 5, 2021
This was referenced Dec 5, 2021
@hoodmane
Copy link
Member Author

hoodmane commented Dec 6, 2021

Remaining test failure replicates running on a copy of chrome 90.0.4430.72 on my local setup. Doesn't replicate on more recent versions of chrome. I am going to xfail it, we can remove the xfail when we update the chrome driver.

@ryanking13
Copy link
Member

ryanking13 commented Dec 6, 2021

Thanks for your effort! I'm pretty sure that this is going to be the largest enhancement for the next Pyodide release (even though end users might not notice).

After the merge, I think it would be awesome if you could provide some details about fpcast removal. A blog post at pyodide-blog would be wonderful, or at a next release note maybe? such as:

  • Why we needed fpcast initially
  • How we removed it
  • What we could achieve through fpcast removal
    • Performance enhancement
    • Fix max recursion errors in packages like jedi, which means we can now implement something like python language server in pyodide maybe?
    • ...

I heard that you are very busy recently teaching students, so if my request is too much burden for you, please gently ignore this :)

@hoodmane
Copy link
Member Author

hoodmane commented Dec 6, 2021

After the merge, I think it would be awesome if you could provide some details about fpcast removal. A blog post at pyodide-blog would be wonderful, or at a next release note maybe?

Yeah that sounds like a good suggestion, will try to write something about it.

I heard that you are very busy recently teaching students.

My second final is tomorrow, after that we have to grade the finals but I should have more time to work on this sort of thing =)

@hoodmane
Copy link
Member Author

hoodmane commented Dec 7, 2021

Hopefully this will also make upgrading Emscripten much easier.

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @hoodmane ! Also much simpler than previous attempts. A few comments are below. I think the only blocker for merging is to add more description to the 0001-call-trampolines-to-handle-fpcast-troubles.patch patch, including either describing the problem it solves or linking to a description in one of the issues/PRs. Similarly, it might be good to mention there were other attempts to fix this at the AST level that you tried.

So this fixes it for Python and all methods calls there. Are we sure that there are no inconsistent function definitions in linked libraries though (e.g. BLAS, libxml etc) which would not be affected by this fix as far as I understand, and that might crash if encountered?

It also needs a changelog entry.

In terms of run time,

benchmarks on main
 -  selenium init
    native: nan  firefox: 4.542443  chrome: 7.181595  
 -  load numpy
    native: nan  firefox: 1.455756  chrome: 4.113080  
 -  pystone
    native: 0.239676  firefox: 0.927000  firefox(w/ ib): 1.217000  chrome: 0.845815  chrome(w/ ib): 1.132980  
 -  allpairs_distances
    native: 0.008091  firefox: 0.015889  firefox(w/ ib): 0.015889  chrome: 0.015847  chrome(w/ ib): 0.015682  
 -  allpairs_distances_loops
    native: 0.136256  firefox: 0.571889  firefox(w/ ib): 0.578444  chrome: 0.543858  chrome(w/ ib): 0.544035  
 -  arc_distance
    native: 0.022029  firefox: 0.063222  firefox(w/ ib): 0.062778  chrome: 0.083156  chrome(w/ ib): 0.082425  
 -  check_mask
    native: 0.025282  firefox: 0.111000  firefox(w/ ib): 0.110778  chrome: 0.109621  chrome(w/ ib): 0.109358  
 -  create_grid
    native: 0.078175  firefox: 0.090889  firefox(w/ ib): 0.090778  chrome: 0.102725  chrome(w/ ib): 0.102730  
 -  cronbach
    native: 0.044831  firefox: 0.080111  firefox(w/ ib): 0.078889  chrome: 0.086279  chrome(w/ ib): 0.086234  
 -  diffusion
    native: 0.199524  firefox: 0.573556  firefox(w/ ib): 0.571556  chrome: 0.642048  chrome(w/ ib): 0.643933  
 -  evolve
    native: 0.141785  firefox: 0.133111  firefox(w/ ib): 0.131333  chrome: 0.144841  chrome(w/ ib): 0.145622  
 -  fdtd
    native: 0.089530  firefox: 0.536667  firefox(w/ ib): 0.556556  chrome: 0.573054  chrome(w/ ib): 0.559818  
 -  grayscott
    native: 0.344450  firefox: 0.859889  firefox(w/ ib): 0.853889  chrome: 0.965238  chrome(w/ ib): 0.966997  
 -  grouping
    native: 0.048837  firefox: 0.299667  firefox(w/ ib): 0.319333  chrome: 0.353859  chrome(w/ ib): 0.356471  
 -  growcut
    native: 0.093585  firefox: 0.554111  firefox(w/ ib): 0.556333  chrome: 0.586661  chrome(w/ ib): 0.602701  
 -  harris
    native: 0.439007  firefox: 0.173889  firefox(w/ ib): 0.172111  chrome: 0.206260  chrome(w/ ib): 0.206258  
 -  hasting
    native: 0.000147  firefox: 0.001000  firefox(w/ ib): 0.001000  chrome: 0.001056  chrome(w/ ib): 0.001059  
 -  julia
    native: 0.013445  firefox: 0.079333  firefox(w/ ib): 0.079000  chrome: 0.088201  chrome(w/ ib): 0.087032  
 -  l2norm
    native: 0.045998  firefox: 0.049000  firefox(w/ ib): 0.049111  chrome: 0.048932  chrome(w/ ib): 0.048892  
 -  large_decimal_list
    native: 0.002985  firefox: 0.016667  firefox(w/ ib): 0.016000  chrome: 0.017215  chrome(w/ ib): 0.017096  
 -  local_maxima
    native: 0.128954  firefox: 0.680333  firefox(w/ ib): 0.678222  chrome: 0.647894  chrome(w/ ib): 0.646208  
 -  log_likelihood
    native: 0.155542  firefox: 0.278000  firefox(w/ ib): 0.281778  chrome: 0.435124  chrome(w/ ib): 0.436340  
 -  lstsqr
    native: 0.247889  firefox: 0.150000  firefox(w/ ib): 0.145778  chrome: 0.148919  chrome(w/ ib): 0.148579  
 -  mandel
    native: 0.237200  firefox: 1.155667  firefox(w/ ib): 1.155778  chrome: 1.257610  chrome(w/ ib): 1.255743  
 -  multiple_sum
    native: 0.087573  firefox: 0.240889  firefox(w/ ib): 0.240667  chrome: 0.255736  chrome(w/ ib): 0.256216  
 -  pairwise_loop
    native: 0.079682  firefox: 0.496667  firefox(w/ ib): 0.494333  chrome: 0.530996  chrome(w/ ib): 0.532673  
 -  periodic_dist
    native: 0.076328  firefox: 0.099556  firefox(w/ ib): 0.099333  chrome: 0.095617  chrome(w/ ib): 0.095839  
 -  repeating
    native: 0.075939  firefox: 0.225556  firefox(w/ ib): 0.223444  chrome: 0.252906  chrome(w/ ib): 0.252736  
 -  reverse_cumsum
    native: 0.108594  firefox: 0.108111  firefox(w/ ib): 0.108222  chrome: 0.108118  chrome(w/ ib): 0.108563  
 -  rosen
    native: 0.402768  firefox: 0.318222  firefox(w/ ib): 0.316222  chrome: 0.346475  chrome(w/ ib): 0.352102  
 -  slowparts
    native: 0.107006  firefox: 0.476444  firefox(w/ ib): 0.476000  chrome: 0.515248  chrome(w/ ib): 0.513597  
 -  smoothing
    native: 0.000019  firefox: 0.000000  firefox(w/ ib): 0.000000  chrome: 0.000093  chrome(w/ ib): 0.000094  
 -  specialconvolve
    native: 0.271563  firefox: 0.191556  firefox(w/ ib): 0.190444  chrome: 0.198698  chrome(w/ ib): 0.198928  
 -  vibr_energy
    native: 0.068417  firefox: 0.171111  firefox(w/ ib): 0.166000  chrome: 0.242397  chrome(w/ ib): 0.244153  
 -  wave
    native: 0.028718  firefox: 0.127333  firefox(w/ ib): 0.127222  chrome: 0.130529  chrome(w/ ib): 0.131443 
benchmarks for this PR
-  selenium init
    native: nan  firefox: 3.657807  chrome: 6.788297  
 -  load numpy
    native: nan  firefox: 0.959720  chrome: 3.891527  
 -  pystone
    native: 0.230358  firefox: 0.757000  firefox(w/ ib): 0.915000  chrome: 0.653320  chrome(w/ ib): 0.787990  
 -  allpairs_distances
    native: 0.008153  firefox: 0.011222  firefox(w/ ib): 0.011556  chrome: 0.011851  chrome(w/ ib): 0.011669  
 -  allpairs_distances_loops
    native: 0.129486  firefox: 0.474778  firefox(w/ ib): 0.475778  chrome: 0.451730  chrome(w/ ib): 0.452252  
 -  arc_distance
    native: 0.021950  firefox: 0.045556  firefox(w/ ib): 0.045556  chrome: 0.064467  chrome(w/ ib): 0.064201  
 -  check_mask
    native: 0.026212  firefox: 0.093778  firefox(w/ ib): 0.094111  chrome: 0.093642  chrome(w/ ib): 0.094030  
 -  create_grid
    native: 0.105310  firefox: 0.128889  firefox(w/ ib): 0.124222  chrome: 0.128520  chrome(w/ ib): 0.125788  
 -  cronbach
    native: 0.049270  firefox: 0.081556  firefox(w/ ib): 0.081889  chrome: 0.087778  chrome(w/ ib): 0.085952  
 -  diffusion
    native: 0.206260  firefox: 0.556889  firefox(w/ ib): 0.550667  chrome: 0.587371  chrome(w/ ib): 0.584415  
 -  evolve
    native: 0.171362  firefox: 0.138333  firefox(w/ ib): 0.135778  chrome: 0.149374  chrome(w/ ib): 0.148715  
 -  fdtd
    native: 0.088260  firefox: 0.403556  firefox(w/ ib): 0.403667  chrome: 0.423388  chrome(w/ ib): 0.421635  
 -  grayscott
    native: 0.344273  firefox: 0.760222  firefox(w/ ib): 0.761000  chrome: 0.877438  chrome(w/ ib): 0.881975  
 -  grouping
    native: 0.052230  firefox: 0.108778  firefox(w/ ib): 0.109333  chrome: 0.106071  chrome(w/ ib): 0.104576  
 -  growcut
    native: 0.093251  firefox: 0.392778  firefox(w/ ib): 0.388556  chrome: 0.438702  chrome(w/ ib): 0.438820  
 -  harris
    native: 0.540719  firefox: 0.198889  firefox(w/ ib): 0.196333  chrome: 0.238007  chrome(w/ ib): 0.240536  
 -  hasting
    native: 0.000153  firefox: 0.000778  firefox(w/ ib): 0.000667  chrome: 0.000747  chrome(w/ ib): 0.000753  
 -  julia
    native: 0.014639  firefox: 0.059889  firefox(w/ ib): 0.060889  chrome: 0.064950  chrome(w/ ib): 0.064777  
 -  l2norm
    native: 0.046940  firefox: 0.049556  firefox(w/ ib): 0.049333  chrome: 0.050471  chrome(w/ ib): 0.049234  
 -  large_decimal_list
    native: 0.003579  firefox: 0.010444  firefox(w/ ib): 0.010778  chrome: 0.014561  chrome(w/ ib): 0.010347  
 -  local_maxima
    native: 0.131609  firefox: 0.545667  firefox(w/ ib): 0.543444  chrome: 0.505154  chrome(w/ ib): 0.509972  
 -  log_likelihood
    native: 0.159741  firefox: 0.286111  firefox(w/ ib): 0.286222  chrome: 0.439895  chrome(w/ ib): 0.446055  
 -  lstsqr
    native: 0.304211  firefox: 0.191778  firefox(w/ ib): 0.188556  chrome: 0.189114  chrome(w/ ib): 0.191963  
 -  mandel
    native: 0.234662  firefox: 0.878556  firefox(w/ ib): 0.726222  chrome: 0.778082  chrome(w/ ib): 0.757526  
 -  multiple_sum
    native: 0.087795  firefox: 0.209222  firefox(w/ ib): 0.208333  chrome: 0.223281  chrome(w/ ib): 0.223394  
 -  pairwise_loop
    native: 0.079285  firefox: 0.354444  firefox(w/ ib): 0.349778  chrome: 0.392214  chrome(w/ ib): 0.390513  
 -  periodic_dist
    native: 0.083134  firefox: 0.098111  firefox(w/ ib): 0.098333  chrome: 0.095462  chrome(w/ ib): 0.095239  
 -  repeating
    native: 0.081325  firefox: 0.223667  firefox(w/ ib): 0.223778  chrome: 0.271918  chrome(w/ ib): 0.259858  
 -  reverse_cumsum
    native: 0.112210  firefox: 0.113444  firefox(w/ ib): 0.113333  chrome: 0.112759  chrome(w/ ib): 0.114435  
 -  rosen
    native: 0.535965  firefox: 0.384444  firefox(w/ ib): 0.377000  chrome: 0.422459  chrome(w/ ib): 0.429678  
 -  slowparts
    native: 0.102569  firefox: 0.368111  firefox(w/ ib): 0.366222  chrome: 0.411823  chrome(w/ ib): 0.412171  
 -  smoothing
    native: 0.000022  firefox: 0.000000  firefox(w/ ib): 0.000000  chrome: 0.000063  chrome(w/ ib): 0.000065  
 -  specialconvolve
    native: 0.371148  firefox: 0.223889  firefox(w/ ib): 0.239889  chrome: 0.238419  chrome(w/ ib): 0.240161  
 -  vibr_energy
    native: 0.073466  firefox: 0.173222  firefox(w/ ib): 0.175333  chrome: 0.261106  chrome(w/ ib): 0.256908  
 -  wave
    native: 0.027753  firefox: 0.103556  firefox(w/ ib): 0.102222  chrome: 0.107931  chrome(w/ ib): 0.106609

it does look like there is on average 8 to 20% speed up both to load pyodide and for individual benchmarks. Overall tests are also faster by a comparable amount. Benchmarks are a bit noisy but it might still be interesting to investigate in detail the impact of this change on individual benchmarks (the json with results is stored in each benchmark CI job).

emsdk/Makefile Show resolved Hide resolved
emsdk/Makefile Show resolved Hide resolved
packages/cffi_example/test_cffi_example.py Outdated Show resolved Hide resolved
@rth rth added this to the 0.19.0 milestone Dec 7, 2021
@hoodmane
Copy link
Member Author

hoodmane commented Dec 7, 2021

Also interesting to benchmark stack size. Running

sys.setrecursionlimit(10_000)
def f(n): print(n); f(n+1)

on various versions gives the following series:

[
    ["x",      "0.16", "0.17", "0.18", "fpcast-removed"],
    ["chrome",    228,    279,   1659, 1946],
    ["firefox",   634,    920,   1991, 4358]

]

Note the significant improvements in stack usage since 0.18 despite the fact that 0.18 had the avoid-fpcalls-in-CALL-FUNCTION which was exactly targeted at improving this particular benchmark by removing all the function pointer casts in Python --> Python calls.

@hoodmane
Copy link
Member Author

hoodmane commented Dec 7, 2021

Are we sure that there are no inconsistent function definitions in linked libraries though (e.g. BLAS, libxml etc) which would not be affected by this fix as far as I understand, and that might crash if encountered?

Don't know, if so we have no test coverage for it. My thought is that we merge this and when someone hits a code path where there is a problem, we can update the tests to cover it and fix. I think it's unlikely that we'll see too much of that though, except in the spots where fortran is linked. In normal C it's not typical to do this function pointer casting stuff, it's just that the unfortunate design of the PyMethodDef and PyGetSetDef structs force casts and the functions are required to take rarely needed or never needed arguments.

Actually, though, having done this I am pretty sure that it's possible to make huge improvements to Emscripten's approach to fpcast emulation.

  1. To start with, you should create a static table of how many arguments each function pointer actually expects. Most of the time functions are being called with the right number of arguments, so before making an indirect call check if this is the case and if so call it like a normal function pointer. This will yield significant speed and stack space improvements.
  2. To improve code size, don't make one byn$fpcast-emu trampoline per function, we only need one per signature. Really we only need one per WASM signature, which is much better -- int f(int, int) is the same as PyObject *f(PyObject*, PyObject*).
  3. The arguments to the trampoline should be taken as an array of int64 which will get passed on the Emscripten stack rather than on the C stack. This might be slower? But it would certainly be better for stack space.

So overall the trampoline argument spec should be:

sig_ret_type trampoline(function_ptr, int num_args, int64 *args);

Using this better implemented function pointer cast emulation would be slower than not using it like what we do in this PR but my suspicion is we would already see the majority of the benefits of removal by improving the general fpcast mechanism.

@hoodmane
Copy link
Member Author

hoodmane commented Dec 8, 2021

I am pretty sure that it's possible to make huge improvements to Emscripten's approach to fpcast emulation.

Okay, I think this design I was thinking of would be an improvement. The problem is, it would require an llvm pass. The information about when a function pointer is taken is destroyed by the time binaryen starts working. We run into the same problem as with clang plugins: to add an llvm pass would require a much more complicated build process.

I think the fpcast emulation they do in Binaryen is the best possible thing you can do if you are only willing to intervene by modifying the wasm after llvm is completely done. But it's really much too late a stage, and that's why the performance is atrocious. If you could do it earlier, the performance, stack space, and code size impact could be reduced a very large amount.

@hoodmane
Copy link
Member Author

hoodmane commented Dec 8, 2021

Okay well, we can improve the speed of these calls by going into Javascript once to look up how many arguments the function pointer takes, and then strategically stowing that information somewhere, then when we are about to call the function pointer, look up how many arguments it actually takes in the spot where we stowed that information and call it with the right number of arguments. This will perform better than these Javascript trampolines.

@rth
Copy link
Member

rth commented Dec 9, 2021

If there are no measurable benefits, how about we merge the previous one since it's simpler (and therefore has fewer ways things could fail) and keep this one in a separate closed PR for future reference if needed?

Thanks for exploring all these possibilities, it's a lot of work!

@hoodmane
Copy link
Member Author

hoodmane commented Dec 9, 2021

Yeah. My intention is to eventually use a custom llvm pass that fixes up all of our function pointer cast troubles. That will use a technique similar to this one. My idea is that we would have a separate Docker image to build our llvm pass and binaryen and then preinstall emscripten into the main Pyodide image directly from the llvm docker image. Then all of the compiler toolchain build and testing can happen in a separate docker image in a separate repo.

@hoodmane
Copy link
Member Author

hoodmane commented Dec 9, 2021

Okay, this is ready. I will merge it assuming the CI passes.

@hoodmane hoodmane merged commit 19261f3 into pyodide:main Dec 9, 2021
@hoodmane hoodmane deleted the fpcast-debug branch December 9, 2021 18:31
@hoodmane
Copy link
Member Author

hoodmane commented Dec 9, 2021

Oops, I guess the "merge when CI passes" option of refined github is not to be trusted...

@hoodmane
Copy link
Member Author

hoodmane commented Dec 9, 2021

Looks like tests are passing on main, but I forgot to remove debug symbols.

hoodmane pushed a commit to hoodmane/pyodide that referenced this pull request Dec 9, 2021
hoodmane pushed a commit that referenced this pull request Dec 10, 2021
hoodmane added a commit that referenced this pull request Jun 2, 2022
This enables WASM_BIGINT while maintaining (hypothetical) Safari 14 support
by shimming BigInt64Array and BigUint64Array if they are missing. I think the
last time we tried to enable WASM_BIGINT was before #2019 so our chances
are significantly better this time.

This will fix dynamic linking bugs and yields a minor reduction in code size.
kleisauke added a commit to kleisauke/libffi-emscripten that referenced this pull request Jun 26, 2022
This reverts commit 593b402 and cbc54da, as it's no longer needed
after PR pyodide/pyodide#2019.
hoodmane pushed a commit to hoodmane/libffi-emscripten that referenced this pull request Sep 20, 2022
This reverts commit 593b402 and cbc54da, as it's no longer needed
after PR pyodide/pyodide#2019.
atgreen pushed a commit to libffi/libffi that referenced this pull request Feb 2, 2023
* added build script

* Apply libffi-emscripten patch

* Some changes to wasm32/ffi.c

* Remove exit(0); from test suites

* Fix LONGDOUBLE argument type

* Use more macros in ffi.c

* Use switch statements instead of if chains

* Implemented struct args

* Finish struct implementation

* Partially working closures

* Got closures working (most of closures test suite passes)

* Revert changes to test suite

* Update .gitignore

* Apply code formatter

* Use stackSave and stackRestore rather than directly adjusting stack pointer

* Add missing break

* Fix visibility of ffi_closure_alloc and ffi_closure_free

* Fix FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used
sig needs to be vi here for FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE, noticed this while running the test suite without WASM_BIGINT support.

* Always use dynCall rather than direct wasmTable lookup (function pointer cast emulation changes dynCall)

* Prevent closures.c from duplicating symbols

* Try to set up CI

* Add test with bigint

* Make test methods static

* Remove BigInt shorthand because it messes up terser

* Add selenium tests

* Update tests a bit to try to make CI work

* WASM_BIGINT is a linker flag not a compile flag

* Finish getting CI working (#1)

* update gitignore

* Avoid adding "use strict;" to generated JS

This should be controlled by -s STRICT_JS in Emscripten.

* Make JavaScript ES5 compliant

* Remove redundant EXPORTED_RUNTIME_METHODS settings

* Fix definition of DEREF_I16

* Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used

* Add missing FFI_TYPE_STRUCT signature

* Improve test scripts

* Remove redundant EXPORTED_RUNTIME_METHODS settings

* Add missing EOL

* Add struct unpacking tests

* Update ci config to try to actually use WASM_BIGINT

* Revert "Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used"

This reverts commit 61bd5a3.

* Fix single_entry_structs tests

* Fix return from closure call

* Fix 64 bit return from closures

* only allocate as much space on stack for return pointer as needed

* Revert "only allocate as much space on stack for return pointer as needed"

This reverts commit e54a30f.

* xfail two tests

* Fix err_bad_abi test

* Remove test logging junk

* Try to set up long double marshalling for closures

* xfail err_bad_abi

* Fix reference errors in previous commit

* Add missing argument pointer assignment

* Fix signature of function pointer in cls_dbls_struct

* Fix longdouble argument

* Try some changes to bigint handling

* Fix BigInt handling

* Fix cls_longdouble test

* Fix long double closure arg with no WASM_BIGINT

* Use EM_JS to factor out js helpers

* Support for varargs closure calls

* Fix varargs calls

* Fix err_bad_abi test

* Fix typo in previous commit

* Add more assertions to closures test suite

* Fix some asserts

* Add assertions to a few more tests

* Fix some tests

* Fix more floating point assertions

* Update more tests

* Var args for ffi_call

* Don't do node tests

* Macro for allocating on stack

* Add some comments, simplify struct handling

* Try again to fix varargs calls, add comments

* Consolidate WASM_BIGINT conditionals into LOAD_U64 and STORE_U64 macros

* A bit of cleanup

* Fix another typo

* Some fixes to the testsuite

* Another testsuite fix

* Fix varags with closures?

* Another attempt at getting closure varargs to work

* sig is initialized later

* Allow libffi.closures tests to be run

* Improve build script

* Remove redundant semicolons

* Fix a few libffi.closures test failures

* Cleanup

* Legacy dynCall API is no longer used

* Fix FFI_TYPE_LONGDOUBLE offset

* xfail 2 tests for WASM

- closure_loc_fn0; not applicable -- codeloc doesn't point to closure.
- huge_struct; function signature too long.

* Revert some redundant dg-output/printf statements

Helps Node.

* Revert "Don't do node tests"

This reverts commit a341ef4.

* Fix assertions in cls_24byte

* More tiny formating fixes to test suite

* Revert "Revert "Don't do node tests""

This reverts commit 7722e68.

* Fix 64 bit returns when WASM_BIGINT is absent

* Fix print statement in cls_24byte

* Add CALL_FUNC_PTR macro to allow pyodide to define custom calling behavior to handle fpcast

* Update single_entry_structs tests

* More explanations

* Fix compile error in last commit

* Add more support for pyodide fpcast emulation, update CI to try to test it

* Clone via https

* Fix path to pyodide emsdk_env

* Add asserts to the rest of the test suite

* Fix test compile errors

* Fix some tests

* Fix cls_ulonglong

* Fix alignment of <4 byte args

* fix cls_ulonglong again

* Use snprintf instead of sprintf

* Should assert than strncmp returned 0

* Fix va_struct1 and va_struct3

* Change double and long double tests

These tests are failing because of a strange bug with prinft and doubles, but I am not convinced
it necessarily has anything to do with libffi. This version casts the double to int before printing it and avoids the issue

* Enable node tests

* Revert "Change double and long double tests"

This reverts commit 8f3ff89.

* Fix PYODIDE_FPCAST flag

* add conftest.py back in

* Fix emcc error: setting `EXPORTED_FUNCTIONS` expects `<class 'list'>` but got `<class 'str'>`

See discussion on pyodide/pyodide#1596

* Remove test.html

* Remove duplicate test file

* More changes from upstream

* Fix some whitespace

* Add some basic debug logging statements

* Reapply libffi.exp changes

* Don't build docs (#7)

Works around build issue makeinfo: command not found.

* Update long double alignment

Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:

> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)

* Update long double alignment

Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:

> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)

* Improve error handling a bit (#8)

* Fix handling of signed arguments to ffi_call (#11)

* Fix struct argument handling in ffi_call (#10)

* Remove fpcast emulation tests

* Align the stack to MAX_ALIGN before making call (#12)

* Increase MAX_ARGS

* Cleanup (#14)

* Fix Closure compiler error with -sASSERTIONS=1 (#15)

* Remove function pointer cast emulation (#13)

This reverts commit 593b402 and cbc54da, as it's no longer needed
after PR pyodide/pyodide#2019.

* Prefer the `__EMSCRIPTEN__` definition over `EMSCRIPTEN` (#18)

"The preprocessor define EMSCRIPTEN is deprecated. Don't pass it to code
in strict mode. Code should use the define __EMSCRIPTEN__ instead."
https://github.com/emscripten-core/emscripten/blob/84a634167a1cd9e8c47d37a559688153a4ceace6/emcc.py#L887-L890

* Install autoconf 2.71

* Try again with installing autoconf 2.71

* Fix compatibility with Emscripten 3.1.28

* CI: remove use of `EM_CONFIG` env

See commit:
emscripten-core/emsdk@3d87d5e

* Fix cls_multi_schar: cast rest_call to signed char

* Remove test xfails (#17)

* Fix long double when used as a varargs argument

* Enable unwindtest and fix it

* Add EM_JS_DEPS

* Also require convertJsFunctionToWasm

* Run tests very very verbose

* Echo the .emscripten file

* Remove --experimental-wasm-bigint insertion

* Build with assertions

* Move verbosity flags back out of LDFLAGS

* Remove debug print statement

* Use up to date pyodide docker image

* Explicitly cast res_call to fix test failure

* Put back name of main function in cls_longdouble_va.c

* Fix alignment

The stack pointer apparently needs to be aligned to 16. There were
some terrible subtle bugs caused by not respecting this. stackAlloc
knows that the stack should be 16 aligned, so we can use stackAlloc(0)
to enforce this. This way if alignment requirements change, as long
as Emscripten updates stackAlloc to continue to enforce them we should
be okay.

* Fix handling of systems with no Js bigint integration

When we run the node tests we use node v14 tests (since node v14 is
vendored with Emscripten). Node v14 has no Js bigint integration
unless the --experimental-wasm-bigint flag is passed. So only the
node tests really notice if we get this right. Turns out, it didn't
work. We can't call a JavaScript function with 64 bit integer arguments
without bigint integration.

In ffi_call, we are trying to call a wasm function that takes 64 bit
integer arguments. dynCall is designed to do this. We need to go back
to tracking the signature when we don't have WASM_BIGINT, and then use
dynCall. This works better now that emscripten can dynamically fill in
extra dynCall wrappers:
emscripten-core/emscripten#17328

On the other hand, for the closures we are not getting a function pointer
as a first argument. We need to make our own wasm legalizer adaptor that
splits 64 bit integer arguments and then calls the JavaScript trampoline,
then the JavaScript trampoline reassembles them, calls the closure, then
splits the result (if it's a 64 bit integer) and the adaptor puts it back
together.

* Improvements to emscripten test shell scripts (#21)

This fixes the C++ unwinding tests and makes other minor improvements
to the Emscripten test shell scripts.

* Rename the test folder and move test files into emscripten test folder

* Use docker image that has autoconf-2.71

* Cleanup

* Pin emscripten 3.1.30

* Fix build.sh path

* Rearrange ci pipeline

* Fix bpo_38748 test

* Cleanup

* Improvements to comments, add static asserts, and update copyright

* Use `*_js` instead of `*_helper` for EM_JS functions (#22)

* Minor code simplification

* Xfail first dejagnu test to work around emscripten cache messages

See emscripten-core/emscripten#18607

* Remove unneeded xfails

* Shorten conftest.py by using pytest-pyodide

* Apply formatters and linters to emscripten directory

* Fix Emscripten xfail hack

* Fix build-tests script

* Patch emscripten to quiet info messages

* Clean up compiler flags in scripts and remove some settings from circleci config

* Rename emscripten quiet script

* Add missing export

* Don't remove go.exp

* Add reference to emscripten logging issue

---------

Co-authored-by: Kleis Auke Wolthuizen <info@kleisauke.nl>
Co-authored-by: Kleis Auke Wolthuizen <github@kleisauke.nl>
Co-authored-by: Christian Heimes <christian@python.org>
Assioftware added a commit to Assioftware/libffi that referenced this pull request Feb 5, 2023
* added build script

* Apply libffi-emscripten patch

* Some changes to wasm32/ffi.c

* Remove exit(0); from test suites

* Fix LONGDOUBLE argument type

* Use more macros in ffi.c

* Use switch statements instead of if chains

* Implemented struct args

* Finish struct implementation

* Partially working closures

* Got closures working (most of closures test suite passes)

* Revert changes to test suite

* Update .gitignore

* Apply code formatter

* Use stackSave and stackRestore rather than directly adjusting stack pointer

* Add missing break

* Fix visibility of ffi_closure_alloc and ffi_closure_free

* Fix FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used
sig needs to be vi here for FFI_TYPE_STRUCT and FFI_TYPE_LONGDOUBLE, noticed this while running the test suite without WASM_BIGINT support.

* Always use dynCall rather than direct wasmTable lookup (function pointer cast emulation changes dynCall)

* Prevent closures.c from duplicating symbols

* Try to set up CI

* Add test with bigint

* Make test methods static

* Remove BigInt shorthand because it messes up terser

* Add selenium tests

* Update tests a bit to try to make CI work

* WASM_BIGINT is a linker flag not a compile flag

* Finish getting CI working (#1)

* update gitignore

* Avoid adding "use strict;" to generated JS

This should be controlled by -s STRICT_JS in Emscripten.

* Make JavaScript ES5 compliant

* Remove redundant EXPORTED_RUNTIME_METHODS settings

* Fix definition of DEREF_I16

* Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used

* Add missing FFI_TYPE_STRUCT signature

* Improve test scripts

* Remove redundant EXPORTED_RUNTIME_METHODS settings

* Add missing EOL

* Add struct unpacking tests

* Update ci config to try to actually use WASM_BIGINT

* Revert "Avoid marshalling FFI_TYPE_LONGDOUBLE when WASM_BIGINT is not used"

This reverts commit 61bd5a3e20891623715604581b6e872ab3dfab80.

* Fix single_entry_structs tests

* Fix return from closure call

* Fix 64 bit return from closures

* only allocate as much space on stack for return pointer as needed

* Revert "only allocate as much space on stack for return pointer as needed"

This reverts commit e54a30faea3803e7ac33eed191bde9e573850fc1.

* xfail two tests

* Fix err_bad_abi test

* Remove test logging junk

* Try to set up long double marshalling for closures

* xfail err_bad_abi

* Fix reference errors in previous commit

* Add missing argument pointer assignment

* Fix signature of function pointer in cls_dbls_struct

* Fix longdouble argument

* Try some changes to bigint handling

* Fix BigInt handling

* Fix cls_longdouble test

* Fix long double closure arg with no WASM_BIGINT

* Use EM_JS to factor out js helpers

* Support for varargs closure calls

* Fix varargs calls

* Fix err_bad_abi test

* Fix typo in previous commit

* Add more assertions to closures test suite

* Fix some asserts

* Add assertions to a few more tests

* Fix some tests

* Fix more floating point assertions

* Update more tests

* Var args for ffi_call

* Don't do node tests

* Macro for allocating on stack

* Add some comments, simplify struct handling

* Try again to fix varargs calls, add comments

* Consolidate WASM_BIGINT conditionals into LOAD_U64 and STORE_U64 macros

* A bit of cleanup

* Fix another typo

* Some fixes to the testsuite

* Another testsuite fix

* Fix varags with closures?

* Another attempt at getting closure varargs to work

* sig is initialized later

* Allow libffi.closures tests to be run

* Improve build script

* Remove redundant semicolons

* Fix a few libffi.closures test failures

* Cleanup

* Legacy dynCall API is no longer used

* Fix FFI_TYPE_LONGDOUBLE offset

* xfail 2 tests for WASM

- closure_loc_fn0; not applicable -- codeloc doesn't point to closure.
- huge_struct; function signature too long.

* Revert some redundant dg-output/printf statements

Helps Node.

* Revert "Don't do node tests"

This reverts commit a341ef4b.

* Fix assertions in cls_24byte

* More tiny formating fixes to test suite

* Revert "Revert "Don't do node tests""

This reverts commit 7722e685ea04e2420e042886816d8c4dd31f5dcb.

* Fix 64 bit returns when WASM_BIGINT is absent

* Fix print statement in cls_24byte

* Add CALL_FUNC_PTR macro to allow pyodide to define custom calling behavior to handle fpcast

* Update single_entry_structs tests

* More explanations

* Fix compile error in last commit

* Add more support for pyodide fpcast emulation, update CI to try to test it

* Clone via https

* Fix path to pyodide emsdk_env

* Add asserts to the rest of the test suite

* Fix test compile errors

* Fix some tests

* Fix cls_ulonglong

* Fix alignment of <4 byte args

* fix cls_ulonglong again

* Use snprintf instead of sprintf

* Should assert than strncmp returned 0

* Fix va_struct1 and va_struct3

* Change double and long double tests

These tests are failing because of a strange bug with prinft and doubles, but I am not convinced
it necessarily has anything to do with libffi. This version casts the double to int before printing it and avoids the issue

* Enable node tests

* Revert "Change double and long double tests"

This reverts commit 8f3ff89c6577dc99564181cd9974f2f1ba21f1e9.

* Fix PYODIDE_FPCAST flag

* add conftest.py back in

* Fix emcc error: setting `EXPORTED_FUNCTIONS` expects `<class 'list'>` but got `<class 'str'>`

See discussion on pyodide/pyodide#1596

* Remove test.html

* Remove duplicate test file

* More changes from upstream

* Fix some whitespace

* Add some basic debug logging statements

* Reapply libffi.exp changes

* Don't build docs (#7)

Works around build issue makeinfo: command not found.

* Update long double alignment

Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:

> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)

* Update long double alignment

Emscripten 2.0.26 reduces the aligmnet of long double to 8. Quoting
from `ChangeLog.md`:

> The alignment of `long double`, which is a 128-bit floating-point
> value implemented in software, is reduced from 16 to 8. The lower
> alignment allows `max_align_t` to properly match the alignment we
> use for malloc, which is 8 (raising malloc's alignment to achieve
> correctness the other way would come with a performance regression).
> (#10072)

* Improve error handling a bit (#8)

* Fix handling of signed arguments to ffi_call (#11)

* Fix struct argument handling in ffi_call (#10)

* Remove fpcast emulation tests

* Align the stack to MAX_ALIGN before making call (#12)

* Increase MAX_ARGS

* Cleanup (#14)

* Fix Closure compiler error with -sASSERTIONS=1 (#15)

* Remove function pointer cast emulation (#13)

This reverts commit 593b402 and cbc54da, as it's no longer needed
after PR pyodide/pyodide#2019.

* Prefer the `__EMSCRIPTEN__` definition over `EMSCRIPTEN` (#18)

"The preprocessor define EMSCRIPTEN is deprecated. Don't pass it to code
in strict mode. Code should use the define __EMSCRIPTEN__ instead."
https://github.com/emscripten-core/emscripten/blob/84a634167a1cd9e8c47d37a559688153a4ceace6/emcc.py#L887-L890

* Install autoconf 2.71

* Try again with installing autoconf 2.71

* Fix compatibility with Emscripten 3.1.28

* CI: remove use of `EM_CONFIG` env

See commit:
emscripten-core/emsdk@3d87d5e

* Fix cls_multi_schar: cast rest_call to signed char

* Remove test xfails (#17)

* Fix long double when used as a varargs argument

* Enable unwindtest and fix it

* Add EM_JS_DEPS

* Also require convertJsFunctionToWasm

* Run tests very very verbose

* Echo the .emscripten file

* Remove --experimental-wasm-bigint insertion

* Build with assertions

* Move verbosity flags back out of LDFLAGS

* Remove debug print statement

* Use up to date pyodide docker image

* Explicitly cast res_call to fix test failure

* Put back name of main function in cls_longdouble_va.c

* Fix alignment

The stack pointer apparently needs to be aligned to 16. There were
some terrible subtle bugs caused by not respecting this. stackAlloc
knows that the stack should be 16 aligned, so we can use stackAlloc(0)
to enforce this. This way if alignment requirements change, as long
as Emscripten updates stackAlloc to continue to enforce them we should
be okay.

* Fix handling of systems with no Js bigint integration

When we run the node tests we use node v14 tests (since node v14 is
vendored with Emscripten). Node v14 has no Js bigint integration
unless the --experimental-wasm-bigint flag is passed. So only the
node tests really notice if we get this right. Turns out, it didn't
work. We can't call a JavaScript function with 64 bit integer arguments
without bigint integration.

In ffi_call, we are trying to call a wasm function that takes 64 bit
integer arguments. dynCall is designed to do this. We need to go back
to tracking the signature when we don't have WASM_BIGINT, and then use
dynCall. This works better now that emscripten can dynamically fill in
extra dynCall wrappers:
emscripten-core/emscripten#17328

On the other hand, for the closures we are not getting a function pointer
as a first argument. We need to make our own wasm legalizer adaptor that
splits 64 bit integer arguments and then calls the JavaScript trampoline,
then the JavaScript trampoline reassembles them, calls the closure, then
splits the result (if it's a 64 bit integer) and the adaptor puts it back
together.

* Improvements to emscripten test shell scripts (#21)

This fixes the C++ unwinding tests and makes other minor improvements
to the Emscripten test shell scripts.

* Rename the test folder and move test files into emscripten test folder

* Use docker image that has autoconf-2.71

* Cleanup

* Pin emscripten 3.1.30

* Fix build.sh path

* Rearrange ci pipeline

* Fix bpo_38748 test

* Cleanup

* Improvements to comments, add static asserts, and update copyright

* Use `*_js` instead of `*_helper` for EM_JS functions (#22)

* Minor code simplification

* Xfail first dejagnu test to work around emscripten cache messages

See emscripten-core/emscripten#18607

* Remove unneeded xfails

* Shorten conftest.py by using pytest-pyodide

* Apply formatters and linters to emscripten directory

* Fix Emscripten xfail hack

* Fix build-tests script

* Patch emscripten to quiet info messages

* Clean up compiler flags in scripts and remove some settings from circleci config

* Rename emscripten quiet script

* Add missing export

* Don't remove go.exp

* Add reference to emscripten logging issue

---------

Co-authored-by: Kleis Auke Wolthuizen <info@kleisauke.nl>
Co-authored-by: Kleis Auke Wolthuizen <github@kleisauke.nl>
Co-authored-by: Christian Heimes <christian@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fpcast emulation tracing and removing
3 participants