Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native fiber #1878

Closed
wants to merge 0 commits into from
Closed

Native fiber #1878

wants to merge 0 commits into from

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented Jun 2, 2018

This is an implementation of fibers using hand-optimised assembly.

There are implementations for amd64, win32, win64, arm32, arm64, and I will add i386 soon too.

@ioquatix
Copy link
Member Author

ioquatix commented Jun 2, 2018

@nobu I don't expect to merge this yet, but I would like to continue to work on it.

@ioquatix ioquatix force-pushed the native-fiber branch 2 times, most recently from 0cd0f54 to 0044c3c Compare August 10, 2018 07:43
@ioquatix
Copy link
Member Author

@nobu I believe this is ready for your review/testing.

@ioquatix
Copy link
Member Author

ioquatix commented Nov 2, 2018

I've rebased this on trunk.

@ioquatix
Copy link
Member Author

ioquatix commented Nov 2, 2018

Looks like build failure on Linux. I will review.

@ioquatix
Copy link
Member Author

ioquatix commented Nov 2, 2018

I am going to see if I can get 32-bit linux passing too. It seems like a very minor target, but it's probably good to support it. As long as it's similar to win32 implementation, it should be straight forward.

@ioquatix ioquatix force-pushed the native-fiber branch 6 times, most recently from fb651b9 to 2fd0495 Compare November 3, 2018 02:15
@ioquatix
Copy link
Member Author

ioquatix commented Nov 3, 2018

It seems like windows build might be broken. It's selecting a coroutine implementation (amd64), but I haven't really tested it much. Looks like we might need to fix this so that it uses the existing code path.

compiling ../ruby/cont.c
880../ruby/cont.c: In function 'cont_free':
881../ruby/cont.c:416:6: warning: implicit declaration of function 'munmap' [-Wimplicit-function-declaration]
882      munmap((void*)fib->ss_sp, fib->ss_size);
883      ^~~~~~
884../ruby/cont.c: In function 'fiber_initialize_machine_stack_context':
885../ruby/cont.c:909:11: warning: implicit declaration of function 'fiber_machine_stack_alloc' [-Wimplicit-function-declaration]
886     ptr = fiber_machine_stack_alloc(size);
887           ^~~~~~~~~~~~~~~~~~~~~~~~~
888../ruby/cont.c:909:9: warning: assignment to 'char *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
889     ptr = fiber_machine_stack_alloc(size);
890         ^
891../ruby/cont.c:912:41: warning: passing argument 2 of 'coroutine_initialize' from incompatible pointer type [-Wincompatible-pointer-types]
892     coroutine_initialize(&fib->context, fiber_entry, ptr+size, size);
893                                         ^~~~~~~~~~~
894In file included from ../ruby/cont.c:46:
895../ruby/coroutine/amd64/Context.h:30:18: note: expected 'coroutine_start' {aka '__attribute__((noreturn)) void (*)(struct <anonymous> *, struct <anonymous> *)'} but argument is of type '__attribute__((noreturn)) void (*)(void *)'
896  coroutine_start start,
897  ~~~~~~~~~~~~~~~~^~~~~

@ioquatix ioquatix force-pushed the native-fiber branch 6 times, most recently from 67b4c2f to 86d68b4 Compare November 4, 2018 09:59
@ioquatix
Copy link
Member Author

ioquatix commented Nov 4, 2018

Here is a brief summary of native implementations now working:

  • Darwin (64-bit)
  • Linux (32 & 64 bit)
  • Windows (64-bit MinGW)

I don't know how to make other Windows builds compile the right files.

Other platforms use existing code path.

@ioquatix ioquatix force-pushed the native-fiber branch 4 times, most recently from d2a745f to 99f433d Compare November 20, 2018 21:06
@ioquatix
Copy link
Member Author

@MSP-Greg I am tracking issues here.

c745c56#commitcomment-31373680

It seems odd to me that MSYS build is failing while win32/win64 build is passing. I wonder what configure is doing.

@ioquatix
Copy link
Member Author

[00:04:38] checking native coroutine implementation for x64-mingw32... win64

@MSP-Greg
Copy link
Contributor

MSP-Greg commented Nov 20, 2018

A ruby-loco build I just did passed. See here.

Run on ruby 2.6.0dev (2018-11-21 trunk 65895) [x64-mingw32]

I have the patches from the PR added to it.

EDIT: I just started another build, as failures may be intermittent.

@MSP-Greg
Copy link
Contributor

I've already added it to ruby-loco as a patch, as soon as the current build is done, I'll start it.

@ioquatix
Copy link
Member Author

Here is how we should be allocating stack on Windows.

@ioquatix
Copy link
Member Author

@MSP-Greg Sorry, there are some more changes, according to documentation, to correctly allocate guard page.

@MSP-Greg
Copy link
Contributor

there are some more changes

Thanks. I was just about to push the first patch. That build passed test-all, but spec froze. When Fix missing semi-colon gets a few minutes into the build, I'll push all the patches to ruby-loco.

Sure, do you have google hangouts?

I haven't done much of that, as I haven't replaced the (nice) desktop I had. I'm now on a rather light notebook. I don't even have VS installed.

Re the ruby-loco builds, every build saves an artifact, so you can dl them and run tests with runner.rb. Building is a bit more involved, but not that complicated.

@ioquatix
Copy link
Member Author

It looks like everything is compiling now.

@MSP-Greg
Copy link
Contributor

It looks like everything is compiling now.

I'm building at https://ci.appveyor.com/project/MSP-Greg/ruby-loco/builds/20448476

@ioquatix
Copy link
Member Author

Looks like still got problems :(

@ioquatix
Copy link
Member Author

How did you get stack trace before, did you run it locally?

@ioquatix
Copy link
Member Author

Oh, I know the problem... I'm still using free on virtualalloc memory lol

@MSP-Greg
Copy link
Contributor

How did you get stack trace before, did you run it locally?

It was collected by the ruby-loco build. I hate long CI logs, so everything is saved to a 7z file.

Oh, I know the problem... I'm still using free on virtualalloc memory lol

Cool. I'd never know. I'll stop and re-start. Current build is always at
https://ci.appveyor.com/project/MSP-Greg/ruby-loco

@ioquatix
Copy link
Member Author

Any backtrace you provide is super useful to me since I'm not actually working on Windows and relying on appveyor and documentation to get things working.

@ioquatix
Copy link
Member Author

malloc/free are general purpose memory allocation. They are not designed for stack allocation. But they can work if you align them correctly.

On Windows, I think it's the cause of intermittent failure. Sometimes aligned, sometimes not.

Linux has mmap/mprotect/munmap. It's lower level than malloc/free and provides more control over memory allocation, e.g. readonly, no access, etc.

On a stack, the last (lowest by address) page should be marked read-only or something similar. this is called guard page. If you try to make too many function calls, eventually your stack hits this page and it causes page fault. It kill your program, instead of allowing your program to overwrite arbitrary memory (a security problem).

On windows, it's the same concept, but they use VirtualAlloc/VirtualProtect/VirtualFree. Nothing is really different conceptually. You still allocate some page-aligned memory, put a guard page at the end (bottom) and use that as the stack.

@MSP-Greg
Copy link
Contributor

Thanks for the explanation.

The last two builds of ruby-loco passed, with the patch skipping the two tests. I can't get the test in test_enum.rb to fail locally, but the test in test_enumerable.rb fails consistently.

We've got a long weekend, I may build a repo to allow one to use a pre-existing build and run test-all & test-spec (or parts thereof) a half dozen times. That might at least determine whether there is an intermittent issue.

@ioquatix
Copy link
Member Author

I tried to reproduce bug on my local windows bug but I only managed to reproduce once, I don’t know how to attach debugger to make test-all.

I will try test_enumerable. How do you run just that test?

@MSP-Greg
Copy link
Contributor

From test folder, I use:

ruby --disable=gems runner.rb ruby/test_enumerator.rb

Ran a third build, three in a row green.

@ioquatix ioquatix force-pushed the native-fiber branch 10 times, most recently from 0b5448d to b5813f5 Compare November 23, 2018 12:43
@MSP-Greg
Copy link
Contributor

@ioquatix

This is sort-of passing on Appveyor. The error in test-all is an unrelated error. But, if test-all fails, test-spec doesn't run. Thanks, Greg

@ioquatix
Copy link
Member Author

All builds passing, I merged to trunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants