Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling behaviours from bare functions causes seg fault #3013

Closed
mpbagot opened this issue Feb 18, 2019 · 19 comments
Closed

Calling behaviours from bare functions causes seg fault #3013

mpbagot opened this issue Feb 18, 2019 · 19 comments

Comments

@mpbagot
Copy link
Contributor

mpbagot commented Feb 18, 2019

While working with PortAudio through the FFI, I found that I wasn't able to call behaviours on actors passed in using the userData field. In the past, I've successfully passed objects using userData fields, and called functions on them, but when I tried a similar action using an actor and a behaviour, it resulted in a Segmentation Fault. I've managed to recreate the bug in the following code snippet:

use "lib:portaudio"

primitive _PaStream

actor Main
  new create(env: Env) =>
    @Pa_Initialize[U8]()

    var stream = Pointer[_PaStream]

    // Stream pointer pointer, input channels, output channels, stream format (8 = paInt16), sample rate, frames-per-buffer (0 = auto), callback pointer, user data
    @Pa_OpenDefaultStream[U8](addressof stream, U8(0), U8(2), U32(8), F64(16000), U32(0), addressof this.callback_func, env.out)

    @Pa_StartStream[U8](stream)

    @Pa_Sleep[None](U32(2000))

  fun @callback_func(input: Pointer[U16], output: Pointer[U16], frames: U32, timeInfo: Pointer[U8], flags: U32, data: Any): U8 =>
    try
      var out = data as OutStream
      @printf[None]("About to call a behaviour...\n".cstring())
      out.print("Calling a behaviour.")
      @printf[None]("Behaviour was called.\n".cstring())
    end

    U8(0)

In the above code example, I expect the callback to print both lines using the FFI printf statements, and the "Calling a behaviour" line to be printed shortly after, however, after the first printf is called, the program seg faults and thus never prints the second line.

The exact output is:

ALSA lib pcm_dsnoop.c:618:(snd_pcm_dsnoop_open) unable to open slave
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
About to call a behaviour...
Segmentation fault (core dumped)

The output of ponyc -v is:

0.26.0-4a8bc944 [release]
compiled with: llvm 5.0.1 -- cc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
Defaults: pic=true ssl=openssl_1.1.0

Please let me know if there's any further information I can provide.

@jemc
Copy link
Member

jemc commented Feb 19, 2019

The issue here is that your bare function is being called from a non-pony-scheduled thread.

This is okay, but you need to register the thread first, which you can do with this FFI function in libponyrt:

ponyc/src/libponyrt/pony.h

Lines 471 to 479 in c42ccd3

/**
* Call this to create a pony_ctx_t for a non-scheduler thread. This has to be
* done before calling pony_ctx(), and before calling any Pony code from the
* thread.
*
* Threads that call pony_init() or pony_start() are automatically registered.
* It's safe, but not necessary, to call this more than once.
*/
PONY_API void pony_register_thread();

Can you try adding that FFI call to the top of your callback and see if it works?

Once we get your example working we can improve our docs to help with this.

  fun @callback_func(input: Pointer[U16], output: Pointer[U16], frames: U32, timeInfo: Pointer[U8], flags: U32, data: Any): U8 =>
    @pony_register_thread[None]()
    try
      var out = data as OutStream
      @printf[None]("About to call a behaviour...\n".cstring())
      out.print("Calling a behaviour.")
      @printf[None]("Behaviour was called.\n".cstring())
    end

    U8(0)

@mpbagot
Copy link
Contributor Author

mpbagot commented Feb 19, 2019

Unfortunately, adding that call didn't change the outcome at all. The code still seg faults with the behaviour call, and passes without.

EDIT: After attempting the suggested modification in my complete project, I found that it does stop the segmentation fault, if the called behaviour has either no arguments passed to it, or only number (int or float) arguments. Thus, in the above it continues to seg fault, but in my complete project, where I was testing a behaviour without any parameters, it passes successfully.

Is there any reason why this could be happening?

@jemc
Copy link
Member

jemc commented Feb 25, 2019

Hmmm... if I had to guess without seeing your latest example, I'd guess that there is an issue with allocating new objects using the Pony heap from a non-Pony thread. Is it possible to try your use case without allocating any new objects?

@mpbagot
Copy link
Contributor Author

mpbagot commented Feb 26, 2019

The following is the bare function from my complete project. I assume when you say "without allocating any new objects", you are referring to the creation of new variable values (e.g var num: U8 = 0) ? If that is the case, my code below does nothing more than pass provided parameters into the behaviour (aside for the casting of data), but I still encounter the segmentation fault.

fun @def_callback(input_buf: Pointer[I16] iso, output_buf: Pointer[I16], frame_count: U32, timeInfo: Pointer[_PaStreamCallbackTimeInfo], flags: U32, data: Any tag): U8 =>
    @pony_register_thread[None]()
    try
      var l = data as SphinxListener
      // TODO Calling a behaviour from this bare function results in a SegFault if non-number args are used.
      l.process_chunk(consume input_buf, frame_count)
    end

    0

@jemc
Copy link
Member

jemc commented Feb 26, 2019

I assume when you say "without allocating any new objects", you are referring to the creation of new variable values (e.g var num: U8 = 0) ?

Sorry, I should have been more specific - I meant things that call the constructors of classes and actors to allocate new objects on the heap. Declaring a local variable of a non-object type like var num: U8 = 0 will just allocate a place on the stack, which should be fine.

It doesn't look like your current example allocates memory, so now I wonder if it is an issue with tracing the iso argument...

@mpbagot
Copy link
Contributor Author

mpbagot commented Feb 26, 2019

Unfortunately, even if the argument has a tag or val capability, it still causes a segmentation fault. It seems unlikely that would be the result of the iso.

@jemc
Copy link
Member

jemc commented Feb 27, 2019

Well, I guess the next step for troubleshooting this is to construct a pthread-based repro for this that can be tested without involving PortAudio.

@adri326
Copy link
Member

adri326 commented Mar 5, 2019

Can confirm, Arch Linux, x86_64, using pony v27 (llvm 7.0.1), with SDL_mixer's threads

Here's the code that is running into the same issue:
pony-sdl-mixer

@adri326
Copy link
Member

adri326 commented Mar 10, 2019

(@jemc)
Here's a working, clean example using pthread:

https://github.com/adri326/-threader-pony

@SeanTAllen
Copy link
Member

@adri326 I would expect that to crash. You need to call pony_register_thread()

@adri326
Copy link
Member

adri326 commented Mar 10, 2019

Forgot about this function; did a push including it, it does not seem to affect anything though

@SeanTAllen
Copy link
Member

@adri326 you need to call pony_register_thread in thread_entry.

@adri326
Copy link
Member

adri326 commented Mar 11, 2019

@SeanTAllen
What would that change? The bare lambda should be executed in the same thread, unless there are some hidden mechanics at the beginning of the bare lambda

@adri326
Copy link
Member

adri326 commented Mar 11, 2019

I can't try rn

@mfelsche
Copy link
Contributor

Given your small example above, i was able to verify the reason for the segfault:

src/libponyrt/sched/scheduler.c:1147: pony_ctx: Assertion `this_scheduler != NULL` failed.

I was able to run your small example app successfully with the following changes:

 void* thread_entry(void *param) {
+  pony_register_thread();
   printf("B' - 🦓\n");
   if (pony_fn == NULL) {
     printf("No callback registered?\n");
   } else {
     pony_fn();
   }
-  pthread_exit(NULL);
+//  pthread_exit(NULL);
 }

It is important to call this function before calling back into pony code as documented here: https://github.com/ponylang/ponyc/blob/master/src/libponyrt/pony.h#L471:L479

I removed the pthread_exit to have repeatable output. If you want to keep it, you should call pony_unregister_thread() beforehand.

I hope this helps for getting your portaudio code running.

@adri326
Copy link
Member

adri326 commented Mar 11, 2019

Indeed, it does fix the issue. Thanks for the help!

adri326 added a commit to adri326/pony-sdl-mixer that referenced this issue Mar 11, 2019
This was achieved thanks to the guys at Pony:
ponylang/ponyc#3013

Turns out you have to call `pony_register_thread` from within your C
code, once it enters Pony-territory it is too late.
@SeanTAllen
Copy link
Member

SeanTAllen commented Mar 12, 2019

@mpbagot @adri326 can this be closed?

@adri326
Copy link
Member

adri326 commented Mar 12, 2019

I'd say yes

@mfelsche
Copy link
Contributor

Closing.

This is a good topic for the ffi chapter in the tutorial (if it is not covered already).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants