Multiple embedded interpreter instances #68

JulianLeviston · 2018-09-06T10:16:34Z

Hi.

I'm trying to run hint inside itself one or maybe more times if I can. I noticed when I try to do that, it raises Exception: This version of GHC is not thread-safe,can't safely run two instances of the interpreter simultaneously. I'm using GHC 8.4.3 within stack at the moment on MacOS 10.13.6.

My question is, what's a way to do this? Will it work if I fork the process? Is there a version of GHC I can use that will let me do this? I'm using distributed haskell in my actual project, I thought maybe I could spawn another process and it would work, but I guess maybe it'd have to be running on another node for it to work?

I'm not really sure how the internals of hint works to know why this limitation exists.

The use case that I have is that I'm trying to make a system where there's some core piece of code at the top level that doesn't change unless the version of Haskell I'm running on changes, (tho I'd like even that limitation to be removed eventually) and then there are two layers below, the upper of which communicates between the layers and the lower of which executes the core of the code that can be edited from within another copy of the code.

The reason to have the layers is so that I can do proxied restarting without any downtime, ie so each piece is replaceable with hot swapping. I'm so chuffed to be able to use Hint, it's really awesome.

The text was updated successfully, but these errors were encountered:

JulianLeviston · 2018-09-06T11:23:20Z

Okay, so I did a test where I used forkIO and that seemed to do what I wanted. I'll proceed that way. Thanks! :) I think I'll close this. Sorry for not trying it out first. Again, thanks so very much for maintaining hint.

JulianLeviston · 2018-09-06T11:28:38Z

Ah, sorry I see that it will work up to a certain point (2 levels) but when I get to the third it makes the same complaint. Any help would (still) be greatly appreciated.

gelisam · 2018-09-06T11:32:13Z

I'm surprised that forkIO helped; the message specifically says that the code is not thread-safe, so running the code in another thread should make it worse! Perhaps GHC tries to detect if you are using its API improperly, and you found a way to prevent it from detecting that you are using it improperly, but not necessarily a way to use it properly?

JulianLeviston · 2018-09-06T11:35:00Z

Ah really? Darn. I since realised if I put run fork in the interpreted process it actually lets me proceed. I'd thought it was the calling thread that needed to be forked.

What makes it not thread safe? Will references stop working? I'm not actually using any references between my layers — they communicate across the "interpreter divide" via TCP ports.

JulianLeviston · 2018-09-06T11:50:43Z

This issue seems to suggest GHC itself has been threadsafe for a while. https://mail.haskell.org/pipermail/haskell-cafe/2015-August/121130.html

Note that I've not noticed this error message before today — I've been running it on GHC 8.4.3 on a different computer with a different version of hint. (a patched commit from a pre-0.8.0 version - by heinrich apfelmus - the first version that provided runStmt).

gelisam · 2018-09-06T12:00:44Z

What makes it not thread safe?

I don't know why GHC isn't thread-safe; it's the first time I see that message claiming it isn't. The message obviously comes from GHC, not hint, so it's not our code which isn't thread-safe.

The message says that "this" version of GHC isn't thread-safe, so maybe there is a way to recompile GHC with options which makes it slower but thread-safe? GHC has a lot of compile-time options.

Usually, when something isn't thread-safe, it's because there is a global variable which is being mutated without being guarded by a lock. Unfortunately, this means that running two copies of thread-unsafe code on two threads will work most of the time, and then mysteriously fail when the two threads happen to access the global variable at the exact same time. This is bad because your tests might lead you to believe that everything works, and then you'll experience mysterious unexplained failures much later.

I'm not actually using any references between my layers — they communicate across the "interpreter divide" via TCP ports.

That makes things much easier, but I'd have to understand the problem better in order to propose a concrete solution. hint can be used from ghci just fine, so why are you encountering this message when running hint from hint but not from ghci? Should be basically the same thing.

gelisam · 2018-09-06T12:10:29Z

The message obviously comes from GHC, not hint, so it's not our code which isn't thread-safe.

Huh, I googled for the error message, expecting to find it in ghc's source, but it turns out it is hint which is emitting the message after all!

https://github.com/mvdan/hint/blob/8cf48a659e97c55c01150ef6ff9eedf69b84e03f/src/Hint/InterpreterT.hs#L153-L154

gelisam · 2018-09-06T13:15:55Z

Here is the story of this error message, as I have managed to piece together from various places on the internet.

In 2009, @jcpetruzza, the original implementor of hint, noticed that the GHC API was not thread safe and opened a ghc ticket. That ticket is still open.

Still in 2009, @jcpetruzza decided to tackle that ticket himself. He laid out his plan in a wiki page.

I guess this plan didn't work out, because in 2010, he added the error message you saw, which attempts to detect when users try to use hint from several threads at once.

Later, in 2011, Simon Marlow laid out a similar but less detailed plan.

In 2014, Simon Marlow implemented a patch which fixed the problem; then SPJ reverted the patch because it caused a segfault on Windows, and then Simon Marlow fixed the problem again, fixing the Windows segfault. The fix has been released in ghc 7.10.1.

Later, in 2015, in the haskell-cafe message you mentioned, a user asked if the message was still relevant. @jcpetruzza says in that message that he does vaguely recall that the problem might have been fixed in 7.10 (it was), but he wasn't sure, so he asked the user to verify that for him. He didn't, and so the message remains.

Also in 2015, the Nomyx project encountered the error message in hint, but they did not realize that the GHC problem was fixed because they only found SPJ's revert, not Marlow's later fix. Of note is that they quote Simon Marlow as saying that "you cannot have multiple independent GHC API sessions running in different threads, because the runtime linker is global state", even though the runtime linker wasn't a global state anymore in 2015. I could not find the source of the quote, so maybe Simon Marlow said that before he fixed the problem in 2014?

Anyway, my conclusion from all that archeology is that the GHC issue has been fixed since GHC 7.10.1, but for some reason the GHC ticket hasn't been closed and Hint's check hasn't been removed. Since we no longer support GHC 7.10, we can now safely remove the check.

JulianLeviston · 2018-09-06T14:03:31Z

@gelisam you're amazing :) seriously, awesome work. Thank you so much.

cdupont · 2018-09-06T15:18:02Z

Congratulation Gelisam! This message was certainly a pain point when developing Nomyx.
Do you have a test case to reproduce the issue?

simonmar · 2018-09-07T14:37:43Z

Unfortunately this is not correct, GHC still doesn't support multiple linker instances. I think the confusion has arisen because "make the linker API thread-safe" refers only to making it possible to call the RTS linker's C API from multiple C threads simultaneously, which is independent from whether the GHC API supports multiple linker instances. In particular GHC still has a single shared global linker state.

JulianLeviston · 2018-09-08T01:39:37Z

Unfortunately this is not correct, GHC still doesn't support multiple linker instances

Setting aside my personal awe that you responded to an issue I wrote for a moment, Simon, I'm wondering if the thing I'm writing actually needs multiple linker instances (I don't really understand what's possible in this space, I think). I guess I should look into the guts of GHC more? Any suggestions for improving my understanding around this would be greatly valued.

I've got a working local throw-away project on this computer that basically has three modules that wrap each other like this:

module Kernel
    ( start
    ) where

import qualified Language.Haskell.Interpreter as I
import Control.Concurrent (forkIO)

start :: IO ()
start = forkIO forkIt >> return ()


forkIt :: IO ()
forkIt = do
    putStrLn  "Starting Interpreter (Kernel.start)"
    result <- I.runInterpreter hintBody
    putStrLn $ "Ending Interpreter (Kernel.start) " ++ show result
  where
    hintBody = do
      I.set [I.searchPath I.:= ["./src"]]
      I.loadModules ["SubKernel"]
      I.setImportsQ
        [ ("SubKernel", Just "SubKernel") ]
      I.runStmt "SubKernel.start"

... then the final inner piece looks like the following. There are obviously some issues with putStrLn vying for sysout at the one time, but all the output seems to be actually getting there.

module SubSubKernel
    ( start
    ) where

import Control.Concurrent (forkIO)

start :: IO ()
start = forkIO forkIt >> return ()


forkIt :: IO ()
forkIt = do
  putStrLn "This is the start module inside !!!!!!"
  putStrLn "It's very simple and should execute properly !!!!!!"

But I've been having trouble actually putting this pattern into my real project because it's using distributed-process, and is also obviously quite a bit more complicated than the above. I've had some success using spawn instead of fork, but it doesn't seem to let me go three levels deep still. It seems to just hang on the second level. I can get two, though. I probably need four.

Again, if you have any pointers on understanding this stuff, I'm all ears. I'd like to get this base level part of my project to the point where it executes the inner code so I can start writing "application level" code in my system :) ❤️

(Update: I've started digging into this by watching the videos at https://ghc.haskell.org/trac/ghc/wiki/AboutVideos, trying to install the source and reading the wiki.)

JulianLeviston · 2018-09-08T16:21:49Z

Update: I've dug into GHC, built it from scratch and read through the intro document, a large chunk of the wiki and watched several of those videos. I'm still not very clear about what's required to maintain separate state for each instance of the linker, or if that's even required for what I want to achieve. I might need to look into hint's source next, I think, to see how it interacts with GHC's API. I'm a bit lost.

JulianLeviston · 2018-09-09T01:11:10Z

I found this 9 year old ticket... https://ghc.haskell.org/trac/ghc/ticket/3372 (which I realise has this thread and convo linked in it :-)) Update: back to learning about GHC internals.

cdupont · 2018-09-10T08:42:49Z

@JulianLeviston If I understand, what you want to do is to load a program in Hint, that itself calls Hint?
I didn't understand why you wanted to do that?

JulianLeviston · 2018-09-10T11:41:45Z

@cdupont Well, I'm building a system that will allow me to rewrite and experiment on buildling the system itself from inside when it's running.

cdupont · 2018-09-10T11:49:25Z

I used such a technique with Nomyx: https://github.com/nomyx/nomyx-core/blob/master/src/Nomyx/Core/Engine/Interpret.hs
Users can submit pieces of code (Rules) that will be incorporated in the runtime. I don't keep around any instance of Hint: I re-initialize it each time I need it. It's much simpler this way (but a bit slow).
But I didn't experiment with having several level of interpreter...

JulianLeviston · 2018-09-10T12:10:34Z

I read about Nomyx a while ago, it seems interesting :)

What I'm doing has several layers, one of which allows for graceful restarting of servers whose source code it doesn't know in advance, all it knows is that it'll have to start something that listens on a TCP port, and passes the TCP port to it, along with setting up a proxy for the actual port wanted, to allow for starting a second one on another proxied port, then switching the proxy across. (ie graceful shutdown, no downtime hot-swapping upgrades/restarts). That's one piece - the whole system has quite a few more pieces.

I've only been working on the frame of it for the last few months; I've been working on the guts for a lot longer. To be honest, I didn't realise this limitation existed until I got to the end of the development of it, because in the experiments I did at the beginning, it worked fine (I don't think this exception existed on the version I was doing experiments on, or possibly it was because I'm working within Control.Distributed.Process and it wasn't triggering the exception, I'm not sure... I hadn't done anything exceptionally complicated yet, tho in the experiments).

JulianLeviston · 2018-09-27T00:54:18Z

Just an update — my plan is to work on fixing this bug in GHC... I'm in progress getting up to speed with how it works, and fixing a few more trivial bugs before I tackle it.

JulianLeviston · 2019-03-06T01:39:28Z

Another update - I've got a merge request up for this now at https://gitlab.haskell.org/ghc/ghc/merge_requests/388. The "more trivial bug" I fixed ended up taking a few months and touching quite a bit of GHC, and someone else wanted to take it over because it touched Template Haskell. Anyway, with a little bit of luck, this might be hitting GHC sometime in the not too distant future 😄 🤞

JulianLeviston · 2019-05-23T12:20:29Z

The merge request https://gitlab.haskell.org/ghc/ghc/merge_requests/388 has now been merged. It fixes the Haskell portion of the issue (not the C Linker part) It's targeted for 8.10.1. Slated for release in 2 months. Gosh this stuff takes a while :)

cdupont · 2019-05-23T12:31:03Z

@JulianLeviston Wonderful. I admire your persistence :)

JulianLeviston · 2019-05-24T02:58:23Z

@cdupont one of the benefits of doing it is I now understand:

how to work on GHC
how the compiler and GHC dynamic linker work quite well
that the linker is quite a tricky beast when it gets down to the RTS side of things
wayyyy more about how GHC works internally and Haskell's syntax transformations, etc.

It's been quit a journey, but I'm really glad I did it. I'm going to try to continue to keep fixing bugs, too. Slowly.

JulianLeviston · 2019-11-25T04:51:52Z

This is in 8.10.1 which just had alpha1 cut today.

KiaraGrouwstra · 2020-01-31T22:07:49Z

does that mean for that GHC version this restriction could be lifted from hint now?

JulianLeviston · 2020-02-06T00:30:15Z

Yes, I think that's true because 8.10.1 (or the patched one after, possibly) is out now, I think :) @gelisam — what say you? 😁 I'd love to use it soon(ish) so that'd be great to get this rolling if at all possible

gelisam · 2020-02-06T02:26:03Z

Congratulations for getting your patch into ghc!

Sounds like the only thing needed for you to use hint in the way you want is to remove the line which throws an error if hint detects a nested call? For your project, would it be sufficient to link with a locally-patched version of hint which doesn't have this line, it do you need me to release a patched version of hint on hackage?

Unfortunately, I'm pretty busy this month, so it's probably going to take a little while until I find the time to do a proper release.

strake · 2020-02-23T02:40:15Z

I have a WIP port on branch "ghc-8.10", which should allow use in multiple threads. I have yet to properly test it, however.

gelisam · 2020-02-23T03:20:31Z

Great! I think the proper way to test this would be to disable hint's thread detection code, then run threaded hint code on an older ghc in order to see what the problem even looks like, then turn that into a (failing) test. Then, switch to ghc-8.10 to confirm that the test now passes. Finally, re-enable hint's thread-detection code on older ghc versions (which is what I assume your branch does, I haven't looked yet), and disable the test on older ghc versions.

JulianLeviston · 2020-03-22T13:23:19Z

It looks like 8.10.1 is on the verge of being released https://gitlab.haskell.org/ghc/ghc/issues/17214

Not sure why I thought 8.10.1 was released before, I think that was just the alpha or so.

JulianLeviston · 2020-03-25T06:08:29Z

Ok 8.10.1 is out. 🎉 cc: @gelisam :)

gelisam · 2020-04-22T17:39:03Z

I think the proper way to test this would be to disable hint's thread detection code, then run threaded hint code on an older ghc in order to see what the problem even looks like, then turn that into a (failing) test. Then, switch to ghc-8.10 to confirm that the test now passes. Finally, re-enable hint's thread-detection code on older ghc versions, and disable the test on older ghc versions.

@tycho01 and I have implemented a test which satisfies the above (see #93). Unfortunately there is still one last hurdle which prevents a new version of hint to be released, and it's #97, an unrelated regression introduced with ghc-8.10.

gelisam · 2020-04-25T16:07:17Z

That last hurdle has been fixed in #98 , so I am now ready to release a new ghc-8.10-compatible version of hint!

gelisam · 2020-04-25T16:52:11Z

hint-0.9.0.3 is released!

cdupont · 2020-04-25T19:50:14Z

Very nice! You might change the README file also (limitations).

gelisam · 2020-04-25T21:37:42Z

Oh, good catch! Fixed, thanks.

JulianLeviston · 2020-06-10T13:50:34Z

Thanks @gelisam & @tycho01 👏 — now if only stack would release an updated version :)

gelisam · 2020-06-10T14:20:10Z

now if only stack would release an updated version :)

hmm, what's the process for updating the stack version? I remember that the person who maintains the stack version of a package is not necessarily the same as the person who maintains the hackage version. I didn't add any of my packages to stackage, so in theory, I am not responsible for updating the stackage version; but if it's @mvdan who did, I might have implicitly inherited the responsibility but not the permissions nor the know-how. Is there a way to check who is responsible for which stackage package?

mvdan · 2020-06-10T14:22:55Z

Err. I only ever uploaded to hackage, I'm pretty sure: https://hackage.haskell.org/package/hint

I don't really understand what https://www.stackage.org/package/hint is or how it's different fromthe first package. I would assume the versions are mirrored from hackage.

Also, why do the packages still list my personal email as the maintainer? That seems wrong :)

gelisam · 2020-06-16T14:14:30Z

Looks like the answer is that hint is grandfathered-in, so the stackage curators themselves are taking the responsibility of updating stackage's version of hint.

gelisam · 2020-06-16T14:29:15Z

When a new version of a package in Stackage is uploaded to Hackage, we automatically try to include it in Stackage.

(from https://github.com/commercialhaskell/stackage/blob/master/MAINTAINERS.md#uploading-a-new-package-version)

So it seems like we don't have to do anything. It continues:

If there is a failure we temporarily introduce an upper bound, and open a GitHub issue ticket to resolve the issue.

I now remember that we were pinged by the stackage curators last time they encountered issues upgrading to the latest version of hint. So I don't think we need to worry; stackage will upgrade to the latest version of hint, either automatically, or we will hear about it.

gelisam · 2020-06-16T14:33:22Z

In fact, a new LTS has been released today and it does use hint-0.9.0.3, so it did happen automatically already!

gelisam · 2020-06-16T14:45:13Z

why do the packages still list my personal email as the maintainer?

Well, technically, you are still the official maintainer of hint: you asked me if I wanted to become the official maintainer, and I said no. The reason I refused is because I don't have time to give all my open-source projects the attention they deserve. Nevertheless, I am still contributing to hint when I have time, and while I still don't think that's enough time, it's apparently more time than anybody else, since I am the most active contributor. So I guess I am the de-facto maintainer?

gelisam · 2020-12-10T02:09:42Z

Looks like the answer is that hint is grandfathered-in, so the stackage curators themselves are taking the responsibility of updating stackage's version of hint.

Apparently I was wrong about what grandfathered-in means, see #108.

JulianLeviston changed the title ~~Multiple embedded instances~~ Multiple embedded interpreter instances Sep 6, 2018

JulianLeviston closed this as completed Sep 6, 2018

JulianLeviston reopened this Sep 6, 2018

gelisam mentioned this issue Sep 6, 2018

Have several instances of the interpreter nomyx/Nomyx#102

Closed

This was referenced Apr 13, 2020

Hint concurrent instances KiaraGrouwstra/synthesis#34

Closed

Interrupting interpreters #94

Closed

gelisam closed this as completed Apr 25, 2020

gelisam mentioned this issue Jun 16, 2020

I guess I'm the de-facto maintainer now #103

Merged

gelisam mentioned this issue Dec 10, 2020

maintain the stackage entry #108

Closed

Multiple embedded interpreter instances #68

Multiple embedded interpreter instances #68

Comments

JulianLeviston commented Sep 6, 2018 • edited

JulianLeviston commented Sep 6, 2018

JulianLeviston commented Sep 6, 2018

gelisam commented Sep 6, 2018

JulianLeviston commented Sep 6, 2018

JulianLeviston commented Sep 6, 2018 • edited

gelisam commented Sep 6, 2018

gelisam commented Sep 6, 2018 • edited

gelisam commented Sep 6, 2018

JulianLeviston commented Sep 6, 2018

cdupont commented Sep 6, 2018

simonmar commented Sep 7, 2018

JulianLeviston commented Sep 8, 2018 • edited

JulianLeviston commented Sep 8, 2018

JulianLeviston commented Sep 9, 2018 • edited

cdupont commented Sep 10, 2018

JulianLeviston commented Sep 10, 2018

cdupont commented Sep 10, 2018

JulianLeviston commented Sep 10, 2018 • edited

JulianLeviston commented Sep 27, 2018

JulianLeviston commented Mar 6, 2019 • edited

JulianLeviston commented May 23, 2019 • edited

cdupont commented May 23, 2019

JulianLeviston commented May 24, 2019 • edited

JulianLeviston commented Nov 25, 2019

KiaraGrouwstra commented Jan 31, 2020

JulianLeviston commented Feb 6, 2020 • edited

gelisam commented Feb 6, 2020 • edited

strake commented Feb 23, 2020

gelisam commented Feb 23, 2020

JulianLeviston commented Mar 22, 2020 • edited

JulianLeviston commented Mar 25, 2020

gelisam commented Apr 22, 2020 • edited

gelisam commented Apr 25, 2020

gelisam commented Apr 25, 2020

cdupont commented Apr 25, 2020

gelisam commented Apr 25, 2020

JulianLeviston commented Jun 10, 2020

gelisam commented Jun 10, 2020

mvdan commented Jun 10, 2020

gelisam commented Jun 16, 2020

gelisam commented Jun 16, 2020

gelisam commented Jun 16, 2020

gelisam commented Jun 16, 2020 • edited

gelisam commented Dec 10, 2020

JulianLeviston commented Sep 6, 2018 •

edited

JulianLeviston commented Sep 6, 2018 •

edited

gelisam commented Sep 6, 2018 •

edited

JulianLeviston commented Sep 8, 2018 •

edited

JulianLeviston commented Sep 9, 2018 •

edited

JulianLeviston commented Sep 10, 2018 •

edited

JulianLeviston commented Mar 6, 2019 •

edited

JulianLeviston commented May 23, 2019 •

edited

JulianLeviston commented May 24, 2019 •

edited

JulianLeviston commented Feb 6, 2020 •

edited

gelisam commented Feb 6, 2020 •

edited

JulianLeviston commented Mar 22, 2020 •

edited

gelisam commented Apr 22, 2020 •

edited

gelisam commented Jun 16, 2020 •

edited