Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupting interpreters #94

Closed
poscat0x04 opened this issue Apr 15, 2020 · 13 comments
Closed

Interrupting interpreters #94

poscat0x04 opened this issue Apr 15, 2020 · 13 comments

Comments

@poscat0x04
Copy link

I'm trying to implement an "eval" command for a chatbot and I would like to set a timeout for the interpreter. But unfortunately, async exceptions don't work. For example, the following program will keep running until it gets killed by the oom killer:

module Main where

import Language.Haskell.Interpreter
import System.Timeout

main = do
    r <- timeout 200000 $ runInterpreter $ do
      setImports ["Prelude"]
      eval "sum [1..]"
    print r

I assume this is because the underlying GHC API uses foreign calls that cannot be interrupted. So are there any other ways of interrupting the interpreter?

@poscat0x04
Copy link
Author

poscat0x04 commented Apr 15, 2020

Using forkIO and killThread explicitly can indeed interrupt the interpreter but will cause memory leaks. Don't know if this is a GHC issue.

@KiaraGrouwstra
Copy link
Collaborator

KiaraGrouwstra commented Apr 18, 2020

I'm dealing with a similar issue.
It looks like you can just call timeout inside the interpreter instead, so you'd have like eval "timeout 1000 . return $ sum [1..]" then extract the result from there.

@gelisam
Copy link
Contributor

gelisam commented Apr 18, 2020

the following program will keep running until it gets killed by the oom killer: [...]

Perhaps surprisingly, that's the expected behaviour for that program! Here is an expanded version of that program which explains what is going on:

import Language.Haskell.Interpreter
import System.Timeout

-- |
-- >>> main
-- runInterpreter is done
-- "<timedout>
-- timeout while printing infinite thunk
main :: IO ()
main = do
  interpreterResult <- timeout 200000 $ runInterpreter $ do
    setImports ["Prelude"]
    eval "sum [1..]"
  putStrLn "runInterpreter is done"
  case interpreterResult of
    Nothing -> do
      putStrLn "<timedout>"
      putStrLn "timeout during runInterpreter"
    Just (Left err) -> do
      putStrLn "error while evaluating:"
      print err
    Just (Right infiniteThunk) -> do
      maybeUnit <- timeout 200000 $ print infiniteThunk
      case maybeUnit of
        Nothing -> do
          putStrLn "<timedout>"
          putStrLn "timeout while printing infinite thunk"
        Just () -> 
          putStrLn "infinite thunk successfully printed??"

That is, the reason your timeout doesn't interrupt your program is that runInterpreter successfully completes in less than 2 seconds; it's your print r which loops, but that part is outside of your timeout.

@gelisam
Copy link
Contributor

gelisam commented Apr 18, 2020

And here is a variant which demonstrates that it is indeed possible to interrupt the interpreter using timeout, when the infinite loop runs is inside the runInterpreter block:

import Language.Haskell.Interpreter
import System.Timeout

-- |
-- >>> main
-- "<timedout>
-- timeout during runInterpreter
main :: IO ()
main = do
  interpreterResult <- timeout 200000 $ runInterpreter $ do
    setImports ["Prelude"]
    infiniteThunk <- eval "sum [1..]"
    lift $ print infiniteThunk
  case interpreterResult of
    Nothing -> do
      putStrLn "<timedout>"
      putStrLn "timeout during runInterpreter"
    Just (Left err) -> do
      putStrLn "error while evaluating:"
      print err
    Just (Right ()) -> do
      putStrLn "runInterpreter terminated successfully??"

@tycho01, did you happen to make the same mistake, or is there another problem with interrupting the interpreter which is not illustrated by @poscat0x04's program?

@gelisam gelisam closed this as completed Apr 18, 2020
@KiaraGrouwstra
Copy link
Collaborator

@gelisam I'd wanna use the above solution, but I'm calling this in a loop, which somehow triggers #68, I think since runInterpreter is called too fast.

So instead, I'd have my whole thing in an Interpreter monad, and would want the timeout on the level of the eval call.
timeout operates on IO monads though; I haven't quite figured out how to use that with eval and the like...

... probably just comes down to #68 I guess? 🤷‍♀️

@gelisam
Copy link
Contributor

gelisam commented Apr 19, 2020

but I'm calling this in a loop, which somehow triggers #68, I think since runInterpreter is called too fast.

I don't think that's the problem; replicateM_ 5 main runs the above runInterpreter block as fast as possible one after the other, and it runs just fine. Are you using threads somewhere?

timeout operates on IO monads though; I haven't quite figured out how to use that with eval and the like...

You can't use eval since it encodes the result as a String and IO actions can't be serialized, but you can use interpret to evaluate an expression of type IO a inside the interpreter, and then you can run that IO action inside or outside the runInterpreter block just like you would run any other IO action:

-- |
-- >>> main
-- hello
main :: IO ()
main = void $ timeout 200000 $ runInterpreter $ do
  setImports ["Prelude"]
  ioAction <- interpret "putStrLn \"hello\"" (as :: IO ())
  lift $ ioAction

@KiaraGrouwstra
Copy link
Collaborator

@gelisam so far I hadn't done threading yet. I'll try and see if that branches fixes it tho.
extracting the IO from the Interpreter to e.g. use timeout on it is pretty cool.

I have indeed been using interpret over eval, but I fear I'd still simplified a bit -- my issues aren't just in running expressions.
Even if I just run a typeChecks or typeOf on an expression with a non-resolving type, such as div (const const) div, Hint just freezes.

(For context, as I'm doing program synthesis, I'm basically producing a bunch of nonsensical programs that may or may not be well-behaved, such as the above -- that's one of the parts I'm trying to evaluate using hint.)

@gelisam
Copy link
Contributor

gelisam commented Apr 19, 2020

Even if I just run a typeChecks or typeOf on an expression with a non-resolving type, such as div (const const) div, Hint just freezes.

That's very strange! On my machine, it gives a type error:

import Language.Haskell.Interpreter

-- |
-- >>> main
-- won't compile:
-- <interactive>:1:19: error:
--     • Occurs check: cannot construct the infinite type: a ~ b -> a
--       Expected type: a -> a -> b -> a
--         Actual type: a -> a -> a
--     • In the second argument of ‘div’, namely ‘div’
--       In the expression: div (const const) div
main :: IO ()
main = do
  interpreterResult <- runInterpreter $ do
    setImports ["Prelude"]
    typeOf "div (const const) div"
  case interpreterResult of
    Left (WontCompile [GhcError msg]) -> do
      putStrLn "won't compile:"
      putStrLn msg
    Left err -> do
      putStrLn "unexpected error:"
      print err
    Right type_ -> do
      putStrLn "unexpected success:"
      print type_

Do you have an example program which freezes for you?

@KiaraGrouwstra
Copy link
Collaborator

I'm getting the impression something may have gone wrong in my code somehow. I'll investigate and report back if it is about Hint. Thanks again.

@poscat0x04
Copy link
Author

It seems that using forkProcess (from the package "unix") can completely avoid memory leaks.

@gelisam
Copy link
Contributor

gelisam commented Apr 22, 2020

@poscat0x04 yes, and this works in general, not just with hint: when you fork a new process (as opposed to a thread), all the memory allocated by that new process belongs to that new process, so the amount of memory allocated to the parent process will not increase and killing the child process will free that memory. There are quite a few disadvantages though: it's more heavyweight (forking a million threads is plausible but a million processes is not), and the parent and child processes are isolated (they can communicate via sockets and files but not via IORefs, MVars, Chans, etc.). But if those constraints work for you, great!

That being said, I have two follow-up questions.

  1. Does the fact you are resorting to forking and killing a process means that timeout still isn't working for you, even if you wrap the code where sum [1..] is forced rather than where it is interpreted?
  2. Do you have an example program in which using forkIO and killThread with hint causes a memory leak? Even if the timeout issue is resolved, that memory leak sounds like a separate bug which should also be addressed.

@poscat0x04
Copy link
Author

It turns out there are no memory leaks :) I thought that performGC would clean all garbage and reduce the memory consumption to near zero if there are no memory leaks but apparently it doesn't (I actually have no idea how garbage collectors work in general).
This is the code I used:

hintTest :: IO ()
hintTest = do
  interpreterResult <- timeout 2000000 $ runInterpreter $ do
    setImports ["Prelude"]
    infiniteThunk <- eval "product [1..]"
    lift $ print infiniteThunk
  case interpreterResult of
    Nothing -> do
      putStrLn "<timedout>"
      putStrLn "timeout during runInterpreter"
    Just (Left err) -> do
      putStrLn "error while evaluating:"
      print err
    Just (Right ()) -> do
      putStrLn "runInterpreter terminated successfully??"
  hFlush stdout
  performGC
  getLine
  pure ()

@gelisam
Copy link
Contributor

gelisam commented Apr 22, 2020

performGC would clean all garbage and reduce the memory consumption to near zero if there are no memory leaks but apparently it doesn't

performGC does clean all the garbage, but the memory you plan to use afterwards is not garbage, so "if there are no memory leaks" is not sufficient, you would also need "and there is no non-garbage data being held either".

Let's look at an example program:

import GHC.Stats
import System.Mem

printMemoryConsumption :: IO ()
printMemoryConsumption = do
  performGC
  live_bytes <- gcdetails_live_bytes . gc <$> getRTSStats
  putStrLn $ show live_bytes ++ " bytes"

-- |
-- >>> main
-- 53368 bytes
-- 1000000
-- 40079880 bytes
-- 1000000
-- 80136 bytes
main :: IO ()
main = do
  printMemoryConsumption
  let xs = [1..1000000::Int]
  print (last xs)  -- force xs
  printMemoryConsumption
  print (length xs)  -- xs is retained until this line
  printMemoryConsumption

I am using GHC.Stats to examine the memory usage, because once memory is acquired from the OS, I don't think it gets released back to the OS, it is merely made available for later allocations. For this reason, it can be misleading to look at the operating system's report of how much memory your program uses.

The memory usage is much higher at the second printMemoryConsumption call, but that's not because there is a memory leak, it is simply because xs is not garbage at that point, since it is used in the following line. At the third printMemoryConsumption call, however, xs is now garbage and so the memory goes down again.

As you can see, the baseline is closer to 50 Kb (50 Mb in ghci) than to zero, because the runtime system needs to keep track of a bunch of things such as which threads are running and which blocks of memory are available for allocation. And once the memory is released, the memory doesn't go all the way back to the baseline. I'm not sure why that happens; if I duplicate that block of code several times (being careful to use a different number than 1000000 so that ghc doesn't optimize the other copies away), we soon reach a fixpoint, so maybe the runtime is simply keeping track of more blocks of memory it has acquired from the OS?

Anyway, even a simple eval "1+1" seems to be using more and more memory, so thanks, you did spot a leak! I've opened issue #96 to track it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants