Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to control execution length/time #348

Closed
jeroenvandijk opened this issue Jun 11, 2020 · 19 comments
Closed

Add a flag to control execution length/time #348

jeroenvandijk opened this issue Jun 11, 2020 · 19 comments

Comments

@jeroenvandijk
Copy link
Contributor

Problem statement

Sci cannot detect when it is running code that will run forever. Clojure cannot do this either, but the flag :termination-safe suggests that Sci is able to do this. A counter example is:

((fn a [n] (if (#{0 1} n) 1 (+ (a (- n 2)) (a (- n 1)))))  30)

This code takes more than 6 seconds with n=30 via babashka. With n=100 this code will practically never finish.

Why does this matter?

A Babashka user is fully aware of what is being executed and will therefore be responsible for the code that is being run. If it takes to long, one CTRL-C will kill the program. However when you use Sci to run third party code in your JVM or in a web browser things start to change. Think of the following examples:

  • A multi tenant webserver (something like PCP), one mistake can occupy one full cpu process
  • A website that publically hosts Sci scripts (Athens is likely to use Sci for dynamic content. A mistake here can make a browser tab unresponsive)

In the JVM example one can use an approach like Clojail to work around this. In browser there are no threads and this becomes a different story. One needs to hack with webworkers to make it safe (not sure how succesful this will be). In both cases it would be cleaner if Sci had an option to protect us against endless execution. It would allow us to provide better error messages and remove complexity in the hosting code.

Potential solution(s)

  1. Have a counter that tracks the number of executions. Stop when the counter reaches a limit
  2. Stop after a certain expired time. This requires to check every so often (a counter) for the elapsed time. This makes it an extension of 1.

Other requirements of potential solution

When the safety flag is turned off it should not affect performance.

Alternatives to control execution time

  • JVM: use Threads and Thread.stop (deprecated)
  • Browser: use web workers (not sure how safe)

Open questions

  • How do we name this option?
  • Do we want number of iterations or elapsed time or both?

Other examples

Background of this issue can be found here https://clojurians-log.clojureverse.org/babashka/2020-06-10/1591808634.003100

@borkdude
Copy link
Collaborator

One detail concerning 1.: probably that counter should be global, since it will probably be possible to write a doseq (either manually or via a macro) that can still loop lots of times when max-iterations is set to a fixed low number.

@borkdude
Copy link
Collaborator

Another detail concerning 1 and #349: we might want to reset the counter at every new invocation of eval-string* when the user evaluates multiple expressions using the same context (as produced with sci/init).

@borkdude
Copy link
Collaborator

borkdude commented Jul 7, 2020

An alternative to iterate-max could be an interrupt. E.g.:

(def interrupt-state (atom false))
(future (Thread/sleep 10000) (reset! interrupt-state true))
(sci/eval-string "(loop [] (recur))" {:interrupt interrupt-state})

This takes forever, but after 10 seconds the interrupt-state is set to false which will make sci stop execution.
Could also work in JS using setTimeout.

Something like this could work for babashka/babashka.nrepl#23.

Note that this won't work for (sci/eval-string "(Thread/sleep 10000000000)" ) but so won't iterate-max.

@jeroenvandijk
Copy link
Contributor Author

Another option would to be to be able to pass a function which is called on every step and when it returns false iteration is stopped:

(def interrupt-flag (atom true)
(def step-fn (fn [] @interrupt-flag))
(future (Thread/sleep 10000) (reset! interrupt-flag false))
(sci/eval-string "(loop [] (recur))" {:step-fn step-fn})

By making it a function the user can also choose between invocation counting or a timeout or both, or something else.

Regarding

Note that this won't work for (sci/eval-string "(Thread/sleep 10000000000)" ) but so won't iterate-max.

This can easily be done by wrapping the evaluation in a future and call future-cancel after a timeout:`

(time (let [fut (future (Thread/sleep 1000000000))] (future-cancel fut) (try @fut (catch Exception e))))
"Elapsed time: 1.756845 msecs"
nil

@borkdude
Copy link
Collaborator

See https://stackoverflow.com/questions/11520394/why-do-cancelled-clojure-futures-continue-using-cpu why future-cancel is not a proper solution in all cases.

@jeroenvandijk
Copy link
Contributor Author

jeroenvandijk commented Jul 16, 2020

@borkdude yeah I mostly meant that future-cancel can solve the case(s) that an interruption flow of Sci can't and vice versa. So it needs to be a combination of both (when one lets the bindings to allow Thread/sleep that is).

@borkdude
Copy link
Collaborator

borkdude commented Sep 2, 2020

@jeroenvandijk I merged the PR #349 branch to a branch in sci and made it up to date: iterate-max-b and invoke-callback.

Right now the counter is global and that may be problematic:

Clojure 1.10.1
user=> (require '[sci.core :as sci])
nil
user=> (def base (sci/init {:iterate-max 20}))
#'user/base
user=> (sci/eval-string* (sci/fork base) "(loop [i 0] (if (< i 10) (recur (inc i)) i))")
10
user=> (sci/eval-string* (sci/fork base) "(loop [i 0] (if (< i 10) (recur (inc i)) i))")
Execution error (ExceptionInfo) at sci.impl.fns/parse-fn-args+body$increase-count (fns.cljc:28).
Maximum number of iterations reached

Maybe we should make the counter a per-function thing? Or not explicitly use a counter at all, but a callback, so the user can do whatever they want:

user=> (def base (sci/init {:invoke-callback (fn [] (prn @cnt) (when (zero? (swap! cnt dec)) (throw (ex-info "No!" {}))))}))
#'user/base
user=> (sci/eval-string* (sci/fork base) "(loop [i 0] (if (< i 10) (recur (inc i)) i))")
20
19
18
17
16
15
14
13
12
11
10
10
user=> (sci/eval-string* (sci/fork base) "(loop [i 0] (if (< i 10) (recur (inc i)) i))")
9
8
7
6
5
4
3
2
1
Execution error (ExceptionInfo) at user/fn (REPL:1).
No!
user=> (reset! cnt 20)
20
user=> (sci/eval-string* (sci/fork base) "(loop [i 0] (if (< i 10) (recur (inc i)) i))")
20
19
18
17
16
15
14
13
12
11
10
10

I think it's best to make the invoke-callback take a map with whatever options we need in the future, e.g the function name, arguments, etc.

@borkdude
Copy link
Collaborator

borkdude commented Sep 2, 2020

@jeroenvandijk A thing that crossed my mind with this example:

"((fn a [n] (if (#{0 1} n) 1 (+ (a (- n 2)) (a (- n 1))))) 30)"

Since we forbid recur, we could also forbid self-referential functions like these in :termination-safe.

But unfortunately I found an example in which that doesn't work:

(letfn [(a1 [n] (if (#{0 1} n) 1 (+ (a2 (- n 2)) (a3 (- n 1)))))
        (a2 [n] (a1 n))
        (a3 [n] (a1 n))]
  (a1 30))

So probably the simplest option is to have a (global) interruption hook/callback like above. Still needs a bit of hammock time...

@jeroenvandijk
Copy link
Contributor Author

@borkdude Yeah I think I agree that there is no perfect solution for every case. A hook/callback will allow any runtime (babashka, sci in js, sci in the jvm) to make their own trade offs and solutions to this problem.

So the counter i proposed in #349 is probably a less ideal solution than having a hook that would allow you to implement this same counter

@borkdude
Copy link
Collaborator

borkdude commented Sep 2, 2020

@jeroenvandijk Ok, I made a branch invoke-callback and ported your tests to it. I think I'll change the name, but the basic idea is there. One can also implement the timeout option with it.

@borkdude
Copy link
Collaborator

Another example which can take quite a while:

(sci/eval-string "(count (for [i (range 100) j (range 100) k (range 100)] [i j k]))" {:realize-max 1000000})

but for lower values of :realize-max it's ok.

Another example that might not be easy to catch:

(sci/eval-string "(vec (byte-array 1000000000))")

@borkdude
Copy link
Collaborator

borkdude commented Sep 15, 2020

Another example in which neither :realize-max nor :invoke-callback helps, since mapv is just a built-in function so it never reaches our callback:

(sci/eval-string "(mapv inc (vec (byte-array 1000000)))" {:realize-max 1000 :invoke-callback (fn [_] (prn :hello))})

The expression still terminates to the name :termination-safe is still fitting, but it can take a long while (depending on the number used in the above example). Maybe there's not much we can do in general about this and sci should be treated as safe in the sense that a user can't execute certain functions by limiting or forbidding certain vars and not allowing infinite seqs like (range), but not safe in the sense of execution time length.

On the JVM this can be wrapped in a thread which you can forcibly kill. On the JS side I'm not certain what's a more fundamental way to solve this.

@borkdude
Copy link
Collaborator

Another example in https://babashka.org/xterm-sci/:

(count (vec (make-array js/Object 1000000000)))

@borkdude
Copy link
Collaborator

I just checked this example (count (for [i (range 100) j (range 100) k (range 100)] [i j k])) and it's going through loop and loop is implemented using a recursive function, that's why invoke-callback will see it and therefore you will be able to stop it. But these might be implementation details.

@borkdude
Copy link
Collaborator

As for byte-array, vec, etc, people could re-implement those, to make them safer (and calling back into invoke-callback) at the cost of a performance penalty. This is more work for the user and it will require going through all of core's functions to see if there are potential holes.

@borkdude
Copy link
Collaborator

For now I removed support for :realize-max and :preset termination-safe via #429.

@borkdude
Copy link
Collaborator

Interestingly GraalVM (Enterprise?) 20.3.0 has added support to run certain stuff for a certain amount of time:

Screenshot 2020-11-18 at 20 23 37
Screenshot 2020-11-18 at 20 24 31

@borkdude
Copy link
Collaborator

@jeroenvandijk I don't know if you are using SCI on Node.js but there I found a proper solution for this problem, using vm.

You can see this in action when you type npx nbb, then run (range) in the REPL and then press ctrl-C.

@jeroenvandijk
Copy link
Contributor Author

@borkdude Not using Sci on Node.js yet, but this looks interesting. Thanks for the heads up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants