Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a runtime timeout using context #26

Closed
wants to merge 3 commits into from

Conversation

lateefj
Copy link

@lateefj lateefj commented May 14, 2019

For embedded use a runtime timeout would make running Awk much safer IMO. I have read using context does impact performance however it doesn't leak goroutines. This was the simplest way I could think of to add a timeout to the runtime of the script.

@lateefj
Copy link
Author

lateefj commented May 14, 2019

Might be worth exposing the context from an API perspective as this Lua implementation has.

@benhoyt
Copy link
Owner

benhoyt commented May 15, 2019

This is a good idea, thank you. I'll take a look in the next week or so.

@benhoyt
Copy link
Owner

benhoyt commented May 21, 2019

Hi @lateefj, thanks for this, I like this idea for safety in an embedded context (eg, a denial-of-service attack caused by a user-submitted BEGIN { while (1); } script or something).

I've played with the idea a bit, and the thing I don't like about this implementation is that if the script does time out, it'll return to the calling code, but the goroutine running execProgram will continue to run merrily in the background, eating up CPU cycles and leaking a goroutine.

So I think we need a way for the interpreter to check for the timeout at various points and terminate if it's timed out. yuin/gopher-lua does this by checking ctx.Done() every time around the bytecode loop, but it adds a lot of overhead. I think in GoAWK we'd need to check in ForStmt, WhileStmt, DoWhileStmt, and the constructs that call os/exec (pass the ctx down). Everything else should terminate ... hmm, but maybe I'm thinking about this wrong.

Also, I can't decide on whether the API should be adding a config.Timeout field as you've done, or adding a new ExecProgramContext function that takes a ctx is its first argument (and you can specify a timeout using context.WithTimeout if you want). I'm leaning towards the latter as it seems more Go-ish (and we could pass the ctx down to os/exec).

Repository owner deleted a comment from lateefj May 21, 2019
@lateefj
Copy link
Author

lateefj commented May 22, 2019

I tried to use the context approach to extend the API but I couldn't seem to make it work without leaking coroutines. I created another implementation here for comparison.

DOH: I intended to put this in another branch but late night coding...

@benhoyt
Copy link
Owner

benhoyt commented Jun 11, 2019

Hi @lateefj, I've been playing with this and different approaches. I have a version of it in my working copy, but unfortunately it slows things down by 10% or so even with the timeout turned off (and much more with it turned on). gopher-lua handles this by duplicating the main VM dispatcher loop, but it's only a few lines of code. With GoAWK I'd have to duplicate a lot of code to make this fast. I guess I might be able to use a constant with a build flag to enable it, but I'm not too familiar with that approach.

The other concern I have here is just because a script has a timeout doesn't make it safe for the script to come from user input, because you can fairly easily write an AWK script which uses hundreds of GB of memory and kill resources that way instead (memory instead of time). In general, protecting against resource usage is a hard problem especially in a GC'd language. If you need safety for user scripts (think Go Playground), it's safest to run the whole process in a sandboxed process with OS-enforced limits.

Overall I think I'd rather have this driven off a real use case. I don't have one at this point. Is yours a real use case where you needed a timeout? If so, what is the use case?

@lateefj
Copy link
Author

lateefj commented Jun 15, 2019

@benhoyt I think I generally agree with your assessment. To sandbox it would need to be an upfront design decision. I wanted to experiment with adding json as an input type and use AWK as a general purpose tool for processing it. Build tags are a good idea but not sure how they would work as a library.

@benhoyt
Copy link
Owner

benhoyt commented Jun 19, 2019

Going to close this for now, per discussion above.

@benhoyt benhoyt closed this Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants