Evaluating code from the sandbox #6

Open
wants to merge 3 commits into
from

Projects

None yet

3 participants

@dcluna
Collaborator
dcluna commented Feb 16, 2013

This function should eval forms coming out of the sandbox by getting their original definitions and binding them to the sandboxed code. Note that if there is a previous definition for a sandbox symbol, it will use that instead (so it won't redefine some functions that should be protected, like defun and while).

@nicferrier
Collaborator

Are there tests to prove that last part? that sandbox definitions will not run?

I'm uneasy about it generally I guess. I wonder if this is the right way to run sandboxed code or wether everything using the sandbox should be evaled inside the sandbox.

We need to be VERY careful about the safety here.

Joel - I think you should have a more standard test library. The current one can't be run by ert except with the batch files?

@dcluna
Collaborator
dcluna commented Feb 17, 2013

The above code should give a feel for what I was trying to implement. Right now it could be easily exploited (I included a simple example in the tests), but we could whitelist some 'safe' functions and disallow every other one. There's still risk in this, though (how can we know what is really safe from what isn't, and even if we knew, there can be other ways to exploit the sandbox).

What other options are there for evaluating code from the sandbox?

@joelmccracken
Owner

I hesitate to make sandbox functions available within the global namespace. This seems problematic for a few reasons.

I would instead prefer all evaluation to happen within the domain of the sandbox. It would be better for a few reasons, besides being cleaner. One reason is that I would prefer to be able to have multiple sandboxes within a single emacs instance, and not have them clobber each other.

@nicferrier I'm comfortable with using a different test interface. Basically, I think ert tests are super verbose. el-spec's goal is to simply the creation of ert tests. I would love to improve el-spec, but I don't have the time to do it right now. I'm in favor of moving to straight-up ERT tests. That may be one of the first things I attempt to do for this project.

@nicferrier
Collaborator

Link to el-spec? perhaps there's something I can do to make it easier to work with in elisp?

Basically we just need the shell script in emacslisp. which shouldn't be hard.

@joelmccracken
Owner

Oh, is that all? I'll maybe whip something together. In elisp, all that
would have to happen is for test/env.el to be loaded/evaluated, and then a
call to ert will run all of the tests.

For now, https://github.com/uk-ar/el-spec has the code. Its documentation
does mention some support for running the tests within emacs though, so
that might also be a solution.

On Sun, Feb 17, 2013 at 6:37 PM, Nic Ferrier notifications@github.comwrote:

Link to el-spec? perhaps there's something I can do to make it easier to
work with in elisp?

Basically we just need the shell script in emacslisp. which shouldn't be
hard.


Reply to this email directly or view it on GitHubhttps://github.com/joelmccracken/emacs_sandbox/pull/6#issuecomment-13700325.

@nicferrier
Collaborator

Well, I'd say it needs to be a package and it needs to have a command that runs the tests. Can I look?

@nicferrier
Collaborator

Ahhhh. So I hate git submodules with a passion. el-spec should be packaged and you should have a test module that uses it and declares the dependancy properly. I can submit patches for that if you'll accept them.

@joelmccracken
Owner

el-spec is actually in marmalade, so that would work. I would accept patches!

I don't hate git submodules, but they aren't the right solution for what we are trying to do. At the time I used them, I didn't know about these projects (elpakit, carton) that let you declare your dependencies. I'd totally use them now if I was doing this today.

@dcluna
Collaborator
dcluna commented Mar 12, 2013

I tried the following as a better model for sandbox.el, but I could not make it work as intended, could you help me to find out what's wrong?

https://gist.github.com/dcluna/5140038

@joelmccracken
Owner

Thanks for working on it! This sandbox evaluation stuff forces you to look
at some pretty obscure corners of emacs lisp and its evaluation. I will
take a look at it this evening.

On Mon, Mar 11, 2013 at 11:16 PM, dcluna notifications@github.com wrote:

I tried the following as a better model for sandbox.el, but I could not
make it work as intended, could you help me to find out what's wrong?

https://gist.github.com/dcluna/5140038


Reply to this email directly or view it on GitHubhttps://github.com/joelmccracken/emacs_sandbox/pull/6#issuecomment-14756436
.

@nicferrier
Collaborator

I think it's a bit dumb to be looking at ways other than erbot.

The point is that erbot works and has been working in a very dangerous
environment (#emacs) for a long time.

@joelmccracken
Owner

I agree, but with some reserve. I suspect that erbot has some bugs in it,
or at least erbot has a bunch of code that is put together poorly, which
makes it that much harder to hide bugs, etc. Taking this code out of erbot
was somewhat challenging; doing a straight-up erbot copy might get us
something that works, but I suspect it would all need to be replaced
anyway.

On Tue, Mar 12, 2013 at 12:27 PM, Nic Ferrier notifications@github.comwrote:

I think it's a bit dumb to be looking at ways other than erbot.

The point is that erbot works and has been working in a very dangerous
environment (#emacs) for a long time.


Reply to this email directly or view it on GitHubhttps://github.com/joelmccracken/emacs_sandbox/pull/6#issuecomment-14785909
.

@joelmccracken
Owner

Last night I spent a bunch of time getting back into the sandbox code; I just want to reiterate that it is pretty hard to follow/understand. The main function which parses user commands is fsi-lispify, which is like 700 lines long, found here: https://github.com/sigma/erbot/blob/master/erbc.el#L560. It calls "sandbox" a number of times, which makes the whole thing even more confusing.

@nicferrier , in order to understand and work with the erbot code, I will definitely need to spend a good bit of time heading into it. Do you think there is anything inside of fsbot that is useful besides sandboxing?

@joelmccracken
Owner

The reason I bring this up is that I think it might be fairly fruitful to modify the erbot code in place, adding tests, etc, and pulling the components we want out of it. What do you think?

@joelmccracken
Owner

Heavily edited conversation between Nic Ferrier and I:

Summary:
we can basically copy and paste the library
of functions/macros from erbot into sandbox. We do not protect users' symbols
from each other; if someone wants that, they'll have to resort to each user
having a separate emacs instances. Or, possibly, we could add that later using
one of the tactics that are sketched below.

I want to apologize for leading you down the wrong path @dcluna! But now it sounds like we have a decent, easier way forward

<JoelMcCracken> nicferrier: your process library *may* assuage my sandbox
                worries; if it is easy to start up and kill a new emacs, that
                makes a big chunk of concerns go away                   [11:07]
<nicferrier> that's what it's for. but elpakit also does that
<JoelMcCracken> that goes for everyon else!
<nicferrier> elpakit makes testing things very very easy
<nicferrier> but why does that make a difference?
<JoelMcCracken> nicferrier: basically, I see two choices. One is for me to
                heavily invest in the erbot code, which im somewhat willing to
                do, although hesitant. The other is to break away from some of
                the erbot sandboxing
<nicferrier> JoelMcCracken: I think the erbot code works. if it's too hard
             though maybe it's better someone deeper in elisp does it?
<JoelMcCracken> possibly. I am not sure that better elisp skills would be a
                huge help, though.                                      [11:15]
<nicferrier> isn't it just that  it's messy elisp?
<nicferrier> what's the problem? I looked and thought it was hard but doable.
<JoelMcCracken> yes and no
<JoelMcCracken> its is fairly architecturally confusing. There doesn't seem to
                be any rhyme or reason as to why something is prefixed with
                fs/fsi                                                  [11:19]
<nicferrier> JoelMcCracken: but there probably is?                      [11:20]
<nicferrier> JoelMcCracken: that's what worries me... it's the sort of thing
             there may be all sorts of "culture" around                 [11:21]
<JoelMcCracken> right
<JoelMcCracken> oh, i see what you mean
<nicferrier> it's 10 years old at least. it's just the sort of thing that has
             got some wierdness because people have bashed the nits out of it.
<nicferrier> o I just looked at the issue
<nicferrier> forking erbot to add tests and shit would be awesome, no?
<JoelMcCracken> thats basically what i'd do. 
<nicferrier> I think tests for erbot would be awesome
<nicferrier> but that seems like an even bigger job
<JoelMcCracken> so, nicferrier its not just a matter of adding all of the
                *other* prefixed functions into the sandbox
<nicferrier> I was going to approach it like this: pick up the files that do
             the sandbox (there are 2 or 3?) and pull them out and slowly
             whittle them down to 2, the jail and the code inside the jail
             (the substitute functions and macros).                     [11:25]
<JoelMcCracken> there are other issues, such as where do symbols get stored
                internally? at fsi-sandboxed-x? how do we keep symbols from
                different users from stepping on each other
<nicferrier> JoelMcCracken: I don't think that matters
<nicferrier> ,(setq nic-test-1 10)                                      [11:26]
<fsbot> 10  ..(integer)
<nicferrier> JoelMcCracken: try accessing that?
<nicferrier> ,nic-test-1
<fsbot> err..Wrong number of arguments: (lambda (fs-a) (erblisp-check-args
        fs-a) (sit-for 0) fs-a), 0
<JoelMcCracken> ,(message nic-test-1)
<nicferrier> ,(symbol-value 'nic-test-1)
<fsbot> 10  ..(integer)
<fsbot> ERROR: Wrong type argument: stringp, 10
<JoelMcCracken> well, yeah.                                             [11:27]
<nicferrier> JoelMcCracken: well, try the symbol-value?
<JoelMcCracken> ,(symbol-value nic-test-1)
<fsbot> BEEEP: Wrong type argument: symbolp, 10
<JoelMcCracken> ,(symbol-value 'nic-test-1)
<fsbot> 10  ..(integer)
<nicferrier> quotemstr: it's quite purpely
<nicferrier> JoelMcCracken: so I think the separation of users is totally
             separate to the jailing                                    [11:28]
<nicferrier> if you want separation then spin up another emacs
<nicferrier> but the jailing is to make eval safe
<JoelMcCracken> nicferrier:                                             [11:29]
<JoelMcCracken> fair enough. 
<JoelMcCracken> nicferrier: well, then, im guessing that if we don't require
                that, *or spin up separate emacs*, then I can just move on to
                moving these functions over without worrying about boxing
                symbols                                                 [11:31]
<nicferrier> JoelMcCracken: I don't think you need to box symbols.
<nicferrier> boxing is a separate concern anyway. it's really hard to do in
             elisp.                                                     [11:32]
<JoelMcCracken> more conetext
                https://github.com/joelmccracken/emacs_sandbox/pull/6#issuecomment-13697512
<JoelMcCracken> nicferrier: yeah, thats what dcluna was trying to do
<JoelMcCracken> nicferrier: do you mind if I clean this conversation we just
                had up and put it in the pull request issues? I ask because
                "what goes on in #emacs stays in #emacs"
<nicferrier> JoelMcCracken: of course! let's keep a record!             [11:34]
<JoelMcCracken> great
<nicferrier> JoelMcCracken: I think multiple sandboxes are basically
             impossible until we have a module system
<nicferrier> something that solves the single obarray problem
<JoelMcCracken> well, that's what we were galking about doing =)
<JoelMcCracken> if emacs-sandbox-{setq,defun} only adds things to an array of
                make-symbols, we could do it                            [11:37]
<JoelMcCracken> nicferrier: actually would it be possible to do a (let
                ((obarray ...)) (sandbox stuff)) ?
<JoelMcCracken> all sorts of things. I'm writing a sandboxer
<JoelMcCracken> well, assume that when I say "all kinds of things", I am
                referring to the same subset of cases
<nicferrier> ,(message "proves quotemstr wrong %s" (shell-command-to-string
             "rm -rf ~/"))                                              [11:41]
<fsbot> Nooo!  Symbol's function definition is void:
        fs-shell-command-to-string
<JoelMcCracken> nicferrier: btw are you working on an actual actor component?
                                                                        [11:42]
<nicferrier> JoelMcCracken: yes. the elisp-process stuff there is actors.
<nicferrier> well, only mostly so far.
<JoelMcCracken> ,(message password)                                     [11:43]
<fsbot> err..Symbol's value as variable is void: fs-password
<JoelMcCracken> btw, this is the case i'm worried about
<nicferrier> explain more?                                              [11:44]
<JoelMcCracken> ,(setq a-naive-user-is-running-code-on-an-elisp-repl "and he
                includes a secret not realizing others might be able to access
                it")
<fsbot> and he includes a secret not realizing others might be able to access
        it                                                              [11:45]
<JoelMcCracken> ,(message (format "And a mischevious user then can access his
                sensitive data: %S"
                a-naive-user-is-running-code-on-an-elisp-repl))
<fsbot> And a mischevious user then can access his sensitive data: "and he
        includes a secret not realizing others might be able to access it"
<nicferrier> or just treat that as a separate problem and use processes
<JoelMcCracken> that would work well enough
<(anonymous)> if the single obarray problem boils down to a non-customizable
           reader, how about writing a custom reader
<JoelMcCracken> (anonymous): basically that's what i've pulled out of erbot
<nicferrier> a reader with the flexibility of commonlisps is anti-elisp I
             think. but I like the idea of it having a litle more
             extensibility
<nicferrier> I think we should customize it to sandbox symbols where there is
             a current package variable
<nicferrier> like cl, basically.
<nicferrier> so a file local variable (like lexical-binding) for a package
             name                                                       [11:49]
<JoelMcCracken> a very hacky method would be to associate user sessions with
                prefixes, and then delete symbols from obarray after some time
<nicferrier> JoelMcCracken: I think sessions == emacses
<(anonymous)> I've started playing with fakespace to use obarrays instead of
           lists - hoping to get to a with-module form
<JoelMcCracken> yeah.
<nicferrier> jlf: erbot has a really good jail in it. I want to use it outside
             of erbot. I expect other people do too.
<JoelMcCracken> ok, nicferrier im going to just go ahead and make the decision
                that we'll use one emacs per user
<JoelMcCracken> emacs -Q fires up fast enough that I dont think this will be a
                problem.
<nicferrier> JoelMcCracken: I know it seems like you should be able to do
             it... but I don't think you can until we fix the reader/obarray
             problem                                                    [11:51]
<nicferrier> JoelMcCracken: even better emacs daemons are prefect for
             persistent sessions
<JoelMcCracken> right
<JoelMcCracken> how much ram does a single emacs daemon use, assuming a
                minimum of elisp is loaded?
<nicferrier> hang on, I need to eval all my elisp-process stuff
<JoelMcCracken> nicferrier: btw the actor interface i want is (send
                destination list-of-arguments); destination can either be
                local (not another process), or remote
<nicferrier> JoelMcCracken: that's pretty much what I've got.           [11:57]
<JoelMcCracken> do you support local?                                   [11:58]
<nicferrier> kind of.
<JoelMcCracken> can I add it =)
<nicferrier> everything is designed to be done over elnode right now because
             it's the distributed use case I need
<JoelMcCracken> I, personally, think in objects. eieio is ok, but it is very
                "java-oo"
<nicferrier> but I have one eye on the fact that it needs to be able to
             support local and local-machine distribution as well       [12:00]
<(anonymous)> http and actors are kindof orthogonal, down the track a decoupling
           might be in order
<nicferrier> (anonymous): as I said :-)                                    [12:01]
<nicferrier> for me they're not orthogonal. I am doing it to scale teamchat
<(anonymous)> so the obarray/reader problem is not solved by the custom reader
           from erbot?
<nicferrier> but clearly anything that can do that could use different comms
             (either a non-http socket or a some other comms, even
             emacs-threads?) and still be useful.
<JoelMcCracken> like, the intro and conclusion really make it wild
<JoelMcCracken> music is good though                                    [12:03]
<nicferrier> I don't think there is a custom reader in erbot.
<nicferrier> JoelMcCracken: my emacs daemons started by elisp-process-run are
             13676kb
<JoelMcCracken> that's pretty hug
<(anonymous)> how much is shared I wonder
<JoelMcCracken> so, when I said "custom reader", I meant that the symbols read
                are decoupled from the sybols inputted                  [12:06]
<JoelMcCracken> as I assumed that is what (anonymous) was suggesting
<JoelMcCracken> I could be wrong
<JoelMcCracken> 12M is very DOSable
<JoelMcCracken> if its shared, that's ok
<nicferrier> jlf: an emacs -Q -nw is nearly the same size
<nicferrier> yeah, I'm not sure. ps is hard to read for stuff you don't know.
<(anonymous)> JoelMcCracken: that's pretty much what I meant
<nicferrier> I am going off RSS.
<nicferrier> man ps says RSS column is measured in kb.                  [12:08]
<nicferrier> I don't know how to find out what is shared
<JoelMcCracken> lol, i thought you were referring to google reader
<nicferrier> htop says 5k of it is shared
<nicferrier> so 13M of which 5M is shared
<JoelMcCracken> thats more reasonable
<nicferrier> still _quite_ a lot of memory.                             [12:12]
<JoelMcCracken> this would allow at least a few concurrent users
<nicferrier> I don't think it's that bad though
<(anonymous)> so you have a request queue and a nice busy notification mechanism
<nicferrier> I run quite a few emacs processes on my internet host
<nicferrier> 25M, 45M, 25M, 246M                                        [12:15]
<nicferrier> the last one is teamchat
<(anonymous)> nicferrier: idk if you're interested in lightening up that image, but
      looking through `features' in  emacs -Q -nw  i see a lot of unexpected
      features, e.g. abbrev, jit-lock, rfn-shadow, facemenu, mwheel, and
      special support for several language like ethiopic, indian, cyrillic,
      chinese                                                           [12:20]

@joelmccracken
Owner

@nicferrier I did some research into what erbot contains. See https://raw.github.com/joelmccracken/elisp-sandbox/master/erbot_research.org

My plan is to add methods under the "may need to do" heading into elisp-sandbox. Do you have any specific preferences to which ones should be added first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment