Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement an efficient process pool for sandboxing evaluation of foreign computations #24

Closed
pchiusano opened this issue Jul 28, 2015 · 9 comments

Comments

@pchiusano
Copy link
Member

This is a fun, important, fairly self-contained project that we aren't blocked on right now, and requiring minimal background. If you'd like to get involved in Unison development, read on and see if you'd like to take the lead on an implementation!

This project will be an important component of the distributed systems API. Reading or at least skimming that post is probably good background but isn't strictly necessary.

When a Unison node receives a computation to evaluate from another node (a "foreign computation"), currently we do so in in the same process as the node itself. This is bad for a few reasons:

  • What if the foreign computation is an infinite loop? (Or a computation that provably terminates but which takes longer than the age of the universe to do so...)
  • What if the foreign computation deliberately leaks a huge amount of memory?
  • There's also the concern that the foreign computation will do something evil like delete random files on our filesystem. That level of sandboxing is handled as a separate layer. I'll touch on how this relates to this project a bit later.

Since we don't necessarily trust the foreign computation with the full set of CPU and memory resources available to a Unison node, we need to run foreign computations in some sort of sandbox. Here's the API (subject to tweaking, but this is probably a good start):

module Unison.Runtime.ProcessPool where

import Data.Bytes.Serial
import System.Process (ProcessHandle)

newtype TimeBudget = Seconds !Int
newtype SpaceBudget = Megabytes !Int
type Budget = (TimeBudget, SpaceBudget)
newtype MaxProcesses = MaxProcesses !Int
data Err = TimeExceeded | SpaceExceeded | Killed | InsufficientProcesses
type ID = Int

data Pool a b = Pool {
  -- | Evaluate the thunk in a separate process, with the given budget. 
  -- If there is no available process for that `Budget`, create a new one, 
  -- unless that would exceed the `MaxProcesses` bound, in which case
  -- fail fast with `Left InsufficientProcesses`.
  evaluate :: Budget -> MaxProcesses -> ID -> a -> IO (Either Err b),

  -- | Forcibly kill the process associated with an ID. Any prior `evaluate` for
  -- that `ID` should complete with `Left Killed`.
  kill :: ID -> IO (),

  -- | Shutdown the entire pool. After this completes, no other processes should be running
  shutdown :: IO ()
}

pool :: (Serial a, Serial b) => IO ProcessHandle -> IO (Pool a b)
pool createWorker = _todo

That is the full API. The implementation should be backed by a growable pool of processes. (If Haskell threads could specify a max heap size on startup, we could do everything in-process, but unfortunately, that isn't supported and it doesn't look like it's happening anytime soon.)

Here's a simple sketch of an implementation:

  • When the pool is created, launch a local (in-process) thread. Conceptually, this keeps a couple pieces of state:
    • available :: Map (TimeBudget, SpaceBudget) [ProcessHandle], which is the list of free worker processes ("workers") associated with each budget. We don't literally want to spin up a new OS process every time evaluate gets called.
    • running :: Map (TimeBudget, SpaceBudget) [ProcessHandle], which is the list of processes that are currently running a call to evaluate.
    • ids :: Map ID [ProcessHandle], storing the mapping from ID to processes with that ID.
  • When evaluate gets called, serialize the thunk using the argument passed to pool. Lookup in available to see if there's an existing process configured with that budget, and which happens to be free:
    • If there isn't, check that creating a new process wouldn't exceed the maximum number of processes. If it would, fail fast with a Left InsufficientProcesses. If not, spin up a new process with that budget, add it to the available map and move to the next step.
    • If so, using inter-process communication (or some socket abstraction that uses IPC if on the same machine), send the serialized thunk to that process and wait for the reply, which should be deserialized as a IO (Either Err b).
    • When results come back, we should update the running, active, and ids state accordingly.

Note: Any restriction of privileges other than time / space budgeting will be handled before a call to evaluate. So for instance, if we want to disallow write access to the node's local data store, this would be implemented by inspecting the term, and making sure it cannot reference any such functions. We'll call this a "capability failure" vs a "resource budget failure" caused by a computation exceeding its time or space budget.

The pool is backed by a number of worker processes (or just "workers"). A worker process will be initialized with a CPU and space budget (probably via command line flags), and its main logic will be some a -> IO b:

module Unison.Runtime.Worker where

worker :: (Serial a, Serial b) -> (a -> IO b) -> IO ()
worker eval = ...

main :: IO ()
main = worker _todo

The time budget will be handled internal to the Haskell code, but the memory budget will have to be handled via an RTS flag. It looks like myprog +RTS -M1024m will limit myprog to run with 1024 megabytes. The time budget should be handled internally so that the same worker can be reused, rather than having to spin up a new process every time. It will be quite common to have lots of sequential requests with the same budget.

If you are interested in this project and have questions (or suggestions), please post them here, or come discuss in the chat room.

@sfultong
Copy link
Member

I'll take this on, if no one else is working on it

@pchiusano
Copy link
Member Author

Sure, though can you prioritize finishing up the key-value store bindings,
at least when you aren't bottlenecked by me needing to review/merge stuff?

On Fri, Apr 15, 2016 at 1:28 PM sfultong notifications@github.com wrote:

I'll take this on, if no one else is working on it


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#24 (comment)

@sfultong
Copy link
Member

Sure thing

@tmciver
Copy link
Contributor

tmciver commented Jun 9, 2016

I started looking into this and have a few questions:

  • Why does the evaluate function take a MaxProcesses? I would think MaxProcesses would be an attribute of a Pool and therefore be a parameter of the pool function.
  • How does a client of Pool get access to process IDs (which are needed by evaluate and kill)? Seems to me that process IDs should be internal to a Pool.

I think it would help me to see how some client code would use a Pool. In particular, I'd like to see the interaction between a Pool and its worker processes.

@pchiusano
Copy link
Member Author

@tmciver

Why does the evaluate function take a MaxProcesses? I would think MaxProcesses would be an attribute of a Pool and therefore be a parameter of the pool function.

I think you're right. Not sure what I was thinking there. Let's go with that.

How does a client of Pool get access to process IDs (which are needed by evaluate and kill)? Seems to me that process IDs should be internal to a Pool.

With the API I gave, the client of Pool just makes them up before calling evaluate. But now that you mention it I think it would be simpler for Pool to just invent them itself and return the ID it picks immediately:

evaluate :: Budget -> a -> IO (ID, IO (Either Err b)),

So the outer IO (ID, ..) completes immediately, and the inner IO (Either Err b) completes whenever the worker finishes.

I think it would help me to see how some client code would use a Pool. In particular, I'd like to see the interaction between a Pool and its worker processes.

I'm a bit hazy on this myself, but client code would just call the evaluate function, and maybe kill if a computation was taking too long. What the evaluate function computes is going to be determined by whatever the worker processes do.

The worker process would just read a stream of Packet a from standard in (you could use some other IPC mechanism but using standard in / standard out seems reasonable), using deserialize, call its evaluate function, catching any exceptions, and writing results incrementally to standard out using serialize. Packet might look something like:

data Packet a = Kill | Done | Eval a deriving (Generic)
instance Serial a => Serial (Packet a)

Kill would indicate that the most recently started computation should be killed, Done indicates the overall process should be shut down following completion of any outstanding evaluation, and Eval a is a new computation to evaluate. Within a worker process, only one thing will ever be getting evaluated at a time.

So the implementation of Pool is going to spin up several worker processes using the provided action, keep track of their state (whether busy or idle), and send them work to do over their standard in. The workers will receive a stream of work to do / control messages, will do work, and serialize results to standard out where they'll be read by the process where the Pool lives.

Do you think that gives you enough to go on, or do you have more questions? Obviously as stuff comes up we can talk more. :)

tmciver added a commit to tmciver/unison that referenced this issue Jun 15, 2016
unisonweb#24) with some minor changes to
make it compile.
@tmciver
Copy link
Contributor

tmciver commented Jun 15, 2016

@pchiusano Thanks for elaborating. I've started a branch in my fork for this work (there's no implementation yet): https://github.com/tmciver/unison/tree/process-pool.

A few more questions:

  • You've changed the return type of evaluate to contain the ID of the process being used for the evaluation but what should be returned for the ID in the case where previously we would have failed fast with an InsufficientProcesses error?

  • In the description of the API at the top of the page you say

    When evaluate gets called, serialize the thunk using the argument passed to pool.

    What is the thunk? Is it the value of type a? Isn't a thunk a function?

  • Probably related to the above question: what is the function a -> IO b passed to the worker function? At first I thought it would be a function given to us by the client that we would then evaluate. But then I thought workers were an implementation detail of a Pool.

@pchiusano
Copy link
Member Author

@tmciver Okay, I have been giving this some thought, and realized that sandboxing can be done in a way that is more elegant and makes this process pool unneeded. So I'm going to close this issue. Really sorry to pull the rug out from under you. :( I'll be on the lookout for other bite-sized projects.

Briefly, the new design (which I'll write up in a post) is that nodes are much lighter weight things (many nodes can be running on a single machine), and you can even spawn nodes dynamically, using a new primitive like:

-- | Create a new local node, running in the specified sandbox
spawn : Budget -> Sandbox -> Remote! Node

Where Budget specifies an amount of compute resources (memory and/or time) and Sandbox is some description of the capabilities the returned Node is allowed access to. Some details to be worked out but you get the idea.

I think this is a more elegant approach. Any pooling of nodes can be done just using pure Unison code. And when sharing node references with others, you can choose to share a sandboxed node that has more limited capabilities.

For the implementation, each node will be backed by a single OS process if it's 'awake', or if it's 'asleep' it will just be backed by some bits on disk.

I'm going to do some prototyping of this architecture before committing to it, but once it is more solid I think there will be some good bite-sized pieces for others to work on.

@tmciver
Copy link
Contributor

tmciver commented Jun 15, 2016

OK, no problem. Let me know when there's more work to do; I'm eager to contribute to this project and hone my skills!

@pchiusano
Copy link
Member Author

Cool. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants