-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building tasks with dynamic outputs using the restarting scheduler #19
Comments
A note about so-called forward-defined build systems, like Fabricate: If you have a forward-defined build system, it means you have a total order on build tasks, which automatically prevents cyclic dependencies. Furthermore, it means that you never need to block tasks, because by the time you reach them in the working queue, all their dependencies must have been either skipped or rebuilt, so you can decide on the spot whether to rebuild it or not. In this way, build systems like Fabricate can be thought of as a special (trivial, really) case when the scheduler is just |
Interesting, my simpler and less formal way of saying it is:
The interesting thing here is that it assumes a finite set of rules. Note that Shake doesn't have a finite set of rules as something like a rule producing |
Agreed on the Fabricate remark, although it feels like the degenerate case of this example, but it does capture the essence to some degree, which all our previous models failed to do. Maybe the powerful thing about Fabricate is actually that the rules are "finite", or more precisely calculated from the set of produced files? |
@ndmitchell Agreed, this is an interesting aspect that I'm not sure how to deal with yet. In our current model, the map
Indeed, Fabricate seems to be different from other build systems in that all build rules are "singular" (i.e. not "template") and are given exactly as a finite list (if we assume one can't write an infinite Fabricate script with some kind of recursion). |
Even if they were applicative, the fact you have an infinite number of them still seems problematic. And if they are finite, you don't need the applicative. You could write an infinite fabricate script, but assuming it terminates, it will only be able to go down one path. It's a weird kind of finite, but definitely related. |
Here is an example of how one could go about compiling a collection of files with read/write tasks: type Get k f = forall a. k a -> f a
type Put k f = forall a. k a -> f a -> f a
type Task c k a = forall f. c f => Get k f -> Put k f -> f a
data Key a where
Dir :: FilePath -> Key [FilePath]
File :: FilePath -> Key String
compileAllCFiles :: Task Monad Key ()
compileAllCFiles get put = do
files <- get (Dir "src/c/")
srcs <- traverse (get . File) files
let objs = [ (File (f ++ ".o"), compileC o) | (f, o) <- zip files srcs ]
void $ traverse (uncurry put) objs
where
compileC = pure . id -- insert a C compiler here An important aspect here is that Note also that We could put this |
To elaborate the above example a bit more: type Get k f = forall a. k a -> f a
type Put k f = forall a. k a -> f a -> f a
type Task c k a = forall f. c f => Get k f -> Put k f -> f a
data Key a where
Dir :: FilePath -> Key [FilePath]
File :: FilePath -> Key String
compileAllCFiles :: Task Monad Key ()
compileAllCFiles get put = do
srcs <- get (Dir "src/c/")
void $ traverse (\src -> compileC src get put) srcs -- independent/parallel
compileC :: FilePath -> Task Monad Key ()
compileC cFile get put = do
let objFile = cFile ++ ".o"
src <- get (File cFile)
deps <- traverse (get . File) (cDependencies src)
void $ put (File objFile) (pure $ compile src deps)
where
cDependencies _src = [] -- insert dependency analysis here
compile src _deps = src -- insert a C compiler here |
So the claim is that if the final step in a monadic dependency chain is an Applicative we can separate it and do partial recomputation? I'm not convinced that's true. Imagine we did a traverse with an index, so compiled files could see if they were the first/last file in the directory. Now you have dependencies that aren't fine grained. There is some level of isolation, but it's a lot more subtle. What if you keep running the |
Yes!
Not sure what exactly you mean. Something like this? data Key a where
Dir :: FilePath -> Key [(FilePath, Int)] -- We need to depend on index
File :: FilePath -> Key String
compileAllCFiles :: Task Monad Key ()
compileAllCFiles get put = do
srcs <- get (Dir "src/c/")
void $ traverse (\src -> compileC src get put) srcs -- independent/parallel
compileC :: (FilePath, Int) -> Task Monad Key ()
compileC (cFile, index) get put = do
let objFile = cFile ++ ".o"
src <- get (File cFile)
deps <- traverse (get . File) (cDependencies src)
void $ put (File objFile) (pure $ compile src index deps)
where
cDependencies _src = [] -- insert dependency analysis here
compile src _index _deps = src -- insert a C compiler here This doesn't seem to change anything. If this is not what you meant, could give an example?
I think in this case the corresponding |
This is an issue to discuss how the restarting scheduler can be used to build tasks with dynamic inputs and outputs, as opposed to tasks with dynamic inputs that are covered by the Build Systems à la Carte paper.
I'll start by sketching a proof that the restarting scheduler works for tasks with dynamic outputs. First of all, we need to assume that all build tasks are finite, i.e. that they terminate and have a finite number of input dependencies, which in turn guarantees that the restarting scheduler terminates. Why? Because it always makes some progress by either removing a task from the working queue, or unblocking one of the blocked tasks, in which case the latter is one step closer to completion.
Let's run the restarting algorithm with a working queue containing all build tasks.
When it terminates, we have one of two cases:
T
.Now we can argue that in the second case the target key cannot be built due to one of the two reasons:
Let
k
denote the target key. All tasks that are not inT
have completed and did not producek
(the build failed), hence we know that all tasks that could possibly buildk
must be inT
. Lett
denote one such task (if there is no sucht
, then this is the case (2) above). Sincet ∈ T
, it is blocked by some keyb
, and we can repeat our argument by takingb
as the target key: by doing this we will eventually either hit the case (2), or will circle back to a key we already examined (sinceT
is finite), which would indicate the case (1).The proof in non-constructive in the sense that we don't know which
t
could actually producek
and hence lead to a cycle. All we can say is that either sucht
does not exist (2), or it does exist but it will inevitably either lead to a cycle (1) or hit a dead end (2).The text was updated successfully, but these errors were encountered: