Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to "dirty" a forward-defined rule inside of cacheAction #734

Open
ChrisPenner opened this issue Dec 5, 2019 · 3 comments
Open

How to "dirty" a forward-defined rule inside of cacheAction #734

ChrisPenner opened this issue Dec 5, 2019 · 3 comments

Comments

@ChrisPenner
Copy link

ChrisPenner commented Dec 5, 2019

Hi! I've been running shake's forward mode as the primary mode for slick and it's been working pretty well!

I'm still a little confused by cacheAction (not a lot of documentation yet), but I think I've got the gist of it; please correct me if I'm wrong about the following assumptions.

So far as I can tell it runs the action once and associates any "needed" files with that key, so it'll only re-run on subsequent executions if any "needed" files have changed.

The part I don't understand, is how this interacts with pieces I'm feeding in from earlier actions. This is likely best explained with a simple example:

loadPosts = do
  postFiles <- getDirectoryFiles "posts" ["//*.md"]
  forP postFiles $ \p -> cacheAction (srcLocation p) $ do
    -- load markdown file, then write html result (using shake combinators)

collectTags :: [Post] -> Map Tag [Post]
collectTags = -- pure function to collect all posts with each respective tag.

writeTags :: [Tag] -> Action ()
writeTags tags = forP tags $ \t -> cacheAction (tagName t) $ do
  -- write each tag to a separate html index

buildRules :: Action ()
buildRules = do
  posts <- loadPosts
  tags <- collectTags posts
  writeTags tags

This mostly works well! The problem is that writeTags uses the list of tags as its entire input, the writing of each tag "depends" on only the tag object itself, it doesn't read any files, and therefore doesn't "need" anything. Ideally it would rebuild a tag when its "object" changes (perhaps detected via checksum or binary representation)

I used to implement this sort of thing with oracles in older versions of slick, but this seems considerably more awkward in the forward feeding style. Is it considered bad practice to compute the md5 and just use that as part of the cacheAction key so it rebuilds on changes? Curious what the "idiomatic" approach is here 😄

It seems like it would be nice to have either a makeDirty <key> action so I can manually dirty tags, or even better: the option to provide a value argument to cacheAction which will force a re-run of the action if the provided argument value changes.

Love using shake, it makes building my site pretty easy, cheers!

@ChrisPenner ChrisPenner changed the title Custom deps for cacheAction rules How to "dirty" a forward-defined rule inside of cacheAction Dec 5, 2019
@ndmitchell
Copy link
Owner

The Forward stuff is definitely undocumented! It's also poorly understood - I think there is a paper to be written on forward build systems, but we haven't figured out what it contains yet, or what idiomatic looks like! Thanks for your investigations.

As to your immediate problems, the two things I can think of are:

  • Remove the cacheAction (tagName t) entirely. Just always rerun the action. How much slower does that make things?
  • Change to cacheAction t - that means if anything in Tag changes then an action reruns, which might match what you want. Or it might pollute the Shake database too much. I'm honestly not totally sure. If the side effects are too big, we can probably figure out a better combinator to optimise that.

@ChrisPenner
Copy link
Author

In this case I can likely get away with rebuilding tags every time, so this isn't a "critical" problem (I'll survive 😄 ). But it would be nice to understand how this works a little better, and there are certain transformations that could be a bit more time intensive.

I gave the cacheAction t a try, and it almost works. It will re-run for all new tag 'states' (e.g. any tags which don't have their binary version stored), BUT, if a tag ever reverts back to a previously known state it will forgo running the action entirely (it detects that nothing "needed" in the action has changed, and even though there's a writeFile' it decides it doesn't need to re-run). This means that the writeFile' doesn't trigger, and results in a tricky bug that's tough to track down.

To clarify, If I have a tag: tagOne = Tag{tagName="haskell", posts=["1"]}, then I change to tag state two: tagTwo = Tag{tagName="haskell", posts=["2"]} it WILL rebuild and properly change to the new associated posts, BUT if I then revert the tag back to the tagOne state the file-system will still reflect the stale tagTwo state.

Make sense?

I suspect a workaround would be to write out each tag to a file, then need that file inside the cacheAction; but that definitely seems a bit awkward, probably the wrong trick going forwards.

Something like a rerunOnValueChange :: Name -> v -> Action r -> Action r could be nice in a forward-feeding system. It could even be implemented via the "write to file" trick behind the scenes if necessary, but it would be handy to hide that from the user.

E.g. (in pseudo-code)

rerunOnValueChange name val action = do
  writeFile' ("cache" </> name) (toBinary val)
  cacheAction name $ need ("cache" </> name) *> action

@ndmitchell
Copy link
Owner

I just pushed actionCacheWith. If you have closure which captures information you should probably pass that information as the argument. It's basically rerunOnValueChange. It doesn't have a write to file trick, but uses actionCache in a nested way. While it works, I couldn't write a formal proof of why, so keen to see if it works for you in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants