How to "dirty" a forward-defined rule inside of `cacheAction` #734

ChrisPenner · 2019-12-05T22:30:40Z

Hi! I've been running shake's forward mode as the primary mode for slick and it's been working pretty well!

I'm still a little confused by cacheAction (not a lot of documentation yet), but I think I've got the gist of it; please correct me if I'm wrong about the following assumptions.

So far as I can tell it runs the action once and associates any "needed" files with that key, so it'll only re-run on subsequent executions if any "needed" files have changed.

The part I don't understand, is how this interacts with pieces I'm feeding in from earlier actions. This is likely best explained with a simple example:

loadPosts = do
  postFiles <- getDirectoryFiles "posts" ["//*.md"]
  forP postFiles $ \p -> cacheAction (srcLocation p) $ do
    -- load markdown file, then write html result (using shake combinators)

collectTags :: [Post] -> Map Tag [Post]
collectTags = -- pure function to collect all posts with each respective tag.

writeTags :: [Tag] -> Action ()
writeTags tags = forP tags $ \t -> cacheAction (tagName t) $ do
  -- write each tag to a separate html index

buildRules :: Action ()
buildRules = do
  posts <- loadPosts
  tags <- collectTags posts
  writeTags tags

This mostly works well! The problem is that writeTags uses the list of tags as its entire input, the writing of each tag "depends" on only the tag object itself, it doesn't read any files, and therefore doesn't "need" anything. Ideally it would rebuild a tag when its "object" changes (perhaps detected via checksum or binary representation)

I used to implement this sort of thing with oracles in older versions of slick, but this seems considerably more awkward in the forward feeding style. Is it considered bad practice to compute the md5 and just use that as part of the cacheAction key so it rebuilds on changes? Curious what the "idiomatic" approach is here 😄

It seems like it would be nice to have either a makeDirty <key> action so I can manually dirty tags, or even better: the option to provide a value argument to cacheAction which will force a re-run of the action if the provided argument value changes.

Love using shake, it makes building my site pretty easy, cheers!

The text was updated successfully, but these errors were encountered:

ndmitchell · 2019-12-07T14:59:55Z

The Forward stuff is definitely undocumented! It's also poorly understood - I think there is a paper to be written on forward build systems, but we haven't figured out what it contains yet, or what idiomatic looks like! Thanks for your investigations.

As to your immediate problems, the two things I can think of are:

Remove the cacheAction (tagName t) entirely. Just always rerun the action. How much slower does that make things?
Change to cacheAction t - that means if anything in Tag changes then an action reruns, which might match what you want. Or it might pollute the Shake database too much. I'm honestly not totally sure. If the side effects are too big, we can probably figure out a better combinator to optimise that.

ChrisPenner · 2019-12-07T20:33:23Z

In this case I can likely get away with rebuilding tags every time, so this isn't a "critical" problem (I'll survive 😄 ). But it would be nice to understand how this works a little better, and there are certain transformations that could be a bit more time intensive.

I gave the cacheAction t a try, and it almost works. It will re-run for all new tag 'states' (e.g. any tags which don't have their binary version stored), BUT, if a tag ever reverts back to a previously known state it will forgo running the action entirely (it detects that nothing "needed" in the action has changed, and even though there's a writeFile' it decides it doesn't need to re-run). This means that the writeFile' doesn't trigger, and results in a tricky bug that's tough to track down.

To clarify, If I have a tag: tagOne = Tag{tagName="haskell", posts=["1"]}, then I change to tag state two: tagTwo = Tag{tagName="haskell", posts=["2"]} it WILL rebuild and properly change to the new associated posts, BUT if I then revert the tag back to the tagOne state the file-system will still reflect the stale tagTwo state.

Make sense?

I suspect a workaround would be to write out each tag to a file, then need that file inside the cacheAction; but that definitely seems a bit awkward, probably the wrong trick going forwards.

Something like a rerunOnValueChange :: Name -> v -> Action r -> Action r could be nice in a forward-feeding system. It could even be implemented via the "write to file" trick behind the scenes if necessary, but it would be handy to hide that from the user.

E.g. (in pseudo-code)

rerunOnValueChange name val action = do
  writeFile' ("cache" </> name) (toBinary val)
  cacheAction name $ need ("cache" </> name) *> action

ndmitchell · 2019-12-14T19:56:00Z

I just pushed actionCacheWith. If you have closure which captures information you should probably pass that information as the argument. It's basically rerunOnValueChange. It doesn't have a write to file trick, but uses actionCache in a nested way. While it works, I couldn't write a formal proof of why, so keen to see if it works for you in practice.

ChrisPenner changed the title ~~Custom deps for cacheAction rules~~ How to "dirty" a forward-defined rule inside of cacheAction Dec 5, 2019

ndmitchell added a commit that referenced this issue Dec 14, 2019

#734, add Forward.cacheActionWith and tests

9d75ebf

srid referenced this issue in srid/rib Dec 30, 2019

Use cacheActionWith to prevent rebuilds

c7468fc

srid mentioned this issue Feb 28, 2020

Shake spends predominant time reading its database srid/rib#108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to "dirty" a forward-defined rule inside of `cacheAction` #734

How to "dirty" a forward-defined rule inside of `cacheAction` #734

ChrisPenner commented Dec 5, 2019 •

edited

ndmitchell commented Dec 7, 2019

ChrisPenner commented Dec 7, 2019

ndmitchell commented Dec 14, 2019

How to "dirty" a forward-defined rule inside of cacheAction #734

How to "dirty" a forward-defined rule inside of cacheAction #734

Comments

ChrisPenner commented Dec 5, 2019 • edited

ndmitchell commented Dec 7, 2019

ChrisPenner commented Dec 7, 2019

ndmitchell commented Dec 14, 2019

How to "dirty" a forward-defined rule inside of `cacheAction` #734

How to "dirty" a forward-defined rule inside of `cacheAction` #734

ChrisPenner commented Dec 5, 2019 •

edited