Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache being busted after restoring directory #5125

Closed
aschmois opened this issue Dec 19, 2019 · 6 comments
Closed

Cache being busted after restoring directory #5125

aschmois opened this issue Dec 19, 2019 · 6 comments

Comments

@aschmois
Copy link
Contributor

aschmois commented Dec 19, 2019

Stack version

Used latest 2.1.3 and latest HEAD

Method of installation

Download linux stack then run upgrade --git

Problem/Question

We are using semaphore CI to compile our project and have not been able to get our project compilation cached. We run stack inside a docker container (for all intents and purposes, it's a regular stack build --test just inside docker).

We cache the stack-root directory and restore it without issues, meaning ghc does not re-download/compile and packages are not re-compiled.

We also cache .stack-work directory and restore it but our build command always rebuilds as if there was no cache. The relevant bits of information from the log below seem to act as if there is a cache, but it ignores it completely.

It unregisters a single file change (to test) but all 818 modules are recompiled:

xxx-22: unregistering (local file changes: library/Executable/xxx.hs)
--
  | xxx> configure (lib + exe + test)
  | Configuring xxx-22...
  | xxx> build (lib + exe + test)
  | Preprocessing library for xxx-22..
  | Building library for xxx-22..
  | [  2 of 818] Compiling Environment.xxx

The shas seen here are identical to the shas used in the previously cached build.

xxx> copy/register
--
  | Installing library in /work/.stack-work/install/x86_64-linux/01c3488eaf4ca467d4cf5a081d8270e39a6956b896e87ed70692ef5026aa5899/8.8.1/lib/x86_64-linux-ghc-8.8.1/xxx-22-B7rEdpkV2M7EtGsQY02An4
  | Installing executable xxx in /work/.stack-work/install/x86_64-linux/01c3488eaf4ca467d4cf5a081d8270e39a6956b896e87ed70692ef5026aa5899/8.8.1/bin
  | Registering library for xxx-22..

My assumption is that the cache restore is messing with some file inside stack-work that's causing the cache to bust. My question is which file(s) is that and is there anyway to force stack to not do that?

@aschmois aschmois changed the title CI cache issues Cache being busted after restoring directory Jan 2, 2020
@simonmichael
Copy link
Contributor

Could be https://stackoverflow.com/a/61178945/84401 ?

@aschmois
Copy link
Contributor Author

aschmois commented Jul 24, 2020

The current workaround for this is to run this before doing stack work in your ci steps:

find ./your ./src ./directories -name "*.hs" -print0 | xargs -0 -I '{}' \
  touch \
  --no-create \
  --reference '{}' \
  --date "1000 days ago" \
  '{}'

This causes the modified time to constantly be different (so all files are checked against the digest and mark dirty if needed) and very long time ago so that the preBuildTime doesn't count them as recently added and marks everything as dirty. I think stack should have an option to ignore timestamps when checking if a file is dirty. Also checking for newly added files to compile may need some work. I attempted to change the code a little with success but it's very hacky. I don't know enough about stack's code style to make it PR ready.

Also worth noting that calculating the digest on the build step is very fast! So removing mod time check and always checking digest could be an option as well. Maybe make it default and enable mod time check an optimization step if you know you will keep a clean file system (like locally for example).

@mbj
Copy link
Contributor

mbj commented Jul 24, 2020

OT: I'd love to see stack to change to a content addressed caching system. Filesystem metadata typically is a weak signal. In all custom build systems I write I only use it as tie breaker (if at all).

@aschmois
Copy link
Contributor Author

The hack unfortunately doesn't work when adding new files. Since they are before the build time they will not be marked as dirty. I'm working on a PR now to see how people feel about removing mod time altogether.

snoyberg added a commit that referenced this issue Aug 6, 2020
Remove ModTime check during build (#5125)
@aschmois
Copy link
Contributor Author

aschmois commented Aug 6, 2020

Fixed with #5351

@snoyberg
Copy link
Contributor

snoyberg commented Aug 6, 2020

Thanks!

@snoyberg snoyberg closed this as completed Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants