Should the cache directory live in $HOME by default? #22

oconnor663 · 2014-07-23T19:04:51Z

As I'm writing the README, I find myself telling new users to set the $PERU_CACHE variable to avoid recloning things after they clean. When new users need to configure some random setting, that's usually a sign that the default is bad. Should we be storing the cache in a centralized spot by default?

Pros:

This is what Maven and Ivy do.
This makes the fastest setup the default for new users.
Different projects with the same dependencies would share their networking and disk space by default.

Cons:

This is what Maven and Ivy do.
This is not what git does by default, even though it can be configured to.
This default might be bad for complicated disk setups.
- Say your build machine uses an NFS mount for /home, and uses /var/local or something for the actual local disk. You might want to do all your clones and builds in /var/local for speed, but behind the scenes peru is doing big git operations over the network in /home.
This can be confusing for modules without an explicit rev.
- If two projects use the same dependency, the rev that one of them is getting will be affected by the other. A "new" dependency could be very stale because the other caller cached it in the past. We will probably also have a --skipcache flag or something in the future to force plugin fetches, and doing that would update the cache for all callers.
- Similar problem for nondeterministic build commands.
This is a band-aid for our slow, serial plugin fetching. We should make it faster instead.
When we start actually using locks for our cache writes, this default could create more lock contention and stale lock issues.

The text was updated successfully, but these errors were encountered:

olson-sean-k · 2014-07-23T20:12:14Z

In general, this seems to add complexity (see list of cons) for the sake of speed (see "list" of pros). As already mentioned, there are other battles we can fight to win speed; I'm not convinced this is the one to focus on, especially since it will likely burden users with complexity rather than just our code.

The behavior today is understandable, unsurprising, and tends to work well. A shared cache makes "building in space" harder, and today may actually break that mantra, no?

Personally, I haven't experienced any notable pain yet using a per-repo cache, but I also do not execute cleans very often. How much of an impact do we expect this to have on our most compelling use cases?

oconnor663 · 2014-07-23T20:38:03Z

My workflows tend to be trigger happy with git clean, so maybe I'm overestimating how much a normal user is going to blow away their cache. I think if we both set PERU_CACHE in our shellrcs that would be a red flag, but if you don't set it I'm less concerned.

olson-sean-k · 2014-07-23T20:42:23Z

I do not set PERU_CACHE. I've done it by hand to experiment, but I don't script it.

There are clear benefits (speed!), so it is worth mentioning in README.md, but I'm not sure it actually needs a lot of attention.

oconnor663 closed this as completed Jul 23, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should the cache directory live in $HOME by default? #22

Should the cache directory live in $HOME by default? #22

oconnor663 commented Jul 23, 2014

olson-sean-k commented Jul 23, 2014

oconnor663 commented Jul 23, 2014

olson-sean-k commented Jul 23, 2014

Should the cache directory live in $HOME by default? #22

Should the cache directory live in $HOME by default? #22

Comments

oconnor663 commented Jul 23, 2014

olson-sean-k commented Jul 23, 2014

oconnor663 commented Jul 23, 2014

olson-sean-k commented Jul 23, 2014