Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the cache directory live in $HOME by default? #22

Closed
oconnor663 opened this issue Jul 23, 2014 · 3 comments
Closed

Should the cache directory live in $HOME by default? #22

oconnor663 opened this issue Jul 23, 2014 · 3 comments

Comments

@oconnor663
Copy link
Member

As I'm writing the README, I find myself telling new users to set the $PERU_CACHE variable to avoid recloning things after they clean. When new users need to configure some random setting, that's usually a sign that the default is bad. Should we be storing the cache in a centralized spot by default?

Pros:

  • This is what Maven and Ivy do.
  • This makes the fastest setup the default for new users.
  • Different projects with the same dependencies would share their networking and disk space by default.

Cons:

  • This is what Maven and Ivy do.
  • This is not what git does by default, even though it can be configured to.
  • This default might be bad for complicated disk setups.
    • Say your build machine uses an NFS mount for /home, and uses /var/local or something for the actual local disk. You might want to do all your clones and builds in /var/local for speed, but behind the scenes peru is doing big git operations over the network in /home.
  • This can be confusing for modules without an explicit rev.
    • If two projects use the same dependency, the rev that one of them is getting will be affected by the other. A "new" dependency could be very stale because the other caller cached it in the past. We will probably also have a --skipcache flag or something in the future to force plugin fetches, and doing that would update the cache for all callers.
    • Similar problem for nondeterministic build commands.
  • This is a band-aid for our slow, serial plugin fetching. We should make it faster instead.
  • When we start actually using locks for our cache writes, this default could create more lock contention and stale lock issues.
@olson-sean-k
Copy link
Member

In general, this seems to add complexity (see list of cons) for the sake of speed (see "list" of pros). As already mentioned, there are other battles we can fight to win speed; I'm not convinced this is the one to focus on, especially since it will likely burden users with complexity rather than just our code.

The behavior today is understandable, unsurprising, and tends to work well. A shared cache makes "building in space" harder, and today may actually break that mantra, no?

Personally, I haven't experienced any notable pain yet using a per-repo cache, but I also do not execute cleans very often. How much of an impact do we expect this to have on our most compelling use cases?

@oconnor663
Copy link
Member Author

My workflows tend to be trigger happy with git clean, so maybe I'm overestimating how much a normal user is going to blow away their cache. I think if we both set PERU_CACHE in our shellrcs that would be a red flag, but if you don't set it I'm less concerned.

@olson-sean-k
Copy link
Member

I do not set PERU_CACHE. I've done it by hand to experiment, but I don't script it.

There are clear benefits (speed!), so it is worth mentioning in README.md, but I'm not sure it actually needs a lot of attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants