Skip to content

Don't *always* fetch remote repositories of git-dependencies #4667

Closed
wants to merge 1 commit into from

5 participants

@guybrush

this PR trys to fix #4191

currently when you do npm i and you have a git-dependency
it will create a local cache of the repo if not already present.
then it will do git archive on that local cached repo to create
a tarball for the git-ref you defined in the package.json
(which is master by default). now when you run npm i again it
will fetch the remote repo to update the cache, no matter what!
even if you did not change the git-ref.

so can we just skip the fetching-part once we have a local cached
repo and we didnt even change the git-ref? the problem is:
how do we know if the git-ref you put in your dependency-url is
a commit-hash or branch/tag-name?

consider following package.json:

{ "name" : "foo"
, "version" : "0.0.0"
, "dependencies" : { "bla" : "ghUser/ghRepo#blub" }
}

how can we know if blub is a commit-hash, a branch or a tag? we cant.

an idea would be to introduce another seperator for the git-url:
ghUser/ghRepo@branchOrTag which indicates that we mean a branch and
not a commit-hash?

this PR currently implements: just look in the locally cached repo for a commit
with the hash blub and if we find any we just assume that it must
be a commit-hash. then we just skip fetching the remote repo and
just pull out a tarball of the cached repo.

also i am currently only skipping the fetch if the specified git-ref is longer
than 4 chars (because what if someone defines ghUser/ghRepo#a).

further improvement would be to even skip creating the tarball if
the hash in the current installed package's package.json points already
to blub. but i did not look into the relevant codepath yet, so no idea.

@domenic
npm member
domenic commented Feb 13, 2014

This looks pretty good to me but I would need another reviewer to be sure. Also, I am not sure whether there are any tests you could use as a template, but we really prefer changes to come with tests.

how can we know if blub is a commit-hash, a branch or a tag? we cant. an idea would be to introduce another seperator for the git-url

I see the problem, and like this idea a lot. It may have trickier implications though.

@guybrush

thanks for reply!

i really want a test too (a lot), will take a look at the other tests and tinker arround, and i would like to have lots of people who think about the problem and maybe find a failure in the current proposed solution or even propose a better solution.

by introducing another seperator we have to be carefull to not break backward-compatibility and in general i dont like the idea actually. i prefer the way as it is solved right now in this PR, the only thing i dont like is the hardcoded length (4) of the (possible) commit hash.

but right now i dont have much time. so maybe this has to wait until tomorrow or something unless someone else wants to solve it.

@isaacs
npm member
isaacs commented Feb 15, 2014

I'd like to not add any new separators or magic package characters for this.

I think the approach mostly seems fine. A new tap test is absolutely going to be a condition of merging this, though.

@guybrush

i am sorry, but this will take at least another week (no promises).

i want to write a good tap-test without relying on external services by creating and hosting a local git-repo.

feel free to close this issue, i will open again upon progress

@robertkowalski
npm member

#4108 would add the ability to use local git repositories. There is also a test included which shows how to test with these local git repositories.

@guybrush

sorry for the late response here

@robertkowalski is this still relevant?

i really like the approach of splitting out the cache into seperate modules like you do it in #4824 but if its not going that way i would go on and provide a test for this PR here

@robertkowalski
npm member

It's still relevant! Test is highly welcome!

@guybrush

im not sure if the test fits your style so please comment stuff that is not ok or if something is missing etc

also note that following tests are failing for me (but they also fail without my patch :D ):

  • test/tap/ignore-install-link.js
  • test/tap/lifecycle-signal.js
  • test/tap/peer-deps.js

https://gist.github.com/guybrush/9759715

@guybrush

also im not sure if that is an issue, but a git-dependency with an url like git+http://foo.com will throw an error because parsed.pathname is null here https://github.com/npm/npm/blob/8553e91/lib/cache.js#L423 , so the url must be at least git+http://foo.com/.

but i did not bother with this in the current PR, if you think this is an issue i can open another PR

@guybrush guybrush commented on the diff Mar 27, 2014
lib/cache.js
- var tmp
-
- exec(git, archive, {cwd: p, env: env}, function (er, stdout, stderr) {
- stdout = (stdout + "\n" + stderr).trim()
- if (er) {
- log.error("git fetch -a origin ("+u+")", stdout)
- return cb(er)
- }
- log.verbose("git fetch -a origin ("+u+")", stdout)
- tmp = path.join(npm.tmp, Date.now()+"-"+Math.random(), "tmp.tgz")
- verifyOwnership()
+ // we look in the locally cached repo for `co`, if we dont find a
+ // commit with a hash that fits `co` we need to fetch the remote repo
+ // otherwise just create a tarball from the cached repo.
+ exec(git, argsResolve, {cwd: p, env: env}, function(er, stdout, stderr){
+ // when `git rev-list` fails, we assume its because the hash is not in the cache
@guybrush
guybrush added a note Mar 27, 2014

im not sure here, maybe we should check for stderr to be some specific message? but then we would have to look if the message looks exactly the same in all the git-versions? so i think it is ok how i am doing it here

i could not find any other solution to "check if a commit with given hash exists" than git rev-list, maybe some git-wizard knows a better solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@othiym23

Oof. The first big cache refactor last year totally wiped out this change, and further tweaks to how git remotes are handled have made it pretty much un-transferrable to the new code base. I'm sorry. We should have merged this sooner. I'm going to close this out (with my apologies for letting it rot like this), but I will say that the entire cache portion of the code base will probably get rewritten at some point this year (it's on the roadmap!), at which point we can revisit how to reduce how much work git does to stay current.

Again, my apologies for letting this get this far from master.

@othiym23 othiym23 closed this Feb 27, 2015
@guybrush
guybrush commented Mar 3, 2015

@othiym23 no worries! you are doing a superb job, thanks for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.