This repository has been archived by the owner. It is now read-only.

Don't *always* update Git dependencies. #4191

Open
domenic opened this Issue Nov 26, 2013 · 27 comments

Comments

Projects
None yet
@domenic
Member

domenic commented Nov 26, 2013

In #4104, @robertkowalski fixed #1727 by making Git dependencies always update (i.e., always show as outdated). This is a big improvement over where we were, when they would never update! But it could be better.

Ideally:

  • If package.json points to a commit hash and the local copy is already at that commit hash, do no HTTP requests.
  • If package.json points to a tag or branch name, go fetch the metadata for that tag/branch from the repo, and check if the commit hash matches the local one. If so, do a full tarball download and unpack; otherwise, do no further HTTP.
  • If package.json points to no tag (i.e. master), go fetch the metadata for master, and proceed as in the previous case.

I'm not sure how much of this is possible given what information and technology we have available. (How do you tell what commit the local copy is on? What does 'getting metadata" mean? Are there easy ways to do this stuff for GitHub, but not for arbitrary Git servers?) I would bet, though, that Git-based package managers like Bower or Volo have already figured this out.

@guybrush

This comment has been minimized.

Show comment
Hide comment
@guybrush

guybrush Dec 20, 2013

Contributor

also:

  • If package.json points to a commit hash and the local copy is not at that commit, check if the commit exists in the local cache of the repository. if it exists pull out the tarball, otherwise fetch the remote and try to pull out tarball afterwards

and in general check if the target-commit (if provided) exists in the cache before fetching

Contributor

guybrush commented Dec 20, 2013

also:

  • If package.json points to a commit hash and the local copy is not at that commit, check if the commit exists in the local cache of the repository. if it exists pull out the tarball, otherwise fetch the remote and try to pull out tarball afterwards

and in general check if the target-commit (if provided) exists in the cache before fetching

@vjpr

This comment has been minimized.

Show comment
Hide comment
@vjpr

vjpr Jan 23, 2014

Is there some way to prevent all the dependencies of git packages being re-downloaded when a package changes?

Say I split my project into modules and deploy to Heroku, node_modules is cached for my main module, but other modules as git dependencies have all their dependencies re-downloaded and rebuilt.

Maybe an option to tell npm not to clobber nested node_modules when a git dep changes. This would allow a custom build script to handle the nested node_modules caching and invalidation.

vjpr commented Jan 23, 2014

Is there some way to prevent all the dependencies of git packages being re-downloaded when a package changes?

Say I split my project into modules and deploy to Heroku, node_modules is cached for my main module, but other modules as git dependencies have all their dependencies re-downloaded and rebuilt.

Maybe an option to tell npm not to clobber nested node_modules when a git dep changes. This would allow a custom build script to handle the nested node_modules caching and invalidation.

@guybrush

This comment has been minimized.

Show comment
Hide comment
@guybrush

guybrush Feb 13, 2014

Contributor

ok i tried to fix it and i am hanging at a problem where i dont see a good solution right now: how do we distinguish a commit-hash from a branch/tag-name? because without knowing if a commit-hash is meant or a git-ref we have to fetch the origin to be sure that whatever is up-to-date when you define a git-url-dependency: "foo" : "git/url#whatever".

so the first idea that pops up in my head would be to introduce another seperator (besides #) which indicates that you mean a commit-hash and no git-ref (which seems a bad idea..).

another idea would be, to look for a commit in the local repo with that hash. if it is there just use it and assume that whatever is a commit-hash

thoughts?

(if the way it works now sucks really hard for you, i recommend to bundle the dependencies)

Contributor

guybrush commented Feb 13, 2014

ok i tried to fix it and i am hanging at a problem where i dont see a good solution right now: how do we distinguish a commit-hash from a branch/tag-name? because without knowing if a commit-hash is meant or a git-ref we have to fetch the origin to be sure that whatever is up-to-date when you define a git-url-dependency: "foo" : "git/url#whatever".

so the first idea that pops up in my head would be to introduce another seperator (besides #) which indicates that you mean a commit-hash and no git-ref (which seems a bad idea..).

another idea would be, to look for a commit in the local repo with that hash. if it is there just use it and assume that whatever is a commit-hash

thoughts?

(if the way it works now sucks really hard for you, i recommend to bundle the dependencies)

@qraynaud

This comment has been minimized.

Show comment
Hide comment
@qraynaud

qraynaud Feb 13, 2014

Once you have downloaded the ref once, you can know. And since you need to download it at least the first time that's not really a problem (you just have to checkout the repository, pull to the indicated ref and compare the current hash to the provided reference).

Or am I missing something?

qraynaud commented Feb 13, 2014

Once you have downloaded the ref once, you can know. And since you need to download it at least the first time that's not really a problem (you just have to checkout the repository, pull to the indicated ref and compare the current hash to the provided reference).

Or am I missing something?

@scottnonnenberg

This comment has been minimized.

Show comment
Hide comment
@scottnonnenberg

scottnonnenberg Mar 23, 2014

+1 for a fix. This is slowing down installs for any package with git dependencies, even if the references are using tags. Just reverted to npm@1.3.11. :0(

scottnonnenberg commented Mar 23, 2014

+1 for a fix. This is slowing down installs for any package with git dependencies, even if the references are using tags. Just reverted to npm@1.3.11. :0(

@guybrush

This comment has been minimized.

Show comment
Hide comment
@guybrush

guybrush Mar 25, 2014

Contributor

@scottnonnenberg with my PR #4667 it will skip fetching but only when you use commit-hashes (since we cant be sure if tags/branches haven't moved). also the defined commit-hash has to be at least 4 characters (currently).

Contributor

guybrush commented Mar 25, 2014

@scottnonnenberg with my PR #4667 it will skip fetching but only when you use commit-hashes (since we cant be sure if tags/branches haven't moved). also the defined commit-hash has to be at least 4 characters (currently).

@scottnonnenberg

This comment has been minimized.

Show comment
Hide comment
@scottnonnenberg

scottnonnenberg Mar 25, 2014

@guybrush Glad to see some movement on this stuff. However, restricting the right behavior to commit-hashes seems like we're optimizing for the degenerative case - it's a bad idea idea to change published history, and that includes tags: http://sethrobertson.github.io/GitBestPractices/#pubonce

Seems to me, if a tag changes, it's reasonable to ask folks to manually remove that dependency and reinstall it. Bower behaves this way - it's entirely built on git tags, and keeps a local cache of that tag for future installs of that version until you manually remove it.

Then you're only left with branch vs. tag differentiation. If you're pointing to a branch, yes, reinstall all the time. If a tag, leave it.

scottnonnenberg commented Mar 25, 2014

@guybrush Glad to see some movement on this stuff. However, restricting the right behavior to commit-hashes seems like we're optimizing for the degenerative case - it's a bad idea idea to change published history, and that includes tags: http://sethrobertson.github.io/GitBestPractices/#pubonce

Seems to me, if a tag changes, it's reasonable to ask folks to manually remove that dependency and reinstall it. Bower behaves this way - it's entirely built on git tags, and keeps a local cache of that tag for future installs of that version until you manually remove it.

Then you're only left with branch vs. tag differentiation. If you're pointing to a branch, yes, reinstall all the time. If a tag, leave it.

@domenic

This comment has been minimized.

Show comment
Hide comment
@domenic

domenic Mar 25, 2014

Member

it's reasonable to ask folks to manually remove that dependency and reinstall it.

-1, this is unreasonable. The fact that bower does this simply means that bower is conflating a version control system with a versioned registry.

Member

domenic commented Mar 25, 2014

it's reasonable to ask folks to manually remove that dependency and reinstall it.

-1, this is unreasonable. The fact that bower does this simply means that bower is conflating a version control system with a versioned registry.

@scottnonnenberg

This comment has been minimized.

Show comment
Hide comment
@scottnonnenberg

scottnonnenberg Mar 25, 2014

@domenic I think you've hit the nail on the head. The problem is that we're acting like we can be 100% clean with a dependency straight to git. But git has behaviors that don't map directly to the npm registry. For example, a commit hash can be removed outright from a remote repository. It's another bad practice, but it can be done.

The point is that we need to design a system that works as most people expect it would, and works consistently. Given the definition of a tag, and the fact that it takes interesting longhand commands to change a tag in a remote repository, I think we'll continue to butt up against user expectation without a change.

scottnonnenberg commented Mar 25, 2014

@domenic I think you've hit the nail on the head. The problem is that we're acting like we can be 100% clean with a dependency straight to git. But git has behaviors that don't map directly to the npm registry. For example, a commit hash can be removed outright from a remote repository. It's another bad practice, but it can be done.

The point is that we need to design a system that works as most people expect it would, and works consistently. Given the definition of a tag, and the fact that it takes interesting longhand commands to change a tag in a remote repository, I think we'll continue to butt up against user expectation without a change.

@guybrush

This comment has been minimized.

Show comment
Hide comment
@guybrush

guybrush Mar 26, 2014

Contributor

when you remove a git-hash like you said, it will just not install anything and throw an error (i.e. npm). it wont just install something.

also see http://blog.npmjs.org/post/77758351673/no-more-npm-publish-f

Even if a package is 100% unpublished by the author, and you publish a new version of a brand new thing, you won’t be able to use the version numbers that the previous author used.

The only way around this will be for a server admin to manually intervene.

The net benefit is that you won’t be surprised by having different things show up when you and someone else both have foo@1.2.3 installed. Either it’ll work, or it won’t, but it won’t ever be a completely different thing.

which comes pretty close to a git-hash, when you put a hash in your package.json you can be pretty sure you get what you meant (or nothing)

also my PR doesnt prevent you from using tags, its just pessimistic about tags/branches since they can move and thus the git-repo must be fetched every time

Contributor

guybrush commented Mar 26, 2014

when you remove a git-hash like you said, it will just not install anything and throw an error (i.e. npm). it wont just install something.

also see http://blog.npmjs.org/post/77758351673/no-more-npm-publish-f

Even if a package is 100% unpublished by the author, and you publish a new version of a brand new thing, you won’t be able to use the version numbers that the previous author used.

The only way around this will be for a server admin to manually intervene.

The net benefit is that you won’t be surprised by having different things show up when you and someone else both have foo@1.2.3 installed. Either it’ll work, or it won’t, but it won’t ever be a completely different thing.

which comes pretty close to a git-hash, when you put a hash in your package.json you can be pretty sure you get what you meant (or nothing)

also my PR doesnt prevent you from using tags, its just pessimistic about tags/branches since they can move and thus the git-repo must be fetched every time

@scottnonnenberg

This comment has been minimized.

Show comment
Hide comment
@scottnonnenberg

scottnonnenberg Apr 29, 2014

Hey guys. I appreciate that you took the time to talk through this with me thus far.

Full disclosure: I am using git+ssh dependencies with semver tags as a kind of local npm registry. It worked until that recent npm change, so I'm on npm@1.3.11 until either the previous tag behavior is restored, or I find a lightweight private npm registry I can use. I've found Reggie and sinopia so far - perhaps you guys have recommendations?

One final bit of support for the previous behavior: There's a lot of history on the side of caching tags - the second-most popular answer at this StackOverflow question on private npm modules, from 2011, talks about using git+ssh/tags. I'm pretty sure that the majority of the people using this technique either haven't upgraded to the latest node/npm yet, or just haven't noticed that their deploys are taking longer. They will, as I did, and consider it a breaking change.

So if we don't go back to the previous behavior, we should at least have a good set of alternative solutions for the private modules scenario.

scottnonnenberg commented Apr 29, 2014

Hey guys. I appreciate that you took the time to talk through this with me thus far.

Full disclosure: I am using git+ssh dependencies with semver tags as a kind of local npm registry. It worked until that recent npm change, so I'm on npm@1.3.11 until either the previous tag behavior is restored, or I find a lightweight private npm registry I can use. I've found Reggie and sinopia so far - perhaps you guys have recommendations?

One final bit of support for the previous behavior: There's a lot of history on the side of caching tags - the second-most popular answer at this StackOverflow question on private npm modules, from 2011, talks about using git+ssh/tags. I'm pretty sure that the majority of the people using this technique either haven't upgraded to the latest node/npm yet, or just haven't noticed that their deploys are taking longer. They will, as I did, and consider it a breaking change.

So if we don't go back to the previous behavior, we should at least have a good set of alternative solutions for the private modules scenario.

@othiym23 othiym23 added this to the cache rewrite milestone Oct 3, 2014

@sandinmyjoints

This comment has been minimized.

Show comment
Hide comment
@sandinmyjoints

sandinmyjoints Feb 17, 2015

What if there were a flag --no-refetch-git that skipped refetching already-installed git-specified deps entirely? Then using checked-in node_modules/ with git+ssh private repos as a poor man's private registry could use tags and branches, instead of only being able to use commit hashes. (Can't wait until private npm packages are available without having to run your own registry!)

sandinmyjoints commented Feb 17, 2015

What if there were a flag --no-refetch-git that skipped refetching already-installed git-specified deps entirely? Then using checked-in node_modules/ with git+ssh private repos as a poor man's private registry could use tags and branches, instead of only being able to use commit hashes. (Can't wait until private npm packages are available without having to run your own registry!)

@rluba

This comment has been minimized.

Show comment
Hide comment
@rluba

rluba Mar 11, 2015

This issue slows us down massively.

If you have a Git dependency that has any child dependency with native extensions, those extensions are recompiled upon every npm install. For example depending on bunyan or restify causes dtrace-provider to be compiled every time, karma recompiles fsevents, socket.io recompiles ws and so on.

Even fetching the branch/tag every time to check if the commit hash has changed before removing and reinstalling a Git dependency would already be a huge improvement.

rluba commented Mar 11, 2015

This issue slows us down massively.

If you have a Git dependency that has any child dependency with native extensions, those extensions are recompiled upon every npm install. For example depending on bunyan or restify causes dtrace-provider to be compiled every time, karma recompiles fsevents, socket.io recompiles ws and so on.

Even fetching the branch/tag every time to check if the commit hash has changed before removing and reinstalling a Git dependency would already be a huge improvement.

@aearly

This comment has been minimized.

Show comment
Hide comment
@aearly

aearly Mar 19, 2015

👍 for a npm flag that skips re-checking git dependencies if the tag/branch id hasn't changed.

aearly commented Mar 19, 2015

👍 for a npm flag that skips re-checking git dependencies if the tag/branch id hasn't changed.

@estani

This comment has been minimized.

Show comment
Hide comment
@estani

estani Apr 29, 2015

"--cache-only" or something that does not check anything (not only git) provided the already cached versions satisfied the requirements.
This also implies that once built, it will always build as long as the main dependencies are not modifed.

estani commented Apr 29, 2015

"--cache-only" or something that does not check anything (not only git) provided the already cached versions satisfied the requirements.
This also implies that once built, it will always build as long as the main dependencies are not modifed.

@gumallett

This comment has been minimized.

Show comment
Hide comment
@gumallett

gumallett Jul 21, 2015

The documentation for npm install should at least note this "quirk". The algorithm list here: https://docs.npmjs.com/cli/install, explicitly states that the dependency is only installed if not already in node_modules.

gumallett commented Jul 21, 2015

The documentation for npm install should at least note this "quirk". The algorithm list here: https://docs.npmjs.com/cli/install, explicitly states that the dependency is only installed if not already in node_modules.

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens Nov 18, 2015

Contributor

So… Any news on one of the "light" solutions proposed? This is a huge waste of resources, for everyone.

Contributor

wmertens commented Nov 18, 2015

So… Any news on one of the "light" solutions proposed? This is a huge waste of resources, for everyone.

@jordwalke

This comment has been minimized.

Show comment
Hide comment
@jordwalke

jordwalke Nov 25, 2015

I would really love to see this finally addressed.

jordwalke commented Nov 25, 2015

I would really love to see this finally addressed.

@benjarwar

This comment has been minimized.

Show comment
Hide comment
@benjarwar

benjarwar Feb 24, 2016

+1 for @sandinmyjoints's idea of adding a flag.

benjarwar commented Feb 24, 2016

+1 for @sandinmyjoints's idea of adding a flag.

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens Mar 29, 2016

Contributor

AFAICT, this seems to be solved in NPMv3. Simply running npm i does not update the git-based deps.

Contributor

wmertens commented Mar 29, 2016

AFAICT, this seems to be solved in NPMv3. Simply running npm i does not update the git-based deps.

@jordwalke

This comment has been minimized.

Show comment
Hide comment
@jordwalke

jordwalke Mar 29, 2016

@wmertens: But you do sometimes want to update the git dependencies, if the hash installed is not the same as the hash specified.

jordwalke commented Mar 29, 2016

@wmertens: But you do sometimes want to update the git dependencies, if the hash installed is not the same as the hash specified.

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens Mar 30, 2016

Contributor

I assume that is exactly what happens, I haven't tested that yet.

Wout.
(typed on mobile, excuse terseness)

Contributor

wmertens commented Mar 30, 2016

I assume that is exactly what happens, I haven't tested that yet.

Wout.
(typed on mobile, excuse terseness)

@danmazzella

This comment has been minimized.

Show comment
Hide comment
@danmazzella

danmazzella Apr 22, 2016

Now my Git repo is never updated after performing "npm i". Is this the desired outcome?

danmazzella commented Apr 22, 2016

Now my Git repo is never updated after performing "npm i". Is this the desired outcome?

@aarononeal

This comment has been minimized.

Show comment
Hide comment
@aarononeal

aarononeal May 13, 2016

Seems like we've gone from "don't always" to "never". npm install is never updating my git repo dependencies and npm outdated doesn't show them. I have to specifically call npm install <mymodule> to force the update which invalidates caching and triggers build updates even when there are none. I assume that's a bug and not by design?

aarononeal commented May 13, 2016

Seems like we've gone from "don't always" to "never". npm install is never updating my git repo dependencies and npm outdated doesn't show them. I have to specifically call npm install <mymodule> to force the update which invalidates caching and triggers build updates even when there are none. I assume that's a bug and not by design?

@wmertens

This comment has been minimized.

Show comment
Hide comment
@wmertens

wmertens May 17, 2016

Contributor

Yes, I can confirm that this is the current behavior, even if you specify
tags and change them (imho indicating that you are very interested in an
update)

Ideally, npm should check if the git repo changed tag and rebuild.

Contributor

wmertens commented May 17, 2016

Yes, I can confirm that this is the current behavior, even if you specify
tags and change them (imho indicating that you are very interested in an
update)

Ideally, npm should check if the git repo changed tag and rebuild.

@dgreensp

This comment has been minimized.

Show comment
Hide comment
@dgreensp

dgreensp May 26, 2016

Is there any way to change a git URL so that npm install will cause an update? Commit hash to a different commit hash with a different semver version doesn't seem to trigger an update.

dgreensp commented May 26, 2016

Is there any way to change a git URL so that npm install will cause an update? Commit hash to a different commit hash with a different semver version doesn't seem to trigger an update.

@genzgd

This comment has been minimized.

Show comment
Hide comment
@genzgd

genzgd Mar 30, 2017

Same issue -- I would really like to have some way so that npm update will actually update when the git commit hash changes on the reference branch.

genzgd commented Mar 30, 2017

Same issue -- I would really like to have some way so that npm update will actually update when the git commit hash changes on the reference branch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.