Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Cache git remotes #2482

Closed
wants to merge 1 commit into from

4 participants

@guybrush

right now npm install git+ssh://foo.com/bar.git#v0.0.1 will (see https://github.com/isaacs/npm/blob/master/lib/cache.js#L280-326 ):

  • clone the git-repo into /tmp/random
  • then it will checkout the gitref v0.0.1
  • then it will run addLocalDirectory(/tmp/random)

this patch will make it do (u is the git-remote):

  1. path.exists(cache+'/.gitRemotes/'+sha1(u)) ? 3. : 2.
  2. git clone --bare u cache/.gitRemotes/sha1(u)
  3. cd cache/.gitRemotes/sha1(u) && git fetch
  4. git clone cache/.gitRemotes/sha1(u) /tmp/random
  5. cd /tmp/random && git checkout gitref
  6. addLocalDirectory(/tmp/random)

that way npm only needs to clone repositories (per git-remote) 1 time and will use git fetch after that

@guybrush

note that since ca15583 npm install git+ssh://git@github.com:isaacs/npm.git does not work anymore. this PR depends on ca15583

git supports (see http://www.kernel.org/pub/software/scm/git/docs/v1.5.6/git-fetch.html#_git_urls_a_id_urls_a )

  • rsync://host.xz/path/to/repo.git/
  • http://host.xz/path/to/repo.git/
  • https://host.xz/path/to/repo.git/
  • git://host.xz/path/to/repo.git/
  • git://host.xz/~user/path/to/repo.git/
  • ssh://[user@]host.xz[:port]/path/to/repo.git/
  • ssh://[user@]host.xz/path/to/repo.git/
  • ssh://[user@]host.xz/~user/path/to/repo.git/
  • ssh://[user@]host.xz/~/path/to/repo.git

and

  • [user@]host.xz:/path/to/repo.git/
  • [user@]host.xz:~user/path/to/repo.git/
  • [user@]host.xz:path/to/repo.git

so npm should only allow

  • npm install git+ssh://git@github.com/isaacs/npm.git
  • npm install git://git@github.com/isaacs/npm.git
  • npm install git@github.com:isaacs/npm.git

which is, the colon (scp-style) works only without defining a protocol

i might be wrong, any thoughts on that?

related: #2453

@guybrush

i found out about require('zlib') :)

so now it is

  1. path.exists(cache+'/.gitRemotes/'+sha1(u)) ? 3. : 2.
  2. git clone --bare u cache/.gitRemotes/sha1(u)
  3. cd cache/.gitRemotes/sha1(u) && git fetch
  4. git archive /tmp/random.tgz
  5. addLocalTarball(/tmp/random.tgz)
@guybrush

did a git rebase to master

@scriby scriby referenced this pull request
Closed

git URL update speed #2855

lib/cache.js
((6 lines not shown))
log.verbose("addRemoteGit", [u, co])
- var tmp = path.join(npm.tmp, Date.now()+"-"+Math.random())
- mkdir(path.dirname(tmp), function (er) {
- if (er) return cb(er)
- exec( npm.config.get("git"), ["clone", u, tmp], null, false
- , function (er, code, stdout, stderr) {
- stdout = (stdout + "\n" + stderr).trim()
- if (er) {
- log.error("git clone " + u, stdout)
- return cb(er)
- }
- log.verbose("git clone "+u, stdout)
- exec( npm.config.get("git"), ["checkout", co], null, false, tmp
+ var p = path.join(npm.config.get("cache"), ".gitRemotes", v)
+ path.exists(p, function (exists) {
@isaacs Owner
isaacs added a note

Don't use path.exists. Use fs.stat, and verify that it's actually the sort of thing you want it to be. (Ie, st.isDirectory() etc.)

what should i do if the path exists but is not a directory, just delete it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
lib/cache.js
((6 lines not shown))
log.verbose("addRemoteGit", [u, co])
- var tmp = path.join(npm.tmp, Date.now()+"-"+Math.random())
- mkdir(path.dirname(tmp), function (er) {
- if (er) return cb(er)
- exec( npm.config.get("git"), ["clone", u, tmp], null, false
- , function (er, code, stdout, stderr) {
- stdout = (stdout + "\n" + stderr).trim()
- if (er) {
- log.error("git clone " + u, stdout)
- return cb(er)
- }
- log.verbose("git clone "+u, stdout)
- exec( npm.config.get("git"), ["checkout", co], null, false, tmp
+ var p = path.join(npm.config.get("cache"), ".gitRemotes", v)
@isaacs Owner
isaacs added a note

The folder should not start with a dot, or contain capital letters. _git-remotes would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
lib/cache.js
((6 lines not shown))
log.verbose("addRemoteGit", [u, co])
- var tmp = path.join(npm.tmp, Date.now()+"-"+Math.random())
- mkdir(path.dirname(tmp), function (er) {
- if (er) return cb(er)
- exec( npm.config.get("git"), ["clone", u, tmp], null, false
- , function (er, code, stdout, stderr) {
- stdout = (stdout + "\n" + stderr).trim()
- if (er) {
- log.error("git clone " + u, stdout)
- return cb(er)
- }
- log.verbose("git clone "+u, stdout)
- exec( npm.config.get("git"), ["checkout", co], null, false, tmp
+ var p = path.join(npm.config.get("cache"), ".gitRemotes", v)
+ path.exists(p, function (exists) {
+ if (exists) return archive()
@isaacs Owner
isaacs added a note

If the path exists and is a folder, then you still ought to do a git fetch -a origin in it, so that you pick up new refs.

I do git fetch in the archive() function, will change it to git fetch -a origin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@scriby

I just wanted to say thanks for working on this! It will really help out on my project.

@guybrush

oh i just noticed the comments from isaacs, sweet! will try to fix the things later today and rebase, this would really help me too :)

@guybrush

to sum up what i changed (its all only in addRemoteGit()):

  • added gitEnv() to all the git-exec()'s
  • .gitRemotes -> _git-remotes
  • path.exists(cacheDir) -> check if all of the following is true, if not rm(chacheDir) and do a fresh clone (i am not sure about this)
    • fs.stat(cacheDir) && stats.isDirectory()
    • exec("git config --get remote.origin.url") == u
  • git fetch -> git fetch -a origin
  • joined stdout and stderr for logging after all the exec's like everywhere else (stdout = (stdout + "\n" + stderr).trim())
  • added some "("+u+")" to some log-messages, to give a hint about the repo
  • listen for close on the git fetch -a origin process before doing addLocalTarball to make sure the gzip-stream ended (i am not sure about this)

so with this patch npm caches git-remotes like this:

  1. cacheDir = path.join(cache,'_git-remotes',sha1(u))
  2. checkGitDir(cacheDir) ? 4. : 3. (rm cacheDir if necessary)
  3. git clone --mirror u cacheDir
  4. cd cacheDir && git fetch -a origin
  5. git archive /tmp/random.tgz --format=tar --prefix=package/
  6. addLocalTarball(/tmp/random.tgz)

since i am very unsure where/how to put tests in npm i made a simple test-module :D

@scriby

poke

@guybrush

@scriby until @isaacs responds you can use the feature by installing the branch with npm install -g git://github.com/guybrush/npm.git#cacheGitRemotes, I will try to keep the branch up2date/rebased to master. at least I am using it in production to install my stuff and it works fine so far - though I dont have a huge server-farm or something where this is super-critical.

I understand that this patch is somehow critical and does not add a popular demanded feature (and is not very clean yet?). so it may take some time to convince @isaacs to even spend some time in reviewing this or even implement this himself, i guess :)

also I will try to poke @isaacs in irc from time to time :p

@guybrush guybrush referenced this pull request in substack/rolling-reduce
Closed

Add component.json #1

@domenic
Collaborator

+1

@domenic domenic referenced this pull request in requirejs/text
Closed

Publish to npm #32

@isaacs
Owner

Landed on master with a little bit of fixup. Thanks, this is a big improvement!

@isaacs isaacs closed this
@scriby

Thanks!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Oct 30, 2012
  1. @guybrush

    cache git-dependencies

    guybrush authored
This page is out of date. Refresh to see the latest.
Showing with 80 additions and 15 deletions.
  1. +80 −15 lib/cache.js
View
95 lib/cache.js
@@ -77,6 +77,7 @@ var mkdir = require("mkdirp")
, lockFile = require("lockfile")
, crypto = require("crypto")
, retry = require("retry")
+ , zlib = require("zlib")
cache.usage = "npm cache add <tarball file>"
+ "\nnpm cache add <folder>"
@@ -341,9 +342,12 @@ function addRemoteTarball (u, shasum, name, cb_) {
})
}
-// For now, this is kind of dumb. Just basically treat git as
-// yet another "fetch and scrub" kind of thing.
-// Clone to temp folder, then proceed with the addLocal stuff.
+// 1. cacheDir = path.join(cache,'_git-remotes',sha1(u))
+// 2. checkGitDir(cacheDir) ? 4. : 3. (rm cacheDir if necessary)
+// 3. git clone --mirror u cacheDir
+// 4. cd cacheDir && git fetch -a origin
+// 5. git archive /tmp/random.tgz
+// 6. addLocalTarball(/tmp/random.tgz) <gitref> --format=tar --prefix=package/
function addRemoteGit (u, parsed, name, cb_) {
if (typeof cb_ !== "function") cb_ = name, name = null
@@ -360,11 +364,13 @@ function addRemoteGit (u, parsed, name, cb_) {
})
}
+ var p, co // cachePath, git-ref we want to check out
+
lock(u, function (er) {
if (er) return cb(er)
// figure out what we should check out.
- var co = parsed.hash && parsed.hash.substr(1) || "master"
+ co = parsed.hash && parsed.hash.substr(1) || "master"
// git is so tricky!
// if the path is like ssh://foo:22/some/path then it works, but
// it needs the ssh://
@@ -378,32 +384,91 @@ function addRemoteGit (u, parsed, name, cb_) {
u = u.replace(/^ssh:\/\//, "")
}
+ var v = crypto.createHash("sha1").update(u).digest("hex")
+
log.verbose("addRemoteGit", [u, co])
- var tmp = path.join(npm.tmp, Date.now()+"-"+Math.random())
- mkdir(path.dirname(tmp), function (er) {
+ p = path.join(npm.config.get("cache"), "_git-remotes", v)
+
+ checkGitDir()
+ })
+
+ function checkGitDir () {
+ fs.stat(p, function(er, s) {
+ if (er) return cloneRemote()
+ if (!s.isDirectory()) return rm(p, function(er){
+ if (er) return cb(er)
+ cloneRemote()
+ })
+ exec( npm.config.get("git"), ["config", "--get", "remote.origin.url"]
+ , gitEnv(), false, p, function(er, code, stdout, stderr) {
+ stdoutTrimmed = (stdout + "\n" + stderr).trim()
+ if (er) {
+ log.warn( "git config --get remote.config.url returned wrong result "
+ + "("+u+")", stdoutTrimmed )
+ return rm(p, function(err){
+ if (err) return cb(err)
+ cloneRemote()
+ })
+ }
+ log.verbose("git config --get remote.config.url ("+u+")", stdoutTrimmed)
+ if (u != stdout.trim()) return rm(p, function(err){
+ if (err) return cb(err)
+ cloneRemote(p)
+ })
+ archiveRemote()
+ })
+ })
+ }
+
+ function cloneRemote () {
+ mkdir(p, function (er) {
if (er) return cb(er)
- exec( npm.config.get("git"), ["clone", u, tmp], gitEnv(), false
+ exec( npm.config.get("git"), ["clone", "--mirror", u, p], gitEnv(), false
, function (er, code, stdout, stderr) {
stdout = (stdout + "\n" + stderr).trim()
if (er) {
log.error("git clone " + u, stdout)
return cb(er)
}
- log.verbose("git clone "+u, stdout)
- exec( npm.config.get("git"), ["checkout", co], gitEnv(), false, tmp
- , function (er, code, stdout, stderr) {
- stdout = (stdout + "\n" + stderr).trim()
+ log.verbose("git clone " + u, stdout)
+ archiveRemote(p)
+ })
+ })
+ }
+
+ function archiveRemote () {
+ exec( npm.config.get("git"), ["fetch", "-a", "origin"], gitEnv(), false, p
+ , function (er, code, stdout, stderr) {
+ stdout = (stdout + "\n" + stderr).trim()
+ if (er) {
+ log.error("git fetch -a origin ("+u+")", stdout)
+ return cb(er)
+ }
+ log.verbose("git fetch -a origin ("+u+")", stdout)
+ var tmp = path.join(npm.tmp, Date.now()+"-"+Math.random(), "tmp.tgz")
+ mkdir( path.dirname(tmp), function (er) {
+ if (er) return cb(er)
+ var gzip = zlib.createGzip()
+ var out = fs.createWriteStream(tmp)
+ var cp = exec( npm.config.get("git")
+ , ["archive", co, "--format=tar", "--prefix=package/"]
+ , gitEnv(), false, p
+ , function (er, code, stdout, stderr) {
if (er) {
- log.error("git checkout " + co, stdout)
+ log.error( "git archive " + co + " ("+u+")"
+ , (stdout + "\n" + stderr).trim())
return cb(er)
}
- log.verbose("git checkout " + co, stdout)
- addLocalDirectory(tmp, cb)
+ log.verbose("git archive " + co + " ("+u+")")
+ cp.on('close', function(){
+ addLocalTarball(tmp, cb)
+ })
})
+ cp.stdout.pipe(gzip).pipe(out)
})
})
- })
+ }
}
Something went wrong with that request. Please try again.