Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the zip of tag #231

Closed
xuehang1 opened this issue Dec 30, 2020 · 7 comments
Closed

How to get the zip of tag #231

xuehang1 opened this issue Dec 30, 2020 · 7 comments
Labels
stale Issues/PRs that are marked for closure due to inactivity

Comments

@xuehang1
Copy link

First I createTag into local ,Now I want to get the zip of tag ,Instruction effect similar to “git tag archive-format=zip-output=tag_1.0.zip tag_1.0”。Anyone knows how to implement it with go-git?

@xuehang1 xuehang1 reopened this Dec 30, 2020
@lmas
Copy link

lmas commented Jan 19, 2021

Looks like you'll have to do the archiving manually yourself, using archive/zip and looping over the file blobs pointed at by the tag.

@abrander
Copy link

Coincidentally, I just had to solve this exact issue. Here's how I solved it.

import (
	"archive/zip"
	"io"

	"github.com/go-git/go-git/v5"
	"github.com/go-git/go-git/v5/plumbing"
	"github.com/go-git/go-git/v5/plumbing/object"
)

// WriteZip will stream the content of the repository to w as a zip file.
// revision can be anything supported by ResolveRevision(), but please
// note that short hashes are not supported for repositories opened using
// PlainOpen(). See: https://github.com/go-git/go-git/issues/148
func WriteZip(repo *git.Repository, w io.Writer, revision string) error {
	hash, err := repo.ResolveRevision(plumbing.Revision(revision))
	if err != nil {
		return err
	}

	// Get the corresponding commit hash.
	obj, err := repo.CommitObject(*hash)
	if err != nil {
		return err
	}

	// Let's have a look at the tree at that commit.
	tree, err := repo.TreeObject(obj.TreeHash)
	if err != nil {
		return err
	}

	z := zip.NewWriter(w)

	addFile := func(f *object.File) error {
		fw, err := z.Create(f.Name)
		if err != nil {
			return err
		}

		fr, err := f.Reader()
		if err != nil {
			return err
		}

		_, err = io.Copy(fw, fr)
		if err != nil {
			return err
		}

		return fr.Close()
	}

	err = tree.Files().ForEach(addFile)
	if err != nil {
		return err
	}

	return z.Close()
}

@lmas
Copy link

lmas commented Feb 19, 2021

Very nice solution! Done any benchmarking tho? Curious how performance would be (hmm maybe with a large repo size like linux kernel?)

@abrander
Copy link

abrander commented Feb 21, 2021

Very nice solution! Done any benchmarking tho? Curious how performance would be (hmm maybe with a large repo size like linux kernel?)

Well. How about we simply measure it. Let's do something like this:

func main() {
	repo, err := git.PlainOpen("/dev/shm/linux-kernel")
	if err != nil {
		panic(err)
	}

	t0 := time.Now()
	err = WriteZip(repo, ioutil.Discard, "HEAD")
	if err != nil {
		panic(err)
	}

	fmt.Fprintf(os.Stderr, "Well, it took %s\n", time.Since(t0).Round(time.Second).String())
}

First run on my measly i3-7100U:

Well, it took 1m21s

Maybe we can do better. Let's replace archive/zip with the excellent drop-in package github.com/klauspost/compress/zip from Klaus Post:

Well, it took 1m5s

Maybe we can do better without trying too hard. Let's make it concurrent:

// WriteZip will stream the content of the repository to w as a zip file.
// revision can be anything supported by ResolveRevision(), but please
// note that short hashes are not supported for repositories opened using
// PlainOpen(). See: https://github.com/go-git/go-git/issues/148
func WriteZip(repo *git.Repository, w io.Writer, revision string) error {
	hash, err := repo.ResolveRevision(plumbing.Revision(revision))
	if err != nil {
		return err
	}

	// Get the corresponding commit hash.
	obj, err := repo.CommitObject(*hash)
	if err != nil {
		return err
	}

	// Let's have a look at the tree at that commit.
	tree, err := repo.TreeObject(obj.TreeHash)
	if err != nil {
		return err
	}

	type carrier struct {
		name string
		r    io.ReadCloser
	}

	files := make(chan carrier, 1000)
	g := &errgroup.Group{}

	g.Go(func() error {
		z := zip.NewWriter(w)

		for c := range files {
			fw, err := z.Create(c.name)
			if err != nil {
				return err
			}

			_, err = io.Copy(fw, c.r)
			if err != nil {
				return err
			}

			err = c.r.Close()
			if err != nil {
				return err
			}
		}

		return z.Close()
	})

	addFile := func(f *object.File) error {
		fr, err := f.Reader()
		if err != nil {
			return err
		}

		files <- carrier{f.Name, fr}

		return nil
	}

	err = tree.Files().ForEach(addFile)
	if err != nil {
		return err
	}

	close(files)

	return g.Wait()
}

Well, it took 53s

And then my kids came nagging me. I'm sure there's room for much improvement. Good luck, soldier.

@abrander
Copy link

Curiosity got me.

(0) [    7ms] /dev/shm/linux-kernel [git origin/master f40ddce88593]$ time git archive --format=zip -o /dev/null HEAD
real	0m40.256s
user	0m39.805s
sys	0m0.448s
(0) [40263ms] /dev/shm/linux-kernel [git origin/master f40ddce88593]$ 

git uses 40 seconds to do the same. I think 53 seconds is okay considering everything.

@lmas
Copy link

lmas commented Feb 22, 2021

Aha thanks for testing that too! Was guessing worst case it would be more like +10 minutes or something, so under a minute is great

Copy link

github-actions bot commented Jan 5, 2024

To help us keep things tidy and focus on the active tasks, we've introduced a stale bot to spot issues/PRs that haven't had any activity in a while.

This particular issue hasn't had any updates or activity in the past 90 days, so it's been labeled as 'stale'. If it remains inactive for the next 30 days, it'll be automatically closed.

We understand everyone's busy, but if this issue is still important to you, please feel free to add a comment or make an update to keep it active.

Thanks for your understanding and cooperation!

@github-actions github-actions bot added the stale Issues/PRs that are marked for closure due to inactivity label Jan 5, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues/PRs that are marked for closure due to inactivity
Projects
None yet
Development

No branches or pull requests

3 participants