Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git pull failing with "git-remote-keybase error: (1) packfile not found" #11366

Closed
telotortium opened this issue Apr 11, 2018 · 19 comments
Closed
Labels

Comments

@telotortium
Copy link

This has been happening on one of my private git repos since around the beginning of this week (only on macOS 10.13.3):

$ git pull
Initializing Keybase... done.
Syncing with Keybase... done.
git-remote-keybase error: (1) packfile not found

The first time this happened, I upgraded Keybase to Version 1.0.47-20180410052738+f705a9510f. However, I'm continuing to see this happen. I saw #11358 and tried to run keybase git delete $REPO, since I have the repo cloned locally, but that command hangs.

@strib
Copy link
Contributor

strib commented Apr 11, 2018

@telotortium please run keybase log send and we can take a look. Having some downtime right now that might be affecting your ability to delete the repo.

@telotortium
Copy link
Author

Log ID: d0b6431d457ef26c9c6b771c

@strib strib added the acked label Apr 11, 2018
@strib
Copy link
Contributor

strib commented Apr 11, 2018

It's dying looking for pack-0000000000000000000000000000000000000000.idx, which seems like a problem to me, since that's not a likely packfile name (usually it's named by the hash).

@telotortium how did you create and initialize this repo? Did you push --all an existing repo, or have you built it up commit-by-commit?

(Our downtime is over now, in case you want to retry deleting and recreating the repo.)

@telotortium
Copy link
Author

telotortium commented Apr 11, 2018

I started a new repo on my workstation, created the repo in Keybase, manually set the remote in my local repo to point to Keybase, and then ran git push to push changes to it. Usually I use git push --force-with-lease. I have only the master branch, so using git push --all would make no difference.

@strib
Copy link
Contributor

strib commented Apr 11, 2018

I'm just wondering if git-remote-keybase initialized the repo on the first push from a repo with a long history already, in which case the bug might have already existed in that initial repo (unrelated to KBFS). But if you started with an empty repo and built it up commit by commit, this isn't likely.

Are you willing to share the repo with me for debugging purposes? If so, and it's not too big, please do this:

cp -r /keybase/private/telotortium/.kbfs_git/<repo-name> /keybase/private/telotortium,strib/

(substituting your repo name for <repo-name>, I don't want to expose it publicly here.) Then I can dig into it locally.

@telotortium
Copy link
Author

I've shared the repo with you.

@strib
Copy link
Contributor

strib commented Apr 11, 2018

Thanks. I play around with it and am not having any issues cloning, pulling or pushing. Hrm. Are you still getting the packfile failure on a git pull? If so, in your local repo where you're doing the pull from, what's the HEAD commit ID?

@telotortium
Copy link
Author

Locally the head commit is ec55898. On KBFS HEAD is at be300d2 and has the local HEAD as a direct ancestor. Should I just try to clone the repo elsewhere locally?

@strib
Copy link
Contributor

strib commented Apr 11, 2018

Yeah a full clone from what is in KBFS worked for me, so you might want to try that. But that's not very satisfying, I'll try to repro your issue on my own too.

@telotortium
Copy link
Author

telotortium commented Apr 13, 2018

I performed a full clone 2 days ago. Now I'm seeing the issue again. I know that I recently updated the repo from a Linux workstation, so maybe that has something to do with it. The version on the workstation is 1.0.45-20180313160419.f0de728311 - I'll try upgrading to the most recent version (currently 1.0.47-20180403160041.242bcf96eb).

@strib
Copy link
Contributor

strib commented Apr 13, 2018

Hrm, odd. I don't think we've done any major git-related changes since that last release, but yeah it's always good to try upgrading.

I was not able to reproduce the pull issue with the repo you shared with me, even when making sure my local checkout only included up to commit ec55898.

It might be worth doing a log send from your Linux workstation, so we can check if any obvious errors happened during the push. But right now I'm still stumped.

@telotortium
Copy link
Author

Log ID from Linux: 446816aef643c4996ebee11c. However, I restarted Keybase since the push that seems to have caused the problems on macOS (due to upgrading on Linux), so I'm not sure how useful it is. Also, I ran git push from Linux just now, but I'm still seeing the same issues on macOS.

@telotortium
Copy link
Author

Another thought: pack-0000000000000000000000000000000000000000.idx looks a lot like it's a filename that would result if the code had forgotten to set a variable or field that should hold a Git object ref, because Go zero initializes all fields and variables.

@strib
Copy link
Contributor

strib commented Apr 13, 2018

Oh, I might see the problem. It looks like, at the KBFS level, you once had two clients (or maybe two simultaneous git commands on the same client) updating the repo at the same time. This led KBFS to do conflict resolution, which led to special "conflict" files in your repo/objects/pack directory. It seems like the git library we're using isn't correctly ignoring these files, and is instead trying to treat them like regular files, leading to the zero hash.

If I'm right, this should clean it up for you (where $REPONAME is set to your repo's name):

rm /keybase/private/telotorium/.kbfs_git/$REPONAME/objects/pack/*conflicted*

Can you give that a try and see if it helps? (Though I'm still not sure why I wasn't able to repro it myself.)

@telotortium
Copy link
Author

That works! It's quite possible that two clients on different computers could have tried to access the repo at the same time (I'd think less likely than 2 git commands on the same computer, because Git normally locks the repo while performing operations, doesn't it?). I wonder what could be done to prevent this in the future.

@strib
Copy link
Contributor

strib commented Apr 13, 2018

Great! I have a fix in our code so this won't happen again, I'll put up a PR soon.

And no, git doesn't prevent you from running git push (or whatever) command more than once on the same client. The locking happens on the remote side, but the way Keybase git works is that the lock only protects a very small part of the operation (for performance reasons). We let the clients upload multiple copies of the same packfile if they try to do so.

Thanks for your help catching this!

strib added a commit to keybase/go-git that referenced this issue Apr 13, 2018
For both packfiles and object files.

Issue: keybase/client#11366
strib added a commit to keybase/go-git that referenced this issue Apr 13, 2018
For both packfiles and object files.

Issue: keybase/client#11366
strib added a commit to keybase/kbfs that referenced this issue Apr 13, 2018
If conflict files ended up in `objects` or `objects/pack`, go-git
would still try to treat their full names as hashes, and end up trying
to access a file named by the 0 hash, which would fail.  Instead, just
skip files like that.

Issue: keybase/client#11366
strib added a commit to keybase/go-git that referenced this issue Apr 13, 2018
For both packfiles and object files.

Issue: keybase/client#11366
strib added a commit to keybase/go-git that referenced this issue Apr 13, 2018
For both packfiles and object files.

Issue: keybase/client#11366
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>
strib added a commit to keybase/kbfs that referenced this issue Apr 16, 2018
If conflict files ended up in `objects` or `objects/pack`, go-git
would still try to treat their full names as hashes, and end up trying
to access a file named by the 0 hash, which would fail.  Instead, just
skip files like that.

Issue: keybase/client#11366
@strib
Copy link
Contributor

strib commented Apr 16, 2018

Merged to master, shouldn't happen again after the next release.

@strib strib closed this as completed Apr 16, 2018
Snehal1112 added a commit to Snehal1112/go-git that referenced this issue Dec 1, 2018
* plumbing: format: pktline, Accept oversized pkt-lines up to 65524 bytes

The canonical Git client successfully decodes sideband packets up to
65524 bytes in length (4-byte header + 65520-byte payload). The Git
protocol documentation was updated in August 2016 to reduce the maximum
payload size to 65516 bytes, however old implementations still exist in
the wild emitting 65520-byte payloads.

As there is no technical difficulty with accepting (not emitting) larger
payload sizes, this change adjusts the limit check to allow successful
decoding of packets up to 65524 bytes. This change increases
compatibility with the current canonical Git implementation.

Doc changes from August 2016:
  git/git@7841c48#diff-52695c8fe91b78b70cea44562ae28297L67

Current packet buffer size is still LARGE_PACKET_MAX (+1 null):
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/sideband.c#L24
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/sideband.c#L36

LARGE_PACKET_MAX definition:
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/pkt-line.h#L100

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* add PlainOpen variant to find .git in parent dirs

This is the git tool's behavior that people are used to; if one runs a
git command in a repository's subdirectory, git still works.

Fixes src-d#765.

Signed-off-by: Daniel Martí <mvdan@mvdan.cc>

* use bsd superset for conditional compilation

Signed-off-by: wardn <wardn@users.noreply.github.com>

* config: adds branches to config for tracking branches against remotes, updates clone to track when cloning a branch. Fixes src-d#313

Signed-off-by: Jeremy Chambers <jeremy@thehipbot.com>

* dotgit: ignore filenames that don't match a hash

For both packfiles and object files.

Issue: keybase/client#11366
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* storage: dotgit, init fixtures in benchmark. Fixes src-d#770

fixtures is not initialized in BenchmarkRefMultipleTimes and caused
panic.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: remote, Add shallow commits instead of substituting. Fixes src-d#412

updateShallow substituted the previous shallow list with the one
returned by the UploadPackResponse. If the repository had previous
shallow commits these are deleted from the list.

This change adds the new shallow hashes to the old ones.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* dotgit: add test for bad file in pack directory

Suggested by mcuadros.

Issue: src-d#807
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* Resolve full commit sha to plumbing hash

Signed-off-by: antham <hamonanth@gmail.com>

* storage: filesystem, close shallow file when read

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* git: worktree, Skip special git directory. Fixes src-d#814

Signed-off-by: kuba-- <kuba@sourced.tech>

* travis: dropping 1.8.x support due to golang.org/x/crypto/ssh requirement

* Use remote name in fetch while clone

Fixes src-d#827

Signed-off-by: Dustin Frisch <fooker@lab.sh>

* Worktree: Provide ability to add excludes  (src-d#825)

Worktree: Provide ability to add excludes

* Teach ResolveRevision how to look up annotated tags

Signed-off-by: Mike Lundy <mike@fluffypenguin.org>

* git: remote, Do not iterate all references on update.

The current code iterates all the references in the remote to check if
they match the refspec. This is OK when the refspec is a wildcard but
is a waste of time when they are not.

A hash with references is generated for fast access before starting the
update and used only when the refspec is not a wildcard.

In a repository with 7800 references this meant 7800 * 7800 checks. With
the current code it took 8m30s to update the references. With the new
code it takes less than 0.5s.

References are already extensively tested in remote_test.go.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* idxfile: optimise allocations in readObjectNames

This makes all the required Entry allocations in one go,
instead of huge amounts of small individual allocations.

Signed-off-by: David Symonds <dsymonds@golang.org>

* packfile: improve Index memory representation to be more compact

Instead of using a map for offset indexing, use a sorted slice.
Binary searching is fast, and a slice is much more compact.
This has a negligible hit on speed, but has a significant impact on
memory usage, especially for larger repos.

benchmark                         old ns/op     new ns/op     delta
BenchmarkIndexConstruction-12     15506506      14056098      -9.35%

benchmark                         old allocs     new allocs     delta
BenchmarkIndexConstruction-12     60764          60385          -0.62%

benchmark                         old bytes     new bytes     delta
BenchmarkIndexConstruction-12     4318145       3913169       -9.38%

Signed-off-by: David Symonds <dsymonds@golang.org>

* config: modules, Ignore submodules with dotdot '..' path components. Fixes CVE-2018-11235

References:
 * https://blogs.msdn.microsoft.com/devops/2018/05/29/announcing-the-may-2018-git-security-vulnerability/
 * https://security-tracker.debian.org/tracker/CVE-2018-11235
 * git/git@0383bbb

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* worktree: Don't allow .gitmodules to be a symlink. Fixes CVE-2018-11235

References:
 * https://blogs.msdn.microsoft.com/devops/2018/05/29/announcing-the-may-2018-git-security-vulnerability/
 * https://security-tracker.debian.org/tracker/CVE-2018-11235
 * git/git@10ecfa7

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* dotgit: Move package outside internal.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* Remove println

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: object, adds tree path cache to trees. Fixes src-d#793

The cache is used in Tree.FindEntry for faster path search.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, Don't push empty objects. Fixes src-d#840

Signed-off-by: kuba-- <kuba@sourced.tech>

* storage: filesystem, make ObjectStorage constructor public

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/transport: http, Adds token authentication support [Fixes src-d#858]

Signed-off-by: Eric Billingsley <ebilling@babrains.com>

* Fix documentation for Notes

It previously said that it returned all references that are branches, but that's not true.

Signed-off-by: Morgan Bazalgette <the@howl.moe>

* packfile: optimise NewIndexFromIdxFile for a very common case

Loading from an on-disk idxfile will usually already have the idxfile
entries in order, so check that before wasting time on sorting.

Signed-off-by: David Symonds <dsymonds@golang.org>

* Remote.Fetch: error on missing remote reference

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* storage/filesystem: avoid norwfs build flag

norwfs build flag was used to work on filesystems that do not support neither opening a file in read/write mode or renaming a file (e.f. sivafs).

This had two problems:

- go-git could not be compiled to work properly both with regular filesystems and limited filesystems at the same time.
- the norwfs trick was not available on Windows.

This PR removes the norwfs build flag, as well as the windows conditional flag on the dotgit package.

For the file open mode, we use the new billy capabilities, to check at runtime if the filesystem supports opening a file in read/write mode or not.

For the renaming, we just try and fallback to alternative methods if billy.ErrNotSupported is returned.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* utils: diff, skip useless rune->string conversion

According to library documentation :
https://github.com/sergi/go-diff/blob/master/diffmatchpatch/diff.go#L391

Signed-off-by: Marc Barussaud <marc.barussaud@orange.com>

* plumbing: add context to allow cancel on diff/patch computing

Signed-off-by: Marc Barussaud <marc.barussaud@orange.com>

* worktree: add test for correct tree sorting (issue src-d#881)

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* worktree: sort the tree object.  Fixes src-d#881

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* worktree: address PR comments: sort imports appropriately

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* plumbing: object, expose ErrEntryNotFound in FindEntry. Fixes src-d#883

FindEntry will return ErrDirNotFound if the directory doesn't exist. But
it doesn't return a public error if the entry itself is missing.  This
exposes the internal error ErrEntryNotFound, so users can
programmatically check for this condition.

Signed-off-by: James Ravn <james@r-vn.org>

* plumbing/transport/internal: common, add support of Gogs for ErrRepositoryNotFound, avoiding to get an 'unknown error: '. Add some tests for existing supported services (github, gitlab, etc...) too.

Signed-off-by: Jerome Doucet <jerdct@gmail.com>

* plumbing/object: fix pgp signature encoder/decoder

The way of reading pgp signatures was searching for pgp begin line in
the header. This caused problems when this string appeared and was not
part of the signature. For example if it appears in the message as an
example or is part of the author name the decoder starts treating it as
a signature. In this state the code was not able to notice then the
header ended so it entered in an infinite loop searching for pgp end
string.

Now it uses the same method as original git. Searches for gpgsig section
in header and starts getting all lines until the next part.

In encoder the string used to add signatures was incorrect. It is now
changed to the proper "gpgsig" string instead of "pgpsig".

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/format/idxfile: add new Index and MemoryIndex

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/packfile: add new packfile parser

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: disable lookup by offset

In one case it disables the cache and the other disables lookup when
the scanner is not seekable. Could be added back later.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: idxfile, add idxfile.Writer with Observer interface

It's still not complete:

* 64 bit offsets
* IdxChecksum

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: use Entry to hold object data

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: support offset64 generating indexes

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: preallocate memory in PatchDelta

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: fix bug searching in MemoryIndex

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: add offset/hash mapping to index

This functionality may be moved elsewhere in the future but is needed
now to fit filesystem.ObjectStorage and the new index.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: index is created only once and retrieved with Index

Index is also automatically generated when OnFooter is called.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing, storage: integrate new index

Now dotgit.PackWriter uses the new packfile.Parser and index.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, new Packfile representation

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing, packfile: delete index_test as is no longer used

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: fix two errors in idxfile and packfile decoder

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add back IndexStorage

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, lazy object reads with DiskObjects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/idxfile: test FindHash and writer with 64 bit offsets

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: remove duplicated IndexStorage

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: add index generation to decoder

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Fix wrong godoc on Tags() method.

Reword Tags() method documentation. Point to TagObjects() method to get all the tags on a repository.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: packfile, fix package tests

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* *: use parser to populate non writable storages and bug fixes

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* Fixed cloning of a single tag

Relates to src-d#870

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* plumbing: packfile, allow non-seekable sources on Parser

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, add Parse benchmark

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, read object content only once

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, benchmark PackfileIter

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, rename DiskObject to FSObject

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, close Packfile after iterating objects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, add PackfileIter benchmark reading object content

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, open and close packfile on FSObject reads

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* git: add benchmark for iterating repository objects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: idxfile, Crc32 to CRC32 and return ok from findHashIndex

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: add buffer cache and use it in packfile parser

It uses less memory and is faster as slices don't have to be converted
from/to MemoryObject and they are indexed by offset.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/pacfile: tidy up objectInfo struct

* a new hasher is created when needed
* delete unused fields
* base content is no longer kept in memory

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: do not compute sha1 for already undeltified objects

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* added hook support

Signed-off-by: noxora <ldecker@indeed.com>

trying a possible fix to the delete test

Signed-off-by: noxora <ldecker@indeed.com>

still trying to fix this test

Signed-off-by: noxora <ldecker@indeed.com>

fixes did not work, seems to be a windows env problem

Signed-off-by: noxora <ldecker@indeed.com>

* plumbing: object, Don't add new line at end of commit signature

The way that commit signatures were being written out was causing an
extra newline to be written at the end of the commit when the message
encoding was already taking care of this. Ultimately, this results in a
corrupt object, rendering the object unverifiable with the signature in
the commit.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add ability to PGP sign commits

This adds the ability to sign commits by adding the SignKey field to
CommitOptions. If present, the commit will be signed during the
WorkTree.Commit call.

The supplied SignKey must already be decrypted by the caller.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Remove use of strings.Builder

This was added in Go 1.10 and is not supported on Go 1.9. Switched to
bytes.Buffer to ensure compatibility.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Remove old hash validation code

This will not work for a signed commit as with the GPG signature being a
part of the commit, the hash is now non-deterministic.

Verification of the commit is done through the validation of the
signature.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add extra test for testing bad key error case

I'm hoping this helps get codecov to a tolerable delta. :)

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* dotgit: fix object delete test

Signed-off-by: Santiago M. Mola <santi@mola.io>

* object: fix panic when reading object header

When the first line of the pgp signature is an empty line or some header
is malformed it crashes as there's no data for the header element. For
example, if author name is "\n".

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Fixed an edge case for .gitignore

Fixes src-d#923

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* plumbing/idxfile: object iterators returns entries in offset order

In the latest change the order was changed from offset order in
packfiles to hash order. This makes reading all the objects not as
efficient as before. It also created problems when the previous order
was expected.

Also added EntriesByOffset to indexes.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: Add tagging support

This adds a few methods:

* CreateTag, which can be used to create both lightweight and annotated
tags with a supplied TagObjectOptions struct. PGP signing is possible as
well.
* Tag, to fetch a single tag ref. As opposed to Tags or TagObjects, this
will also fetch the tag object if it exists and return it along with the
output. Lightweight tags just return the object as nil.
* DeleteTag, to delete a tag. This simply deletes the ref. The object is
left orphaned to be GCed later.

I'm not 100% sure if DeleteTag is the correct behavior - looking for
details on exactly *what* happens to a tag object if you delete the ref
and not the tag were sparse, and groking the Git source did not really
produce much insight to the untrained eye. This may be something that
comes up in review. If deletion of the object is necessary, the
in-memory storer may require some updates to allow DeleteLooseObject to
be supported.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing: object, correct tag PGP encoding

As with the update in ec3d2a8, tag encoding needed to be corrected to
ensure extra newlines were not being added in during tag object
encoding, so that it did not corrupt the object for verification.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing: object, don't add extra newline on PGP signature

Tag encoding/decoding seems to be a lot more sensitive to requiring the
exact expected format in the object, which generally includes messages
canonicalized so that they have a newline on the end (even if they
didn't before).

As such, the message should be written with the newline (no need for an
extra), and the PGP signature right after that, which will be newline
split already, so there's no need to split it again.

All of this means it's very important for the caller to send the message
in the correct format - which I'm correcting in the next commit.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Canonicalize incoming annotated tag messages

Tag messages are highly sensitive to being in the expected format,
especially when encoding/decoding for PGP verification.

As such, we do a simple trimming of whitespace on the incoming message
and add a newline on the end, to ensure there are no surprises here.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Replace test signing key with one with longer expiry

The old one was created with defaults, which would have caused CI
failures in 2 years.

The new one is valid for 10 years:

> gpg --list-secret-keys
/root/.gnupg/pubring.kbx
------------------------
sec   rsa4096 2018-08-22 [SC] [expires: 2028-08-19]
      93A17FF01E54328546087C8E029395402EFCCD53
uid           [ unknown] foo bar <foo@foo.foo>

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing, storage: add bases to the common cache

After clone only resolved deltas were added to the cache. This caused
slowdowns in small repositories where most objects can be held in cache.

It also makes packfiles reuse delta cache from the store. Previously it
created a new delta cache each time a packfile object was created. This
also slowed down a bit accessing objects and had an impact on memory
consumption when bases are added to the cache.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: Discern tag target type from supplied hash

I figured there was a way to do this without having to have
TagObjectOptions supply this in - there is.

Added support for this in and removed the object type from
TagObjectOptions.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Don't return tag object with Tag, adjust docs for Tag and Tags

I've mainly noticed that in using the current Tag function, that there
were lots of times that I was ignoring the ref or the object, depending
on what I needed. This was evident in the tests as well. As such, I
think it just makes more sense for a singular tag fetcher to return just
a ref, through which the caller may grab the annotation if they need it,
and if it exists.

Also, contrary to the docs on Tags, all tags have a ref, even if they
are annotated. The difference between a lightweight tag and an annotated
tag is the presence of the tag object, which the ref will point to if
the tag is annotated. As such I've adjusted the docs with an example as
to how one can get the annotation for a tag ref through the iterator.

Source: https://git-scm.com/book/en/v2/Git-Internals-Git-References,
tags section.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* storage/dotgit: search for incoming dir only once

Search for incoming object directory was done once each time objects
were accessed. This means a ReadDir of the objects path that is
expensive. Now incoming directory is searched the first time an object
is accessed and its name kept in DotGit to be reused.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: use HasPrefix instead of Split

Also reformatted function comment and fixed some typos.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Remove empty dirs when cleaning with Dir opt.

Signed-off-by: kuba-- <kuba@sourced.tech>

* Add Status.IsUntracked function

Signed-off-by: kuba-- <kuba@sourced.tech>

* plumbing: object: Clamp object timestamps before unix epoch to unix epoch

Signed-off-by: Taru Karttunen <taruti@taruti.net>

* config: add commentChar to core config struct

Signed-off-by: Zaq? Wiedmann <zaquestion@gmail.com>

* git: add Static option to PlainOpen

Also adds Static configuration to Storage and DotGit. This option means
that the git repository is not expected to be modified while open and
enables some optimizations.

Each time a file is accessed the storer tries to open an object file for
the requested hash. When this is done for a lot of objects it is
expensive. With Static option a list of object files is generated the
first time an object is accessed and used to check if exists instead of
using system calls.

A similar optimization is done for packfiles.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git, storer: use a common storer.Options for storer and PlainOpen

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* dotgit: fix typo in comment

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: do not expose storage options in PlainOpen

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/storer: rename Static option to ExclusiveAccess

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: make Storage options private

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: move Options to filesytem and dotgit

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: add ExclusiveAccess tests in dotgit

This functionality was already tested in storage/filesystem.
The coverage tool only takes into account files from the same
package of the test.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: add KeepDescriptors option

This option maintains packfile file descriptors opened after reading
objects from them. It improves performance as it does not have to be
opening packfiles each time an object is needed.

Also adds Close to EncodedObjectStorer to close all the files manualy.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/storer: do not expose Close in EncodedObjectStorer interface

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add KeepDescriptors test

Also delete Close from MockObjectStorage and memory storer.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: compare files using offset in test

Using equals to compare files it uses diff to do so. This can
potentially consume lots of ram. Changed the comparison to use file
offsets. If the descriptor is reused the offset is maintained.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/transport: ssh check if list of known_hosts files is empty

Signed-off-by: kuba-- <kuba@sourced.tech>

* Fix fatal corrupt patch in unified diff format

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* Expose Storage cache.

Signed-off-by: kuba-- <kuba@sourced.tech>

* git: s/fetch/returns/ on Tag function doc

This is to avoid any ambiguity with the act of "fetching" in git in
general.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add Tag objects to the list of supported objects for walking

This is necessary to support pruning on Tag objects.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Don't touch tag objects orphaned by tag deletion

Deleting a tag ref for an annotated tag in normal git behavior does not
delete the tag object right away. This is handled by the normal GC
process.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add some tests for annotated tag deletion

Added a couple of tests for annotated tag deletion:

* The first one is a general test and should work regardless of the
fixture used - the tag object could possibly be packed, so we do a prune
*and* a repack operation before testing to see if the object was GCed
correctly.

* The second one actually creates the tag to be deleted, so that the tag
object gets created as a loose, unpacked object. This is so we can
effectively test that purning unpacked objects is now working 100%
correctly (this was failing before because tag objects were not
supported for walking).

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: s/TagObjectOptions/CreateTagOptions/

Just renaming the TagObjectOptions type to CreateTagOptions so that it's
consistent with the other option types.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* repository: fix test for new Storage constructor

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* *: go modules support

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* travis: drop go1.9 add go1.11

* Fix potential LRU cache size issue.

Signed-off-by: kuba-- <kuba@sourced.tech>

* Remove empty space to trigger windows build.

Signed-off-by: kuba-- <kuba@sourced.tech>

* storage/filesystem: keep packs open in PackfileIter

PackfileIter was not taking into account the option KeepDescriptors
and was always closing the file. This caused "file already closed"
errors when iterating packfiles in with KeepDescriptors active.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add more doc to NewPackfileIter

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* all: remove extra 's' in "mismatch"

Signed-off-by: Jongmin Kim <jmkim@pukyong.ac.kr>

* test: improve test for urlencoded user:pass

Signed-off-by: Santiago M. Mola <santi@mola.io>

* use time.IsZero in Prune

Signed-off-by: u5surf <u5.horie@gmail.com>

* Add test for Windows local paths.

Signed-off-by: Filip Navara <navara@emclient.com>

* git: Fix Status.IsClean() documentation

The documentation of the IsClean Method contained a negation, so it was
describing the opposite of its actual behavior.

Fixes src-d#838

Signed-off-by: David Url <david@urld.io>

* Plumbing: object, Add support for Log with filenames. Fixes src-d#826 (src-d#979)

plumbing: object, Add support for Log with filenames. Fixes src-d#826

* object: get object size without reading whole object

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* tree: add a Size() method for getting plaintext size

Without reading the entire object into memory.

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* filesystem: add a new test for EncodedObjectSize

Suggested by taruti.

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* repository: allow open non-bare repositories as bare

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* use remote name in fetch while clone, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* references: sort: compare author timestamps when commit timestamps are equal, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* teach ResolveRevision how to look up annotated tags, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* teach ResolveRevision how to look up annotated tags, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* packfile: add comment on GetSizeByOffset

Suggested by mcuadros.

Issue: src-d#982
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* blame: fix edge case with missing \n in content length causing mismatched length error

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* repository: improve CheckoutOption.Hash doc

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* remote: use reference deltas on push when the remote server does not
support offset deltas

Signed-off-by: Benjamin Ash <bash@intelerad.com>

* Fixed a typo. (src-d#989)

README: Fixed a typo.

* Enables building on openbsd, dragonfly bsd and solaris

Signed-off-by: Yuce Tekol <yucetekol@gmail.com>

* plumbing/format/packfile: Fix broken "thin" packfile support. Fixes src-d#991

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* plumbing: ReferenceName constructors

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* examples  & documentation: PlainClone with Basic Authentication (Password & Access Token) (src-d#990)

examples: PlainClone with Basic Authentication (Password & Access Token)

* add StackOverflow to support channels

Since we are not redirecting users to StackOverflow for support
questions, it makes sense to add it to the official support channels.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: transport/http, Add missing host/port on redirect. Fixes src-d#820

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* Fix spelling and grammar in docs and example

Signed-off-by: Lukasz Kokot <lukasz@kumojin.com>

* update gcfg dependency to v1.4.0

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* repository: added cleanup for the PlainCloneContext()

Signed-off-by: Bartek Jaroszewski <jaroszewskibartek@gmail.com>

* improve cleanup implementation, add more tests

Signed-off-by: Santiago M. Mola <santi@mola.io>

* Update LICENSE

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* http: improve TokenAuth documentation

Users are often confused with TokenAuth, since it might look that it
should be used with GitHub's OAuth tokens. But that is not the case.

TokenAuth implements HTTP bearer authentication. Most git servers will
use HTTP basic authentication (user+passwords) even for OAuth tokens.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: ssh, Fix flaky test TestAdvertisedReferencesNotExists. Fixes src-d#969

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* repository: Fix RefSpec for a single tag. Fixes src-d#960

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* storage/filesystem: Added reindex method to  reindex packfiles

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* plumbing/format/packfile: Added thin pack test

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* Remove unused method (src-d#1022)

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: format/index: support for EOIE extension, by default on git v2.2.0

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* repository: fix plain clone error handling regression

PR src-d#1008 introduced a regression by changing the errors returned by
PlainClone when a repository did not exist.

This change goes back to returned errors as they were in v4.7.0.

Fixes src-d#1027

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: format/packfile, performance optimizations for reading large commit histories (src-d#963)

Signed-off-by: Filip Navara <navara@emclient.com>
Snehal1112 added a commit to Snehal1112/go-git that referenced this issue Dec 2, 2018
* plumbing: format: pktline, Accept oversized pkt-lines up to 65524 bytes

The canonical Git client successfully decodes sideband packets up to
65524 bytes in length (4-byte header + 65520-byte payload). The Git
protocol documentation was updated in August 2016 to reduce the maximum
payload size to 65516 bytes, however old implementations still exist in
the wild emitting 65520-byte payloads.

As there is no technical difficulty with accepting (not emitting) larger
payload sizes, this change adjusts the limit check to allow successful
decoding of packets up to 65524 bytes. This change increases
compatibility with the current canonical Git implementation.

Doc changes from August 2016:
  git/git@7841c48#diff-52695c8fe91b78b70cea44562ae28297L67

Current packet buffer size is still LARGE_PACKET_MAX (+1 null):
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/sideband.c#L24
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/sideband.c#L36

LARGE_PACKET_MAX definition:
  https://github.com/git/git/blob/468165c1d8a442994a825f3684528361727cd8c0/pkt-line.h#L100

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* add PlainOpen variant to find .git in parent dirs

This is the git tool's behavior that people are used to; if one runs a
git command in a repository's subdirectory, git still works.

Fixes src-d#765.

Signed-off-by: Daniel Martí <mvdan@mvdan.cc>

* use bsd superset for conditional compilation

Signed-off-by: wardn <wardn@users.noreply.github.com>

* config: adds branches to config for tracking branches against remotes, updates clone to track when cloning a branch. Fixes src-d#313

Signed-off-by: Jeremy Chambers <jeremy@thehipbot.com>

* dotgit: ignore filenames that don't match a hash

For both packfiles and object files.

Issue: keybase/client#11366
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* storage: dotgit, init fixtures in benchmark. Fixes src-d#770

fixtures is not initialized in BenchmarkRefMultipleTimes and caused
panic.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: remote, Add shallow commits instead of substituting. Fixes src-d#412

updateShallow substituted the previous shallow list with the one
returned by the UploadPackResponse. If the repository had previous
shallow commits these are deleted from the list.

This change adds the new shallow hashes to the old ones.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* dotgit: add test for bad file in pack directory

Suggested by mcuadros.

Issue: src-d#807
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* Resolve full commit sha to plumbing hash

Signed-off-by: antham <hamonanth@gmail.com>

* storage: filesystem, close shallow file when read

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* git: worktree, Skip special git directory. Fixes src-d#814

Signed-off-by: kuba-- <kuba@sourced.tech>

* travis: dropping 1.8.x support due to golang.org/x/crypto/ssh requirement

* Use remote name in fetch while clone

Fixes src-d#827

Signed-off-by: Dustin Frisch <fooker@lab.sh>

* Worktree: Provide ability to add excludes  (src-d#825)

Worktree: Provide ability to add excludes

* Teach ResolveRevision how to look up annotated tags

Signed-off-by: Mike Lundy <mike@fluffypenguin.org>

* git: remote, Do not iterate all references on update.

The current code iterates all the references in the remote to check if
they match the refspec. This is OK when the refspec is a wildcard but
is a waste of time when they are not.

A hash with references is generated for fast access before starting the
update and used only when the refspec is not a wildcard.

In a repository with 7800 references this meant 7800 * 7800 checks. With
the current code it took 8m30s to update the references. With the new
code it takes less than 0.5s.

References are already extensively tested in remote_test.go.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* idxfile: optimise allocations in readObjectNames

This makes all the required Entry allocations in one go,
instead of huge amounts of small individual allocations.

Signed-off-by: David Symonds <dsymonds@golang.org>

* packfile: improve Index memory representation to be more compact

Instead of using a map for offset indexing, use a sorted slice.
Binary searching is fast, and a slice is much more compact.
This has a negligible hit on speed, but has a significant impact on
memory usage, especially for larger repos.

benchmark                         old ns/op     new ns/op     delta
BenchmarkIndexConstruction-12     15506506      14056098      -9.35%

benchmark                         old allocs     new allocs     delta
BenchmarkIndexConstruction-12     60764          60385          -0.62%

benchmark                         old bytes     new bytes     delta
BenchmarkIndexConstruction-12     4318145       3913169       -9.38%

Signed-off-by: David Symonds <dsymonds@golang.org>

* config: modules, Ignore submodules with dotdot '..' path components. Fixes CVE-2018-11235

References:
 * https://blogs.msdn.microsoft.com/devops/2018/05/29/announcing-the-may-2018-git-security-vulnerability/
 * https://security-tracker.debian.org/tracker/CVE-2018-11235
 * git/git@0383bbb

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* worktree: Don't allow .gitmodules to be a symlink. Fixes CVE-2018-11235

References:
 * https://blogs.msdn.microsoft.com/devops/2018/05/29/announcing-the-may-2018-git-security-vulnerability/
 * https://security-tracker.debian.org/tracker/CVE-2018-11235
 * git/git@10ecfa7

Signed-off-by: Joseph Vusich <jvusich@amazon.com>

* dotgit: Move package outside internal.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* Remove println

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: object, adds tree path cache to trees. Fixes src-d#793

The cache is used in Tree.FindEntry for faster path search.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, Don't push empty objects. Fixes src-d#840

Signed-off-by: kuba-- <kuba@sourced.tech>

* storage: filesystem, make ObjectStorage constructor public

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/transport: http, Adds token authentication support [Fixes src-d#858]

Signed-off-by: Eric Billingsley <ebilling@babrains.com>

* Fix documentation for Notes

It previously said that it returned all references that are branches, but that's not true.

Signed-off-by: Morgan Bazalgette <the@howl.moe>

* packfile: optimise NewIndexFromIdxFile for a very common case

Loading from an on-disk idxfile will usually already have the idxfile
entries in order, so check that before wasting time on sorting.

Signed-off-by: David Symonds <dsymonds@golang.org>

* Remote.Fetch: error on missing remote reference

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* storage/filesystem: avoid norwfs build flag

norwfs build flag was used to work on filesystems that do not support neither opening a file in read/write mode or renaming a file (e.f. sivafs).

This had two problems:

- go-git could not be compiled to work properly both with regular filesystems and limited filesystems at the same time.
- the norwfs trick was not available on Windows.

This PR removes the norwfs build flag, as well as the windows conditional flag on the dotgit package.

For the file open mode, we use the new billy capabilities, to check at runtime if the filesystem supports opening a file in read/write mode or not.

For the renaming, we just try and fallback to alternative methods if billy.ErrNotSupported is returned.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* utils: diff, skip useless rune->string conversion

According to library documentation :
https://github.com/sergi/go-diff/blob/master/diffmatchpatch/diff.go#L391

Signed-off-by: Marc Barussaud <marc.barussaud@orange.com>

* plumbing: add context to allow cancel on diff/patch computing

Signed-off-by: Marc Barussaud <marc.barussaud@orange.com>

* worktree: add test for correct tree sorting (issue src-d#881)

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* worktree: sort the tree object.  Fixes src-d#881

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* worktree: address PR comments: sort imports appropriately

Signed-off-by: Mark Bartel <github@spottybenny.ca>

* plumbing: object, expose ErrEntryNotFound in FindEntry. Fixes src-d#883

FindEntry will return ErrDirNotFound if the directory doesn't exist. But
it doesn't return a public error if the entry itself is missing.  This
exposes the internal error ErrEntryNotFound, so users can
programmatically check for this condition.

Signed-off-by: James Ravn <james@r-vn.org>

* plumbing/transport/internal: common, add support of Gogs for ErrRepositoryNotFound, avoiding to get an 'unknown error: '. Add some tests for existing supported services (github, gitlab, etc...) too.

Signed-off-by: Jerome Doucet <jerdct@gmail.com>

* plumbing/object: fix pgp signature encoder/decoder

The way of reading pgp signatures was searching for pgp begin line in
the header. This caused problems when this string appeared and was not
part of the signature. For example if it appears in the message as an
example or is part of the author name the decoder starts treating it as
a signature. In this state the code was not able to notice then the
header ended so it entered in an infinite loop searching for pgp end
string.

Now it uses the same method as original git. Searches for gpgsig section
in header and starts getting all lines until the next part.

In encoder the string used to add signatures was incorrect. It is now
changed to the proper "gpgsig" string instead of "pgpsig".

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/format/idxfile: add new Index and MemoryIndex

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/packfile: add new packfile parser

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: disable lookup by offset

In one case it disables the cache and the other disables lookup when
the scanner is not seekable. Could be added back later.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: idxfile, add idxfile.Writer with Observer interface

It's still not complete:

* 64 bit offsets
* IdxChecksum

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: use Entry to hold object data

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: support offset64 generating indexes

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: preallocate memory in PatchDelta

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: fix bug searching in MemoryIndex

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: add offset/hash mapping to index

This functionality may be moved elsewhere in the future but is needed
now to fit filesystem.ObjectStorage and the new index.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/idxfile: index is created only once and retrieved with Index

Index is also automatically generated when OnFooter is called.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing, storage: integrate new index

Now dotgit.PackWriter uses the new packfile.Parser and index.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, new Packfile representation

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing, packfile: delete index_test as is no longer used

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: fix two errors in idxfile and packfile decoder

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add back IndexStorage

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing: packfile, lazy object reads with DiskObjects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing/idxfile: test FindHash and writer with 64 bit offsets

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: remove duplicated IndexStorage

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: add index generation to decoder

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Fix wrong godoc on Tags() method.

Reword Tags() method documentation. Point to TagObjects() method to get all the tags on a repository.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: packfile, fix package tests

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* *: use parser to populate non writable storages and bug fixes

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* Fixed cloning of a single tag

Relates to src-d#870

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* plumbing: packfile, allow non-seekable sources on Parser

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, add Parse benchmark

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, read object content only once

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, benchmark PackfileIter

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, rename DiskObject to FSObject

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, close Packfile after iterating objects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* storage: filesystem, add PackfileIter benchmark reading object content

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: packfile, open and close packfile on FSObject reads

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* git: add benchmark for iterating repository objects

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: idxfile, Crc32 to CRC32 and return ok from findHashIndex

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>

* plumbing: add buffer cache and use it in packfile parser

It uses less memory and is faster as slices don't have to be converted
from/to MemoryObject and they are indexed by offset.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/pacfile: tidy up objectInfo struct

* a new hasher is created when needed
* delete unused fields
* base content is no longer kept in memory

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/packfile: do not compute sha1 for already undeltified objects

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* added hook support

Signed-off-by: noxora <ldecker@indeed.com>

trying a possible fix to the delete test

Signed-off-by: noxora <ldecker@indeed.com>

still trying to fix this test

Signed-off-by: noxora <ldecker@indeed.com>

fixes did not work, seems to be a windows env problem

Signed-off-by: noxora <ldecker@indeed.com>

* plumbing: object, Don't add new line at end of commit signature

The way that commit signatures were being written out was causing an
extra newline to be written at the end of the commit when the message
encoding was already taking care of this. Ultimately, this results in a
corrupt object, rendering the object unverifiable with the signature in
the commit.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add ability to PGP sign commits

This adds the ability to sign commits by adding the SignKey field to
CommitOptions. If present, the commit will be signed during the
WorkTree.Commit call.

The supplied SignKey must already be decrypted by the caller.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Remove use of strings.Builder

This was added in Go 1.10 and is not supported on Go 1.9. Switched to
bytes.Buffer to ensure compatibility.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Remove old hash validation code

This will not work for a signed commit as with the GPG signature being a
part of the commit, the hash is now non-deterministic.

Verification of the commit is done through the validation of the
signature.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add extra test for testing bad key error case

I'm hoping this helps get codecov to a tolerable delta. :)

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* dotgit: fix object delete test

Signed-off-by: Santiago M. Mola <santi@mola.io>

* object: fix panic when reading object header

When the first line of the pgp signature is an empty line or some header
is malformed it crashes as there's no data for the header element. For
example, if author name is "\n".

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Fixed an edge case for .gitignore

Fixes src-d#923

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* plumbing/idxfile: object iterators returns entries in offset order

In the latest change the order was changed from offset order in
packfiles to hash order. This makes reading all the objects not as
efficient as before. It also created problems when the previous order
was expected.

Also added EntriesByOffset to indexes.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: Add tagging support

This adds a few methods:

* CreateTag, which can be used to create both lightweight and annotated
tags with a supplied TagObjectOptions struct. PGP signing is possible as
well.
* Tag, to fetch a single tag ref. As opposed to Tags or TagObjects, this
will also fetch the tag object if it exists and return it along with the
output. Lightweight tags just return the object as nil.
* DeleteTag, to delete a tag. This simply deletes the ref. The object is
left orphaned to be GCed later.

I'm not 100% sure if DeleteTag is the correct behavior - looking for
details on exactly *what* happens to a tag object if you delete the ref
and not the tag were sparse, and groking the Git source did not really
produce much insight to the untrained eye. This may be something that
comes up in review. If deletion of the object is necessary, the
in-memory storer may require some updates to allow DeleteLooseObject to
be supported.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing: object, correct tag PGP encoding

As with the update in ec3d2a8, tag encoding needed to be corrected to
ensure extra newlines were not being added in during tag object
encoding, so that it did not corrupt the object for verification.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing: object, don't add extra newline on PGP signature

Tag encoding/decoding seems to be a lot more sensitive to requiring the
exact expected format in the object, which generally includes messages
canonicalized so that they have a newline on the end (even if they
didn't before).

As such, the message should be written with the newline (no need for an
extra), and the PGP signature right after that, which will be newline
split already, so there's no need to split it again.

All of this means it's very important for the caller to send the message
in the correct format - which I'm correcting in the next commit.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Canonicalize incoming annotated tag messages

Tag messages are highly sensitive to being in the expected format,
especially when encoding/decoding for PGP verification.

As such, we do a simple trimming of whitespace on the incoming message
and add a newline on the end, to ensure there are no surprises here.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Replace test signing key with one with longer expiry

The old one was created with defaults, which would have caused CI
failures in 2 years.

The new one is valid for 10 years:

> gpg --list-secret-keys
/root/.gnupg/pubring.kbx
------------------------
sec   rsa4096 2018-08-22 [SC] [expires: 2028-08-19]
      93A17FF01E54328546087C8E029395402EFCCD53
uid           [ unknown] foo bar <foo@foo.foo>

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* plumbing, storage: add bases to the common cache

After clone only resolved deltas were added to the cache. This caused
slowdowns in small repositories where most objects can be held in cache.

It also makes packfiles reuse delta cache from the store. Previously it
created a new delta cache each time a packfile object was created. This
also slowed down a bit accessing objects and had an impact on memory
consumption when bases are added to the cache.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: Discern tag target type from supplied hash

I figured there was a way to do this without having to have
TagObjectOptions supply this in - there is.

Added support for this in and removed the object type from
TagObjectOptions.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Don't return tag object with Tag, adjust docs for Tag and Tags

I've mainly noticed that in using the current Tag function, that there
were lots of times that I was ignoring the ref or the object, depending
on what I needed. This was evident in the tests as well. As such, I
think it just makes more sense for a singular tag fetcher to return just
a ref, through which the caller may grab the annotation if they need it,
and if it exists.

Also, contrary to the docs on Tags, all tags have a ref, even if they
are annotated. The difference between a lightweight tag and an annotated
tag is the presence of the tag object, which the ref will point to if
the tag is annotated. As such I've adjusted the docs with an example as
to how one can get the annotation for a tag ref through the iterator.

Source: https://git-scm.com/book/en/v2/Git-Internals-Git-References,
tags section.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* storage/dotgit: search for incoming dir only once

Search for incoming object directory was done once each time objects
were accessed. This means a ReadDir of the objects path that is
expensive. Now incoming directory is searched the first time an object
is accessed and its name kept in DotGit to be reused.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: use HasPrefix instead of Split

Also reformatted function comment and fixed some typos.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* Remove empty dirs when cleaning with Dir opt.

Signed-off-by: kuba-- <kuba@sourced.tech>

* Add Status.IsUntracked function

Signed-off-by: kuba-- <kuba@sourced.tech>

* plumbing: object: Clamp object timestamps before unix epoch to unix epoch

Signed-off-by: Taru Karttunen <taruti@taruti.net>

* config: add commentChar to core config struct

Signed-off-by: Zaq? Wiedmann <zaquestion@gmail.com>

* git: add Static option to PlainOpen

Also adds Static configuration to Storage and DotGit. This option means
that the git repository is not expected to be modified while open and
enables some optimizations.

Each time a file is accessed the storer tries to open an object file for
the requested hash. When this is done for a lot of objects it is
expensive. With Static option a list of object files is generated the
first time an object is accessed and used to check if exists instead of
using system calls.

A similar optimization is done for packfiles.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git, storer: use a common storer.Options for storer and PlainOpen

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* dotgit: fix typo in comment

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* git: do not expose storage options in PlainOpen

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/storer: rename Static option to ExclusiveAccess

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: make Storage options private

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: move Options to filesytem and dotgit

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: add ExclusiveAccess tests in dotgit

This functionality was already tested in storage/filesystem.
The coverage tool only takes into account files from the same
package of the test.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/dotgit: add KeepDescriptors option

This option maintains packfile file descriptors opened after reading
objects from them. It improves performance as it does not have to be
opening packfiles each time an object is needed.

Also adds Close to EncodedObjectStorer to close all the files manualy.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/storer: do not expose Close in EncodedObjectStorer interface

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add KeepDescriptors test

Also delete Close from MockObjectStorage and memory storer.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: compare files using offset in test

Using equals to compare files it uses diff to do so. This can
potentially consume lots of ram. Changed the comparison to use file
offsets. If the descriptor is reused the offset is maintained.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* plumbing/transport: ssh check if list of known_hosts files is empty

Signed-off-by: kuba-- <kuba@sourced.tech>

* Fix fatal corrupt patch in unified diff format

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* Expose Storage cache.

Signed-off-by: kuba-- <kuba@sourced.tech>

* git: s/fetch/returns/ on Tag function doc

This is to avoid any ambiguity with the act of "fetching" in git in
general.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add Tag objects to the list of supported objects for walking

This is necessary to support pruning on Tag objects.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Don't touch tag objects orphaned by tag deletion

Deleting a tag ref for an annotated tag in normal git behavior does not
delete the tag object right away. This is handled by the normal GC
process.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: Add some tests for annotated tag deletion

Added a couple of tests for annotated tag deletion:

* The first one is a general test and should work regardless of the
fixture used - the tag object could possibly be packed, so we do a prune
*and* a repack operation before testing to see if the object was GCed
correctly.

* The second one actually creates the tag to be deleted, so that the tag
object gets created as a loose, unpacked object. This is so we can
effectively test that purning unpacked objects is now working 100%
correctly (this was failing before because tag objects were not
supported for walking).

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* git: s/TagObjectOptions/CreateTagOptions/

Just renaming the TagObjectOptions type to CreateTagOptions so that it's
consistent with the other option types.

Signed-off-by: Chris Marchesi <chrism@vancluevertech.com>

* repository: fix test for new Storage constructor

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* *: go modules support

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* travis: drop go1.9 add go1.11

* Fix potential LRU cache size issue.

Signed-off-by: kuba-- <kuba@sourced.tech>

* Remove empty space to trigger windows build.

Signed-off-by: kuba-- <kuba@sourced.tech>

* storage/filesystem: keep packs open in PackfileIter

PackfileIter was not taking into account the option KeepDescriptors
and was always closing the file. This caused "file already closed"
errors when iterating packfiles in with KeepDescriptors active.

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* storage/filesystem: add more doc to NewPackfileIter

Signed-off-by: Javi Fontan <jfontan@gmail.com>

* all: remove extra 's' in "mismatch"

Signed-off-by: Jongmin Kim <jmkim@pukyong.ac.kr>

* test: improve test for urlencoded user:pass

Signed-off-by: Santiago M. Mola <santi@mola.io>

* use time.IsZero in Prune

Signed-off-by: u5surf <u5.horie@gmail.com>

* Add test for Windows local paths.

Signed-off-by: Filip Navara <navara@emclient.com>

* git: Fix Status.IsClean() documentation

The documentation of the IsClean Method contained a negation, so it was
describing the opposite of its actual behavior.

Fixes src-d#838

Signed-off-by: David Url <david@urld.io>

* Plumbing: object, Add support for Log with filenames. Fixes src-d#826 (src-d#979)

plumbing: object, Add support for Log with filenames. Fixes src-d#826

* object: get object size without reading whole object

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* tree: add a Size() method for getting plaintext size

Without reading the entire object into memory.

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* filesystem: add a new test for EncodedObjectSize

Suggested by taruti.

Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* repository: allow open non-bare repositories as bare

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* use remote name in fetch while clone, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* references: sort: compare author timestamps when commit timestamps are equal, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* teach ResolveRevision how to look up annotated tags, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* teach ResolveRevision how to look up annotated tags, test

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* packfile: add comment on GetSizeByOffset

Suggested by mcuadros.

Issue: src-d#982
Signed-off-by: Jeremy Stribling <strib@alum.mit.edu>

* blame: fix edge case with missing \n in content length causing mismatched length error

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* repository: improve CheckoutOption.Hash doc

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* remote: use reference deltas on push when the remote server does not
support offset deltas

Signed-off-by: Benjamin Ash <bash@intelerad.com>

* Fixed a typo. (src-d#989)

README: Fixed a typo.

* Enables building on openbsd, dragonfly bsd and solaris

Signed-off-by: Yuce Tekol <yucetekol@gmail.com>

* plumbing/format/packfile: Fix broken "thin" packfile support. Fixes src-d#991

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* plumbing: ReferenceName constructors

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* examples  & documentation: PlainClone with Basic Authentication (Password & Access Token) (src-d#990)

examples: PlainClone with Basic Authentication (Password & Access Token)

* add StackOverflow to support channels

Since we are not redirecting users to StackOverflow for support
questions, it makes sense to add it to the official support channels.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: transport/http, Add missing host/port on redirect. Fixes src-d#820

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* Fix spelling and grammar in docs and example

Signed-off-by: Lukasz Kokot <lukasz@kumojin.com>

* update gcfg dependency to v1.4.0

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* repository: added cleanup for the PlainCloneContext()

Signed-off-by: Bartek Jaroszewski <jaroszewskibartek@gmail.com>

* improve cleanup implementation, add more tests

Signed-off-by: Santiago M. Mola <santi@mola.io>

* Update LICENSE

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* http: improve TokenAuth documentation

Users are often confused with TokenAuth, since it might look that it
should be used with GitHub's OAuth tokens. But that is not the case.

TokenAuth implements HTTP bearer authentication. Most git servers will
use HTTP basic authentication (user+passwords) even for OAuth tokens.

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: ssh, Fix flaky test TestAdvertisedReferencesNotExists. Fixes src-d#969

Signed-off-by: Colton McCurdy <mccurdyc22@gmail.com>

* repository: Fix RefSpec for a single tag. Fixes src-d#960

Signed-off-by: Fedor Korotkov <fedor.korotkov@gmail.com>

* storage/filesystem: Added reindex method to  reindex packfiles

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* plumbing/format/packfile: Added thin pack test

Signed-off-by: Javier Peletier <jm@epiclabs.io>

* Remove unused method (src-d#1022)

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>

* plumbing: format/index: support for EOIE extension, by default on git v2.2.0

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>

* repository: fix plain clone error handling regression

PR src-d#1008 introduced a regression by changing the errors returned by
PlainClone when a repository did not exist.

This change goes back to returned errors as they were in v4.7.0.

Fixes src-d#1027

Signed-off-by: Santiago M. Mola <santi@mola.io>

* plumbing: format/packfile, performance optimizations for reading large commit histories (src-d#963)

Signed-off-by: Filip Navara <navara@emclient.com>
@programehr
Copy link

programehr commented Dec 19, 2018

I updated keybase today and now have this problem.

@strib
Copy link
Contributor

strib commented Dec 19, 2018

Please open a new issue and send logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants