http protocol error: git fetch_pack: expected ACK/NAK, got 'dul-daemon says what' #88

Closed
asaladin opened this Issue Apr 9, 2013 · 6 comments

3 participants

@asaladin

Hi,

This description follows the previous bug report on launchpad: https://bugs.launchpad.net/dulwich/+bug/953916

I've been able to reproduce the same error on a public git repository, so it can now easily be reproduced.
I can reproduce this problem with at least two versions of dulwich: the current pypi version and the latest github master version at 2a8548b (Dec 04, 2012).

It happens during a git pull operation of a repository served by http.
Here are the steps to reproduce:

$ cd /tmp 
$ git clone https://github.com/asaladin/testdul_repo.git ptools
$ git clone https://github.com/asaladin/testdul_git2.git git2

$ cat test_dulwich/test_dulwich.py
##################
from dulwich.repo import Repo
from dulwich.web import HTTPGitApplication
from dulwich import server as dulserver

repo_path = "/tmp/ptools"

_d = {'/repo' : Repo(repo_path)}
backend = dulserver.DictBackend(_d)
gitserve = HTTPGitApplication(backend)

def app(environ, start_response):
   return gitserve(environ, start_response)
##################

#run on another shell with: $ gunicorn test_dulwich:app
$ cd git2
$ git pull http://localhost:8000/repo 
fatal: git fetch_pack: expected ACK/NAK, got 'dul-daemon says what'

cheers,

@jelmer
Owner
@bbqsrc

Just like to note that this is still broken unfortunately.

@jelmer
Owner
@bbqsrc

If you could prod me into a direction to look, I could have a go at it.

@jelmer
Owner
@jelmer jelmer added the bug label Nov 15, 2014
@metatoaster metatoaster added a commit to metatoaster/dulwich that referenced this issue Mar 25, 2015
@metatoaster metatoaster Test case and data for reproduction of issue #88
- Updated import_repo to allow fast-import to make a working copy as an
  option.
- Added two fast-export test files along with test case that reliably
  reproduce this issue over http.
fccf9fd
@metatoaster metatoaster added a commit to metatoaster/dulwich that referenced this issue Mar 27, 2015
@metatoaster metatoaster Fixes issue #88 in a brute manner
- Just to get the core details done to ensure the issued is nailed down
  before the cleanup.

- `MultiAckDetailedGraphWalkerImpl` rebuilt to ensure output generated
  matches the reference implementation (i.e. follow the behavior of
  git-http-backend).

- While I would like to split out above into separate commit, the pieces
  needed seems to be scattered all over the place so I will leave this
  unisolated simply this change does not make sense without the full
  correction in place (also this was already done to isolate the actual
  root cause and building the fix).  This also required minor changes to
  the related TEST case.

- Another issue that is complicit to issue 88 is that the http server
  behaves almost as if no-done was implemented, however that capability
  is not reported to the client, i.e. the PACK section was sent without
  the expected done, which the client dutifully sends in a subsequent
  HTTP request, but instead it already got the PACK and thus silently
  errors out without completing the pull request, even with the first
  issue corrected.  If the no-done capability was specified the client
  will automatically call merge if configured to do so.

- There is another issue where attempts to pull an unrelated repo, the
  various combination of the previous flaws made it difficult to send
  the correct number of NAKs.  Removing the walker.send_ack call from
  the walker implementation and just ensure the ack is the only method
  that will call send_ack simplifies the process of correcting this.

- A state variable is needed to track whether the client expects the
  server to be processing the have lines as the reference implementation
  errors out (with `expected ACK/NAK got` something else error).  This
  was triggered by `dulwich.object_store.MissingObjectFinder.next`, and
  the flag controls whether `progress` reports progress.  This was the
  issue that was isolated by correcting the above (or some other way
  around, correction to this implementation has been rather difficult
  due to lack of discipline on where things are called, which leads
  to the next point).

- Finally, server needs to send the final ACK obj-id before the PACK
  iff client has already send done.  Otherwise this causes similar
  issues as a misplaced PACK.  Unfortunately due to poor cohesion
  between the walker and handler this was implemented in a hacky
  manner which must  be fixed in a subsequent changeset.  Also the
  definition of scoping and object ownership (e.g. certain method calls
  should not be done at where they are) needs to be corrected (i.e
  define a more common methods across all implementations).
145c6dd
@jelmer jelmer added a commit that referenced this issue Apr 13, 2015
@metatoaster metatoaster Test case and data for reproduction of issue #88
- Updated import_repo to allow fast-import to make a working copy as an
  option.
- Added two fast-export test files along with test case that reliably
  reproduce this issue over http.
4a44e7a
@jelmer jelmer added a commit that referenced this issue Apr 13, 2015
@metatoaster metatoaster Fixes issue #88 in a brute manner
- Just to get the core details done to ensure the issued is nailed down
  before the cleanup.

- `MultiAckDetailedGraphWalkerImpl` rebuilt to ensure output generated
  matches the reference implementation (i.e. follow the behavior of
  git-http-backend).

- While I would like to split out above into separate commit, the pieces
  needed seems to be scattered all over the place so I will leave this
  unisolated simply this change does not make sense without the full
  correction in place (also this was already done to isolate the actual
  root cause and building the fix).  This also required minor changes to
  the related TEST case.

- Another issue that is complicit to issue 88 is that the http server
  behaves almost as if no-done was implemented, however that capability
  is not reported to the client, i.e. the PACK section was sent without
  the expected done, which the client dutifully sends in a subsequent
  HTTP request, but instead it already got the PACK and thus silently
  errors out without completing the pull request, even with the first
  issue corrected.  If the no-done capability was specified the client
  will automatically call merge if configured to do so.

- There is another issue where attempts to pull an unrelated repo, the
  various combination of the previous flaws made it difficult to send
  the correct number of NAKs.  Removing the walker.send_ack call from
  the walker implementation and just ensure the ack is the only method
  that will call send_ack simplifies the process of correcting this.

- A state variable is needed to track whether the client expects the
  server to be processing the have lines as the reference implementation
  errors out (with `expected ACK/NAK got` something else error).  This
  was triggered by `dulwich.object_store.MissingObjectFinder.next`, and
  the flag controls whether `progress` reports progress.  This was the
  issue that was isolated by correcting the above (or some other way
  around, correction to this implementation has been rather difficult
  due to lack of discipline on where things are called, which leads
  to the next point).

- Finally, server needs to send the final ACK obj-id before the PACK
  iff client has already send done.  Otherwise this causes similar
  issues as a misplaced PACK.  Unfortunately due to poor cohesion
  between the walker and handler this was implemented in a hacky
  manner which must  be fixed in a subsequent changeset.  Also the
  definition of scoping and object ownership (e.g. certain method calls
  should not be done at where they are) needs to be corrected (i.e
  define a more common methods across all implementations).
9e17d13
@jelmer jelmer added a commit that referenced this issue Apr 26, 2015
@metatoaster metatoaster Test case and data for reproduction of issue #88
- Updated import_repo to allow fast-import to make a working copy as an
  option.
- Added two fast-export test files along with test case that reliably
  reproduce this issue over http.
e0d56cf
@jelmer jelmer added a commit that referenced this issue Apr 26, 2015
@metatoaster metatoaster Fixes issue #88 in a brute manner
- Just to get the core details done to ensure the issued is nailed down
  before the cleanup.

- `MultiAckDetailedGraphWalkerImpl` rebuilt to ensure output generated
  matches the reference implementation (i.e. follow the behavior of
  git-http-backend).

- While I would like to split out above into separate commit, the pieces
  needed seems to be scattered all over the place so I will leave this
  unisolated simply this change does not make sense without the full
  correction in place (also this was already done to isolate the actual
  root cause and building the fix).  This also required minor changes to
  the related TEST case.

- Another issue that is complicit to issue 88 is that the http server
  behaves almost as if no-done was implemented, however that capability
  is not reported to the client, i.e. the PACK section was sent without
  the expected done, which the client dutifully sends in a subsequent
  HTTP request, but instead it already got the PACK and thus silently
  errors out without completing the pull request, even with the first
  issue corrected.  If the no-done capability was specified the client
  will automatically call merge if configured to do so.

- There is another issue where attempts to pull an unrelated repo, the
  various combination of the previous flaws made it difficult to send
  the correct number of NAKs.  Removing the walker.send_ack call from
  the walker implementation and just ensure the ack is the only method
  that will call send_ack simplifies the process of correcting this.

- A state variable is needed to track whether the client expects the
  server to be processing the have lines as the reference implementation
  errors out (with `expected ACK/NAK got` something else error).  This
  was triggered by `dulwich.object_store.MissingObjectFinder.next`, and
  the flag controls whether `progress` reports progress.  This was the
  issue that was isolated by correcting the above (or some other way
  around, correction to this implementation has been rather difficult
  due to lack of discipline on where things are called, which leads
  to the next point).

- Finally, server needs to send the final ACK obj-id before the PACK
  iff client has already send done.  Otherwise this causes similar
  issues as a misplaced PACK.  Unfortunately due to poor cohesion
  between the walker and handler this was implemented in a hacky
  manner which must  be fixed in a subsequent changeset.  Also the
  definition of scoping and object ownership (e.g. certain method calls
  should not be done at where they are) needs to be corrected (i.e
  define a more common methods across all implementations).
3d5b8a8
@jelmer jelmer added this to the 0.11 milestone May 12, 2015
@jelmer jelmer added a commit that referenced this issue May 24, 2015
@metatoaster metatoaster Fix handling of 'done' in graph walker and implement the 'no-done' ca…
…pability. (#88)

The gist of this gnarly problem is that it looks like the implementation over HTTP behaves as-if the `no-done` capability was enabled but unspecified so the reference client would make a second round trip back with more `haves` but then it doesn't know what to do with the `PACK` that was already sent. Alternatively there are cases where the side-band was used when it was expecting ACK/NAK. There were multiple problems with the code base that made this issue rather hard to stomp out, and here are the fixes (some verbatim from commit messages).

- `MultiAckDetailedGraphWalkerImpl` rebuilt to ensure output generated
  matches the reference implementation, like the git-http-backend. (Even though this is not strictly necessary, repeatedly calling `all_wants_specified` doesn't really change much given that there's only one place where the commits can be added in, which is by the object store through the `ack` method.

- Correction to the communication after all `have` lines have been processed. As a freebie I got `no-done` capability implemented since that's what it looked like before.

- There is another issue where attempts to pull an unrelated repo, the
  various combination of the previous flaws made it difficult to send
  the correct number of NAKs.  Removing the walker.send_ack call from
  the walker implementation and just ensure the ack is the only method
  that will call send_ack simplifies the process of correcting this.

- Added various state variables to the handler to track whether the `done` token is expected.

- Added a couple methods to deal with the handling of the above state variables, which the walker implementations themselves have access to through a method provided by the generic protocol walker class

- Added a method to the graph walker that takes in the state variables provided by the handler to delegate the dealing of the final ACK/NAK line that ends the section to permit the PACK section to begin, which then gets delegated to the Impl instances (which also have been normalized to address this).

- Finally, cleaned up the test cases and added further tests where relevant. Naturally
one of the earlier commits introduce some test examples that demonstrates this problem.
2cf0c06
@jelmer jelmer added a commit that referenced this issue May 24, 2015
@metatoaster metatoaster Fix handling of 'done' in graph walker and implement the 'no-done' ca…
…pability. (#88)

The gist of this gnarly problem is that it looks like the implementation over HTTP behaves as-if the `no-done` capability was enabled but unspecified so the reference client would make a second round trip back with more `haves` but then it doesn't know what to do with the `PACK` that was already sent. Alternatively there are cases where the side-band was used when it was expecting ACK/NAK. There were multiple problems with the code base that made this issue rather hard to stomp out, and here are the fixes (some verbatim from commit messages).

- `MultiAckDetailedGraphWalkerImpl` rebuilt to ensure output generated
  matches the reference implementation, like the git-http-backend. (Even though this is not strictly necessary, repeatedly calling `all_wants_specified` doesn't really change much given that there's only one place where the commits can be added in, which is by the object store through the `ack` method.

- Correction to the communication after all `have` lines have been processed. As a freebie I got `no-done` capability implemented since that's what it looked like before.

- There is another issue where attempts to pull an unrelated repo, the
  various combination of the previous flaws made it difficult to send
  the correct number of NAKs.  Removing the walker.send_ack call from
  the walker implementation and just ensure the ack is the only method
  that will call send_ack simplifies the process of correcting this.

- Added various state variables to the handler to track whether the `done` token is expected.

- Added a couple methods to deal with the handling of the above state variables, which the walker implementations themselves have access to through a method provided by the generic protocol walker class

- Added a method to the graph walker that takes in the state variables provided by the handler to delegate the dealing of the final ACK/NAK line that ends the section to permit the PACK section to begin, which then gets delegated to the Impl instances (which also have been normalized to address this).

- Finally, cleaned up the test cases and added further tests where relevant. Naturally
one of the earlier commits introduce some test examples that demonstrates this problem.
b3445db
@jelmer
Owner

Fixed in master.

@jelmer jelmer closed this May 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment