WIP - Unsymlinked #878

wants to merge 1 commit into


None yet
5 participants

nulltoken commented Aug 15, 2012

This PR is something between a WIP and some R&D. It originates back to libgit2/libgit2sharp#196 and libgit2/libgit2sharp#201.

Curiously, it looks like the .Net tests pass on Mono, Windows x86 but fail on Windows amd64

This PR brings the C version of one of the .Net test in order to try to troubleshoot this at the libgit2 level.

This pull request fails (merged 04c9b9c4 into fc1826d).

This pull request fails (merged fa967951 into fc1826d).

This pull request fails (merged 0e7635d7 into fc1826d).

This pull request passes (merged cc94c113 into fc1826d).


nulltoken commented Aug 15, 2012

@arrbee @aroben Would you be so kind as to run this test on a 64bits Windows platform?

/cc @carlosmn @dahlbyk


nulltoken commented Aug 15, 2012

@arrbee Could you please peek at ed04f77? test_diff_tree__options() refuses to pass if it isn't running in sandboxed mode. And I can't figure out why....


carlosmn commented Aug 15, 2012

On my amd64 VM, index_0 and index_range both fail with 1 != 0, and then after tree_1 I get a debug assertion with "Expression: (unsigned)(c+1) <= 256", which presumably happens inside tree_2.


carlosmn commented Aug 15, 2012

Igore that, I ran the wrong test (ctest seems to choose the wrong binary).

Now properly: the test fails with 2 != 3 at line 295.


nulltoken commented Aug 18, 2012

Is there a branch / commit I should clone to get the failing test into my tree so I can experiment a little bit?

@arrbee Thanks for this. It's part of the the WIP #878.

@arrbee arrbee commented on an outdated diff Aug 21, 2012

@@ -88,7 +88,10 @@ int diff_line_fn(
- assert(0);
+ /*
+ * FIXME: What should we do with the following assertion

arrbee Aug 21, 2012


This constant is deprecated, so it would be an error to return it. I thought assert(0) wasn't a horrible solution, but it does probably need a comment explaining why.


arrbee commented Aug 21, 2012

Okay, I figured out the bug that is being shown the unit test...

On platforms that don't have symlinks, if a tree or index is being diffed against the working directory and we see a symlink that has been converted into a regular file, then we ignore the fact that it was "converted" and keep it as a symlink. Of course, we should really only do this if the new file info is coming from the filesystem. Line 471-474 of diff.c:maybe_modified() needs to be changed to:

    /* on platforms with no symlinks, preserve mode of existing symlinks */
    if (S_ISLNK(omode) && S_ISREG(nmode) &&
        !(diff->diffcaps & GIT_DIFFCAPS_HAS_SYMLINKS) &&
        new_iter->type == GIT_ITERATOR_WORKDIR)
        nmode = omode;

With that change, I think we should get consistent behavior across platforms.

By the way, I think we only have to do this for comparison with the WORKDIR because of the index_merge_mode function in index.c that preserves the symlink-ness of a file in the index when a git_index_add is being done. However, there are probably situations where this could get us in trouble. If so, then instead of doing new_iter->type == GIT_ITERATOR_WORKDIR we would need to do new_iter->type != GIT_ITERATOR_TREE. I think that is not the right thing, however.

@arrbee arrbee added a commit that referenced this pull request Aug 22, 2012

@arrbee arrbee Minor bug fixes in diff code
In looking at PR #878, I found a few small bugs in the diff code,
mostly related to work that can be avoided when processing tree-
to-tree diffs that was always being carried out.  This commit has
some small fixes in it.

arrbee commented Aug 22, 2012

Okay, so @nulltoken I think I know what's going here except for the amd64 problem.

When you don't run sandboxed, the "gitattributes" file is not renamed to ".gitattributes" and as a result there is no .gitattributes file and the file that should be treated as a binary file is not, hence the number of hunks and the number of additions don't match up. That is why the test values don't match expected.

In going through this code, however, I found several small bugs in the diff code. One is the problem mentioned above that the platform-specific symlink capabilities are effecting tree-to-tree diffs which they should not do. Another was an unnecessary recomputation of SHAs for tree-to-tree diffs because of a bug that was clearing the "OID is valid" flag in the diff.

Since those bug fixes are independent of this PR, I just put them all into 5fdc41e and pushed them to development. If you rebase this PR, then you'll get those fixes. Of course, they won't actually matter for the different values between sandboxed / not sandboxed because that depends on the worktree based renaming of "gitattributes".

This pull request passes (merged 3b0b32f7 into 5fdc41e).

This pull request passes (merged c0a6b0a2 into 5fdc41e).


nulltoken commented Aug 23, 2012

@arrbee I've rebased the PR and fixed the tests.

Below, some things I'd like yo interest you in ;-)

  • The assert cannot be currently uncommented. Doing so makes one of the tests fail (cf the comment in the code)
  • The tree-to-tree test no longer differentiates the platform it's being run on. This passes on Windows and Linux
  • I've created a new workdir-to-tree test which compares the tree containing the symlinked blob against a plain file in the workdir. This test currently fails on Windows (cf the comment in the code)
  • Those haven't been run on a Windows amd64 platform yet.

arrbee commented Aug 23, 2012

@nulltoken Thanks! I will look into what's happening with the assert. I think I know what's happening, but I think we should probably be returning GIT_DIFF_LINE_ADDITION in the particular case and this is another bug. I'll go into the debugger and see what it looks like.

The Windows failure in that case is a little odd to me, but I suspect it is due to the path munging that our Windows implementation of p_readlink is doing. I realize now that assuming that the in-memory representation of a the link is the same number of bytes as the on-disk file is probably not safe. The logic for buffer allocation in diff_output.c probably needs to be tweaked...


arrbee commented Aug 24, 2012

Looking over the use of p_readlink throughout the code base, I'm surprised that any use of symlinks on Windows works at all. Everywhere that calls that function assumes that a buffer equal to the file size will be sufficient to hold the data, and that is not guaranteed with the win32 implementation.

I think this is a) not tested much in the library, b) not activated in most cases because we try to skip this code on Windows usually. I think that's the issue in this case with diff output - because we copy the mode bits from the index, this is one of the few places where we ever try to invoke the Windows p_readlink implementation. I actually think that may be the real bug here. Again, need to see how msysgit handles it.


nulltoken commented Sep 18, 2012

I've rebased this on top of development.


nulltoken commented Feb 1, 2013

Rebased on top of development.

  • The tree-to-tree test now passes
  • The tree-to-workdir still fails on Windows with the following message: "Failed to read symlink 'include/Nu/Nu.h': Not enough storage is available to process this command."

carlosmn commented Dec 18, 2013

Does this still happen? The diff code has probably gone though a couple of overhauls by now.


nulltoken commented Dec 21, 2013

@carlosmn: Rebased on latest tip. Still failing, but the error now originates back from git_fidff_foreach() and no longer from git_diff_tree_to_workdir().

diff::workdir::symlink_blob_mode_changed_to_regular_file [D:\Dropbox\Dropbox\lib
  Function call failed: git_diff_foreach( diff, diff_file_cb, diff_hunk_cb, diff
_line_cb, &exp)
  error -1 - Failed to read symlink 'include/Nu/Nu.h': Not enough storage is ava
ilable to process this command.

@phatblat phatblat pushed a commit to phatblat/libgit2 that referenced this pull request Sep 13, 2014

@arrbee arrbee Minor bug fixes in diff code
In looking at PR #878, I found a few small bugs in the diff code,
mostly related to work that can be avoided when processing tree-
to-tree diffs that was always being carried out.  This commit has
some small fixes in it.

ethomson commented Nov 3, 2015

Thanks for the unit test here - after digging in to this again, I notice that we shouldn't be trying to read a symlink here at all! MIgrated this over to #3498

ethomson closed this Nov 3, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment