Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git svn fetch fails with: index uses e 2. extension, which we do not understand #3780

Open
fmuntean opened this issue Apr 9, 2022 · 27 comments
Labels

Comments

@fmuntean
Copy link

fmuntean commented Apr 9, 2022

I am trying to clone a huge repo using GIT SVN I am getting the following error using: git svn fetch

git svn fetch -q -r 64480
error: index uses e 2. extension, which we do not understand
fatal: index file corrupt
write-tree: command returned error: 128

I have the svn repo locally and serving using the svnserve command from local computer thus no issues with netowrking.

svnadmin verify reports no issues.

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
    -[System Info]
    git version:
    git version 2.35.1.windows.2
    cpu: x86_64
    built from commit: 5437f0f
    sizeof-long: 4
    sizeof-size_t: 8
    shell-path: /bin/sh
    feature: fsmonitor--daemon
    uname: Windows 10.0 19044
    compiler info: gnuc: 11.2
    libc info: no libc information available
    $SHELL (typically, interactive shell): C:\Program Files\Git\usr\bin\bash.exe
$ git --version --build-options

git version 2.35.1.windows.2
cpu: x86_64
built from commit: 5437f0fd368c7faf1a0b5e1fef048232c1f2a3e6
sizeof-long: 4
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon


$ cmd.exe /c ver

I am using the mingw64 aka the git shell

 - What options did you set as part of the installation? Or did you choose the
   defaults?

I used defaults

One of the following:

type "C:\Program Files\Git\etc\install-options.txt"
type "C:\Program Files (x86)\Git\etc\install-options.txt"
type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
$ cat /etc/install-options.txt

Editor Option: Notepad++
Custom Editor Path:
Default Branch Option:
Path Option: Cmd
SSH Option: OpenSSH
Tortoise Option: false
CURL Option: WinSSL
CRLF Option: CRLFCommitAsIs
Bash Terminal Option: MinTTY
Git Pull Behavior Option: Rebase
Use Credential Manager: Enabled
Performance Tweaks FSCache: Enabled
Enable Symlinks: Disabled
Enable Pseudo Console Support: Disabled
Enable FSMonitor: Disabled

  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

nothing special

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

mingw64

git svn fetch 
  • What did you expect to occur after running these commands?

to work and clone the entire project

  • What actually happened instead?

error: index uses e 2. extension, which we do not understand
fatal: index file corrupt
write-tree: command returned error: 128

@rimrul
Copy link
Member

rimrul commented Apr 11, 2022

fatal: index file corrupt

That seems to be the cause of your issue, your index file got corrupted. How that happened is anyones guess, but you seem to have non-extension data in parts of the index file where git expects extensions to the index.

git fsck might help.

@fmuntean
Copy link
Author

Did the git fsck and no luck.
Also did the git svn gc and git gc and still having the same issue.
Please advise where to look into. Is not easy to understand if this is a GIT Index or GIT SVN index file.
Where in the *.rm perl code I can add more logging to get to the bottom of exactly what file is considered corrupted ?

@rimrul
Copy link
Member

rimrul commented Apr 11, 2022

I think the only occurrence of that error message is in read-cache.c, so you'll probably need to add that logging to the C code.

@fmuntean
Copy link
Author

I think the only occurrence of that error message is in read-cache.c, so you'll probably need to add that logging to the C code.

That would require a full rebuild vs a small change in the Perl files. Any other way to determine which file is chocking on ?

@fmuntean
Copy link
Author

fmuntean commented Apr 11, 2022

The repo was cloned using the git svn so any extension should have been created by the tool itself.

@fmuntean
Copy link
Author

Line 70: #define CACHE_ENTRY_PATH_LENGTH 80 draw my attention.
Looks like this could be a limitation where long paths are not supported ?
The index file under the .git directory seems to be in a binary format.
Is there any documentation on the format so I can try to parse it manually to see where the problem could reside?

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

Is there any documentation on the format so I can try to parse it manually to see where the problem could reside?

Yes:

https://git-scm.com/docs/index-format

@fmuntean
Copy link
Author

From the documentation it looks that the extensions have the size in their header but the index entries do not have their size mentioned.
It seems that only a count is provided in the main header.
The "object name" does not seem to have a length specified. I see some comments about padding to 8 byte multiple but not clear where they are.

Once I runt the git svn gc and rerun the fetch just for the specific SVN revision I get a new index file under SVN tag folder that is 4GB in size.
The revision seems to be about tagging in SVN.

Please advise.

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

From the documentation it looks that the extensions have the size in their header but the index entries do not have their size mentioned.

They have a variable size.

It seems that only a count is provided in the main header.
The "object name" does not seem to have a length specified.

That's an SHA-1 (usually), so 20 bytes.

I see some comments about padding to 8 byte multiple but not clear where they are.

after the Entry path name.

Once I runt the git svn gc and rerun the fetch just for the specific SVN revision I get a new index file under SVN tag folder that is 4GB in size.

That's interesting. This could be related to #2179.

@fmuntean
Copy link
Author

Was checking that issue and is 3 years old and not fixed.
Looking in the index header it generates v2. How can I force it to use a v4 index which seems to use some compression maybe just getting under the current limit so I can continue.
From the header also seems that there are 2.5Milion entries in the index. Not sure if there is no bug in there as I do not expect that many files so wonder on the duplicates ?

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

For new repos you can use the index.version config option.

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

2.5 million entries sounds like a lot, but would by my rough napkin math only account for somewhere between 200 and 800 megabytes of the index file.

@fmuntean
Copy link
Author

2.5 million entries sounds like a lot, but would by my rough napkin math only account for somewhere between 200 and 800 megabytes of the index file.

My bad! read it wrong the header says 25 Millions

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

My bad! read it wrong the header says 25 Millions

That would do it.

git update-index --index-version=4 updates an existing index to version 4, but I recommend a backup beforehand, especially seeing as your file seems to be corrupt. I'm not quite sure what results to expect on your file.

@fmuntean
Copy link
Author

The index file I am talking is under the SVN subfolder and the git svn gc seems to delete it anyway.
I was trying to parse the index file however the index header size seems to fluctuate with 4 bytes. from the 62 bytes I counted before the file Path name.
Is there a condition where this 32bit extra is required based on some flags not currently documented?
Where is the implementation of read_index_from() method ? Search over code does not seem to work for me in this repo.

@rimrul
Copy link
Member

rimrul commented Apr 12, 2022

read_index_from() is implemented in read-cache.c. GitHubs code search doesn't work on this repo because it's forked from git/git. You can often find these things when searching in git/git, but for some reason that doesn't currently find read_index_from either.

@fmuntean
Copy link
Author

My bad! read it wrong the header says 25 Millions

That would do it.

git update-index --index-version=4 updates an existing index to version 4, but I recommend a backup beforehand, especially seeing as your file seems to be corrupt. I'm not quite sure what results to expect on your file.

running git update-index did not help. Still getting the same error.
after this I run 'git svn gc' which removes all the SVN index files then ran the 'git config index.version 4' which added that in the git config file.
'git svn fetch ..' now produced a 1.9GB index file but get the following error:
0 [main] perl 5185 child_info_fork::abort: address space needed by 'msys-svn_wc-1-0.dll' (0x82A2C0000) is already occupied
Can't fork, trying again in 5 seconds at C:/Program Files/Git/mingw64/share/perl5/Git.pm line 1647.

perl.exe using 1.3GB RAM and still have plenty left on the system

@fmuntean
Copy link
Author

fmuntean commented Apr 14, 2022

Any help with this error ?
[main] perl 5185 child_info_fork::abort: address space needed by 'msys-svn_wc-1-0.dll' (0x82A2C0000) is already occupied
Can't fork, trying again in 5 seconds at C:/Program Files/Git/mingw64/share/perl5/Git.pm line 1647.

Seems that no matter how many times I restart the 'git svn fetch' using the index v4 I get this error while there is plenty of free memory on the system.

@rimrul rimrul added the git-svn label Apr 14, 2022
@dscho
Copy link
Member

dscho commented Apr 19, 2022

main] perl 5185 child_info_fork::abort: address space needed by 'msys-svn_wc-1-0.dll' (0x82A2C0000) is already occupied
Can't fork, trying again in 5 seconds at C:/Program Files/Git/mingw64/share/perl5/Git.pm line 1647.

This would suggest an i686 version of Git might be used, and https://github.com/git-for-windows/git/wiki/32-bit-issues might help resolve it. (It might even help resolve it if the Git in question is of the x86_64 flavor.)

@fmuntean
Copy link
Author

I am using the 64bit for GIT already.

@fmuntean
Copy link
Author

fmuntean commented Apr 20, 2022

I skipped the specific revision by using fetch the -r next revision and the system progressed for a while.
Now I am getting the following error:
Checking svn:mergeinfo changes since r72502: 298 sources, 1 changed
open: No such file or directory at /usr/share/perl5/core_perl/Memoize.pm line 264.

Not sure what to do next. Can anyone help please ?
FYI: I am using windows so the path in the error does not even make sense.

@rimrul
Copy link
Member

rimrul commented Apr 20, 2022

open: No such file or directory at /usr/share/perl5/core_perl/Memoize.pm line 264.
FYI: I am using windows so the path in the error does not even make sense.

You're using MSys2 perl, so the path makes perfect sense.

Converted to a windows path, that would be C:\Program Files\Git\usr\share\perl5\core_perl\Memoize.pm.

@fmuntean
Copy link
Author

fmuntean commented Apr 20, 2022

Ok. So how do I actually solve this issue ? The error does not show what the file is trying to open.

the line reported in the error is: my @q = &{$info->{U}}(@_);

@fmuntean
Copy link
Author

I am stuck right now with these multiple issues and I have a huge repo to convert. Any help is appreciated.

@fmuntean
Copy link
Author

Any update on this issue? Anyone have any idea on how to move this forward ? I am currently stuck on migrating an SVN repo to GIT.

@dscho
Copy link
Member

dscho commented May 6, 2022

Any update on this issue? Anyone have any idea on how to move this forward ? I am currently stuck on migrating an SVN repo to GIT.

If you want to perform a one-time migration, it might make sense to use WSL to do it. Historically, Git for Windows' git svn has been a prolific source of bugs.

@rhuijben
Copy link

@fmuntean you might want to look at https://gitlab.com/esr/reposurgeon, from what I heard it should support many scenarios git-svn doesn't.

I think your git-svn misdetected your tags directory and tries to map all of them in your working copy instead of as branches/tags. You might be able to fix that in other ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants