Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git reports errors in the repository #2278

Open
alex-ilin opened this issue Apr 13, 2020 · 5 comments
Open

Git reports errors in the repository #2278

alex-ilin opened this issue Apr 13, 2020 · 5 comments

Comments

@alex-ilin
Copy link
Member

When I try to make git-clone of the Factor repository, it fails with an error.

C:\Programs\Dev\factor-github>git clone git@github.com:factor/factor.git
Cloning into 'factor'...
remote: Enumerating objects: 85, done.
remote: Counting objects: 100% (85/85), done.
remote: Compressing objects: 100% (58/58), done.
error: object 571928545d9f2c920eb892b3da0679885b3c3945: zeroPaddedFilemode: contains zero-padded file modes
fatal: fsck error in packed object
fatal: index-pack failed

C:\Programs\Dev\factor-github>git --version
git version 2.25.0.windows.1

C:\Programs\Dev\factor-github>git config transfer.fsckobjects
true

Searching through the issues, I found that this problem was already mentioned in #2197.
This seems to be a serious problem, because I can't clone the official repository from GitHub unless I disable the transfer.fsckobjects setting.

In my local repository, where I normally work, git-fsck completes without errors. I can clone my own fork from git@github.com:AlexIljin/factor.git also without any errors.

Here's the full list of the bad objects:

04844ac88050a91b0be01a982069f56b68ec8fc3
19a3572612d83fabcd651a83283f3d055c07667a
2f52cab4f0d9e4796a4090e7302acde044906d1a
34ee26da67e16ab5cd67a431e3f2457694b101be
3e353c1ffd2ba87158ec0585aaabfff667111933
3f6d86a70b28f9f113ee99cd8ed2a36100cf81ad
41a975d250b999b67f7e978f89510d37e2b1af92
46179fe46976c370dd5f717dc56e1200a70f55a2
487823fcf3e3ef004e2a605a53371dc08845e736
4ce4dc447af5e1166d28501c0aebe72f2a9c240a
509747a51338e7ee33a108e3d2e7fdd4830a1c49
5444f37f20906a33abc37d281a52328b46267445
571928545d9f2c920eb892b3da0679885b3c3945
6aec4b919334fe5f5fe22acedbf3ee71765c5c61
78e4700154d0d569e7741881ba8b3bbc140608ac
79c29f256fe9c6ad0bf31b1be50332a23985050f
8aaeef26ddf8a5e8686a0d3188d5ec48bd1596b7
91ac0bf89d022369d8e315d8d7b1edde078210d7
9661fb9a7e6aaf99361447e5e64b5b0234b8a8e4
aadb6d1f086da69dd3cc6e92a258b3cfd7d01fd6
abd2b75f960ad2abc2febd59ee949a31c7731c13
b6e38dd97ded04ae4f9f25550f62b99dcef8e7f4
d6908b7db93330b2e4e25849c6e6b898d32b6973
db1b2c59dc674ce46d94fc2f9292aa258ef3824f
e97749e9e2fed26f44b0d30a63f4213991caf60a
e97e99cb528db90eb9172b1aac6c1f8dfad90915
f436eace0f3e7d9c13e91efff62e2eeef12b1c18
@nomennescio
Copy link
Contributor

I checked #2190 today, and saw it's still not finished. After it's finished, I am willing to start a controlled rewrite of the Git history, based on the archive in Github, as I mentioned in #2197. My suspicion is that this will also solve the issues you encounter.

Btw, I was able to clone the repo today using both git version 2.21.0.windows.1 and Cygwin's git version 2.21.0

@nomennescio
Copy link
Contributor

nomennescio commented Jun 28, 2022

Did a check today, still the same errors occur. At least no new ones were introduced the past two years.

@mrjbq7
Copy link
Member

mrjbq7 commented Jun 28, 2022

Yeah, at some point some kind of git-fu messed up the repo around 0.96 or 0.97 which probably caused that.

@nomennescio
Copy link
Contributor

Did another check today, still same errors, no signs of further corruption.

@nomennescio
Copy link
Contributor

nomennescio commented Jan 6, 2023

For completeness; when I did a recreation of Factor's paleo-history (#2323), where I also created the release branch, I had to find the commits that corresponded to a release snapshot archive. Only quite late in the history, the file git-id was added to the repo to record what commit was used to create such an archive. But even then I could not find exact matches for some releases. I will list these here:

e39e4c58a9 import-0.91 parent 3f22d61a0482bb5f6b60d0252e28598bbc5b0fba (no git-id available)
a34afc9c34 import-0.92 parent a495f8e0998de845a82c9a7b905cff2e18686bfa (git-id NOT found)
4989da35f2 import-0.93 parent 58bf727ac52508ec787e2d8779e7ba35b370b287 (git-id NOT found)
ba92cccef1 import-0.94 git-id db359d69dfe0f24613f9a8ec4f6ac3a0b3d87980 NOT FOUND
d9f6a668ca import-0.95 git-id e4cc936c55d9946698abd266f673ba8c06b5e19e NOT FOUND
ce08c23447 import-0.96 git-id 2a8af325347d5e90ce874f706f5746cd0ddaac9b NOT FOUND
bd240ff97b import-0.97 parent eb3ca179740e6cfba696b55a999caa13369e6182 (=git-id)
f09eb73ae2 import-0.98 parent 7999e72aecc3c5bc4019d43dc4697f49678cc3b4 (=git-id)

So around 0.94 git-id was introduced, but releases 0.94, 0.95, and 0.96 refer to commit SHAs that are NOT in the Factor repo!
That means those commits which these releases were based on are lost. I did find commits that most closely resembled the state of the release. Because I made an artificial "merge" by manually constructing Git objects (yes, this is Git deeper magic), you can look at the diffs in gitk between the selected merge point and the actual content of the archive, and you will see minor differences, just point gitk to the release branch and go from there. For releases with no clearly associated commit, I did not create a "merge", but just added the release by itself. Those diffs still show the differences between two releases, which is typically a lot.

If you look at the differences, you will notice that all releases differ in files

.cvskeywords
boot.image.be32
boot.image.be64
boot.image.le32
boot.image.le64

where at some point the names of the boot images change from release to release. This is because these files are in the archives only, which is good. At least it makes all boot files available in the Factor repo.
However, you will also notice quite some other differences, which are the differences between the repo commit coupled to the release and the release, hence no close match is present in the repo.

(The .cvskeywords file I added and it records the original CVS keywords present in source files. I then changed all source files by replacing the expanded with the unexpanded CVS keywords. I did this to give meaningful diffs. The original files could be recreated using .cvskeywords)

The releases with non-trivial differences between the archive and the release are:

0.98
0.92
0.91
0.85
0.82
0.79
0.78
0.71
0.69
0.68
0.67 (very many different .png files, so probably an issue with the created release archive)
0.66

Releases with no corresponding commit can be easily spotted; they miss an additional merge arrow in gitk

All this does show that archiving and release discipline has been somewhat lacking. It also shows that the repo has been corrupted by incorrectly forced pushes in the past.

A git repo fix can solve most issues, but can not recreate missing commits out of thin air.

(To be explicit about it, there are still large parts of history missing, including official release archives, but most importantly, CVS and Darcs history. These are to be located between tags common-root and first-cvs-commit, between last-cvs-commit and first-darcs-commit, and between last-darcs-commit and first-git-commit.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants