Skip to content
This repository has been archived by the owner on Aug 24, 2021. It is now read-only.

Wrong entry ordering in tree serialization #44

Closed
osarrouy opened this issue Mar 20, 2019 · 6 comments
Closed

Wrong entry ordering in tree serialization #44

osarrouy opened this issue Mar 20, 2019 · 6 comments

Comments

@osarrouy
Copy link

Hi everyone,

I'm working on pando - a decentralized git-remote based on IPFS, Ethereum and Aragon - and I think I've spotted an error in how this resolver handles tree serialization.

If you have let's say a git repo with both a test.mdand a test folder as in this repo example git orders them this way:

git cat-file -p 66fdaafa5fac5abca298f9822bd2071d0dc258d2
100644 blob 557db03de997c86a4a028e1ebd3a1ceb225be238    test.md
040000 tree d564d0bc3dd917926892c55e3706cc116d5b165e    test

Now js-ipld-git is gonna order them the other way around:

{"test":{"hash":{"/":"z8mWaJC5XCFN365khYYEdFVkdHVnMMJ37"},"mode":"40000"},
"test.md":{"hash":{"/":"z8mWaGQjCphPSZpWxcC2gH3KeoCXCp1HH"},"mode":"100644"}}

This leads to an error where the result of the shaToCid function and the CID of the actually serialized tree object are not the same - thus breaking the whole commit tree.

This comes from the sorting function implemented here.

Do you think you guys could have a look or do you prefer me to dive deeper in it and open a PR?

@vmx
Copy link
Member

vmx commented Mar 20, 2019

@magik6k Could you have a look at this issue, resp. helping @osarrouy with the PR?

@magik6k
Copy link
Contributor

magik6k commented Mar 20, 2019

So git entries have to be be sorted, but the sorting function appears to be weird:

$ git cat-file -p 9d844d66513148068ef778f7beb593fd907b6f21
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	a
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	a.a
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	aa
040000 tree 496d6428b9cf92981dc9495211e6e1120fb6f2ba	b
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	c
040000 tree 496d6428b9cf92981dc9495211e6e1120fb6f2ba	d
100644 blob 557db03de997c86a4a028e1ebd3a1ceb225be238	test.md
040000 tree d564d0bc3dd917926892c55e3706cc116d5b165e	test

(note how a.a is after a, but test.md is before test)

@magik6k
Copy link
Contributor

magik6k commented Mar 20, 2019

Might have found the issue - from http://git.661346.n2.nabble.com/In-tree-object-Must-the-td7446900.html#a7447657

Yes, the entries in the tree are alpha-sorted. The exception are trees,
where you have to pretend that there is a trailing slash. In other
words, the order is the same as you'd see in the index (as there, the
test/ directory in your example would be stored with a slash and the
name of the subdirs and files in it.

@osarrouy
Copy link
Author

@magik6k That sounds like being the source of the issue, indeed.

What would the best solution, then ? Regarding how the sorting function is implemented here maybe the best way would be to check if mode === '040000 and add a trailing slash to the entry on a copy of the entry mapping before the comparison function ?

How would you like to proceed. I let you update the sorting function or do you want me to push a PR?

@magik6k
Copy link
Contributor

magik6k commented Mar 20, 2019

How would you like to proceed

If you have some code, open a PR with it, it will be faster/easier to iterate on that

@osarrouy
Copy link
Author

I don't have yet. I will try to work on it tonight or tomorrow, and then open a PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants