Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming blocks in the datastore #242

Open
Stebalien opened this issue Mar 12, 2020 · 16 comments
Open

Naming blocks in the datastore #242

Stebalien opened this issue Mar 12, 2020 · 16 comments
Labels
P1 High: Likely tackled by core team if no one steps up

Comments

@Stebalien
Copy link
Member

Stebalien commented Mar 12, 2020

Context: https://github.com/ipfs/ipfs/issues/337

Currently, both js-ipfs and go-ipfs index blocks in the datastore by CID. Unfortunately, it's possible for a single block to have multiple CIDs due to (a) different CID versions and (b) due to different multicodecs (e.g., dag-cbor v. cbor v. raw).

The primary concern is (a). As we switch to returning CIDv1 (for multibase support), we still want to be able to lookup blocks that were fetched/added as CIDv0.

Currently, when looking up a CID, both go-ipfs and js-ipfs will first attempt to lookup the CID under the original CID version, then under the other CID version. However, this costs us two look-ups.


Primary use-case:

Ensure that CIDv1 and CIDv0 can be used interchangeably, especially in the gateway.

Proposals:

  1. Index blocks by multihash only.
  2. Index blocks by CIDv1.
  3. Double indirection. Map CIDs to a locally chosen hash function to blocks.
  4. Store extra metadata along with the block (known codecs, refcounts, etc.).
    • Unclear how to do this efficiently with a datastore (would likely require a bunch of separate writes/keys).

Desired properties:

  1. No duplicate blocks when...
    a. CID Versions differ: 1, 2 & 3
    b. CID Codecs differ: 1 & 3.
    c. Hash functions differ: 3.
  2. No discarded structural information (keep codecs): 2 & 3.
  3. Viable migration path: 1, 2 & 3.
  4. Fast migration path: 1 & 3.
  5. Zero Overhead:
    a. Time: 1, 2, & 4.
    b. Space: 1 & 2.

The current consensus is option 1 (multihash) (ipfs/kubo#6815, ipfs/js-ipfs#2415). The proposal here is to consider option 2 (and at least look at the others).

However, option 1 doesn't give us property 2 as it discards the codec on write. @jbenet has objected strongly to this as we will be discarding structural information about the data. This information will still be stored in the pin set but we'll be losing this information for unpinned data.

We have also run into this same issue when trying to write a reverse datastore migration, migrating back from multihashes to CIDs: we need to somehow recover the codecs. The reverse migration in Option 2 would simply be: do nothing.

We need to consider the pros/cons of switching to option 1 before proceeding.


Side note: why should we even care about (b), multiple codecs for the same block?

IPLD objects as "files"

  • I might want to add one or more CBOR-IPLD objects to an IPFS directory for legacy applications, and address them in IPLD.
  • I might want to address a yaml/json object in an IPFS directory as an IPLD object.

Dumb block transport

I might want to take an unbalanced DAG (blockchain), a DAG with weird codecs (git, eth, etc.), etc. and sync it with another node. It would be nice if I could take this DAG, treat all the nodes as raw nodes, then build a well-balanced "overlay-dag" using simple, well-supported codecs.

This might, for example, be useful for storing DAGs with custom codecs in pinning services.

@achingbrain
Copy link
Member

achingbrain commented Mar 13, 2020

If we were to start storing metadata with a block, part of that could be whether it's pinned (and how - direct, indirect, multiples thereof, etc), which would make the pinning code a lot simpler and (arguably) gc way faster as it would be trivial to parallelise.

@hsanjuan
Copy link
Contributor

If we were to start storing metadata with a block, part of that could be whether it's pinned (and how - direct, indirect, multiples thereof, etc), which would make the pinning code a lot simpler and (arguably) gc way faster as it would be trivial to parallelise.

That makes pinning maintenance horrible because every time you unpin something you need to check if the item is pinned by something else, update all the items etc.

@hsanjuan
Copy link
Contributor

I would like to stress that proposal 1 does not change much from the current defaults. We are effectively indexing by multihash already when using CidV0.

If we were indexing using CIDv1s we would also have a situation where the codecs don't give us meaningful information for most of the blocks in the store (raw leaves, dag-pb chunks leafs).

In the end any content used on IPFS needs an address, that's the CID and you can derive everything from there. I don't see too much value in the ability to know the codec for every block in the datastore, specially having already a list of root CIDs stored separately that we can derive information from (the pinset/mfs).

So what do we expect to get from having that structural info for everything? If we choose that way, it should be because it is needed for a very specific feature that we want to support (Option 1 is there as support for the switch to base32).

@rklaehn
Copy link

rklaehn commented Mar 16, 2020

Why not store the blobs by raw multihash, and have an additional table that has just the cids. That way you don't burden the blob store with unnecessary information but still retain all the metadata.

@achingbrain
Copy link
Member

That makes pinning maintenance horrible because every time you unpin something you need to check if the item is pinned by something else, update all the items etc.

At the moment when we want to GC something we have to check every direct pin and every child of every recursive pin to see if the block is pinned which seems quite similar to the above.

My assumption is that GC operations will run more frequently than unpinning so it might make sense to optimise for that use case.

@ianopolous
Copy link
Member

My assumption is that GC operations will run more frequently than unpinning so it might make sense to optimise for that use case.

In our usage unpinning is much much more common than GC. We periodically call GC, whereas every time any user modifies any file they pin their new root (with a pin update) and then unpin the old one.

@Stebalien
Copy link
Member Author

@rklaehn

Why not store the blobs by raw multihash, and have an additional table that has just the cids. That way you don't burden the blob store with unnecessary information but still retain all the metadata.

That's a variant of solution 4. The concern there is that we'd now have multiple writes under different keys, every time we write a block.


@achingbrain

If we were to start storing metadata with a block, part of that could be whether it's pinned (and how - direct, indirect, multiples thereof, etc), which would make the pinning code a lot simpler and (arguably) gc way faster as it would be trivial to parallelise.

I agree we'll need to do something like this eventually, but ideally only for pinned blocks. That is, I'm more-fine having this kind of overhead when pinning, but less fine when just adding random blocks (but maybe it's still fine)>


@hsanjuan

If we were to start storing metadata with a block, part of that could be whether it's pinned (and how - direct, indirect, multiples thereof, etc), which would make the pinning code a lot simpler and (arguably) gc way faster as it would be trivial to parallelise.

That makes pinning maintenance horrible because every time you unpin something you need to check if the item is pinned by something else, update all the items etc.

There are ways to optimize this. When we unpin, we'd need to traverse all newly-unreferenced blocks and when we pin, we'd need to traverse all newly-referenced blocks. However, we can probably do this asynchronously by recording things we need to do in the datastore, and making sure we work through the backlog before we GC.

@Stebalien
Copy link
Member Author

If we were indexing using CIDv1s we would also have a situation where the codecs don't give us meaningful information for most of the blocks in the store (raw leaves, dag-pb chunks leafs).

I agree for dag-pb. I think the main concern here is that, when we start getting more and more CBOR blocks, CIDs start becoming more useful.

However, in my personal opinion, everything that is unpinned is just "cached" and being able to enumerate a cache is not a desirable feature.

@hsanjuan
Copy link
Contributor

However, in my personal opinion, everything that is unpinned is just "cached" and being able to enumerate a cache is not a desirable feature.

Said otherwise, we could support an "everything-pinned" mode. i.e. an additional pinset where everything that is written goes, but does not necessarily need to be enabled by default, only for the cases where cache enumeration is important. Moving to datastore-backed pinset should enable us to do this more or less easily.

I guess what I mean is that we can separate this into two problems and solve the problem of cache enumeration on top of storing raw multihashes, and do it at a later point in time, rather than now (and that complexity may be better managed with that approach).

@Stebalien
Copy link
Member Author

I guess what I mean is that we can separate this into two problems and solve the problem of cache enumeration on top of storing raw multihashes, and do it at a later point in time, rather than now (and that complexity may be better managed with that approach).

I agree. This would actually be really nice because we'd be able to GC better. That is, we'd be able to:

  1. Look at the "cache pins".
  2. Determine which pin we can remove to free the most space (i.e., which pin has the most unique data).
  3. Remove that pin.

@hsanjuan
Copy link
Contributor

hsanjuan commented Apr 8, 2020

I think this is stale at this point and we have surfaced concerns and identified how to potentially address them later as needed. Can we add this to 0.6.0 milestone?

@Stebalien
Copy link
Member Author

Resolution

  1. Store blocks via multihash (proposal 1).
    • Cut a release that reads blocks using the "raw multihash", regardless of codec. That way, users can downgrade to this release and everything will "just work" even if the reverse migration doesn't catch everything.
    • Cut another release with the actual migration.
  2. Create a better way to manage local files:
    • Use MFS by default.
    • Possibly a history mechanism.
  3. Let the rust-ipfs folks explore this design space.

@ShadowJonathan
Copy link

ShadowJonathan commented Aug 9, 2020

My opinion: Since the moment I have started getting involved with IPFS, CIDs were always trumped as content addressors; a interpretive codec followed by a raw data identifier, it was my assumption that CIDs only existed above the actual storage layer, and would mostly be used for "user-facing" data-address resolving and interpreting

To use CIDs in blockstore would carry unnecessary "addressing flair" (codec + future additions to CIDvX) to the raw data, and raw data could exist in duplicate under different CIDs, with the same internal multihash.

My assumption was thus that multihashes were used from the start, as any other alternative would be unreasonable concerning how the whole of IPFS' system works, I was surprised to see that CIDs were used instead, so my opinion is to immediately make plans to migrate this. I'm honestly a bit confused as to why this was implemented this way in the first place.

@Stebalien
Copy link
Member Author

I'm honestly a bit confused as to why this was implemented this way in the first place.

There was a desire to not lose information. That is, being able to list and understand all blocks in a datastore as structure data is nice. See the first post for the "desired properties".

On the other hand, I completely agree that referencing by multihash at this layer makes the most sense.

@ShadowJonathan
Copy link

There was a desire to not lose information. That is, being able to list and understand all blocks in a datastore as structure data is nice. See the first post for the "desired properties".

If that's true, then an extra list or "store" for CID keys that have been ever seen across the whole client (and network) could help this problem, CIDs can then act as "headers" or indexes in the way that filesystems work today, the raw multihash-referenced blocks could then act as the "raw data" on "disk".

With storing all CIDs seen across the network, maybe later on then a protocol could be specified that "repairs" or performs "reverse lookup" based on a raw set of blocks, to find out how the data fitted together, in a repair operation, with the help of the network by asking them (and possibly also placing this in the DHT) if they have seen CIDs with this multihash in them.

@lidel
Copy link
Member

lidel commented Dec 9, 2021

go-ipfs 0.12 is when the switch of the low level datastore to use multihash keys happens: ipfs/kubo#6816 / ipfs/kubo#8344

Do we have any specs or docs that need updating, or can this issue be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

No branches or pull requests

7 participants