Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open FIXMEs in sharding #496

Open
11 tasks
hsanjuan opened this issue Aug 9, 2018 · 4 comments
Open
11 tasks

Open FIXMEs in sharding #496

hsanjuan opened this issue Aug 9, 2018 · 4 comments
Assignees
Labels
topic/meta Topic meta

Comments

@hsanjuan
Copy link
Collaborator

hsanjuan commented Aug 9, 2018

As we come closer to merging the sharding branch and officially supporting local adding through IPFS, this issue tracks all of the open issues that are remaining in the sharding code in order to actually support sharding.

  • ipfs-cluster-ctl pin ls and status should probably filter out all shard and cluster dag pins out, using only data-pins and meta-pins. Extra flags should allow displaying the others.
  • when checking the Pin for errors for a ShardType pin, we have no way to know if the MaxDepth should be 1 or 2 (if we added one extra level), or more (if we supported many extra levels). We should decide what we check once we know what MaxDepths are possible.
  • Sharding two files which share significant similarities may result in producing some shards which are the same for both. Unpinning one of them will then unpin the shard that is used by the other. We handled this problem by tracking Parents per shard and placing the , but this requires patching the shards when the cluster dag is known, and updating them when a parent is unpinned etc. It is probably easier to make each shard pin "unique" for a ClusterDAG. But this will result in not being able to re-use an already allocated shard. We might also just load all clusterDAGs when unpinning one of them, and check that each shard is not part of any other clusterDAG. A bit like garbage collection on ipfs.
  • IPFSPinStatus type should probably make distinctions for Recursive and Depth Limited pins. This depends on how IPFS presents that.
  • IPFSPinStatus.IsPinned(depth) should return true only if the status is pinned to the given depth. Related to above.
  • ipfs-cluster-ctl should enable sharding flags for adding when it's supported
  • Is it possible to display total progress while adding? Things don't look pretty on ipfs on this front.
  • We have a 4MB limitation on the size of blocks. We should keep track of how big our shard block is and flush it before it hits the limit. There is some commented code.
  • Related to above: we should use a custom IPLD block format(s) for cluster. This would allow recovering cluster DAGs from IPFS without cluster and gives us further control on how data formats (currently its a cbored-map with ipld-cbor).
  • We could dinamically adjust the shard size to the smallest available space in the allocations. But this would require that we use disk-informer metrics and not something else, and that we drop the whole informer abstraction, or request those metrics on the side. It's also not clear how useful it is, as probably results in really small shards as space runs out and doesn't fix the fact that there is no space.
  • Our sharding dag service carries a cid.Set to avoid putting the same block on several shards because the ipfs adder tends to send the same block several times or simply because the chunker (say rabin) outputs the same blocks. This means that we keep all the sharded object CIDs in memory. Maybe there is a better way to determine if we have already processed a Cid (and skip it then), at least with some probability.

All of these show up as FIXMEs and TODOs in the code.

@hsanjuan hsanjuan added the topic/meta Topic meta label Aug 9, 2018
@hsanjuan hsanjuan self-assigned this Aug 9, 2018
@hsanjuan
Copy link
Collaborator Author

hsanjuan commented Aug 22, 2018

  • The operations which involve the pintrackers re-queuing Pin objects (Recover at least) must set the right PinOptions on them. Currently they just do PinCid(cid). But recovering should set MaxDepth correctly in the case of non standard pins.

@hsanjuan
Copy link
Collaborator Author

hsanjuan commented Oct 1, 2018

  • We have a problem in the sharding DagService because it is automatically wrapped in a Batch DagService by the importers and this results in a) concurrent calls to Add (which causes concurrent map writes) b) broken adder, because we might Finalize shards before the last blocks have been Added() from the importer, because it's adding them asynchronously after closing a batch. I was fixing a) but it's probably worth to take this and fix it in the importers project. Started discussion here: Importers: DAGService should not be batched internally by default ipfs/go-unixfs#27

hsanjuan added a commit that referenced this issue Oct 19, 2018
As documented in #496, unpinning a shard which is part of multiple
cluster-DAGs was not handled.

Before, we had a back link to the ClusterDAGs (.Parents) in the shards,
but this implied patching (re-pinning) all shard objects created at the
end of the adding process when the ClusterDAGPin is created and known.

Alternatively, we could make shards unique, but this defeats the property
that if a shard is already allocated somewhere, we should not re-store its
parts on a potentially different allocation because we cannot distinguish
they are the same.

Another approach, taken here, is to count all the times shard is part of a
clusterDAG when unpinning. This requires reading all the ClusterDAGs blocks and
writing down their shards counting how many times they are referenced.
**This may not scale well**. The main part of this PR is the end-to-end
tests for sharding, which should still work regardless of the approach.

Issues with this approach: unpinning will be slow when:

* The state has many items (we need to loop all of them and pick ClusterDAG pins)
* The state has many clusterDAG pins (we need to read the blocks from IPFS,
  parse them and loop the Links() (shards).
* The clusterDAGs have many shards. We loop the shard CIDs and put them
  into a map to count how many times they are referenced. Many shards
  will potentially make a big map.
* Unpinning multiple things affecting the same shards at the same time? Might as well leave
  things pinned.

Issues with the having-a-Parents-reference-in-every-shard approach:

* We need to re-pin shards at the end of adding
* Potentially a big mess if things go wrong.
* Unpinning is easier

Other approaches:

* Counting the references in the pin (carrying a number instead of Parents)
  * Would not need shard re-pinning (pinning a shard-pin that exists just increases count + 1)
  * Would not need looping the state for unpin
  * Adds one more field to the pin type (space).

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
hsanjuan added a commit that referenced this issue Oct 29, 2018
As documented in #496, unpinning a shard which is part of multiple
cluster-DAGs was not handled.

Before, we had a back link to the ClusterDAGs (.Parents) in the shards,
but this implied patching (re-pinning) all shard objects created at the
end of the adding process when the ClusterDAGPin is created and known.

Alternatively, we could make shards unique, but this defeats the property
that if a shard is already allocated somewhere, we should not re-store its
parts on a potentially different allocation because we cannot distinguish
they are the same.

Another approach, taken here, is to count all the times shard is part of a
clusterDAG when unpinning. This requires reading all the ClusterDAGs blocks and
writing down their shards counting how many times they are referenced.
**This may not scale well**. The main part of this PR is the end-to-end
tests for sharding, which should still work regardless of the approach.

Issues with this approach: unpinning will be slow when:

* The state has many items (we need to loop all of them and pick ClusterDAG pins)
* The state has many clusterDAG pins (we need to read the blocks from IPFS,
  parse them and loop the Links() (shards).
* The clusterDAGs have many shards. We loop the shard CIDs and put them
  into a map to count how many times they are referenced. Many shards
  will potentially make a big map.
* Unpinning multiple things affecting the same shards at the same time? Might as well leave
  things pinned.

Issues with the having-a-Parents-reference-in-every-shard approach:

* We need to re-pin shards at the end of adding
* Potentially a big mess if things go wrong.
* Unpinning is easier

Other approaches:

* Counting the references in the pin (carrying a number instead of Parents)
  * Would not need shard re-pinning (pinning a shard-pin that exists just increases count + 1)
  * Would not need looping the state for unpin
  * Adds one more field to the pin type (space).

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
@0zAND1z
Copy link

0zAND1z commented Oct 2, 2019

I am very much interested in the sharding support for ipfs-cluster.

From the roadmap, it looks like there is a plan to tackle this in the next 6 months to 8 months.

Are there any active PRs or discourse threads working on this objective? Happy to collaborate.

@hsanjuan
Copy link
Collaborator Author

hsanjuan commented Oct 3, 2019

@0zAND1z this is blocked on the fact that ipfs can only pin fully recursively or directly, but not to a certain max-depth as cluster needs. Unfortunately a fix on this is falls on go-ipfs land and not likely to come to come anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/meta Topic meta
Projects
None yet
Development

No branches or pull requests

2 participants