Skip to content
This repository was archived by the owner on Feb 8, 2023. It is now read-only.
This repository was archived by the owner on Feb 8, 2023. It is now read-only.

flatfs observations #5

@jbenet

Description

@jbenet

some random collection of observations.

fanout

6k entries in top level

> ls ~/.ipfs/blocks/ | wc -l
6497

not yet sure whether this size (4bytes of fanout) is better than nesting multiple fanout levels. we'll need real benchmarks for the various platforms

  • currently have 4 bytes of fanout, and 1 level.
  • but 2 bytes are multihash prefix (mostly useless), so only 2 bytes of fanout.
  • git uses 1 byte of fanout (and 1 level?)
  • git uses pack files for storing most objects

multihash prefix in fanout

entries include the multihash prefix, maybe could definitely split on the multihash prefix (2 bytes), so that there's one dir per multihash combination, which helps since multihash combinations are very few (not uniform). this also avoids repeating each prefix on every entry. so maybe:

/Users/jbenet/.ipfs/blocks/
├── 1220
│   └── 0006
│              └── 12200006a05b63b68a923a5ad57c0830459697ef8922d69fb31a3efe72e563727b24.data

observations from @tv42

18:27 <Tv`> ipfs keys are "/", 2 bytes from multihash, and then uniform distribution
18:27 <Tv`> when i picked that value 4, that lead to a little higher than 256 way split
18:28 <Tv`> after that, a commit was added that stripped the "/" (and made assumptions i wasn't ready to make; if there ever is a key without that, it screws up; and if it's always there, what's the value of)
18:28 <Tv`> so now the sharding data is 2 non-random bytes, 2 random bytes
18:29 <Tv`> actually, let's see whether it shards before or after the skipping of "/" was added...
18:29 <Tv`> after
18:29 <Tv`> so what i said above holds, afaik
18:30 <Tv`> so by that logic, now the tunable 4 should result in up to about 65k top-level dirs in ~/.ipfs/blocks
18:30 <Tv`> which is fine, the dirs will be a little empty for small setups, which is mostly a waste of inodes, but not a huge concern

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions