Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sector bitfield proposal #133

Closed
whyrusleeping opened this issue Feb 7, 2019 · 30 comments · Fixed by #251
Closed

sector bitfield proposal #133

whyrusleeping opened this issue Feb 7, 2019 · 30 comments · Fixed by #251
Assignees

Comments

@whyrusleeping
Copy link
Member

Problem

Filecoin miners need a compact way to specify which of a (large) set of sectors is either bad or complete. The number of sectors in the set may be in the tens of millions, and a bitset selecting a subset of them must be much much smaller than that.

If sectors are selected randomly, then it is unlikely that we can gain much over one bit per sector. However, it is very likely that miners will fill disks up one at a time, and so sectors added around the same time would likely be physically near (on the same disk). If a disk failure occurred, then an entire range of sectors would be lost, which is really easy to represent with something like run length encoding.

Proposal 0 - Simple RLE

Encode the bitset as follows:

func encodeBitset(bitsetQ) []byte {
  var out []byte
  
  var countType bool
  for !bitsetQ.Empty() {
    var count int
    for bitsetQ.Pop() == countType {
      count++
    }
    out = append(out, varint.Encode(count))
  }
  return out
} 

This works pretty well in most ideal cases, but breaks down if bits are set randomly (or in some pattern, like 011011011011).

Proposal 1 - RLE + Extension

To address the cases in which the previous proposal is inefficient, we can give the encoding the ability to specify bitfields directly. To do this, we can say that while reading the encoding, if a negative varint is parsed, that many bytes should be read and treated as the exact bitfield for the duration.

For example:

000000011111100000011111110110110001010101
would encode as
varint<7> varint<6> varint<6> varint<7> varint<-2>[]byte{108, 85}

(implementation note: it is sufficient just to implement a correct parser for this, and leave the optimization of the encoding up to the individuals)

@porcuquine
Copy link
Contributor

porcuquine commented Feb 7, 2019

I have a few questions/observations, mostly as checks of whether I understand the problem and proposals.

  • Is it a goal that this representation be efficiently updatable? (I assume yes.)
  • Is it okay to require that 'subsets' be contiguous? (I assume yes.)
  • Must all subsets be of the same size?
  • Do we need to be able to randomly index into the total set to choose subsets?

What I'm getting at is that both proposals require sequential scanning/parsing of the data. Based on my guesses about how this needs to be used, it seems that we would need some amount of metadata or additional structure. Essentially, we would chunk the total data into groups with known parity (say, always beginning with a run of zeroes — though there are other strategies). The borders of these chunks would have known indices (known how depends on the strategy). If we needed to select beginning somewhere other than at a chunk boundary, we would have to scan forward from the beginning of the chunk.

Does that sound right?

@dignifiedquire
Copy link
Contributor

Looking at http://roaringbitmap.org/ it is more complicated but the roaring+run paper (linked at the top) suggests much better compression than rle, as well as more efficient ways to get all set values.

@whyrusleeping
Copy link
Member Author

@dignifiedquire thanks for the link, though this in the summary of the paper makes me wonder

Yet there are cases where run-length encoded bitmaps are smaller than the original Roaring bitmaps—typically when the data is sorted so that the bitmaps containlong compressible runs.

@whyrusleeping
Copy link
Member Author

@porcuquine

  • sets do not have to be efficiently updateable
  • not sure what you mean by requiring subsets be contiguous
  • also not sure what you mean by subsets must be the same size
  • yes, random indexing is necessary, but some preprocessing to speed up the random indexing is totally fine.

The borders of these chunks would have known indices (known how depends on the strategy). If we needed to select beginning somewhere other than at a chunk boundary, we would have to scan forward from the beginning of the chunk.

Does that sound right?

Yeah, that seems like an approach. I'm not sure exactly what youre proposing though

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 7, 2019 via email

@porcuquine
Copy link
Contributor

@whyrusleeping

  • not sure what you mean by requiring subsets be contiguous
  • also not sure what you mean by subsets must be the same size

I was trying to unpack what you meant by this:

The number of sectors in the set may be in the tens of millions, and a bitset selecting a subset of them must be much much smaller than that.

I thought you were talking about a way to record the whole set, with the ability to extract (select) subsets of the whole thing.

I now think you might mean that the bitfield would only be used for the much smaller subsets. Is that right?

@whyrusleeping
Copy link
Member Author

No, i'm just saying that the bitfield to select 100 items from 1000000 should take less space to store than a 1000000 bit long bitfield.

@whyrusleeping
Copy link
Member Author

If someone could benchmark various bitfield constructions under different scenarios, that would be super helpful.

@porcuquine
Copy link
Contributor

Thanks for the clarification.

I agree that some tests on representative data should be the next step. If it looks like RLE might really be better for our use case we can spend more time fiddling with the details.

@dignifiedquire
Copy link
Contributor

I believe CONCISE is a more formal version of the suggested RLE + Extension, which we should test as well.

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 8, 2019

Started writing some code for this here: https://github.com/filecoin-project/bitsets

Status:

Done:

  • Formats:
    • Raw
    • Roaring
    • RLE
  • Inputs:
    • random uniform selected samples
    • 2 contiguous selections of sectors
  • Output:
    • Dumb stdout print of each result

Missing

  • Formats:
    • RLE + extension
    • CONCISE
  • Inputs:
    • Validate and/or improve generated raw data to match more closely to what we expect the inputs to be
  • Outputs:
    • Statistical analysis
    • [Bonus] Make pretty graphs

@whyrusleeping
Copy link
Member Author

@dignifiedquire if you wanted to continue with these benchmarks i wouldnt be upset ;)

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 21, 2019 via email

@dignifiedquire
Copy link
Contributor

Current example numbers.

random selection: (200 count)
  raw:     62000 bytes (avg)
  rle:     74724.28 bytes (avg)
  roaring: 49363.02 bytes (avg)
  concise: 128267.32 bytes (avg)
  gz:      27012.03 bytes (avg)
  zlib:    27000.03 bytes (avg)
contingous slices: (200 count)
  raw:     62000 bytes (avg)
  rle:     11.825 bytes (avg)
  roaring: 72.27 bytes (avg)
  concise: 33.92 bytes (avg)
  gz:      113.68 bytes (avg)
  zlib:    101.68 bytes (avg)

@whyrusleeping
Copy link
Member Author

@dignifiedquire how many random bits are being set in the random selection one? I'd like to see an array of say 100,000ish bits with 100 random bits set, 1000 random bits set, 10,000 random bits set.

Then i'd also like to see the same size of thing with N contiguous runs of 100 sectors, for N={1,2,3,5,10,20}

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 22, 2019

-- Random selections (up to 2%) --
sectors: 2700.7/10000
  raw:     1250 bytes (avg)
  rle:     3859.8 bytes (avg)
  roaring: 4653.6 bytes (avg)
  concise: 8835.2 bytes (avg)
  gz:      1053.2 bytes (avg)
  zlib:    1041.2 bytes (avg)
-- Random selections (up to 5%) --
sectors: 1167/10000
  raw:     1250 bytes (avg)
  rle:     2066.2 bytes (avg)
  roaring: 2173.2 bytes (avg)
  concise: 3812 bytes (avg)
  gz:      714 bytes (avg)
  zlib:    702 bytes (avg)
-- Random selections (up to 10%) --
sectors: 624.6/10000
  raw:     1250 bytes (avg)
  rle:     1247.3 bytes (avg)
  roaring: 1213 bytes (avg)
  concise: 2053.2 bytes (avg)
  gz:      542.5 bytes (avg)
  zlib:    530.5 bytes (avg)
-- Random selections (up to 2%) --
sectors: 28194.2/100000
  raw:     12500 bytes (avg)
  rle:     37077.8 bytes (avg)
  roaring: 15654 bytes (avg)
  concise: 92555.2 bytes (avg)
  gz:      9650.2 bytes (avg)
  zlib:    9638.2 bytes (avg)
-- Random selections (up to 5%) --
sectors: 10551.3/100000
  raw:     12500 bytes (avg)
  rle:     19773.1 bytes (avg)
  roaring: 13778.6 bytes (avg)
  concise: 34690.4 bytes (avg)
  gz:      6784.8 bytes (avg)
  zlib:    6772.8 bytes (avg)
-- Random selections (up to 10%) --
sectors: 6317.3/100000
  raw:     12500 bytes (avg)
  rle:     12747.7 bytes (avg)
  roaring: 10810.4 bytes (avg)
  concise: 20655.6 bytes (avg)
  gz:      5101.1 bytes (avg)
  zlib:    5089.1 bytes (avg)
-- Random selections (up to 2%) --
sectors: 259792.4/1000000
  raw:     125000 bytes (avg)
  rle:     361039 bytes (avg)
  roaring: 128196.6 bytes (avg)
  concise: 852353.6 bytes (avg)
  gz:      94311.4 bytes (avg)
  zlib:    94299.4 bytes (avg)
-- Random selections (up to 5%) --
sectors: 85746.7/1000000
  raw:     125000 bytes (avg)
  rle:     163321.1 bytes (avg)
  roaring: 111617.8 bytes (avg)
  concise: 281396 bytes (avg)
  gz:      57639.6 bytes (avg)
  zlib:    57627.6 bytes (avg)
-- Random selections (up to 10%) --
sectors: 46409.1/1000000
  raw:     125000 bytes (avg)
  rle:     95577.5 bytes (avg)
  roaring: 78511.6 bytes (avg)
  concise: 152282.4 bytes (avg)
  gz:      39526.4 bytes (avg)
  zlib:    39514.4 bytes (avg)
-- Multiple 2 contigous selections (each up to 2%) --
sectors: 5387.2/10000
  raw:     1250 bytes (avg)
  rle:     6.4 bytes (avg)
  roaring: 16.6 bytes (avg)
  concise: 22 bytes (avg)
  gz:      49.2 bytes (avg)
  zlib:    37.2 bytes (avg)
-- Multiple 2 contigous selections (each up to 5%) --
sectors: 1871.5/10000
  raw:     1250 bytes (avg)
  rle:     7.5 bytes (avg)
  roaring: 17.8 bytes (avg)
  concise: 22.8 bytes (avg)
  gz:      51.2 bytes (avg)
  zlib:    39.2 bytes (avg)
-- Multiple 2 contigous selections (each up to 10%) --
sectors: 976.3/10000
  raw:     1250 bytes (avg)
  rle:     8.8 bytes (avg)
  roaring: 19 bytes (avg)
  concise: 28.8 bytes (avg)
  gz:      53 bytes (avg)
  zlib:    41 bytes (avg)
-- Multiple 5 contigous selections (each up to 2%) --
sectors: 14392.4/10000
  raw:     1250 bytes (avg)
  rle:     7.2 bytes (avg)
  roaring: 17.4 bytes (avg)
  concise: 39.2 bytes (avg)
  gz:      50.3 bytes (avg)
  zlib:    38.3 bytes (avg)
-- Multiple 5 contigous selections (each up to 5%) --
sectors: 4235.3/10000
  raw:     1250 bytes (avg)
  rle:     15.3 bytes (avg)
  roaring: 25.4 bytes (avg)
  concise: 52.8 bytes (avg)
  gz:      65.2 bytes (avg)
  zlib:    53.2 bytes (avg)
-- Multiple 5 contigous selections (each up to 10%) --
sectors: 2471.4/10000
  raw:     1250 bytes (avg)
  rle:     15.5 bytes (avg)
  roaring: 26.2 bytes (avg)
  concise: 52.8 bytes (avg)
  gz:      64.4 bytes (avg)
  zlib:    52.4 bytes (avg)
-- Multiple 10 contigous selections (each up to 2%) --
sectors: 24878.2/10000
  raw:     1250 bytes (avg)
  rle:     6 bytes (avg)
  roaring: 16.2 bytes (avg)
  concise: 69.6 bytes (avg)
  gz:      46 bytes (avg)
  zlib:    34 bytes (avg)
-- Multiple 10 contigous selections (each up to 5%) --
sectors: 10435.9/10000
  raw:     1250 bytes (avg)
  rle:     14.8 bytes (avg)
  roaring: 25 bytes (avg)
  concise: 92.4 bytes (avg)
  gz:      63.6 bytes (avg)
  zlib:    51.6 bytes (avg)
-- Multiple 10 contigous selections (each up to 10%) --
sectors: 5112.2/10000
  raw:     1250 bytes (avg)
  rle:     25.4 bytes (avg)
  roaring: 35.8 bytes (avg)
  concise: 100.4 bytes (avg)
  gz:      77 bytes (avg)
  zlib:    65 bytes (avg)
-- Multiple 2 contigous selections (each up to 2%) --
sectors: 50592.8/100000
  raw:     12500 bytes (avg)
  rle:     7.8 bytes (avg)
  roaring: 23.2 bytes (avg)
  concise: 21.2 bytes (avg)
  gz:      59.9 bytes (avg)
  zlib:    47.9 bytes (avg)
-- Multiple 2 contigous selections (each up to 5%) --
sectors: 25118.2/100000
  raw:     12500 bytes (avg)
  rle:     9.2 bytes (avg)
  roaring: 20.8 bytes (avg)
  concise: 20 bytes (avg)
  gz:      63.8 bytes (avg)
  zlib:    51.8 bytes (avg)
-- Multiple 2 contigous selections (each up to 10%) --
sectors: 11531.6/100000
  raw:     12500 bytes (avg)
  rle:     9.6 bytes (avg)
  roaring: 22.8 bytes (avg)
  concise: 23.6 bytes (avg)
  gz:      65.5 bytes (avg)
  zlib:    53.5 bytes (avg)
-- Multiple 5 contigous selections (each up to 2%) --
sectors: 122273.4/100000
  raw:     12500 bytes (avg)
  rle:     9.5 bytes (avg)
  roaring: 26 bytes (avg)
  concise: 43.6 bytes (avg)
  gz:      63.2 bytes (avg)
  zlib:    51.2 bytes (avg)
-- Multiple 5 contigous selections (each up to 5%) --
sectors: 52841.7/100000
  raw:     12500 bytes (avg)
  rle:     16.5 bytes (avg)
  roaring: 31.8 bytes (avg)
  concise: 50 bytes (avg)
  gz:      78.3 bytes (avg)
  zlib:    66.3 bytes (avg)
-- Multiple 5 contigous selections (each up to 10%) --
sectors: 23563.8/100000
  raw:     12500 bytes (avg)
  rle:     19.8 bytes (avg)
  roaring: 35.4 bytes (avg)
  concise: 51.6 bytes (avg)
  gz:      86.9 bytes (avg)
  zlib:    74.9 bytes (avg)
-- Multiple 10 contigous selections (each up to 2%) --
sectors: 253926.6/100000
  raw:     12500 bytes (avg)
  rle:     7.7 bytes (avg)
  roaring: 26.6 bytes (avg)
  concise: 76.4 bytes (avg)
  gz:      60.9 bytes (avg)
  zlib:    48.9 bytes (avg)
-- Multiple 10 contigous selections (each up to 5%) --
sectors: 94972.6/100000
  raw:     12500 bytes (avg)
  rle:     17.6 bytes (avg)
  roaring: 33.4 bytes (avg)
  concise: 84.8 bytes (avg)
  gz:      79.7 bytes (avg)
  zlib:    67.7 bytes (avg)
-- Multiple 10 contigous selections (each up to 10%) --
sectors: 53349/100000
  raw:     12500 bytes (avg)
  rle:     23.9 bytes (avg)
  roaring: 39.8 bytes (avg)
  concise: 94 bytes (avg)
  gz:      92 bytes (avg)
  zlib:    80 bytes (avg)
-- Multiple 2 contigous selections (each up to 2%) --
sectors: 521459.1/1000000
  raw:     125000 bytes (avg)
  rle:     8.1 bytes (avg)
  roaring: 114.9 bytes (avg)
  concise: 22 bytes (avg)
  gz:      166.1 bytes (avg)
  zlib:    154.1 bytes (avg)
-- Multiple 2 contigous selections (each up to 5%) --
sectors: 163846.9/1000000
  raw:     125000 bytes (avg)
  rle:     12.4 bytes (avg)
  roaring: 50 bytes (avg)
  concise: 24.8 bytes (avg)
  gz:      175 bytes (avg)
  zlib:    163 bytes (avg)
-- Multiple 2 contigous selections (each up to 10%) --
sectors: 64772.9/1000000
  raw:     125000 bytes (avg)
  rle:     12.4 bytes (avg)
  roaring: 36.2 bytes (avg)
  concise: 25.6 bytes (avg)
  gz:      176.3 bytes (avg)
  zlib:    164.3 bytes (avg)
-- Multiple 5 contigous selections (each up to 2%) --
sectors: 1143903.5/1000000
  raw:     125000 bytes (avg)
  rle:     11.6 bytes (avg)
  roaring: 172.3 bytes (avg)
  concise: 39.6 bytes (avg)
  gz:      171.5 bytes (avg)
  zlib:    159.5 bytes (avg)
-- Multiple 5 contigous selections (each up to 5%) --
sectors: 568391/1000000
  raw:     125000 bytes (avg)
  rle:     19.1 bytes (avg)
  roaring: 141.6 bytes (avg)
  concise: 51.2 bytes (avg)
  gz:      185.6 bytes (avg)
  zlib:    173.6 bytes (avg)
-- Multiple 5 contigous selections (each up to 10%) --
sectors: 272885.8/1000000
  raw:     125000 bytes (avg)
  rle:     26.2 bytes (avg)
  roaring: 112.1 bytes (avg)
  concise: 50.4 bytes (avg)
  gz:      194 bytes (avg)
  zlib:    182 bytes (avg)
-- Multiple 10 contigous selections (each up to 2%) --
sectors: 2274281/1000000
  raw:     125000 bytes (avg)
  rle:     8.6 bytes (avg)
  roaring: 197.8 bytes (avg)
  concise: 80 bytes (avg)
  gz:      166.8 bytes (avg)
  zlib:    154.8 bytes (avg)
-- Multiple 10 contigous selections (each up to 5%) --
sectors: 1094097.3/1000000
  raw:     125000 bytes (avg)
  rle:     19.8 bytes (avg)
  roaring: 178.8 bytes (avg)
  concise: 83.2 bytes (avg)
  gz:      187.3 bytes (avg)
  zlib:    175.3 bytes (avg)
-- Multiple 10 contigous selections (each up to 10%) --
sectors: 527079.5/1000000
  raw:     125000 bytes (avg)
  rle:     36.4 bytes (avg)
  roaring: 164.6 bytes (avg)
  concise: 98.4 bytes (avg)
  gz:      213.5 bytes (avg)
  zlib:    201.5 bytes (avg)

@dignifiedquire
Copy link
Contributor

@whyrusleeping this should give you good overview of things (each section is the average over 10 runs)

@whyrusleeping
Copy link
Member Author

@dignifiedquire great results! thank you! which rle is that? the one I described? or something else?

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 23, 2019 via email

@dignifiedquire
Copy link
Contributor

dignifiedquire commented Feb 23, 2019

I implemented an improved rle, called rle+ which I really like from the characteristics:

@whyrusleeping
Copy link
Member Author

@dignifiedquire i feel like something must be wrong with the results. The 2% bits set shows rle taking up 3x the space as the data itself? 2% bits set implies an average of 50 empty bits for every set bit, which gives plenty of room to compress via rle.

@dignifiedquire
Copy link
Contributor

I agree, this does seem strange. Will investigate further.

@dignifiedquire
Copy link
Contributor

Many bugs, little wow in my code.

Better results are here now: https://gist.github.com/dignifiedquire/2ee6d850105d2706bf1db5ba6091120f

@dignifiedquire
Copy link
Contributor

rle+ still looks like a winner on all fronts though

@dignifiedquire
Copy link
Contributor

and now in a much nicer format: https://gist.github.com/dignifiedquire/76d98419f003e4015402c810af7ab400

@dignifiedquire dignifiedquire self-assigned this Apr 12, 2019
@whyrusleeping
Copy link
Member Author

Cool, i'm liking rle+. We should write a decoder and fuzz it a bit, but I think this is good enough for now. We probably want to put a marker in front of the serialized bitmap to serialize the type.

@dignifiedquire
Copy link
Contributor

Alright: decoder + tests are up and running: https://github.com/filecoin-project/bitsets/blob/master/src/rleplus.rs

@dignifiedquire
Copy link
Contributor

I also wrote a basic spec describing the format here: https://github.com/filecoin-project/bitsets/blob/master/src/rleplus.rs#L1

@whyrusleeping
Copy link
Member Author

@dignifiedquire this is great!

@whyrusleeping
Copy link
Member Author

Could we copy that specification into a document in this repo, and then also add a prefix marker to the serialized set so that we could hypothetically change it in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants