New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WriteTo() variant that exports only the bitset #46
Comments
Is |
@db7 as I understand the GoDoc |
Yes, you're right. But you can simply make a slice and pass it to the So this... data := make([]uint64, m / 64) // perhaps you'd like to ceil the division.
bf := bloom.From(data, k) should be equivalent to this... bf := bloom.New(m, k) |
@db7 you're again referring to deserialisation or am I missing something? |
@maciej This is how we work/serialize BFs. data := make([]uint64, m / 64)
// store reference to data slice somewhere
// whenever we need to update or check the BF, we create a BF object with the slice.
bf := bloom.From(data, k)
bf.AddString("whatever")
bf.TestString("whatever")
...
// whevener we need to serialize the BF, use use the data slice.
buf, err := serialize(data) To make it a bit cleaner, one could create a wrapper for the BF+data slice. It won't take more space in memory since the data slice is not copied. type SerializableBF struct {
*bloom.BloomFilter
data []uint64
}
func NewSerializableBF(m int, k hashes) *SerializableBF {
data := make([]uint64, m/64)
return &SerializableBF{bloom.From(data, k), data}
}
func (s *SerializableBF) Serialize() ([]byte, error) {
// serialize s.data in the format you want as buf byte slice
return buf, nil
} I hope I am not completely missing your issue... and perhaps this is not the most elegant solution too. |
@maciej The current format saves It is a rather thin format. Admittedly using 64 bits per parameter is a bit wasteful but it does not seem like a big deal. Furthermore, if your false-positive and capacities are fixed, you can omit these parameters... We are talking about 16 bytes... which we could reduce to 8 bytes easily... If that is a large fraction of your storage... I'd be curious about your use case? Can you elaborate? I am totally open to proposing something finer. |
m := uint(b.GetM())
k := uint(b.GetK())
return bloom.FromWithM(b.GetBitSet().GetSet(), m, k) |
Since we are exposing the bitset (@paralin), I think that this issue is resolved. Please open a new issue if the underlying problem remains. |
Currently there is no elegant solution to export only the bloom filter bitset. That makes it difficult to engineer custom serialization formats.
I'm eager to submit a PR to change that if we could agree on the design.
Of the top of my head we could do one of those things:
BloomFilter.b
– the simplest solution. However will allow developers to shoot themselves in the foot.BloomFilter.b
.*BloomFilter.BisetCopy
?*BloomFilter.WriteBitSetTo()
function?*BloomFilter.BitSetWriter()
function that would return aThe text was updated successfully, but these errors were encountered: