Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libs/bitarray: Use []uint32 instead of []uint64 for the bits #2077

Closed
ValarDragon opened this issue Jul 26, 2018 · 5 comments
Closed

libs/bitarray: Use []uint32 instead of []uint64 for the bits #2077

ValarDragon opened this issue Jul 26, 2018 · 5 comments
Labels
S:proposal Status: Proposal T:enhancement Type: Enhancement T:perf Type: Performance
Milestone

Comments

@ValarDragon
Copy link
Contributor

ValarDragon commented Jul 26, 2018

Currently we use []uint64 to store the bits of the bit array. Instead we should use []byte, for spatial efficiency. (Plus using []byte for bit arrays is standard)

With an []uint64 the number of elements in the list will be 8 times less than the number of elements in a list of []byte. This means that the prefix when amino encoded will require 3 less bits. (Perhaps 4 less bits, due to uint prefixing).

However if we wanted a random number of bits, we would expect on average that we would only need half of the last element to store it. So with an []uint64, this means that the last element uses 64 bits to specify what we could with 32 bits. Or in other words, we are wasting 4 bytes on average. With an []byte we are using a single byte to represent 4 bits, which means we are wasting 4 bits on average here.

This means that if we used []byte in encoding, we would save 3 bytes on average relative to the []uint64 case. Thus we should change the bit array to use []byte instead of []uint64.

  • Note, I am assuming that there is no per-element bytes added for every element of the slice. I vaguely recall that such a thing may exist due to protobuf3 compatibility. If this is the case, we should keep on doing what we are doing, and just write this reasoning in the godoc or in an ADR.
@ValarDragon ValarDragon added T:perf Type: Performance S:proposal Status: Proposal labels Jul 26, 2018
@xla xla added the T:enhancement Type: Enhancement label Jul 27, 2018
@xla xla added this to the post-launch milestone Jul 27, 2018
@ValarDragon
Copy link
Contributor Author

Talked to Jae a bit about this. He made the point that arithmetic operations using []uint64's are significantly faster than []byte, due to the processor being able to perform arithmetic 64 bits at a time, as opposed to 8. I do agree, and I don't think the spatial efficiency gains here (an additive constant) is worth the loss of a multiplicative factor for processing time gains.

He did suggest that we consider using uint32 instead of uint64 for better javascript compatibility. (Javascript doesn't have a 64 bit object, so uint64s are big ints) He had suggested this as something we relook at soon before launch if we have time.

@xla xla changed the title cmn/bitarray: Use []bytes instead of []uint64 for the bits. cmn/bitarray: Use []uint32 instead of []uint64 for the bits. Jul 27, 2018
@xla xla changed the title cmn/bitarray: Use []uint32 instead of []uint64 for the bits. cmn/bitarray: Use []uint32 instead of []uint64 for the bits Jul 27, 2018
@melekes
Copy link
Contributor

melekes commented Aug 1, 2018

Java does not have uint64 too.

@ebuchman
Copy link
Contributor

As far as I can tell, the arithmetic ops (Or, And, Not, Sub) are only being used in the consensus reactor for tracking peer state (ie. how many and which votes and block parts peers have), and are not used in any essential way in marshalable form (ie over RPC, we show the bitarray string like __xx_xxx__), so it shouldn't be an issue for javascript. Please correct me if I'm wrong.

We are starting to use bitarrays for crypto though and for those we should use the more efficient form with bytes since we're not trying to do math. This includes multisig and the invalid txs bit array we want to put in the header.

@tessr tessr changed the title cmn/bitarray: Use []uint32 instead of []uint64 for the bits libs/bitarray: Use []uint32 instead of []uint64 for the bits Nov 16, 2020
@tessr tessr added this to Untriaged 🥚 in Tendermint Core Project Board via automation Nov 16, 2020
@tessr tessr moved this from Untriaged 🥚 to Icebox 🧊 in Tendermint Core Project Board Nov 16, 2020
@creachadair
Copy link
Contributor

@ValarDragon Is this issue still relevant? The existing uses in the consensus engine are for "small" vectors, where space won't matter. Arguably anything that has more specific constraints should probably use its own bit vector implementation rather than depending on the TM one.

There's at least one generated file in the Cosmos SDK that still cares about the protobuf representation of this file, but I don't see any meaningful use of it in a cursory GitHub search. Would anything be meaningfully impacted if we made this package internal?

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Jan 31, 2022

Agreed, I don't think this really matters either. Theres other places to get much larger space savings

Tendermint Core Project Board automation moved this from Icebox 🧊 to Done 🎉 Jan 31, 2022
firelizzard18 pushed a commit to AccumulateNetwork/tendermint that referenced this issue Feb 1, 2024
Nit: upgrade alpine linux in localnode Dockerfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S:proposal Status: Proposal T:enhancement Type: Enhancement T:perf Type: Performance
Projects
No open projects
Development

No branches or pull requests

5 participants