New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libs/bitarray: Use []uint32 instead of []uint64 for the bits #2077
Comments
Talked to Jae a bit about this. He made the point that arithmetic operations using []uint64's are significantly faster than []byte, due to the processor being able to perform arithmetic 64 bits at a time, as opposed to 8. I do agree, and I don't think the spatial efficiency gains here (an additive constant) is worth the loss of a multiplicative factor for processing time gains. He did suggest that we consider using uint32 instead of uint64 for better javascript compatibility. (Javascript doesn't have a 64 bit object, so uint64s are big ints) He had suggested this as something we relook at soon before launch if we have time. |
Java does not have uint64 too. |
As far as I can tell, the arithmetic ops (Or, And, Not, Sub) are only being used in the consensus reactor for tracking peer state (ie. how many and which votes and block parts peers have), and are not used in any essential way in marshalable form (ie over RPC, we show the bitarray string like We are starting to use bitarrays for crypto though and for those we should use the more efficient form with bytes since we're not trying to do math. This includes multisig and the invalid txs bit array we want to put in the header. |
@ValarDragon Is this issue still relevant? The existing uses in the consensus engine are for "small" vectors, where space won't matter. Arguably anything that has more specific constraints should probably use its own bit vector implementation rather than depending on the TM one. There's at least one generated file in the Cosmos SDK that still cares about the protobuf representation of this file, but I don't see any meaningful use of it in a cursory GitHub search. Would anything be meaningfully impacted if we made this package internal? |
Agreed, I don't think this really matters either. Theres other places to get much larger space savings |
Nit: upgrade alpine linux in localnode Dockerfile
Currently we use
[]uint64
to store thebits
of the bit array. Instead we should use[]byte
, for spatial efficiency. (Plus using[]byte
for bit arrays is standard)With an
[]uint64
the number of elements in the list will be 8 times less than the number of elements in a list of[]byte
. This means that the prefix when amino encoded will require 3 less bits. (Perhaps 4 less bits, due to uint prefixing).However if we wanted a random number of bits, we would expect on average that we would only need half of the last element to store it. So with an
[]uint64
, this means that the last element uses 64 bits to specify what we could with 32 bits. Or in other words, we are wasting 4 bytes on average. With an[]byte
we are using a single byte to represent 4 bits, which means we are wasting4 bits
on average here.This means that if we used
[]byte
in encoding, we would save3 bytes
on average relative to the[]uint64
case. Thus we should change the bit array to use[]byte
instead of[]uint64
.The text was updated successfully, but these errors were encountered: