math/bits: an integer bit twiddling library #18616

Closed
brtzsnr opened this Issue Jan 11, 2017 · 166 comments

Comments

Projects
None yet
Contributor

brtzsnr commented Jan 11, 2017 edited

Previous discussions at #17373 and #10757.

Abstract

This proposal introduces a set of API for integer bit twiddling.

Background

This proposal introduces a set of API for integer bit twiddling. For this proposal we are interested in the following functions:

  • ctz - count trailing zeros.
  • clz - count leading zeros; log_2.
  • popcnt - count population; hamming distance; integer parity.
  • bswap - reverses the order of bytes.

These functions were picked by surveying:

We limited ourselves to these four functions because other twiddling
tricks are very simple to implement using the proposed library,
or already available Go constructs.

We found implementations for a subset of the selected twiddling functions
in many packages including runtime, compiler and tools:

package clz ctz popcnt bswap
math/big X X
runtime/internal/sys X X X
tools/container/intsets X X
cmd/compile/internal/ssa X X
code.google.com/p/intmath X X
github.com/hideo55/go-popcount X (asm)
github.com/RoaringBitmap/roaring X X (asm)
github.com/tHinqa/bitset X X X
github.com/willf/bitset X X (asm)
gopl.io/ch2/popcount X
GCC builtins X X X X

Many other packages implement a subset of these functions:

Similarly hardware providers have recognized the importance
of such functions and included machine level support.
Without hardware support these operations are very expensive.

arch clz ctz popcnt bswap
AMD64 X X X X
ARM X X ? X
ARM64 X X ? X
S390X X X ? X

All bit twiddling functions, except popcnt, are already implemented by runtime/internal/sys and receive special support from the compiler in order to "to help get the very best performance". However, the compiler support is limited to the runtime package and other Golang users have to reimplement the slower variant of these functions.

Proposal

We introduce a new std library math/bits with the following external API, to provides compiler / hardware optimized implementations of clz, ctz, popcnt and bswap functions.

package bits

// SwapBytes16 reverses the order of bytes in a 16-bit integer.
func SwapBytes16(uint16) uint16
// SwapBytes32 reverses the order of bytes in a 32-bit integer.
func SwapBytes32(uint32) uint32
// SwapBytes64 reverses the order of bytes in a 64-bit integer.
func SwapBytes64(uint64) uint64

// TrailingZeros32 counts the number of trailing zeros in a 32-bit integer, and if all are zero, then 32.
func TrailingZeros32(uint32) uint
// TrailingZeros64 counts the number of trailing zeros in a 64-bit integer, and if all are zero, then 64.
func TrailingZeros64(uint64) uint

// LeadingZeros32 counts the number of trailing zeros in a 32-bit integer, and if all are zero, then 32.
func LeadingZeros32(uint32) uint
// LeadingZeros64 counts the number of trailing zeros in a 64-bit integer, and if all are zero, then 64.
func LeadingZeros64(uint64) uint

// Ones32 counts the number of bits set in a 32-bit integer.
func Ones32(uint32) uint
// Ones64 counts the number of bits set in a 64-bit integer.
func Ones64(uint64) uint

Rationale

Alternatives to this proposal are:

  • Bit twiddling functions are implemented in an external library not supported by the compiler. This approach works and is the current state of affairs. Runtime uses the compiler supported methods, while Golang users continue to use the slower implementations.
  • The external library is supported by the compiler. Given that we expect this library to replace runtime/internal/sys it means that this library must be in lock with the compiler and live in the standard library.

Compatibility

This proposal does not change or breaks any existing stdlib API and it conforms to compatibly guidelines.

Implementation

SwapBytes, TrailingZeros and LeadingZeros are already implemented. The only missing function is Ones which can be implemented similarly to the other functions. If this proposal is accepted it can be implemented in time for Go1.9.

Open issues (if applicable)

Names are hard, bike shed is in the comments.

Please suggest additional functions to be included in the comments. Ideally, please include where such function is used in stdlib (e.g. math/big), tools or popular packages.

So far the following functions have been proposed and are under consideration:

  • RotateLeft / RotateRight
    • Pros: &63 is no longer need to compile a rotate with non-const argument to a single instruction on x86, x86-64..
    • Cons: very short/simple to implement and inline.
    • Used: crypto/ uses the constant rotate which is handled properly by the compiler.
  • ReverseBits
    • Pros: ?
    • Cons: ?
    • Used: ?
  • Add/Sub with carry return
    • Pros: Expensive otherwise
    • Cons: ?
    • Used: math/big

History

14.Jan: Clarified the output of TrailingZeros and LeadingZeros when the argument is 0.
14.Jan: Renamed methods: CountTrailingZeros -> TrailingZeros, CountLeadingZeros -> LeadingZeros, CountOnes -> Ones.
13.Jan: Fixed architecture name.
11.Jan: Initial proposal opened to public.

cespare added the Proposal label Jan 11, 2017

Contributor

cespare commented Jan 11, 2017

@brtzsnr perhaps you should submit this document to the proposal repo as outlined in the proposal process steps?

Since it's already markdown following the template, it should be easy to copy-paste into a CL creating a file design/18616-bit-twiddling.md (or whatever).

Contributor

brtzsnr commented Jan 11, 2017

@cespare from https://github.com/golang/proposal "if the author wants to write a design doc, then they can write one". It started as a design doc, if there is strong feeling that I should submit this, I'm totally fine.

Contributor

griesemer commented Jan 11, 2017

I'd be ok with this, it's common enough functionality used in many algorithmic libraries and math/bits seems like an appropriate place.

(For one, math/big also implements nlz (== clz).)

There's probably some bike shedding about the names. I for one would prefer the functions to say what they return rather than what they do; which in turn may lead to shorter names. For instance:

bits.TrailingZeros64(x) rather than bits.CountTrailingZeros64(x)

and so forth.

Contributor

griesemer commented Jan 11, 2017 edited

The proposal seems pretty clear and minimal - a design doc seems overkill. I think a CL would be more appropriate at this point.

(That is a CL with API and basic implementation - for purpose of discussion in place of a design doc. We still need to decide if this proposal should be accepted or not.)

Contributor

cespare commented Jan 11, 2017

@brtzsnr has already written the design document: it's in the issue description and it follows the the template. I assumed that there was some value in having these documents all in one location.

Contributor

cespare commented Jan 11, 2017

The last arch listed in the hardware support table is "BSWAP" -- typo?

Contributor

josharian commented Jan 12, 2017

Thanks for writing this up.

The doc string for ctz and clz should specify the result when passed 0.

I also prefer (e.g.) TrailingZeros32 to CountTrailingZeros32. I'd also be happy with Ctz32. It is concise, familiar to most, and easily googleable for the rest.

Contributor

cherrymui commented Jan 12, 2017

Thanks for the proposal.
I just want to add that we probably also want to focus on the use, instead of focusing only on the instructions. For example, @dr2chase once found there are several log2 functions in the compiler/assembler. It is close to CLZ but not the same. Functions like this should probably also in the bit twiddling package. And maybe also include useful functions that are not related to these instructions.

bradfitz added this to the Proposal milestone Jan 13, 2017

Member

minux commented Jan 13, 2017

Member

dsnet commented Jan 13, 2017 edited

@minux, fortunately, every bit twiddling function I've needed so far is exactly the ones that are in this proposal.

Contributor

dr2chase commented Jan 13, 2017

Following Hacker's Delight has the advantage that we don't need to waste time arguing about the names.

Member

minux commented Jan 13, 2017 edited by josharian

Contributor

josharian commented Jan 13, 2017

Relatedly, functions to check whether an add/multiply will overflow.

A related data point: there are various issues that have been filed for faster cgo. In one such example (proposal #16051), the fact that fast implementation of bsr/ctz/etc. might happen was mentioned as hopefully chipping away at the set of use cases where people writing go are tempted to use cgo.

@aclements commented on Jul 1, 2016:

@eloff, there's been some discussion (though no concrete proposals to my knowledge) to add functions for things like popcount and bsr that would be compiled as intrinsics where supported (like math.Sqrt is today). With 1.7 we're trying this out in the runtime, which now has a ctz intrinsic available to it on amd64 SSA. Obviously that doesn't solve the overall problem, but it would chip away at one more reason to use cgo in a low-overhead context.

Many people (myself included) are attracted to go because of the performance, so things like this current proposal for bit twiddling would help. (And yes, cgo is also now faster in 1.8, which is nice as well).

Contributor

brtzsnr commented Jan 14, 2017

RotateLeft/Right (can be expanded inline with two shifts, but the compiler
can't always do the transformation due to shift range issues)

@minux Can you elaborate what is the problem?

Contributor

randall77 commented Jan 14, 2017

@brtzsnr : I think what Minux is referring to is that when you write (x << k) | (x >> (64-k)), you know you're using 0 <= k < 64, but the compiler can't read your mind, and it is not obviously derivable from the code. If we had the function

func leftRot(x uint64, k uint) uint64 {
   k &= 63
   return (x << k) | (x >> (64-k))
}

Then we can make sure (via the &63) that the compiler knows that the range of k is bounded.
So if the compiler can't prove the input is bounded, then we need an extra AND. That's better than not generating the rotate assembly at all.

Member

minux commented Jan 14, 2017

opennota commented Jan 14, 2017 edited

How about byte and bit shuffling (and unshuffling) functions that are used by the blosc compression library? The slides (the shuffling starts from the slide 17). These functions can be SSE2/AVX2 accelerated.

Member

minux commented Jan 14, 2017

Contributor

brtzsnr commented Jan 14, 2017

The current proposed functions have a Go native implementation much larger and disproportionally more expensive than optimum. On the other hand rotate is easy to write inline in a way that the compiler can recognize.

@minux and also everyone else: Do you know where rotate left/right is used with a non-constant number of rotated bits? crypto/sha256 uses rotate for example, but with constant number of bits.

Contributor

josharian commented Jan 14, 2017

rotate is easy to write inline in a way that the compiler can recognize.

It is easy for those who are familiar with the compiler's internals. Putting it in a math/bits package makes it easy for everyone.

Do you know where rotate left/right is used with a non-constant number of rotated bits?

Here's an example from #9337:

https://play.golang.org/p/rmDG7MR5F9

In each invocation it is a constant number of rotated bits each time, but the function itself is not currently inlined, so it compiles without any rotate instructions. A math/bits library function would definitely help here.

Member

minux commented Jan 14, 2017 edited by josharian

Contributor

brtzsnr commented Jan 17, 2017

@josharian The example looks like a bad inliner decision if rot is not inlined. Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

@minux: I agree with you. I'm not trying to tie the API to a particular instruction set; the hardware support is a nice bonus. My main focus is to find usages in the real code (not toy code) to understand the context, what is the best signature and how important is to provide everyone's favorite function. Compatibility promise will bite us later if we don't this properly now.

For example: What should be the signature of add with carry return? Add(x, y uint64) (c, s uint64)? Looking at math/big we probably need Add(x, y uintptr) (c, s uintptr) too.

Contributor

josharian commented Jan 17, 2017

The example looks like a bad inliner decision if rot is not inlined.

Yes. It is part of a bug complaining about inlining. :)

Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

It's not my code--and that's part of the point. And anyway, it is reasonable code.

Member

mundaym commented Jan 19, 2017

It would be nice to guarantee (in the package documentation?) that the rotate and byte-swap functions are constant-time operations so that they can be safely used in crypto algorithms. Possibly something to think about for other functions too.

Member

minux commented Jan 19, 2017

Contributor

randall77 commented Jan 19, 2017

I'm inclined to agree with @minux. If you want constant-time crypto primitives, they should live in crypto/subtle. crypto/subtle can easily redirect to math/bits on platforms where those implementations have been verified. They can do something else if a slower but constant-time implementation is required.

rsc changed the title from proposal: an integer bit twiddling library to proposal: math/bits: an integer bit twiddling library Jan 23, 2017

Contributor

rsc commented Jan 23, 2017

This seems worth doing. Leaving for @griesemer and @randall77 to vet. The next step seems like a design doc checked into the usual place (I do see the sketch above).

Contributor

griesemer commented Jan 24, 2017 edited

Now that this proposal can go forward, we should agree on the details of the API. Questions I see at the moment:

  • which functions to include?
  • which types to handle (uint64, uint32 only, or uint64, uint32, uint16, uint8, or uintptr)?
  • signature details (e.g. TrailingZeroesxx(x uintxx) uint, with xx = 64, 32, etc., vs TrailingZeroes(x uint64, size uint) uint where size is one of 64, 32, etc. - the latter would lead to a smaller API and may still be fast enough dep. on implementation)
  • some name bike shedding (I like the "Hacker's Delight" terminology but spelling it out may be more appropriate)
  • is there a better place than math/bits for this package.

@brtzsnr Would you like to continue spearheading this effort? I'm ok if you want to take your initial design and iterate upon it in this issue or if you prefer setting up a dedicated design doc. Alternatively, I can pick this up and push it forward. I like to see this getting in during the 1.9 phase.

griesemer self-assigned this Jan 24, 2017

I prefer the spelled out xx names, personally: bits.TrailingZeroes(uint64(x), 32) is more cumbersome and less readable than bits.TrailingZeroes32(x), imo, and that naming scheme would work with the platform-variable-sized uintptr more easily (xx = Ptr?).

It would also make it more apparent if you, say, didn't include uint8 in the first release but added it later—suddenly there'd be a lot of new xx = 8 functions instead of a bunch of updated doc comments that add 8 to the list of valid sizes with a note in the release docs that's easy to miss.

If the spelled out names are used, I think the Hacker's delight terms should be included in the descriptions as an "also known as" so that searches (in page or via search engine) find the canonical name easily if you search for the wrong spelling.

I think math/bits is a fine place—it's math with bits.

Member

dsnet commented Jan 24, 2017 edited

I should note that SwapBytes{16,32,64} is more consistent with functions in sync/atomic than SwapBytes(..., bitSize uint).

Although the strconv package does use the ParseInt(..., bitSize int) pattern, there is more relation between bits and atomic, than there is with strconv.

Member

minux commented Jan 24, 2017

Contributor

brtzsnr commented Jan 24, 2017

@griesemer Robert, please take over the proposal. I will have time to push this proposal forward in one month, but progress should not stall till then.

Contributor

griesemer commented Jan 25, 2017 edited

It looks like there's a clear preference for signatures of the form:

func fffNN(x uintNN) uint

which leaves open which NN to support. Let's focus on NN=64 for the purpose of the continuing discussion. It will be straight-forward to add the remaining types as needed (incl. uintptr).

Which leads us straight back to the original proposal by @brtzsnr and the following functions (and their respective variations for different types), plus what people have suggested in the meantime:

// LeadingZeros64 returns the number of leading zero bits in x.
// The result is 64 if x == 0.
func LeadingZeros64(x uint64) uint

// TrailingZeros64 returns the number of trailing zero bits in x.
// The result is 64 if x == 0.
func TrailingZeros64(x uint64) uint

// Ones64 returns the number of bits set in x.
func Ones64(x uint64) uint

// RotateLeft64 returns the value of x rotated left by n%64 bits.
func RotateLeft64(x uint64, n uint) uint64

// RotateRight64 returns the value of x rotated right by n%64 bits.
func RotateRight64(x uint64, n uint) uint64

We may have a set of Swap functions as well. Personally I am skeptical about these: 1) I've never needed a SwapBits, and most uses of SwapBytes are due to code that is endian-aware when it shouldn't be. Comments?

// SwapBits64 reverses the order of the bits in x.
func SwapBits64(x uint64) uint64

// SwapBytes64 reverses the order of the bytes in x.
func SwapBytes64(x uint64) uint64

Then there may be a set of integer operations. They would only be in this package because some of them (e.g. Log2) may be close to other functions in this package. These functions could be elsewhere. Perhaps the package name bits needs to be adjusted. Comments?

// Log2 returns the integer binary logarithm of x.
// The result is the integer n for which 2^n <= x < 2^(n+1).
// If x == 0, the result is -1.
func Log2(x uint64) int

// Sqrt returns the integer square root of x.
// The result is the value n such that n^2 <= x < (n+1)^2.
func Sqrt(x uint64) uint64

Finally, @minux suggested operations such as AddUint32 etc. I'm going to leave those out for now because I think they are more tricky to specify correctly (and there's more variety). We can get back to them later. Let's get to some consensus on the above.

Questions:

  • Any opinions about function names?
  • Any opinions about the integer ops?
  • Any opinions about the package name/location?
Owner

bradfitz commented Jan 25, 2017

I'd prefer long functions names. Go avoids abbreviating in function names. The comments can say their nicknames ("nlz", "ntz", etc) for people searching the package that way.

Member

dsnet commented Jan 25, 2017 edited

@griesemer, you seem to have typos in your message. The comments don't match the function signatures.

I use SwapBits in compression applications. That said, do the major architectures provide any ability to do bit reversal efficiently? I was planning on just doing in my own code:

v := bits.SwapBytes64(v)
v = (v&0xaaaaaaaaaaaaaaaa)>>1 | (v&0x5555555555555555)<<1
v = (v&0xcccccccccccccccc)>>2 | (v&0x3333333333333333)<<2
v = (v&0xf0f0f0f0f0f0f0f0)>>4 | (v&0x0f0f0f0f0f0f0f0f)<<4
Contributor

griesemer commented Jan 25, 2017

@bradfitz , @dsnet : Updated signatures (fixed typos). Also updated questions (people prefer explicit names of shortcuts).

Member

dsnet commented Jan 25, 2017 edited

Unless there's native support for bit-swapping, I would vote to leave out SwapBits. It can be a little ambiguous by name alone whether bits are only swapped within each byte, or across the entire uint64.

Member

minux commented Jan 25, 2017

@griesemer As far as the type signatures of functions like

func Ones64(x uint64) uint
func RotateLeft64(x uint64, n uint) uint64

Does RotateLeftN etc take uints because that's necessary for easy intrinisifying (when possible) or just because that's the domain? If there's not a strong technical need, there's a stronger precedent for taking ints, even if negative values don't make sense. If there is a strong technical need, should they be uintN for the appropriate N for regularity?

Regardless OnesN and its ilk should return ints unless it's so these operations can be composed with other bit operations in which case they should return a uintN for the appropriate N.

Contributor

griesemer commented Jan 25, 2017

@jimmyfrasche Because it's the domain. The cpu doesn't care. Whether somethings is an int or a uint is irrelevant for most operations except comparisons. That said, if we make it an int, we have to explain that it is never negative (result), or that it must not be negative (argument), or specify what it's supposed to do when it's negative (argument for rotation). We might be able to get away with just one Rotate in that case which would be nice.

I'm not convinced that using int makes things easier. If properly designed, uints will flow to uints (that's the case in math/big) and no conversions are needed. But even if a conversion is needed, it's going to be free in terms of CPU cost (though not in terms of readability).

But I'm happy to hear/see convincing arguments otherwise.

Member

minux commented Jan 25, 2017

Contributor

griesemer commented Jan 25, 2017 edited

@minux There's no ambiguity when Rotate(x, n) is defined to rotate x by n mod 64 bits: Rotate left by 10 bits is the same as rotate right by 64-10 = 54 bits, or (if we allow int arguments) by -10 bits (assuming a positive value means rotate left). -10 mod 64 = 54. Most possibly we get the effect for free even with a CPU instruction. For instance, the 32bit x86 ROL/ROR instructions already only look at the bottom 5 bits of the CL register, effectively performing mod 32 for 32 bit rotates.

The only real argument is regularity with the rest of the stdlib. There are numerous places where uints make a lot of sense but ints are used instead.

Since math/big is a notable exception, I don't have a problem with this being another, but it is something to consider.

I assume @minux meant ambiguity for someone reading/debugging code rather than ambiguity in the implementation, which is a persuasive argument for it taking uint.

as commented Jan 25, 2017

Should bits.SwapBits really be named with a bits suffix when the name of the package is already called bits? How about bits.Swap?

Another idea is a bits.SwapN(v uint64, n int) rather than bits.SwapBytes where n represents the number of bits to group by for the swap. When n=1 the bits are reversed, when n=8, the bytes are swapped. One benefit is that bits.SwapBytes no longer has to exist, although I've never seen a necessity for any other value of n, maybe others will disagree.

Contributor

josharian commented Jan 25, 2017

As specified above, LeadingZeros64, TrailingZeros64, Ones64 all LGTM.

A single Rotate64 with a signed rotate amount is appealing. Most rotates will be by a constant amount, too, so (if/when this becomes an intrinsic) there'll be no runtime cost to not having separate Right/Left functions.

I have no strong opinion about bit/byte shuffling/swapping/reversing. For reversing bits, bits.Reverse64 seems like a better name. And perhaps bits.ReverseBytes64? I find Swap64 a bit unclear. I see the appeal of having Swap64 take a group size, but what is the behavior when the group size doesn't evenly divide 64? If the only group sizes that matter are 1 and 8, it'd be simpler to just give them separate functions. I also wonder whether some kind of general bit field manipulation might be better, maybe looking at arm64 and amd64 BMI2 instructions for inspiration.

I'm not convinced that integer math functions belong in this package. They seem more like they belong in package math, where they could use bits functions in their implementation. Related, why not func Sqrt(x uint64) uint32 (instead of returning uint64)?

Though AddUint32 and friends are complicated, I hope we do revisit them eventually. Along other things, MulUint64 would provide access to HMUL, which even the super-minimalist RISC-V ISA provides as an instruction. But yes, let's get something in first; we can always expand later.

bits seems clearly like the right package name. I'm tempted to say the import path should be just bits, not math/bits, but I don't feel strongly.

Member

dsnet commented Jan 25, 2017 edited

bits seems clearly like the right package name. I'm tempted to say the import path should be just bits, not math/bits, but I don't feel strongly.

IMO, if it were just bits, (without reading the package docs) I would expect it to be somewhat similar to package bytes. That is, in addition to functions to operate on bits, I would expect to find a Reader and Writer for reading and writing bits to/from a byte stream. Making it math/bits makes it more obvious its just a set of stateless functions.

(I'm not proposing that we add Reader/Writer for bit streams)

Member

mdlayher commented Jan 27, 2017

A single Rotate64 with a signed rotate amount is appealing. Most rotates will be by a constant amount, too, so (if/when this becomes an intrinsic) there'll be no runtime cost to not having separate Right/Left functions.

I can see how it could be convenient to have a slightly smaller package API by combining rotate left/right into a single Rotate, but then there is also the matter of effectively documenting the n parameter to indicate that:

  • negative n results in a left rotate
  • zero n results in no change
  • positive n results in a right rotate

To me, the added mental and documentation overhead doesn't seem to justify combining the two into a single Rotate. Having an explicit RotateLeft and RotateRight where n is a uint feels more intuitive to me.

Contributor

griesemer commented Jan 27, 2017 edited

@mdlayher Keep in mind that for an n-bit word rotating by k bits to the left is the same as rotating n-k bits to the right. Even if you separate this in two functions you still need to understand that rotating by k bits to the left always means rotating by (k mod n) bits (if you rotate by more than n bits, you start again). Once you do that, you can just say that the rotate function always rotates by k mod n bits and you're done. There's no need to explain the negative value or a second function. It just works out. It is actually simpler.

On top of that, hardware (eg on x86) even does this for you automatically, even for non-constant k.

Member

aclements commented Jan 27, 2017

@griesemer, a downside of Rotate is that it doesn't say which direction is positive. Does Rotate32(1, 1) equal 2 or 0x80000000? For example, if I were reading code that used that, I would expect the result to be 2, but apparently @mdlayher would expect it to be 0x80000000. On the other hand, RotateLeft and RotateRight are unambiguous names, independent of whether they take a signed or unsigned argument. (I disagree with @minux that it's ambiguous whether RotateRight by -2 is equal to RotateLeft 2. These seem obviously equivalent to me and I don't see how else you could specify them.)

Contributor

josharian commented Jan 27, 2017

@aclements that could be fixed by having just one function called RotateLeft64 and using negative values to rotate right. Or with docs. (FWIW, in your example, I would also expect 2, not 0x800000000.)

Member

aclements commented Jan 27, 2017

@josharian, I agree, though it seems a bit odd to have a RotateLeft without also having a RotateRight. The consistency of having a signed rotation seems potentially valuable if the rotation is computed, but for constant rotations I'd much rather read code that says "rotate right by two bits" than "rotate left by just kidding negative two bits".

The problem with fixing this with docs is that docs don't help the readability of code that calls the function.

Contributor

ianlancetaylor commented Jan 27, 2017

People writing code want to rotate left or rotate right. They don't want to rotate either left or right, depending. If we only provide rotate-left, then anybody who wants to rotate right has to write an expression to convert their right rotation into a left rotation. This is the kind of expression that computers can do trivially but programmers will make mistakes on. Why make programmers write it themselves?

Contributor

josharian commented Jan 27, 2017

I'm convinced.

Contributor

griesemer commented Jan 27, 2017

@aclements @ianlancetaylor Point taken. (That said, the implementation of RotateLeft/Right will probably just call one implementation of rotate.)

Contributor

btracey commented Jan 27, 2017

A name idea is to put the type in the package name rather than the signature name. bits64.Ones rather than bits.Ones64.

Owner

bradfitz commented Jan 27, 2017

@btracey, I threw up in my mouth a little bit there. :) We don't have flag64.Int or math64.Floatfrombits or sort64.Floats or atomic64.StoreInt.

iand commented Jan 27, 2017

@bradfitz golang.org/x/image/math/f64 and golang.org/x/image/math/f32 exist though

Contributor

btracey commented Jan 27, 2017

I hadn't thought about atomic as a precedent (not in my normal Go usage, thankfully). The other examples are different because the packages are bigger than a set of functions repeated for a number of different types.

The documentation is easier to look at (say, on godoc), when you go to bits64 and see

Ones
LeadingZeros
TrailingZeros

Instead of going to bits and seeing

Ones8
Ones16
Ones32
Ones64
LeadingZeros8
...
Owner

bradfitz commented Jan 27, 2017

@iand, that's also pretty gross. I guess the 32 and 64 didn't look nice to all the existing numeric suffixes for Aff, Mat, and Vec. Still, I'd say that's an exception rather than the normal Go style.

Contributor

griesemer commented Jan 27, 2017

There's pretty strong precedent in Go to use the pkg.foo64 naming style rather than pkg64.foo .

The package API does get n times bigger if we have n types, but the API complexity does not. A better approach to solve this might be through documentation and how the API is presented via a tool such as go doc. For instance, instead of:

// TrailingZeros16 returns the number of trailing zero bits in x.
// The result is 16 if x == 0.
func TrailingZeros16(x uint16) uint

// TrailingZeros32 returns the number of trailing zero bits in x.
// The result is 32 if x == 0.
func TrailingZeros32(x uint32) uint

// TrailingZeros64 returns the number of trailing zero bits in x.
// The result is 64 if x == 0.
func TrailingZeros64(x uint64) uint

which clearly adds mental overhead when looking at the API, we could present it as:

// TrailingZerosN returns the number of trailing zero bits in a uintN value x for N = 16, 32, 64.
// The result is N if x == 0.
func TrailingZeros16(x uint16) uint
func TrailingZeros32(x uint32) uint
func TrailingZeros64(x uint64) uint

godoc could be smart about functions that are similar in that sense and present them as a group with a single doc string. This would be helpful in other packages as well.

To summarize, I think the proposed naming scheme seems fine and in the Go tradition. The presentation issue is a separate question that can be discussed elsewhere.

Contributor

rogpeppe commented Jan 28, 2017 edited

+1 on putting the bit size in the package name.

bits64.Log2 reads better than bits.Log264 (I feel Log2 does belong there - and package naming isn't a good reason for keeping it out)

In the end, it's the same API for each type "parameterised" by scalar type, so if the functions have the same name across types, the code is easier to refactor with separate packages - just change the import path (whole-word replacements are trivial with gofmt -r but custom suffix transformations are more awkward).

I agree with @griesmeyer that the bitsize suffix on the function name can be idiomatic, but I feel that's worth it only if there's non-trivial code in the package that's independent of the type.

Contributor

nerdatmath commented Jan 28, 2017

We could steal a play from encoding/binary, and create vars named Uint64, Uint32, ... that would have methods accepting the associated predefined types.

bits.Uint64.RightShift
bits.Uint8.Reverse
bits.Uintptr.Log2
...

@griesemer how would it know how to group them? One doc string with the function name ending in N in the documentation instead of the actual name and then finding a common prefix among funcs without docstrings that matches the incongruity? How does that generalize? It seems like a complicated special case to serve very few packages.

@rogpeppe it could be bits.Log64 and the docs say it was base 2 (what else would it be in a bits package?) Though as messy as having of 8/16/32/64/ptr packages are on one level it does make them each cleaner individually. (Also it looks like your transmission got cut off midsentence.)

@nerdatmath this would be slightly different since they would have the same interface in the colloquial sense but not in the Go sense, as is the case in encoding/binary (they both implement binary.ByteOrder), so the variables would just be for namespacing and godoc wouldn't pick up any of the methods (#7823) unless the types were exported, which is messy in a different way.

I'm fine with the size suffixes, but, all in all, I would prefer the separate packages. Not enough to bother pushing it past this comment, though.

Contributor

josharian commented Jan 28, 2017

@nerdatmath if they are vars, then they are mutable (you could do bits.Uint64 = bits.Uint8), which would prevent the compiler from treating them as intrinsics, which was one of the motivations for (at least part of) the package.

Owner

bradfitz commented Jan 28, 2017

Variables aren't really mutable if they're of an unexported type and have no state (unexported empty struct). But without a common interface the godoc (today) would be gross. Fixable though if that's the answer.

If y'all do end up going with multiple packages, though, if there's enough repetition and volume to warrant it, at least maybe name the packages like "X/bits/int64s", "X/bits/ints", "X/bits/int32s" to match "errors", "strings", "bytes". Still seems like a lot of package explosion.

Contributor

ulikunitz commented Jan 28, 2017

I don't think we should go the path to multiple type-specific packages, having a single package bits is simple and doesn't inflate the number of packages in the Go library.

I don't think that Ones64 and TrailingZeroes64 are good names. They don't inform that they return a count. Adding the prefix Count makes the names even longer. In my programs those functions are often embedded in larger expressions, so short function names will increase readability. Though I used the names from the book "Hacker's Delight" I suggest to follow the Intel assembler mnemonics: Popcnt64, Tzcnt64 and Lzcnt64. The names are short, follow a consistent pattern, inform that a count is returned and are already established.

Usually counts in Go are returned as ints, even when a count can never be negative. The prime example is the builtin function len, but Read, Write, Printf, etc. all return integers as well. I find the use of uint value for a count as quite surprising and suggest to return an int value.

If the choice is between (to abuse notation) math/bits/int64s.Ones and math/bits.Ones64, then I'd definitely prefer a single package. It's more important to have bits in the package identifier than to partition by size. Having bits in the package identifier makes reading the code easier, which is more important than the minor inconvenience of repetition in the package's documentation.

@ulikunitz You can always do ctz := bits.TrailingZeroes64, and the like, in code that uses bits extensively. That strikes a balance between having globally self-documenting names and locally compact names for complex code. The opposite transformation, countTrailingZeroes := bits.Ctz64, is less useful since the nice global properties are already lost.

I mess with stuff at the bit level very rarely. I have to look a lot of this stuff up each time, just because it's been years since I had to think about it, so I find all the Hacker's Delight/asm-style names infuriatingly cryptic. I'd rather it just said what it did so I can spot the one I need and get back to programming.

as commented Jan 28, 2017

I agree with @ulikunitz the long names are an eye sore. The first time you see bits.TrailingZeroes64, it might be self documenting. Every subsequent time, it's annoying. I can see why people would do ctz := bits.TrailingZeroes64, but I think a name that needs to be assigned to a shorter name is a bad name. There are places where Go abbreviates, for example,

init() instead of initialize(), 
func instead of function
os instead of operatingsystem
proc instead of process
chan instead of channel

Additionally, I would challenge that bits.TrailingZeroes64 is self documenting. What does it mean for a zero to be trailing? The self documentation property exists with an assumption that the user knows something about the semantics of the documentation to begin with. If not using Hacker's Delight for consistency, why not just call it ZeroesRight64 and ZeroesLeft64 to match RotateRight64 and RotateLeft64?

To clarify, I didn't mean to imply that any time you use it the first thing you should do is create a short alias and you should never use the given name. I meant if you're using it repeatedly, you could alias it, if that improves the readability of the code.

For example, if I'm calling strings.HasSuffix I use the given name most of the time because I'm calling it once or twice, but, every now and then, in a one-off ETL script or the like I'll end up calling it a bunch so I'll do ends := strings.HasSuffix which is shorter and makes the code clearer. It's fine because the definition of the alias is close at hand.

Abbreviated names are fine for things that are extremely common. Plenty of non-programmers know what OS means. Anyone reading code, even if they don't know Go, is going to get that func is short for function. Channels are fundamental to Go and widely used, so chan is fine. The bits functions are never going to be extremely common.

TrailingZeroes may not be able to tell you exactly what's going on if you're unfamiliar with the concept but it gives you a much better idea of what's going on, at least roughly, than Ctz.

Contributor

griesemer commented Jan 30, 2017 edited

@jimmyfrasche Regarding #18616 (comment): It would be pretty easy for go/doc to recognize a contiguous sequence of functions in the same file that have names that all start with the same prefix and have a suffix that is just a sequence of numbers and/or perhaps a word that is "short" relative to the prefix. I'd combine it with the fact that only the first of these functions has a doc string and the other with matching prefix don't. I think that could work quite well in more general settings. That said, let's discuss this elsewhere to not hijack this proposal.

Member

mdlayher commented Jan 30, 2017

FYI, created #18858 to discuss the go/doc changes and keep that conversation separate from this one.

Contributor

griesemer commented Jan 30, 2017

@mdlayher thanks for doing that.

Contributor

griesemer commented Jan 30, 2017

@rogpeppe Putting the size into the package name sounds intriguing, but judging from the comments and given existing Go style, I think the preference is to put the size with the function name. With respect to Log2, I'd say, let's leave the 2 away. (Also, please repeat if there was anything else you wanted to add, your comments appears cut off mid-sentence.)

@ulikunitz I'm usually one of the first to vote for short names; especially in local context (and in global context for extremely frequently used names). But here we have a package API, and (at least in my experience) the functions will not show up pervasively in clients. For instance, math/big makes uses LeadingZeros, Log2, etc. but it's just one or two calls. I think a longer descriptive name is ok, and judging from the discussion, a majority of comments is in favor of that. If you call a function frequently, it's cheap to wrap it inside a function with short name. The additional call will be inlined away. FWIW, the Intel mnemonics are not great either, their emphasis is on "count" (cnt) and then you are left with 2 letters which need to be decrypted.

Contributor

randall77 commented Jan 30, 2017

I agree that the 2 in Log2 isn't necessary. bits.Log implies a base of 2.

Contributor

rogpeppe commented Jan 31, 2017

bits.Log64 SGTM.

@griesemer My cut-off sentence was probably an artifact of clumsy editing. I just removed it.

Contributor

ulikunitz commented Feb 1, 2017 edited

@griesemer I wonder whether it is worth to continue the name bike shedding. I keep it short: I had two arguments: shortness and clarity about returning a number. Package big uses nlz, not LeadingZeros, internally. I agree the number of uses for these functions is small.

Contributor

griesemer commented Feb 1, 2017

@ulikunitz I think the majority of feedback here is in favor for clear function names that are not shortcuts.

Owner

bradfitz commented Feb 1, 2017

@ulikunitz, additionally:

Package big uses nlz, not LeadingZeros, internally.

Internal private unexported names have no influence here.

All that matters is consistency and standard feel of the public API.

Bikeshedding might be annoying but names matter when we're stuck with them forever.

CL https://golang.org/cl/36315 mentions this issue.

Contributor

griesemer commented Feb 3, 2017 edited

I've uploaded https://go-review.googlesource.com/#/c/36315/ as an initial (partially tested) API (with preliminary implementation) for review (in lieu of a design document).

We're still making sure the API is correct, so please comment only on the API (bits.go) for now. For minor issues (typos, etc.) please comment on the CL. For design issues, please comment on this issue. I will keep updating the CL until we're happy with the interface.

Specifically, don't worry about the implementation. It is basic, and only partially tested. Once we're happy with the API it will be easy to expand on the implementation.

Contributor

ericlagergren commented Feb 4, 2017

@griesemer I'm sure you're fine, but if it's useful, I've a CLZ assembly implementation with tests https://github.com/ericlagergren/decimal/tree/ba42df4f517084ca27f8017acfaeb69629a090fb/internal/arith that you're free to steal/fiddle with/whatever.

Contributor

griesemer commented Feb 4, 2017

@ericlagergren Thanks for the link. Once the API has settled and we have tests in place everywhere, we can start optimizing the implementation. I'll keep this in mind. Thanks.

Member

mdlayher commented Feb 4, 2017

@griesemer, it looks like your proposed API documentation depends on go/doc being able to adequately document functions with the same prefix but an integer suffix.

Would the plan be to work on that before 1.9 as well?

Member

mdlayher commented Feb 4, 2017

#18858, that is!

Contributor

griesemer commented Feb 4, 2017

@mdlayher That would be ideal. But it wouldn't be a showstopper since it affects "only" documentation, not the API (which must remain backward-compatible going forward).

Contributor

ulikunitz commented Feb 5, 2017 edited

@griesemer @bradfitz I spend some time to rethink my position regarding the function names. An important objective of choosing names must be Readability of the code using the API.

The following code is indeed easy to understand:

n := bits.LeadingZeros64(x)

I'm still a little bit in doubt about the clarity of Ones64(x), but it is consistent with LeadingZeros and TrailingZeros. So I rest my case and support the current proposal. As an experiment I used my existing code for an experimental pure Go implementation of the package including test cases.

Repository: https://github.com/ulikunitz/bits
Documentation: https://godoc.org/github.com/ulikunitz/bits

Member

mundaym commented Feb 5, 2017

We may have a set of Swap functions as well [...] most uses of SwapBytes are due to code that is endian-aware when it shouldn't be. Comments?

I'd like to keep SwapBytes out of the API for this reason. I worry that people will start to use it in non-portable ways because it is simpler (or assumed to be faster) than binary/encoding.BigEndian.

Contributor

ericlagergren commented Feb 5, 2017 edited

I'd like to keep SwapBytes out of the API for this reason. I worry that people will start to use it in non-portable ways because it is simpler (or assumed to be faster) than binary/encoding.BigEndian.

@mundaym are these routines are ever going to be intrinsified? Letting SwapBytes64 compile directly as BSWAP might outweigh people using it improperly, imo.

Member

dsnet commented Feb 5, 2017

I worry that people will start to use it in non-portable ways because it is simpler (or assumed to be faster) than binary/encoding.BigEndian

I believe use of ReverseBytesis only non-portable if used in conjunction with unsafe where you access data at the low-level in the way the machine lays them out in memory. However, use of encoding.{Little,Big}Endian.X is just as non-portable when used in that way. I look forward to ReverseBytes for use in compression and would be sad if it were not included.

Member

mundaym commented Feb 5, 2017

@ericlagergren

are these routines are ever going to be intrinsified?

Yes, the functions in encoding/binary that perform byte reversals use code patterns that are recognized by the compiler and are (or could be) optimized to single instructions (e.g. BSWAP) on platforms that support unaligned data accesses (e.g. 386, amd64, s390x, ppc64le and probably others).

Letting SwapBytes64 compile directly as BSWAP might outweigh people using it improperly, imo.

I'm inclined to disagree. There may be legitimate uses of SwapBytes in endianness-unaware code, but I suspect it will be misused more often than it is used correctly.

Member

aclements commented Feb 5, 2017

Note that runtime/internal/sys already has optimized Go, assembly, and compiler intrinsic versions of trailing zeros and byte swap, so we can reuse those implementations.

Member

mundaym commented Feb 6, 2017

@dsnet Interesting, do you have something specific in mind?

@aclements Do you know where the byte swap intrinsic is used in the runtime? I haven't found any uses outside of the tests.

Member

aclements commented Feb 6, 2017

@mundaym, AFAIK it's not used in the runtime. We don't have to be as careful with internal APIs, so I think we just added it when adding ctz because it was easy. :) (Popcount would have been a better choice.)

Contributor

griesemer commented Feb 6, 2017 edited

All: Just a reminder to please concentrate on the API for now. The implementation in the CL is simply a trivial and slow implementation that allows us to actually use these functions and test them. Once we are happy with the API we can tune the implementation.

On that note: We probably should include versions for uintptr since it's not guaranteed (though likely) that it's the same size as an uint32 or uint64. (Note that uint is always either uint32 or uint64). Opinions? Leave it for later? What would be good names? (LeadingZerosPtr?).

Contributor

josharian commented Feb 6, 2017

We should do uintptr now, otherwise math/big can't use it. Suffix Ptr SGTM. I think we should also do uint as well, otherwise all calling code needs to write conditional code around int size to make the conversion-free call. No naming ideas.

Ptr for uintptr, no suffix for uint. Ones, Ones8, Ones16, Ones32, Ones64, OnesPtr, etc.

Member

minux commented Feb 6, 2017

Contributor

josharian commented Feb 6, 2017

math/big exports type Word uintptr. I agree that it might have been a mistake, but I believe that compatibility requires that that remain. So if we want to use math/bits to implement math/big, which I think we do, we need uintptr APIs.

Wandering a bit off topic, but isn't the native unsigned word size just uint? In which case, having no suffix for uint seems about right.

I wouldn't want a single type, Word, to be different on different architectures. That seems like introducing lots of complexity for the caller and the docs. Sticking to existing types means you can simply use it. Having a type that varies across architectures means designing all your code around that type or having a bunch of adapters in build-tag protected files.

Contributor

griesemer commented Feb 6, 2017

FWIW, the implementation of math/big could still use a uint32/64 based bits API (if there was no uintptr version of the functions).

@minux I'm not against the 2-result add,sub,mul,div - but let's do that in a 2nd step. We still need assembly code if we want decent performance though. But I'm happy to be convinced otherwise by the compiler...

I've added the uint and uintptr versions to the API. Have a look at the CL and comment. I'm not too happy with the proliferation of functions, though.

@josharian The native uint may be a 32bit value even on a 64bit machine. There's no guarantee. I do realize that uintptr doesn't guarantee machine word size either; but at the time it seemed the more sensible choice, if wrong in retrospect. Perhaps we really do need a Word type.

If the only legitimate need for the uintptr functions is to support math/big, maybe the implementation of math/bits could go in an internal package: only use the uintptr stuff in math/big and expose the rest.

Contributor

ericlagergren commented Feb 6, 2017

@jimmyfrasche, @griesemer how would that affect big.Int.Bits? i.e. will it still be zero-alloc?

Member

minux commented Feb 6, 2017

Contributor

griesemer commented Feb 6, 2017

math/big defines and exports Word, but it's a) not directly compatible with uintptr (it's a different type), and b) code that properly uses Word and doesn't make assumptions about its implementations will be able with any kind of Word implementation. We can change it. I don't see why it should affect the API compatibility guarantee (haven't tried though). Anyway, let's leave this aside. We can handle that separately.

Can we simply do all the explicitly sized uint types? If we know the uint/uintptr size as a constant, we can trivially write an if-statement that calls the right sized function. That if statement will be constant-folded away, and the respective wrapper function simply calls the sized function which will be inlined.

Finally, what's the right solution regarding the Word type (independent of math/big)? Of course we could define it in math/bits but is that the right place? It's a platform-specific type. Should it be a predeclared type 'word' ? And why not?

Contributor

rsc commented Feb 6, 2017

Having any function signatures (or names) here that mention uintptr directly seems like a mistake. People are not supposed to do this kind of bit twiddling on Go pointers.

If there should be functions for a natural machine word type, I think those can refer to int/uint now. We didn't use uint in math/big because it was 32 bits on 64-bit systems, but we've since fixed that. And while math/big is a coherent world unto itself, where it makes sense to have its own type big.Word, this package is more a collection of pick-and-choose utility routines. Defining a new type in this context is less compelling.

I don't know whether we need int/uint variants at all. If they are commonly needed, defining them in this package makes more sense than forcing all callers to write if statements. But I don't know if they will be commonly needed.

Contributor

rsc commented Feb 6, 2017

To address the potential mismatch with math/big, there are at least two solutions that don't involve mentioning uintptr in the API here.

  1. Define files word32.go and word64.go in math/big with build tags and uintptr-accepting wrappers that redirect appropriately (and make sure the compiler inlining does the right thing).

  2. Change the definition of big.Word to uint. The exact definition is an internal implementation detail, even though it is exposed in the API. Changing it can't break any code except on amd64p32, and even there it seems very unlikely to matter outside math/big.

Contributor

griesemer commented Feb 6, 2017

@rsc: "We didn't use uint in math/big because it was 32 bits on 64-bit systems, but we've since fixed that. " Ah yes, I completely forgot about the reason for the choice of uintptr. The whole point of the Go uint/int types was to have integer types that reflected the natural register size of a machine. I will try changing big.Word to uint and see what happens.

Contributor

griesemer commented Feb 6, 2017

All: Updated https://go-review.googlesource.com/#/c/36315/ to exclude uintptr function versions per above discussion.

Contributor

griesemer commented Feb 6, 2017

I've tentatively updated math/big to use math/bits to see the effect (https://go-review.googlesource.com/36328).

A few comments:

  1. I'm leaning towards making the results of the Leading/TrailingZeros functions uint rather than int. These results tend to be used to shift a value accordingly; and math/big evidences this. I'm also in favor of making Ones return a uint, for consistency. Counter arguments?

  2. I'm not a fan of the name One, however consistent with Leading/TrailingZeros. How about Population?

  3. Some people have complained that Log doesn't fit into this package. Log happens to be equivalent to the index of the msb (with -1 for x == 0). So we could call it MsbIndex (though Log seems nicer). Alternatively, we could have a Len function which is the length of an x in bits; i.e., `Log(x) == Len(x)-1).

Comments?

Contributor

ericlagergren commented Feb 6, 2017

I'm not a fan of the name One, however consistent with Leading/TrailingZeros. How about Population?

Why not the traditional Popcount?

Alternatively, we could have a Len function which is the length of an x in bits; i.e., `Log(x) == Len(x)-1).

Len or Length sounds like a good name and good idea.

Member

aclements commented Feb 6, 2017

@aclements Count what?

Population may be the better name for historical reasons but it doesn't really say more than Ones if you're not familiar with the term and has the same problem is just Count: Population of what?

It's somewhat unidiomatic, even within the package, but maybe CountOnes to give it a clear, obvious name.

Member

aclements commented Feb 6, 2017

@aclements Count what?

In a package called "bits", I'm not sure what else you could be counting except set bits, but CountOnes is also fine and obviously more explicit. Adding "Ones" also parallels the "Zeros" in LeadingZeros/TrailingZeros.

That's the most obvious interpretation but there's still ambiguity. It could be counting unset bits or total bits (the last interpretation would be extremely unlikely, but still a potential trap for someone reading code using bits who's unfamiliar with the concepts involved and might think the suffix-free Count is like unsafe.SizeOf)

Maybe something like AllOnes or TotalOnes to mirror Trailing/LeadingZeroes but specify that, unlike those, the position is not taken into account.

Contributor

rsc commented Feb 7, 2017

AllOnes sounds like it returns all the ones (in a bitmask maybe?), or maybe a word that is all ones.

CountOnes and TotalOnes seem about the same, but since this operation's most commonly known name is "population count", CountOnes seems like it would be preferable.

Contributor

griesemer commented Feb 10, 2017 edited

I've uploaded a new version (https://go-review.googlesource.com/#/c/36315/) with a couple of changes:

  1. I renamed Ones to PopCount: Most people were not too excited about the consistent but somewhat drab name Ones. Everybody who is going to use this package will know exactly what PopCount does: it's the name by which this function is commonly known. It's slightly inconsistent with the rest but it's meaning has become so much clearer. I'm going with Ralph Waldo Emerson on this one ("A foolish consistency is the hobgoblin of little minds...").

  2. I renamed Log to Len as in bits.Len. The bit length of a number x is the number of bits it takes to represent x in binary representation (https://en.wikipedia.org/wiki/Bit-length). Seems fitting, and removes the need to have Log here which has a non-"bit-fiddly" quality. I believe this is also a win because Len(x) == Log(x) + 1 given how we had Log defined. It furthermore has the advantage that the result now is always >= 0, and it removes some +/-1 corrections in the (trivial) implementation.

Overall I'm pretty happy with this API at this point (we may want to add more functions later). The only other thing I think we might want to seriously consider is whether all results should be unsigned. As I have pointed out before, the Trailing/Leading Zero function results tend to be inputs to shift operations which must be unsigned. That only leaves Len and PopCount which arguably could return unsigned values as well.

Comments?

Contributor

rsc commented Feb 10, 2017

My experience with math/big has been frustration at being forced into uint mode by functions when I'm not shifting. In math/big/prime.go, I wrote

for i := int(s.bitLen()); i >= 0; i-- {

even though s.bitLen() returns int not uint, because I was unsure without looking, and was I also unsure that some future CL might not change it to return uint, thereby turning that for loop into an infinite loop. Needing to be this defensive suggests there is a problem.

Uints are much more error-prone than ints, which is why we discourage them in most APIs. I think it would be much nicer if every function in math/bits returned int and left the conversion to uint to be done in the shift expression, as is necessary in most shift expressions anyway.

(I'm not proposing a change, but I think shift requiring uint may have been a mistake in retrospect. Maybe we can fix it at some point. It would be nice if it didn't hurt our other APIs.)

Contributor

ulikunitz commented Feb 10, 2017

I'm in support of returning int after grepping my code for calls of the trailing zeros and popcount functions. The majority of calls are in short variable declarations and comparisons of type int. Calls in shifts require of course the uint type, but are surprisingly rare.

Contributor

griesemer commented Feb 11, 2017

Uploaded minor tweaks. Per +1's and comments let's leave the results of counts as int.

https://go-review.googlesource.com/36315

I left the input counts for RotateLeft/Right as uint otherwise we will need specify what happens when the value is negative or disallow it.

Finally, we might even leave away Len after all since LenN(x) == N - LeadingZerosN(x). Opinions?

If there's no significant feedback anymore at this point, I'd suggest we move forward with this API and commit an initial implementation after review. After that we can start tweaking the implementation.

I a next step we may want to discuss if and which other functions we might want to include, specifically Add/Sub/ etc. that consume 2 arguments and carry, and produce a result and carry.

Contributor

ericlagergren commented Feb 11, 2017 edited

Contributor

ericlagergren commented Feb 11, 2017

Also. Pardon me if I'm getting out of scope of this issue but I'd be in favor of checked arithmetic. Eg AddN(x, y T) (sum T, overflow bool)

Member

minux commented Feb 11, 2017

Contributor

griesemer commented Feb 11, 2017

@ericlagergren The (base 2) Len function, while almost log2, is still in essence a bit counting function. A base 10 Len function is really a log function. There's no denying that it's useful, but it seems less appropriate for this package.

Yes, I think it would be nice to get checked arithmetic in. I just wanted to get closure on the current API first so we can make progress instead of going back and forth. It does seem most people who commented so far seem happy with it.

Regarding the Add, Sub functions: I agree with @minux that the carry should be of the same type as the argument. Also, I'm not convinced that we need anything else besides uint arguments for these functions: The point is that (in most cases) uint has the platform's natural register size, and that's the level at which these operations make the most sense.

A math/checked package could be a better home for those. checked.Add is clearer than bits.Add.

Contributor

ericlagergren commented Feb 11, 2017

@minux I was thinking signed checked arithmetic, so overflow.

@griesemer re: base 10 Len: makes sense!

If you want I can submit a CL for checked arithmetic. I like @jimmyfrasche's idea of having it be under a separate package name.

Member

aclements commented Feb 13, 2017

Add/sub/mul may or may not belong in bits, but checked math isn't the only use for these operations. More generally, I would say these are "wide" arithmetic operations. For add/sub there's little difference between the one bit of carry/overflow result and a boolean indicator of overflow, but we would probably also want to provide add-with-carry and subtract-with-borrow to make these chainable. And for wide mul there's far more information in the additional result than overflow.

Contributor

rsc commented Feb 13, 2017

It's a good idea to remember the checked/wide arithmetic operations, but let's leave them for round two.

Contributor

rogpeppe commented Feb 13, 2017

"Everybody who is going to use this package will know exactly what PopCount does"

I've used that functionality before and somehow I wasn't familiar with the PopCount name (although I should have because I'm sure I pinched the implementation off Hacker's Delight, which uses "pop").

I know I'm to the party, but "OnesCount" seems considerably more obvious to me, and if the word "PopCount" is mentioned in the doc comment, people looking for that will find it anyway.

Contributor

josharian commented Feb 13, 2017

Related for checked/wide arithmetic: #6815. But yes, let's get round one in!

Contributor

seh commented Feb 13, 2017

@griesemer wrote:

The (base 2) Len function, while almost log2, is still in essence a bit counting function. A base 10 Len function is really a log function. There's no denying that it's useful, but it seems less appropriate for this package.

I wrote a benchmark to compare a few approaches to the "decimal digit length" problem last October, which I posted in a gist.

Contributor

rsc commented Feb 13, 2017

Accepted as a proposal; there will probably be minor tweaks moving forward, and that's fine.

@rsc rsc added Proposal-Accepted and removed Proposal labels Feb 13, 2017

rsc changed the title from proposal: math/bits: an integer bit twiddling library to math/bits: an integer bit twiddling library Feb 13, 2017

@rsc rsc modified the milestone: Go1.9, Proposal Feb 13, 2017

Contributor

griesemer commented Feb 16, 2017 edited

@rogpeppe: Changed PopCount to OnesCount as it also received 4 thumbs-up (like my suggestion to use PopCount). Referred to "population count" in doc string.

Per @rsc, leaving checked/wide arithmetic operations out for now.

Also per @rsc, all counting functions return int values, and for ease of use (and with an eye on #19113), use int values for rotation counts.

Left LenN functions in even though they are simply N - LeadingZerosN. But a similar symmetry exists for RotateLeft/Right and we have both.

Added slightly faster implementation for TrailingZeroes and completed tests.

At this point I believe we have a usable first implementation. Please review the code at https://go-review.googlesource.com/36315, especially the API. I like to get this submitted if we're all happy.

Next steps:

  • faster implementation (primary)
  • adding additional functionality (secondary)

We're designing a new package

@minux You mean new math/big? Is it possible to follow the process somewhere?

Contributor

griesemer commented Feb 16, 2017

@TuomLarsen: @minux referred to math/bits with "new package". He mentioned math/big as a case where math/bits would be used. (In the future, please be more specific with your comments so we don't have to search and guess what you are referring to - thanks).

@gopherbot gopherbot pushed a commit that referenced this issue Feb 16, 2017

@griesemer griesemer math/bits: added package for bit-level counting and manipulation
Initial platform-independent implementation.

For #18616.

Change-Id: I4585c55b963101af9059c06c1b8a866cb384754c
Reviewed-on: https://go-review.googlesource.com/36315
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
661e217

CL https://golang.org/cl/37140 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Feb 17, 2017

@griesemer griesemer math/bits: expand doc strings for all functions
Follow-up on https://go-review.googlesource.com/36315.
No functionality change.

For #18616.

Change-Id: Id4df34dd7d0381be06eea483a11bf92f4a01f604
Reviewed-on: https://go-review.googlesource.com/37140
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
81acd30
Contributor

cespare commented Feb 17, 2017

Will there be any compiler-assisted intrinsification happening in math/bits for Go 1.9?

Contributor

griesemer commented Feb 17, 2017

@cespare Depends on whether we get to it (@khr?). Independent of that, we want to have a decent platform-independent implementation. (One of the reasons we don't want to move math/big entirely over to using math/bits is that currently we have some platform-specific assembly in math/big which is faster.)

Contributor

randall77 commented Feb 17, 2017

zephyr commented Feb 23, 2017 edited

This article about the fastest way to implement population count on x86-64 may be helpful: Hand coded assembly beats intrinsics in speed and simplicity – Dan Luu, Oct 2014

CL https://golang.org/cl/38155 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Mar 16, 2017

@randall77 randall77 cmd/compile: intrinsics for math/bits.TrailingZerosX
Implement math/bits.TrailingZerosX using intrinsics.

Generally reorganize the intrinsic spec a bit.
The instrinsics data structure is now built at init time.
This will make doing the other functions in math/bits easier.

Update sys.CtzX to return int instead of uint{64,32} so it
matches math/bits.TrailingZerosX.

Improve the intrinsics a bit for amd64.  We don't need the CMOV
for <64 bit versions.

Update #18616

Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39
Reviewed-on: https://go-review.googlesource.com/38155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
d5dc490

CL https://golang.org/cl/38166 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Mar 16, 2017

@randall77 @randall77 randall77 + randall77 cmd/compile: intrinsify math/bits.ReverseBytes
Update #18616

Change-Id: I0c2d643cbbeb131b4c9b12194697afa4af48e1d2
Reviewed-on: https://go-review.googlesource.com/38166
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
dd9892e

CL https://golang.org/cl/38311 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Mar 16, 2017

@randall77 @randall77 randall77 + randall77 cmd/compile: intrinsics for math/bits.{Len,LeadingZeros}
name              old time/op  new time/op  delta
LeadingZeros-4    2.00ns ± 0%  1.34ns ± 1%  -33.02%  (p=0.000 n=8+10)
LeadingZeros16-4  1.62ns ± 0%  1.57ns ± 0%   -3.09%  (p=0.001 n=8+9)
LeadingZeros32-4  2.14ns ± 0%  1.48ns ± 0%  -30.84%  (p=0.002 n=8+10)
LeadingZeros64-4  2.06ns ± 1%  1.33ns ± 0%  -35.08%  (p=0.000 n=8+8)

8-bit args is a special case - the Go code is really fast because
it is just a single table lookup.  So I've disabled that for now.
Intrinsics were actually slower:
LeadingZeros8-4   1.22ns ± 3%  1.58ns ± 1%  +29.56%  (p=0.000 n=10+10)

Update #18616

Change-Id: Ia9c289b9ba59c583ea64060470315fd637e814cf
Reviewed-on: https://go-review.googlesource.com/38311
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
495b167

CL https://golang.org/cl/38320 mentions this issue.

CL https://golang.org/cl/38323 mentions this issue.

@gopherbot gopherbot pushed a commit that referenced this issue Mar 17, 2017

@randall77 @randall77 randall77 + randall77 cmd/compile: intrinsic for math/bits.Reverse on ARM64
I don't know that it exists for any other architectures.

Update #18616

Change-Id: Idfe5dee251764d32787915889ec0be4bebc5be24
Reviewed-on: https://go-review.googlesource.com/38323
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
42e9746

@gopherbot gopherbot pushed a commit that referenced this issue Apr 4, 2017

@randall77 @randall77 randall77 + randall77 cmd/compile: intrinsics for math/bits.OnesCount
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call.  Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.

name           old time/op  new time/op  delta
OnesCount-8    2.47ns ± 5%  1.04ns ± 2%  -57.70%  (p=0.000 n=10+10)
OnesCount16-8  1.05ns ± 1%  0.78ns ± 0%  -25.56%    (p=0.000 n=9+8)
OnesCount32-8  1.63ns ± 5%  1.04ns ± 2%  -35.96%  (p=0.000 n=10+10)
OnesCount64-8  2.45ns ± 0%  1.04ns ± 1%  -57.55%   (p=0.000 n=6+10)

Update #18616

Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
5cadc91
Contributor

randall77 commented Apr 4, 2017

Some more discussion:

Me:

math/bits.RotateLeft is currently defined to panic if its argument is less than zero.
I'd like to change that to define RotateLeft to do a right rotate if its arg is less than zero.

Having a base routine like this one panic seems unnecessarily harsh. I'd argue that a rotate by a negative amount is more analogous to a shift by larger than the word size (which doesn't panic) instead of a divide by zero (which does panic). Divide-by-zero really has to panic, as there is no reasonable result you would return instead. Rotates by negative amounts have a perfectly well-defined result that we can return.

I think we should keep RotateLeft and RotateRight as separate functions, even though one could now be implemented with the other. Standard usage will still be with nonnegative arguments.

If we're going to do something here, we should do it by the freeze. Once go 1.9 is out we can't change our minds.

Rob:

If you really want something like this, I'd just have one function, Rotate, that takes a positive for left and a negative for right.

Me:

The problem is
bits.Rotate(x, -5)

It isn't clear when reading this code whether we end up rotating left or right.
bits.RotateRight(5)
is a lot clearer, even if it means there are twice as many Rotate* functions in math/bits.

Michael Jones:

Signed rotate means jumps in the code, yes? Seems a pity.

Me:

No, the implementation is actually simpler with the proposed semantics. Mask out the low 6 bits (for 64-bit rotates) and it just works.

Contributor

robpike commented Apr 4, 2017

I prefer the single Rotate with a signed argument because it means half the number of functions and can be done very efficiently in all cases without a panic or even a conditional branch.

Rotate is a specialist function anyways, so the people who use it will surely be comfortable with being asked to remember that a positive argument shifts left and a negative argument shifts right.

Member

bcmills commented Apr 4, 2017

You could always split the difference and provide only RotateLeft with a signed argument. That gives a handy mnemonic for the direction but avoids duplicating functions.

Contributor

josharian commented Apr 4, 2017

@bcmills @robpike see also previous discussion of this exact topic starting at #18616 (comment) and continuing for a few comments

Contributor

robpike commented Apr 4, 2017

@josharian I have seen the comments and still prefer my version. This can be bikeshedded forever but I am trying to get to simple to implement, simple to define, simple to understand, and simple to document. I believe a single Rotate function with a signed argument satisfies all of that except for the need to define the sign, but that left is positive should be intuitive to anyone capable of using a rotate instruction.

Member

aclements commented Apr 4, 2017

but that left is positive should be intuitive to anyone capable of using a rotate instruction.

I like to consider myself capable of using a rotate instruction and my intuition is "Geez, why doesn't this say what direction it is? It's probably left, but I'll have to look at the documentation to be sure." I agree it's intuitive that if you were going to pick a positive direction it would be left rotation, but there's a much higher threshold for something to be so obviously the right answer that it's clear at every call site what direction you're rotating without saying it.

Contributor

nathany commented Apr 4, 2017 edited

As far as readability, how about something along the lines of the time.Duration API:

const RotateRight = -1

bits.Rotate(x, 5 * RotateRight)

Maybe a constant defined by bits or maybe an exercise for the reader (call site)?

Contributor

robpike commented Apr 4, 2017

@aclements And so you end up with two (times N types) functions with identical capability whose only difference is a sign in the argument. Now we tolerate it for Add and Sub but that's the only example I can think of.

Contributor

cznic commented Apr 4, 2017

On the number axis positive numbers increase to the right, so I expect a direction-defined-by-sign rotate/shift to move the bits to the right for positive numbers.

I have no problem if it's going to be the [documented] opposite, but I'd not call that intuitive.

Contributor

nathany commented Apr 4, 2017

@cznic bits are written right-to-left though

Contributor

griesemer commented Apr 4, 2017 edited

I'm also in favor of just Rotate (#18616 (comment)) if we make it work in both directions.

As a counter-argument to @aclements concern about the direction: Providing a RotateLeft that works also when rotating right can also provide a false sense of security: "It says RotateLeft so surely it's not rotating right!". In other words, if it says RotateLeft it better doesn't do anything else.

Also, the use of bits.Rotate is really in specialist code only. This is not a function that is used by a large number of programmers. Users that really need it will understand the symmetry of rotations.

Contributor

cznic commented Apr 4, 2017

@nathany

bits are written right-to-left though

Bits are just binary digits. Digits in number in any base are written left to right - even in most, if not all, right-to-left writing systems. 123 is one-hundred-and-twenty-three, not three-hundred-and-twenty-one.

That the power of the multiplicand for the digit decreases to the right is a different thing.

Once again: I don't care about the direction, it's just that the intuitive one is a matter of personal imagination.

Contributor

rogpeppe commented Apr 4, 2017 edited

Contributor

rsc commented Apr 5, 2017

Please keep both RotateLeft and RotateRight instead of doing something half of developers will misremember. It seems fine to handle negative numbers though.

Contributor

robpike commented Apr 5, 2017

99% of developers will never program a rotate instruction, so the need for unambiguous direction is weak at best. A single call is enough.

The problem that reawakened this discussion is that having both requires arguing about whether negative values are OK, and if not, what to do about them. By having only one, that whole argument falls away. It's cleaner design.

ct1n commented Apr 5, 2017

I somewhat sympathize with the argument about clean design but it seems weird that you have to remove "Right" from "RotateRight", while keeping the same implementation, in order to achieve it. In practical terms the only way it seems to answer questions is by forcing people who see it to read the documentation, by way of the questions the name raises.
In the end it's a matter of unambiguous direction vs unambiguous behavior for negative values. Negative values should probably be less of a concern in the common case.

What I'm saying is that Rotate raises one question for all people, answering it indirectly through documentation.
RotateRight raises one question for very few people who can (and should) read the documentation if they care about it.

On the other hand Rotate would probably prevent people from writing if n < 0 { RotateLeft(...) } else { RotateRight(...) }.

Contributor

rsc commented Apr 10, 2017

@golang/proposal-review discussed this and ended up at having just one function, but naming it RotateLeft, not just Rotate, to be extra clear at call sites. Negative numbers rotate right, and documentation will make this clear.

@gopherbot gopherbot pushed a commit that referenced this issue Apr 11, 2017

@griesemer griesemer math/bits: support negative rotation count and remove RotateRight
For details see the discussion on the issue below.

RotateLeft functions can now be inlined because the don't panic
anymore for negative rotation counts.

name            old time/op  new time/op  delta
RotateLeft-8    6.72ns ± 2%  1.86ns ± 0%  -72.33%  (p=0.016 n=5+4)
RotateLeft8-8   4.41ns ± 2%  1.67ns ± 1%  -62.15%  (p=0.008 n=5+5)
RotateLeft16-8  4.46ns ± 6%  1.65ns ± 0%  -63.06%  (p=0.008 n=5+5)
RotateLeft32-8  4.50ns ± 5%  1.67ns ± 1%  -62.86%  (p=0.008 n=5+5)
RotateLeft64-8  4.54ns ± 1%  1.85ns ± 1%  -59.32%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20170411.4

(Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.)

For #18616.

Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686
Reviewed-on: https://go-review.googlesource.com/40394
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
9d01def

@lparth lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

@randall77 @lparth randall77 + lparth cmd/compile: intrinsics for math/bits.OnesCount
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call.  Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.

name           old time/op  new time/op  delta
OnesCount-8    2.47ns ± 5%  1.04ns ± 2%  -57.70%  (p=0.000 n=10+10)
OnesCount16-8  1.05ns ± 1%  0.78ns ± 0%  -25.56%    (p=0.000 n=9+8)
OnesCount32-8  1.63ns ± 5%  1.04ns ± 2%  -35.96%  (p=0.000 n=10+10)
OnesCount64-8  2.45ns ± 0%  1.04ns ± 1%  -57.55%   (p=0.000 n=6+10)

Update #18616

Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
8e57590

@lparth lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

@griesemer @lparth griesemer + lparth math/bits: support negative rotation count and remove RotateRight
For details see the discussion on the issue below.

RotateLeft functions can now be inlined because the don't panic
anymore for negative rotation counts.

name            old time/op  new time/op  delta
RotateLeft-8    6.72ns ± 2%  1.86ns ± 0%  -72.33%  (p=0.016 n=5+4)
RotateLeft8-8   4.41ns ± 2%  1.67ns ± 1%  -62.15%  (p=0.008 n=5+5)
RotateLeft16-8  4.46ns ± 6%  1.65ns ± 0%  -63.06%  (p=0.008 n=5+5)
RotateLeft32-8  4.50ns ± 5%  1.67ns ± 1%  -62.86%  (p=0.008 n=5+5)
RotateLeft64-8  4.54ns ± 1%  1.85ns ± 1%  -59.32%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20170411.4

(Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.)

For #18616.

Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686
Reviewed-on: https://go-review.googlesource.com/40394
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
ef77a71

CL https://golang.org/cl/40394 mentions this issue.

CL https://golang.org/cl/41630 mentions this issue.

Contributor

griesemer commented Apr 27, 2017

The original proposal plus a few extra functions have been designed and implemented at this point. We may add to this library over time, but it seems reasonably "complete" for now.

Most notably, we have not decided upon or implemented functionality to:

  • test whether add/mul/etc overflow
  • provide functions to implement add/mul/etc that accept an incoming carry and produce a result plus carry (or higher word)

Personally I'm not convinced those belong into a "bits" package (maybe the tests do). Functions to implement multi-precision add/sub/mul would allow a pure Go implementation of some of the math/big kernels, but I don't believe the granularity is right: What we want there is optimized kernels working on vectors, and maximum performance for those kernels. I don't believe we can achieve that with Go code depending on add/sub/mul "intrinsics" alone.

Thus, for now I like to close this issue as "done" unless there are major objections. Please speak up over the next week or so if you are against closing this.

I'm in favor of adding functions along those lines.

I strongly believe that they belong in their own package, if for no other reason than to give it a name that better reflects their collective functionality.

👍 on closing this issue and ❤️ for the work done so far.

Contributor

griesemer commented May 5, 2017

Closing since there were no objections.

griesemer closed this May 5, 2017

@gopherbot gopherbot pushed a commit that referenced this issue May 10, 2017

@laboger laboger cmd/compile: ppc64x intrinsics for math/bits
This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
ppc64x.

benchmark                       old ns/op     new ns/op     delta
BenchmarkLeadingZeros-16        4.26          1.71          -59.86%
BenchmarkLeadingZeros16-16      3.04          1.83          -39.80%
BenchmarkLeadingZeros32-16      3.31          1.82          -45.02%
BenchmarkLeadingZeros64-16      3.69          1.71          -53.66%
BenchmarkTrailingZeros-16       2.55          1.62          -36.47%
BenchmarkTrailingZeros32-16     2.55          1.77          -30.59%
BenchmarkTrailingZeros64-16     2.78          1.62          -41.73%
BenchmarkOnesCount-16           3.19          0.93          -70.85%
BenchmarkOnesCount32-16         2.55          1.18          -53.73%
BenchmarkOnesCount64-16         3.22          0.93          -71.12%

Update #18616

I also made a change to bits_test.go because when debugging some failures
the output was not quite providing the right argument information.

Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
Reviewed-on: https://go-review.googlesource.com/41630
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
8304d10

glycerine referenced this issue in RoaringBitmap/roaring Jun 15, 2017

Open

new go1.9 math/bits package #101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment