math/bits: an integer bit twiddling library #18616

Closed
brtzsnr opened this Issue Jan 11, 2017 · 168 comments

Comments

Projects
None yet
@brtzsnr
Contributor

brtzsnr commented Jan 11, 2017

Previous discussions at #17373 and #10757.

Abstract

This proposal introduces a set of API for integer bit twiddling.

Background

This proposal introduces a set of API for integer bit twiddling. For this proposal we are interested in the following functions:

  • ctz - count trailing zeros.
  • clz - count leading zeros; log_2.
  • popcnt - count population; hamming distance; integer parity.
  • bswap - reverses the order of bytes.

These functions were picked by surveying:

We limited ourselves to these four functions because other twiddling
tricks are very simple to implement using the proposed library,
or already available Go constructs.

We found implementations for a subset of the selected twiddling functions
in many packages including runtime, compiler and tools:

package clz ctz popcnt bswap
math/big X X
runtime/internal/sys X X X
tools/container/intsets X X
cmd/compile/internal/ssa X X
code.google.com/p/intmath X X
github.com/hideo55/go-popcount X (asm)
github.com/RoaringBitmap/roaring X X (asm)
github.com/tHinqa/bitset X X X
github.com/willf/bitset X X (asm)
gopl.io/ch2/popcount X
GCC builtins X X X X

Many other packages implement a subset of these functions:

Similarly hardware providers have recognized the importance
of such functions and included machine level support.
Without hardware support these operations are very expensive.

arch clz ctz popcnt bswap
AMD64 X X X X
ARM X X ? X
ARM64 X X ? X
S390X X X ? X

All bit twiddling functions, except popcnt, are already implemented by runtime/internal/sys and receive special support from the compiler in order to "to help get the very best performance". However, the compiler support is limited to the runtime package and other Golang users have to reimplement the slower variant of these functions.

Proposal

We introduce a new std library math/bits with the following external API, to provides compiler / hardware optimized implementations of clz, ctz, popcnt and bswap functions.

package bits

// SwapBytes16 reverses the order of bytes in a 16-bit integer.
func SwapBytes16(uint16) uint16
// SwapBytes32 reverses the order of bytes in a 32-bit integer.
func SwapBytes32(uint32) uint32
// SwapBytes64 reverses the order of bytes in a 64-bit integer.
func SwapBytes64(uint64) uint64

// TrailingZeros32 counts the number of trailing zeros in a 32-bit integer, and if all are zero, then 32.
func TrailingZeros32(uint32) uint
// TrailingZeros64 counts the number of trailing zeros in a 64-bit integer, and if all are zero, then 64.
func TrailingZeros64(uint64) uint

// LeadingZeros32 counts the number of trailing zeros in a 32-bit integer, and if all are zero, then 32.
func LeadingZeros32(uint32) uint
// LeadingZeros64 counts the number of trailing zeros in a 64-bit integer, and if all are zero, then 64.
func LeadingZeros64(uint64) uint

// Ones32 counts the number of bits set in a 32-bit integer.
func Ones32(uint32) uint
// Ones64 counts the number of bits set in a 64-bit integer.
func Ones64(uint64) uint

Rationale

Alternatives to this proposal are:

  • Bit twiddling functions are implemented in an external library not supported by the compiler. This approach works and is the current state of affairs. Runtime uses the compiler supported methods, while Golang users continue to use the slower implementations.
  • The external library is supported by the compiler. Given that we expect this library to replace runtime/internal/sys it means that this library must be in lock with the compiler and live in the standard library.

Compatibility

This proposal does not change or breaks any existing stdlib API and it conforms to compatibly guidelines.

Implementation

SwapBytes, TrailingZeros and LeadingZeros are already implemented. The only missing function is Ones which can be implemented similarly to the other functions. If this proposal is accepted it can be implemented in time for Go1.9.

Open issues (if applicable)

Names are hard, bike shed is in the comments.

Please suggest additional functions to be included in the comments. Ideally, please include where such function is used in stdlib (e.g. math/big), tools or popular packages.

So far the following functions have been proposed and are under consideration:

  • RotateLeft / RotateRight
    • Pros: &63 is no longer need to compile a rotate with non-const argument to a single instruction on x86, x86-64..
    • Cons: very short/simple to implement and inline.
    • Used: crypto/ uses the constant rotate which is handled properly by the compiler.
  • ReverseBits
    • Pros: ?
    • Cons: ?
    • Used: ?
  • Add/Sub with carry return
    • Pros: Expensive otherwise
    • Cons: ?
    • Used: math/big

History

14.Jan: Clarified the output of TrailingZeros and LeadingZeros when the argument is 0.
14.Jan: Renamed methods: CountTrailingZeros -> TrailingZeros, CountLeadingZeros -> LeadingZeros, CountOnes -> Ones.
13.Jan: Fixed architecture name.
11.Jan: Initial proposal opened to public.

@cespare cespare added the Proposal label Jan 11, 2017

@cespare

This comment has been minimized.

Show comment
Hide comment
@cespare

cespare Jan 11, 2017

Contributor

@brtzsnr perhaps you should submit this document to the proposal repo as outlined in the proposal process steps?

Since it's already markdown following the template, it should be easy to copy-paste into a CL creating a file design/18616-bit-twiddling.md (or whatever).

Contributor

cespare commented Jan 11, 2017

@brtzsnr perhaps you should submit this document to the proposal repo as outlined in the proposal process steps?

Since it's already markdown following the template, it should be easy to copy-paste into a CL creating a file design/18616-bit-twiddling.md (or whatever).

@brtzsnr

This comment has been minimized.

Show comment
Hide comment
@brtzsnr

brtzsnr Jan 11, 2017

Contributor

@cespare from https://github.com/golang/proposal "if the author wants to write a design doc, then they can write one". It started as a design doc, if there is strong feeling that I should submit this, I'm totally fine.

Contributor

brtzsnr commented Jan 11, 2017

@cespare from https://github.com/golang/proposal "if the author wants to write a design doc, then they can write one". It started as a design doc, if there is strong feeling that I should submit this, I'm totally fine.

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Jan 11, 2017

Contributor

I'd be ok with this, it's common enough functionality used in many algorithmic libraries and math/bits seems like an appropriate place.

(For one, math/big also implements nlz (== clz).)

There's probably some bike shedding about the names. I for one would prefer the functions to say what they return rather than what they do; which in turn may lead to shorter names. For instance:

bits.TrailingZeros64(x) rather than bits.CountTrailingZeros64(x)

and so forth.

Contributor

griesemer commented Jan 11, 2017

I'd be ok with this, it's common enough functionality used in many algorithmic libraries and math/bits seems like an appropriate place.

(For one, math/big also implements nlz (== clz).)

There's probably some bike shedding about the names. I for one would prefer the functions to say what they return rather than what they do; which in turn may lead to shorter names. For instance:

bits.TrailingZeros64(x) rather than bits.CountTrailingZeros64(x)

and so forth.

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Jan 11, 2017

Contributor

The proposal seems pretty clear and minimal - a design doc seems overkill. I think a CL would be more appropriate at this point.

(That is a CL with API and basic implementation - for purpose of discussion in place of a design doc. We still need to decide if this proposal should be accepted or not.)

Contributor

griesemer commented Jan 11, 2017

The proposal seems pretty clear and minimal - a design doc seems overkill. I think a CL would be more appropriate at this point.

(That is a CL with API and basic implementation - for purpose of discussion in place of a design doc. We still need to decide if this proposal should be accepted or not.)

@cespare

This comment has been minimized.

Show comment
Hide comment
@cespare

cespare Jan 11, 2017

Contributor

@brtzsnr has already written the design document: it's in the issue description and it follows the the template. I assumed that there was some value in having these documents all in one location.

Contributor

cespare commented Jan 11, 2017

@brtzsnr has already written the design document: it's in the issue description and it follows the the template. I assumed that there was some value in having these documents all in one location.

@cespare

This comment has been minimized.

Show comment
Hide comment
@cespare

cespare Jan 11, 2017

Contributor

The last arch listed in the hardware support table is "BSWAP" -- typo?

Contributor

cespare commented Jan 11, 2017

The last arch listed in the hardware support table is "BSWAP" -- typo?

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Jan 12, 2017

Contributor

Thanks for writing this up.

The doc string for ctz and clz should specify the result when passed 0.

I also prefer (e.g.) TrailingZeros32 to CountTrailingZeros32. I'd also be happy with Ctz32. It is concise, familiar to most, and easily googleable for the rest.

Contributor

josharian commented Jan 12, 2017

Thanks for writing this up.

The doc string for ctz and clz should specify the result when passed 0.

I also prefer (e.g.) TrailingZeros32 to CountTrailingZeros32. I'd also be happy with Ctz32. It is concise, familiar to most, and easily googleable for the rest.

@cherrymui

This comment has been minimized.

Show comment
Hide comment
@cherrymui

cherrymui Jan 12, 2017

Contributor

Thanks for the proposal.
I just want to add that we probably also want to focus on the use, instead of focusing only on the instructions. For example, @dr2chase once found there are several log2 functions in the compiler/assembler. It is close to CLZ but not the same. Functions like this should probably also in the bit twiddling package. And maybe also include useful functions that are not related to these instructions.

Contributor

cherrymui commented Jan 12, 2017

Thanks for the proposal.
I just want to add that we probably also want to focus on the use, instead of focusing only on the instructions. For example, @dr2chase once found there are several log2 functions in the compiler/assembler. It is close to CLZ but not the same. Functions like this should probably also in the bit twiddling package. And maybe also include useful functions that are not related to these instructions.

@bradfitz bradfitz added this to the Proposal milestone Jan 13, 2017

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 13, 2017

Member
Member

minux commented Jan 13, 2017

@dsnet

This comment has been minimized.

Show comment
Hide comment
@dsnet

dsnet Jan 13, 2017

Member

@minux, fortunately, every bit twiddling function I've needed so far is exactly the ones that are in this proposal.

Member

dsnet commented Jan 13, 2017

@minux, fortunately, every bit twiddling function I've needed so far is exactly the ones that are in this proposal.

@dr2chase

This comment has been minimized.

Show comment
Hide comment
@dr2chase

dr2chase Jan 13, 2017

Contributor

Following Hacker's Delight has the advantage that we don't need to waste time arguing about the names.

Contributor

dr2chase commented Jan 13, 2017

Following Hacker's Delight has the advantage that we don't need to waste time arguing about the names.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 13, 2017

Member
Member

minux commented Jan 13, 2017

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Jan 13, 2017

Contributor

Relatedly, functions to check whether an add/multiply will overflow.

Contributor

josharian commented Jan 13, 2017

Relatedly, functions to check whether an add/multiply will overflow.

@thepudds

This comment has been minimized.

Show comment
Hide comment
@thepudds

thepudds Jan 14, 2017

A related data point: there are various issues that have been filed for faster cgo. In one such example (proposal #16051), the fact that fast implementation of bsr/ctz/etc. might happen was mentioned as hopefully chipping away at the set of use cases where people writing go are tempted to use cgo.

@aclements commented on Jul 1, 2016:

@eloff, there's been some discussion (though no concrete proposals to my knowledge) to add functions for things like popcount and bsr that would be compiled as intrinsics where supported (like math.Sqrt is today). With 1.7 we're trying this out in the runtime, which now has a ctz intrinsic available to it on amd64 SSA. Obviously that doesn't solve the overall problem, but it would chip away at one more reason to use cgo in a low-overhead context.

Many people (myself included) are attracted to go because of the performance, so things like this current proposal for bit twiddling would help. (And yes, cgo is also now faster in 1.8, which is nice as well).

A related data point: there are various issues that have been filed for faster cgo. In one such example (proposal #16051), the fact that fast implementation of bsr/ctz/etc. might happen was mentioned as hopefully chipping away at the set of use cases where people writing go are tempted to use cgo.

@aclements commented on Jul 1, 2016:

@eloff, there's been some discussion (though no concrete proposals to my knowledge) to add functions for things like popcount and bsr that would be compiled as intrinsics where supported (like math.Sqrt is today). With 1.7 we're trying this out in the runtime, which now has a ctz intrinsic available to it on amd64 SSA. Obviously that doesn't solve the overall problem, but it would chip away at one more reason to use cgo in a low-overhead context.

Many people (myself included) are attracted to go because of the performance, so things like this current proposal for bit twiddling would help. (And yes, cgo is also now faster in 1.8, which is nice as well).

@brtzsnr

This comment has been minimized.

Show comment
Hide comment
@brtzsnr

brtzsnr Jan 14, 2017

Contributor

RotateLeft/Right (can be expanded inline with two shifts, but the compiler
can't always do the transformation due to shift range issues)

@minux Can you elaborate what is the problem?

Contributor

brtzsnr commented Jan 14, 2017

RotateLeft/Right (can be expanded inline with two shifts, but the compiler
can't always do the transformation due to shift range issues)

@minux Can you elaborate what is the problem?

@randall77

This comment has been minimized.

Show comment
Hide comment
@randall77

randall77 Jan 14, 2017

Contributor

@brtzsnr : I think what Minux is referring to is that when you write (x << k) | (x >> (64-k)), you know you're using 0 <= k < 64, but the compiler can't read your mind, and it is not obviously derivable from the code. If we had the function

func leftRot(x uint64, k uint) uint64 {
   k &= 63
   return (x << k) | (x >> (64-k))
}

Then we can make sure (via the &63) that the compiler knows that the range of k is bounded.
So if the compiler can't prove the input is bounded, then we need an extra AND. That's better than not generating the rotate assembly at all.

Contributor

randall77 commented Jan 14, 2017

@brtzsnr : I think what Minux is referring to is that when you write (x << k) | (x >> (64-k)), you know you're using 0 <= k < 64, but the compiler can't read your mind, and it is not obviously derivable from the code. If we had the function

func leftRot(x uint64, k uint) uint64 {
   k &= 63
   return (x << k) | (x >> (64-k))
}

Then we can make sure (via the &63) that the compiler knows that the range of k is bounded.
So if the compiler can't prove the input is bounded, then we need an extra AND. That's better than not generating the rotate assembly at all.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 14, 2017

Member
Member

minux commented Jan 14, 2017

@opennota

This comment has been minimized.

Show comment
Hide comment
@opennota

opennota Jan 14, 2017

How about byte and bit shuffling (and unshuffling) functions that are used by the blosc compression library? The slides (the shuffling starts from the slide 17). These functions can be SSE2/AVX2 accelerated.

opennota commented Jan 14, 2017

How about byte and bit shuffling (and unshuffling) functions that are used by the blosc compression library? The slides (the shuffling starts from the slide 17). These functions can be SSE2/AVX2 accelerated.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 14, 2017

Member
Member

minux commented Jan 14, 2017

@brtzsnr

This comment has been minimized.

Show comment
Hide comment
@brtzsnr

brtzsnr Jan 14, 2017

Contributor

The current proposed functions have a Go native implementation much larger and disproportionally more expensive than optimum. On the other hand rotate is easy to write inline in a way that the compiler can recognize.

@minux and also everyone else: Do you know where rotate left/right is used with a non-constant number of rotated bits? crypto/sha256 uses rotate for example, but with constant number of bits.

Contributor

brtzsnr commented Jan 14, 2017

The current proposed functions have a Go native implementation much larger and disproportionally more expensive than optimum. On the other hand rotate is easy to write inline in a way that the compiler can recognize.

@minux and also everyone else: Do you know where rotate left/right is used with a non-constant number of rotated bits? crypto/sha256 uses rotate for example, but with constant number of bits.

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Jan 14, 2017

Contributor

rotate is easy to write inline in a way that the compiler can recognize.

It is easy for those who are familiar with the compiler's internals. Putting it in a math/bits package makes it easy for everyone.

Do you know where rotate left/right is used with a non-constant number of rotated bits?

Here's an example from #9337:

https://play.golang.org/p/rmDG7MR5F9

In each invocation it is a constant number of rotated bits each time, but the function itself is not currently inlined, so it compiles without any rotate instructions. A math/bits library function would definitely help here.

Contributor

josharian commented Jan 14, 2017

rotate is easy to write inline in a way that the compiler can recognize.

It is easy for those who are familiar with the compiler's internals. Putting it in a math/bits package makes it easy for everyone.

Do you know where rotate left/right is used with a non-constant number of rotated bits?

Here's an example from #9337:

https://play.golang.org/p/rmDG7MR5F9

In each invocation it is a constant number of rotated bits each time, but the function itself is not currently inlined, so it compiles without any rotate instructions. A math/bits library function would definitely help here.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 14, 2017

Member
Member

minux commented Jan 14, 2017

@brtzsnr

This comment has been minimized.

Show comment
Hide comment
@brtzsnr

brtzsnr Jan 17, 2017

Contributor

@josharian The example looks like a bad inliner decision if rot is not inlined. Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

@minux: I agree with you. I'm not trying to tie the API to a particular instruction set; the hardware support is a nice bonus. My main focus is to find usages in the real code (not toy code) to understand the context, what is the best signature and how important is to provide everyone's favorite function. Compatibility promise will bite us later if we don't this properly now.

For example: What should be the signature of add with carry return? Add(x, y uint64) (c, s uint64)? Looking at math/big we probably need Add(x, y uintptr) (c, s uintptr) too.

Contributor

brtzsnr commented Jan 17, 2017

@josharian The example looks like a bad inliner decision if rot is not inlined. Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

@minux: I agree with you. I'm not trying to tie the API to a particular instruction set; the hardware support is a nice bonus. My main focus is to find usages in the real code (not toy code) to understand the context, what is the best signature and how important is to provide everyone's favorite function. Compatibility promise will bite us later if we don't this properly now.

For example: What should be the signature of add with carry return? Add(x, y uint64) (c, s uint64)? Looking at math/big we probably need Add(x, y uintptr) (c, s uintptr) too.

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Jan 17, 2017

Contributor

The example looks like a bad inliner decision if rot is not inlined.

Yes. It is part of a bug complaining about inlining. :)

Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

It's not my code--and that's part of the point. And anyway, it is reasonable code.

Contributor

josharian commented Jan 17, 2017

The example looks like a bad inliner decision if rot is not inlined.

Yes. It is part of a bug complaining about inlining. :)

Did you try to write the function as func rot(x, n) instead of rot = func(x, n)?

It's not my code--and that's part of the point. And anyway, it is reasonable code.

@mundaym

This comment has been minimized.

Show comment
Hide comment
@mundaym

mundaym Jan 19, 2017

Member

It would be nice to guarantee (in the package documentation?) that the rotate and byte-swap functions are constant-time operations so that they can be safely used in crypto algorithms. Possibly something to think about for other functions too.

Member

mundaym commented Jan 19, 2017

It would be nice to guarantee (in the package documentation?) that the rotate and byte-swap functions are constant-time operations so that they can be safely used in crypto algorithms. Possibly something to think about for other functions too.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 19, 2017

Member
Member

minux commented Jan 19, 2017

@randall77

This comment has been minimized.

Show comment
Hide comment
@randall77

randall77 Jan 19, 2017

Contributor

I'm inclined to agree with @minux. If you want constant-time crypto primitives, they should live in crypto/subtle. crypto/subtle can easily redirect to math/bits on platforms where those implementations have been verified. They can do something else if a slower but constant-time implementation is required.

Contributor

randall77 commented Jan 19, 2017

I'm inclined to agree with @minux. If you want constant-time crypto primitives, they should live in crypto/subtle. crypto/subtle can easily redirect to math/bits on platforms where those implementations have been verified. They can do something else if a slower but constant-time implementation is required.

@rsc rsc changed the title from proposal: an integer bit twiddling library to proposal: math/bits: an integer bit twiddling library Jan 23, 2017

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Jan 23, 2017

Contributor

This seems worth doing. Leaving for @griesemer and @randall77 to vet. The next step seems like a design doc checked into the usual place (I do see the sketch above).

Contributor

rsc commented Jan 23, 2017

This seems worth doing. Leaving for @griesemer and @randall77 to vet. The next step seems like a design doc checked into the usual place (I do see the sketch above).

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Jan 24, 2017

Contributor

Now that this proposal can go forward, we should agree on the details of the API. Questions I see at the moment:

  • which functions to include?
  • which types to handle (uint64, uint32 only, or uint64, uint32, uint16, uint8, or uintptr)?
  • signature details (e.g. TrailingZeroesxx(x uintxx) uint, with xx = 64, 32, etc., vs TrailingZeroes(x uint64, size uint) uint where size is one of 64, 32, etc. - the latter would lead to a smaller API and may still be fast enough dep. on implementation)
  • some name bike shedding (I like the "Hacker's Delight" terminology but spelling it out may be more appropriate)
  • is there a better place than math/bits for this package.

@brtzsnr Would you like to continue spearheading this effort? I'm ok if you want to take your initial design and iterate upon it in this issue or if you prefer setting up a dedicated design doc. Alternatively, I can pick this up and push it forward. I like to see this getting in during the 1.9 phase.

Contributor

griesemer commented Jan 24, 2017

Now that this proposal can go forward, we should agree on the details of the API. Questions I see at the moment:

  • which functions to include?
  • which types to handle (uint64, uint32 only, or uint64, uint32, uint16, uint8, or uintptr)?
  • signature details (e.g. TrailingZeroesxx(x uintxx) uint, with xx = 64, 32, etc., vs TrailingZeroes(x uint64, size uint) uint where size is one of 64, 32, etc. - the latter would lead to a smaller API and may still be fast enough dep. on implementation)
  • some name bike shedding (I like the "Hacker's Delight" terminology but spelling it out may be more appropriate)
  • is there a better place than math/bits for this package.

@brtzsnr Would you like to continue spearheading this effort? I'm ok if you want to take your initial design and iterate upon it in this issue or if you prefer setting up a dedicated design doc. Alternatively, I can pick this up and push it forward. I like to see this getting in during the 1.9 phase.

@griesemer griesemer self-assigned this Jan 24, 2017

@jimmyfrasche

This comment has been minimized.

Show comment
Hide comment
@jimmyfrasche

jimmyfrasche Jan 24, 2017

Member

I prefer the spelled out xx names, personally: bits.TrailingZeroes(uint64(x), 32) is more cumbersome and less readable than bits.TrailingZeroes32(x), imo, and that naming scheme would work with the platform-variable-sized uintptr more easily (xx = Ptr?).

It would also make it more apparent if you, say, didn't include uint8 in the first release but added it later—suddenly there'd be a lot of new xx = 8 functions instead of a bunch of updated doc comments that add 8 to the list of valid sizes with a note in the release docs that's easy to miss.

If the spelled out names are used, I think the Hacker's delight terms should be included in the descriptions as an "also known as" so that searches (in page or via search engine) find the canonical name easily if you search for the wrong spelling.

I think math/bits is a fine place—it's math with bits.

Member

jimmyfrasche commented Jan 24, 2017

I prefer the spelled out xx names, personally: bits.TrailingZeroes(uint64(x), 32) is more cumbersome and less readable than bits.TrailingZeroes32(x), imo, and that naming scheme would work with the platform-variable-sized uintptr more easily (xx = Ptr?).

It would also make it more apparent if you, say, didn't include uint8 in the first release but added it later—suddenly there'd be a lot of new xx = 8 functions instead of a bunch of updated doc comments that add 8 to the list of valid sizes with a note in the release docs that's easy to miss.

If the spelled out names are used, I think the Hacker's delight terms should be included in the descriptions as an "also known as" so that searches (in page or via search engine) find the canonical name easily if you search for the wrong spelling.

I think math/bits is a fine place—it's math with bits.

@dsnet

This comment has been minimized.

Show comment
Hide comment
@dsnet

dsnet Jan 24, 2017

Member

I should note that SwapBytes{16,32,64} is more consistent with functions in sync/atomic than SwapBytes(..., bitSize uint).

Although the strconv package does use the ParseInt(..., bitSize int) pattern, there is more relation between bits and atomic, than there is with strconv.

Member

dsnet commented Jan 24, 2017

I should note that SwapBytes{16,32,64} is more consistent with functions in sync/atomic than SwapBytes(..., bitSize uint).

Although the strconv package does use the ParseInt(..., bitSize int) pattern, there is more relation between bits and atomic, than there is with strconv.

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux Jan 24, 2017

Member
Member

minux commented Jan 24, 2017

@brtzsnr

This comment has been minimized.

Show comment
Hide comment
@brtzsnr

brtzsnr Jan 24, 2017

Contributor

@griesemer Robert, please take over the proposal. I will have time to push this proposal forward in one month, but progress should not stall till then.

Contributor

brtzsnr commented Jan 24, 2017

@griesemer Robert, please take over the proposal. I will have time to push this proposal forward in one month, but progress should not stall till then.

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Jan 25, 2017

Contributor

It looks like there's a clear preference for signatures of the form:

func fffNN(x uintNN) uint

which leaves open which NN to support. Let's focus on NN=64 for the purpose of the continuing discussion. It will be straight-forward to add the remaining types as needed (incl. uintptr).

Which leads us straight back to the original proposal by @brtzsnr and the following functions (and their respective variations for different types), plus what people have suggested in the meantime:

// LeadingZeros64 returns the number of leading zero bits in x.
// The result is 64 if x == 0.
func LeadingZeros64(x uint64) uint

// TrailingZeros64 returns the number of trailing zero bits in x.
// The result is 64 if x == 0.
func TrailingZeros64(x uint64) uint

// Ones64 returns the number of bits set in x.
func Ones64(x uint64) uint

// RotateLeft64 returns the value of x rotated left by n%64 bits.
func RotateLeft64(x uint64, n uint) uint64

// RotateRight64 returns the value of x rotated right by n%64 bits.
func RotateRight64(x uint64, n uint) uint64

We may have a set of Swap functions as well. Personally I am skeptical about these: 1) I've never needed a SwapBits, and most uses of SwapBytes are due to code that is endian-aware when it shouldn't be. Comments?

// SwapBits64 reverses the order of the bits in x.
func SwapBits64(x uint64) uint64

// SwapBytes64 reverses the order of the bytes in x.
func SwapBytes64(x uint64) uint64

Then there may be a set of integer operations. They would only be in this package because some of them (e.g. Log2) may be close to other functions in this package. These functions could be elsewhere. Perhaps the package name bits needs to be adjusted. Comments?

// Log2 returns the integer binary logarithm of x.
// The result is the integer n for which 2^n <= x < 2^(n+1).
// If x == 0, the result is -1.
func Log2(x uint64) int

// Sqrt returns the integer square root of x.
// The result is the value n such that n^2 <= x < (n+1)^2.
func Sqrt(x uint64) uint64

Finally, @minux suggested operations such as AddUint32 etc. I'm going to leave those out for now because I think they are more tricky to specify correctly (and there's more variety). We can get back to them later. Let's get to some consensus on the above.

Questions:

  • Any opinions about function names?
  • Any opinions about the integer ops?
  • Any opinions about the package name/location?
Contributor

griesemer commented Jan 25, 2017

It looks like there's a clear preference for signatures of the form:

func fffNN(x uintNN) uint

which leaves open which NN to support. Let's focus on NN=64 for the purpose of the continuing discussion. It will be straight-forward to add the remaining types as needed (incl. uintptr).

Which leads us straight back to the original proposal by @brtzsnr and the following functions (and their respective variations for different types), plus what people have suggested in the meantime:

// LeadingZeros64 returns the number of leading zero bits in x.
// The result is 64 if x == 0.
func LeadingZeros64(x uint64) uint

// TrailingZeros64 returns the number of trailing zero bits in x.
// The result is 64 if x == 0.
func TrailingZeros64(x uint64) uint

// Ones64 returns the number of bits set in x.
func Ones64(x uint64) uint

// RotateLeft64 returns the value of x rotated left by n%64 bits.
func RotateLeft64(x uint64, n uint) uint64

// RotateRight64 returns the value of x rotated right by n%64 bits.
func RotateRight64(x uint64, n uint) uint64

We may have a set of Swap functions as well. Personally I am skeptical about these: 1) I've never needed a SwapBits, and most uses of SwapBytes are due to code that is endian-aware when it shouldn't be. Comments?

// SwapBits64 reverses the order of the bits in x.
func SwapBits64(x uint64) uint64

// SwapBytes64 reverses the order of the bytes in x.
func SwapBytes64(x uint64) uint64

Then there may be a set of integer operations. They would only be in this package because some of them (e.g. Log2) may be close to other functions in this package. These functions could be elsewhere. Perhaps the package name bits needs to be adjusted. Comments?

// Log2 returns the integer binary logarithm of x.
// The result is the integer n for which 2^n <= x < 2^(n+1).
// If x == 0, the result is -1.
func Log2(x uint64) int

// Sqrt returns the integer square root of x.
// The result is the value n such that n^2 <= x < (n+1)^2.
func Sqrt(x uint64) uint64

Finally, @minux suggested operations such as AddUint32 etc. I'm going to leave those out for now because I think they are more tricky to specify correctly (and there's more variety). We can get back to them later. Let's get to some consensus on the above.

Questions:

  • Any opinions about function names?
  • Any opinions about the integer ops?
  • Any opinions about the package name/location?
@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jan 25, 2017

Member

I'd prefer long functions names. Go avoids abbreviating in function names. The comments can say their nicknames ("nlz", "ntz", etc) for people searching the package that way.

Member

bradfitz commented Jan 25, 2017

I'd prefer long functions names. Go avoids abbreviating in function names. The comments can say their nicknames ("nlz", "ntz", etc) for people searching the package that way.

@dsnet

This comment has been minimized.

Show comment
Hide comment
@dsnet

dsnet Jan 25, 2017

Member

@griesemer, you seem to have typos in your message. The comments don't match the function signatures.

I use SwapBits in compression applications. That said, do the major architectures provide any ability to do bit reversal efficiently? I was planning on just doing in my own code:

v := bits.SwapBytes64(v)
v = (v&0xaaaaaaaaaaaaaaaa)>>1 | (v&0x5555555555555555)<<1
v = (v&0xcccccccccccccccc)>>2 | (v&0x3333333333333333)<<2
v = (v&0xf0f0f0f0f0f0f0f0)>>4 | (v&0x0f0f0f0f0f0f0f0f)<<4
Member

dsnet commented Jan 25, 2017

@griesemer, you seem to have typos in your message. The comments don't match the function signatures.

I use SwapBits in compression applications. That said, do the major architectures provide any ability to do bit reversal efficiently? I was planning on just doing in my own code:

v := bits.SwapBytes64(v)
v = (v&0xaaaaaaaaaaaaaaaa)>>1 | (v&0x5555555555555555)<<1
v = (v&0xcccccccccccccccc)>>2 | (v&0x3333333333333333)<<2
v = (v&0xf0f0f0f0f0f0f0f0)>>4 | (v&0x0f0f0f0f0f0f0f0f)<<4
@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Mar 17, 2017

CL https://golang.org/cl/38323 mentions this issue.

CL https://golang.org/cl/38323 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 17, 2017

cmd/compile: intrinsic for math/bits.Reverse on ARM64
I don't know that it exists for any other architectures.

Update #18616

Change-Id: Idfe5dee251764d32787915889ec0be4bebc5be24
Reviewed-on: https://go-review.googlesource.com/38323
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>

gopherbot pushed a commit that referenced this issue Apr 4, 2017

cmd/compile: intrinsics for math/bits.OnesCount
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call.  Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.

name           old time/op  new time/op  delta
OnesCount-8    2.47ns ± 5%  1.04ns ± 2%  -57.70%  (p=0.000 n=10+10)
OnesCount16-8  1.05ns ± 1%  0.78ns ± 0%  -25.56%    (p=0.000 n=9+8)
OnesCount32-8  1.63ns ± 5%  1.04ns ± 2%  -35.96%  (p=0.000 n=10+10)
OnesCount64-8  2.45ns ± 0%  1.04ns ± 1%  -57.55%   (p=0.000 n=6+10)

Update #18616

Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
@randall77

This comment has been minimized.

Show comment
Hide comment
@randall77

randall77 Apr 4, 2017

Contributor

Some more discussion:

Me:

math/bits.RotateLeft is currently defined to panic if its argument is less than zero.
I'd like to change that to define RotateLeft to do a right rotate if its arg is less than zero.

Having a base routine like this one panic seems unnecessarily harsh. I'd argue that a rotate by a negative amount is more analogous to a shift by larger than the word size (which doesn't panic) instead of a divide by zero (which does panic). Divide-by-zero really has to panic, as there is no reasonable result you would return instead. Rotates by negative amounts have a perfectly well-defined result that we can return.

I think we should keep RotateLeft and RotateRight as separate functions, even though one could now be implemented with the other. Standard usage will still be with nonnegative arguments.

If we're going to do something here, we should do it by the freeze. Once go 1.9 is out we can't change our minds.

Rob:

If you really want something like this, I'd just have one function, Rotate, that takes a positive for left and a negative for right.

Me:

The problem is
bits.Rotate(x, -5)

It isn't clear when reading this code whether we end up rotating left or right.
bits.RotateRight(5)
is a lot clearer, even if it means there are twice as many Rotate* functions in math/bits.

Michael Jones:

Signed rotate means jumps in the code, yes? Seems a pity.

Me:

No, the implementation is actually simpler with the proposed semantics. Mask out the low 6 bits (for 64-bit rotates) and it just works.

Contributor

randall77 commented Apr 4, 2017

Some more discussion:

Me:

math/bits.RotateLeft is currently defined to panic if its argument is less than zero.
I'd like to change that to define RotateLeft to do a right rotate if its arg is less than zero.

Having a base routine like this one panic seems unnecessarily harsh. I'd argue that a rotate by a negative amount is more analogous to a shift by larger than the word size (which doesn't panic) instead of a divide by zero (which does panic). Divide-by-zero really has to panic, as there is no reasonable result you would return instead. Rotates by negative amounts have a perfectly well-defined result that we can return.

I think we should keep RotateLeft and RotateRight as separate functions, even though one could now be implemented with the other. Standard usage will still be with nonnegative arguments.

If we're going to do something here, we should do it by the freeze. Once go 1.9 is out we can't change our minds.

Rob:

If you really want something like this, I'd just have one function, Rotate, that takes a positive for left and a negative for right.

Me:

The problem is
bits.Rotate(x, -5)

It isn't clear when reading this code whether we end up rotating left or right.
bits.RotateRight(5)
is a lot clearer, even if it means there are twice as many Rotate* functions in math/bits.

Michael Jones:

Signed rotate means jumps in the code, yes? Seems a pity.

Me:

No, the implementation is actually simpler with the proposed semantics. Mask out the low 6 bits (for 64-bit rotates) and it just works.

@robpike

This comment has been minimized.

Show comment
Hide comment
@robpike

robpike Apr 4, 2017

Contributor

I prefer the single Rotate with a signed argument because it means half the number of functions and can be done very efficiently in all cases without a panic or even a conditional branch.

Rotate is a specialist function anyways, so the people who use it will surely be comfortable with being asked to remember that a positive argument shifts left and a negative argument shifts right.

Contributor

robpike commented Apr 4, 2017

I prefer the single Rotate with a signed argument because it means half the number of functions and can be done very efficiently in all cases without a panic or even a conditional branch.

Rotate is a specialist function anyways, so the people who use it will surely be comfortable with being asked to remember that a positive argument shifts left and a negative argument shifts right.

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Apr 4, 2017

Member

You could always split the difference and provide only RotateLeft with a signed argument. That gives a handy mnemonic for the direction but avoids duplicating functions.

Member

bcmills commented Apr 4, 2017

You could always split the difference and provide only RotateLeft with a signed argument. That gives a handy mnemonic for the direction but avoids duplicating functions.

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Apr 4, 2017

Contributor

@bcmills @robpike see also previous discussion of this exact topic starting at #18616 (comment) and continuing for a few comments

Contributor

josharian commented Apr 4, 2017

@bcmills @robpike see also previous discussion of this exact topic starting at #18616 (comment) and continuing for a few comments

@robpike

This comment has been minimized.

Show comment
Hide comment
@robpike

robpike Apr 4, 2017

Contributor

@josharian I have seen the comments and still prefer my version. This can be bikeshedded forever but I am trying to get to simple to implement, simple to define, simple to understand, and simple to document. I believe a single Rotate function with a signed argument satisfies all of that except for the need to define the sign, but that left is positive should be intuitive to anyone capable of using a rotate instruction.

Contributor

robpike commented Apr 4, 2017

@josharian I have seen the comments and still prefer my version. This can be bikeshedded forever but I am trying to get to simple to implement, simple to define, simple to understand, and simple to document. I believe a single Rotate function with a signed argument satisfies all of that except for the need to define the sign, but that left is positive should be intuitive to anyone capable of using a rotate instruction.

@aclements

This comment has been minimized.

Show comment
Hide comment
@aclements

aclements Apr 4, 2017

Member

but that left is positive should be intuitive to anyone capable of using a rotate instruction.

I like to consider myself capable of using a rotate instruction and my intuition is "Geez, why doesn't this say what direction it is? It's probably left, but I'll have to look at the documentation to be sure." I agree it's intuitive that if you were going to pick a positive direction it would be left rotation, but there's a much higher threshold for something to be so obviously the right answer that it's clear at every call site what direction you're rotating without saying it.

Member

aclements commented Apr 4, 2017

but that left is positive should be intuitive to anyone capable of using a rotate instruction.

I like to consider myself capable of using a rotate instruction and my intuition is "Geez, why doesn't this say what direction it is? It's probably left, but I'll have to look at the documentation to be sure." I agree it's intuitive that if you were going to pick a positive direction it would be left rotation, but there's a much higher threshold for something to be so obviously the right answer that it's clear at every call site what direction you're rotating without saying it.

@nathany

This comment has been minimized.

Show comment
Hide comment
@nathany

nathany Apr 4, 2017

Contributor

As far as readability, how about something along the lines of the time.Duration API:

const RotateRight = -1

bits.Rotate(x, 5 * RotateRight)

Maybe a constant defined by bits or maybe an exercise for the reader (call site)?

Contributor

nathany commented Apr 4, 2017

As far as readability, how about something along the lines of the time.Duration API:

const RotateRight = -1

bits.Rotate(x, 5 * RotateRight)

Maybe a constant defined by bits or maybe an exercise for the reader (call site)?

@robpike

This comment has been minimized.

Show comment
Hide comment
@robpike

robpike Apr 4, 2017

Contributor

@aclements And so you end up with two (times N types) functions with identical capability whose only difference is a sign in the argument. Now we tolerate it for Add and Sub but that's the only example I can think of.

Contributor

robpike commented Apr 4, 2017

@aclements And so you end up with two (times N types) functions with identical capability whose only difference is a sign in the argument. Now we tolerate it for Add and Sub but that's the only example I can think of.

@cznic

This comment has been minimized.

Show comment
Hide comment
@cznic

cznic Apr 4, 2017

Contributor

On the number axis positive numbers increase to the right, so I expect a direction-defined-by-sign rotate/shift to move the bits to the right for positive numbers.

I have no problem if it's going to be the [documented] opposite, but I'd not call that intuitive.

Contributor

cznic commented Apr 4, 2017

On the number axis positive numbers increase to the right, so I expect a direction-defined-by-sign rotate/shift to move the bits to the right for positive numbers.

I have no problem if it's going to be the [documented] opposite, but I'd not call that intuitive.

@nathany

This comment has been minimized.

Show comment
Hide comment
@nathany

nathany Apr 4, 2017

Contributor

@cznic bits are written right-to-left though

Contributor

nathany commented Apr 4, 2017

@cznic bits are written right-to-left though

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Apr 4, 2017

Contributor

I'm also in favor of just Rotate (#18616 (comment)) if we make it work in both directions.

As a counter-argument to @aclements concern about the direction: Providing a RotateLeft that works also when rotating right can also provide a false sense of security: "It says RotateLeft so surely it's not rotating right!". In other words, if it says RotateLeft it better doesn't do anything else.

Also, the use of bits.Rotate is really in specialist code only. This is not a function that is used by a large number of programmers. Users that really need it will understand the symmetry of rotations.

Contributor

griesemer commented Apr 4, 2017

I'm also in favor of just Rotate (#18616 (comment)) if we make it work in both directions.

As a counter-argument to @aclements concern about the direction: Providing a RotateLeft that works also when rotating right can also provide a false sense of security: "It says RotateLeft so surely it's not rotating right!". In other words, if it says RotateLeft it better doesn't do anything else.

Also, the use of bits.Rotate is really in specialist code only. This is not a function that is used by a large number of programmers. Users that really need it will understand the symmetry of rotations.

@cznic

This comment has been minimized.

Show comment
Hide comment
@cznic

cznic Apr 4, 2017

Contributor

@nathany

bits are written right-to-left though

Bits are just binary digits. Digits in number in any base are written left to right - even in most, if not all, right-to-left writing systems. 123 is one-hundred-and-twenty-three, not three-hundred-and-twenty-one.

That the power of the multiplicand for the digit decreases to the right is a different thing.

Once again: I don't care about the direction, it's just that the intuitive one is a matter of personal imagination.

Contributor

cznic commented Apr 4, 2017

@nathany

bits are written right-to-left though

Bits are just binary digits. Digits in number in any base are written left to right - even in most, if not all, right-to-left writing systems. 123 is one-hundred-and-twenty-three, not three-hundred-and-twenty-one.

That the power of the multiplicand for the digit decreases to the right is a different thing.

Once again: I don't care about the direction, it's just that the intuitive one is a matter of personal imagination.

@rogpeppe

This comment has been minimized.

Show comment
Hide comment
@rogpeppe

rogpeppe Apr 4, 2017

Contributor
Contributor

rogpeppe commented Apr 4, 2017

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Apr 5, 2017

Contributor

Please keep both RotateLeft and RotateRight instead of doing something half of developers will misremember. It seems fine to handle negative numbers though.

Contributor

rsc commented Apr 5, 2017

Please keep both RotateLeft and RotateRight instead of doing something half of developers will misremember. It seems fine to handle negative numbers though.

@robpike

This comment has been minimized.

Show comment
Hide comment
@robpike

robpike Apr 5, 2017

Contributor

99% of developers will never program a rotate instruction, so the need for unambiguous direction is weak at best. A single call is enough.

The problem that reawakened this discussion is that having both requires arguing about whether negative values are OK, and if not, what to do about them. By having only one, that whole argument falls away. It's cleaner design.

Contributor

robpike commented Apr 5, 2017

99% of developers will never program a rotate instruction, so the need for unambiguous direction is weak at best. A single call is enough.

The problem that reawakened this discussion is that having both requires arguing about whether negative values are OK, and if not, what to do about them. By having only one, that whole argument falls away. It's cleaner design.

@ct1n

This comment has been minimized.

Show comment
Hide comment
@ct1n

ct1n Apr 5, 2017

Contributor

I somewhat sympathize with the argument about clean design but it seems weird that you have to remove "Right" from "RotateRight", while keeping the same implementation, in order to achieve it. In practical terms the only way it seems to answer questions is by forcing people who see it to read the documentation, by way of the questions the name raises.
In the end it's a matter of unambiguous direction vs unambiguous behavior for negative values. Negative values should probably be less of a concern in the common case.

What I'm saying is that Rotate raises one question for all people, answering it indirectly through documentation.
RotateRight raises one question for very few people who can (and should) read the documentation if they care about it.

On the other hand Rotate would probably prevent people from writing if n < 0 { RotateLeft(...) } else { RotateRight(...) }.

Contributor

ct1n commented Apr 5, 2017

I somewhat sympathize with the argument about clean design but it seems weird that you have to remove "Right" from "RotateRight", while keeping the same implementation, in order to achieve it. In practical terms the only way it seems to answer questions is by forcing people who see it to read the documentation, by way of the questions the name raises.
In the end it's a matter of unambiguous direction vs unambiguous behavior for negative values. Negative values should probably be less of a concern in the common case.

What I'm saying is that Rotate raises one question for all people, answering it indirectly through documentation.
RotateRight raises one question for very few people who can (and should) read the documentation if they care about it.

On the other hand Rotate would probably prevent people from writing if n < 0 { RotateLeft(...) } else { RotateRight(...) }.

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Apr 10, 2017

Contributor

@golang/proposal-review discussed this and ended up at having just one function, but naming it RotateLeft, not just Rotate, to be extra clear at call sites. Negative numbers rotate right, and documentation will make this clear.

Contributor

rsc commented Apr 10, 2017

@golang/proposal-review discussed this and ended up at having just one function, but naming it RotateLeft, not just Rotate, to be extra clear at call sites. Negative numbers rotate right, and documentation will make this clear.

gopherbot pushed a commit that referenced this issue Apr 11, 2017

math/bits: support negative rotation count and remove RotateRight
For details see the discussion on the issue below.

RotateLeft functions can now be inlined because the don't panic
anymore for negative rotation counts.

name            old time/op  new time/op  delta
RotateLeft-8    6.72ns ± 2%  1.86ns ± 0%  -72.33%  (p=0.016 n=5+4)
RotateLeft8-8   4.41ns ± 2%  1.67ns ± 1%  -62.15%  (p=0.008 n=5+5)
RotateLeft16-8  4.46ns ± 6%  1.65ns ± 0%  -63.06%  (p=0.008 n=5+5)
RotateLeft32-8  4.50ns ± 5%  1.67ns ± 1%  -62.86%  (p=0.008 n=5+5)
RotateLeft64-8  4.54ns ± 1%  1.85ns ± 1%  -59.32%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20170411.4

(Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.)

For #18616.

Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686
Reviewed-on: https://go-review.googlesource.com/40394
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

cmd/compile: intrinsics for math/bits.OnesCount
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call.  Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.

name           old time/op  new time/op  delta
OnesCount-8    2.47ns ± 5%  1.04ns ± 2%  -57.70%  (p=0.000 n=10+10)
OnesCount16-8  1.05ns ± 1%  0.78ns ± 0%  -25.56%    (p=0.000 n=9+8)
OnesCount32-8  1.63ns ± 5%  1.04ns ± 2%  -35.96%  (p=0.000 n=10+10)
OnesCount64-8  2.45ns ± 0%  1.04ns ± 1%  -57.55%   (p=0.000 n=6+10)

Update #18616

Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>

lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

math/bits: support negative rotation count and remove RotateRight
For details see the discussion on the issue below.

RotateLeft functions can now be inlined because the don't panic
anymore for negative rotation counts.

name            old time/op  new time/op  delta
RotateLeft-8    6.72ns ± 2%  1.86ns ± 0%  -72.33%  (p=0.016 n=5+4)
RotateLeft8-8   4.41ns ± 2%  1.67ns ± 1%  -62.15%  (p=0.008 n=5+5)
RotateLeft16-8  4.46ns ± 6%  1.65ns ± 0%  -63.06%  (p=0.008 n=5+5)
RotateLeft32-8  4.50ns ± 5%  1.67ns ± 1%  -62.86%  (p=0.008 n=5+5)
RotateLeft64-8  4.54ns ± 1%  1.85ns ± 1%  -59.32%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20170411.4

(Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.)

For #18616.

Change-Id: I0828d80d54ec24f8d44954a57b3d6aeedb69c686
Reviewed-on: https://go-review.googlesource.com/40394
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Apr 21, 2017

CL https://golang.org/cl/40394 mentions this issue.

CL https://golang.org/cl/40394 mentions this issue.

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Apr 25, 2017

CL https://golang.org/cl/41630 mentions this issue.

CL https://golang.org/cl/41630 mentions this issue.

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer Apr 27, 2017

Contributor

The original proposal plus a few extra functions have been designed and implemented at this point. We may add to this library over time, but it seems reasonably "complete" for now.

Most notably, we have not decided upon or implemented functionality to:

  • test whether add/mul/etc overflow
  • provide functions to implement add/mul/etc that accept an incoming carry and produce a result plus carry (or higher word)

Personally I'm not convinced those belong into a "bits" package (maybe the tests do). Functions to implement multi-precision add/sub/mul would allow a pure Go implementation of some of the math/big kernels, but I don't believe the granularity is right: What we want there is optimized kernels working on vectors, and maximum performance for those kernels. I don't believe we can achieve that with Go code depending on add/sub/mul "intrinsics" alone.

Thus, for now I like to close this issue as "done" unless there are major objections. Please speak up over the next week or so if you are against closing this.

Contributor

griesemer commented Apr 27, 2017

The original proposal plus a few extra functions have been designed and implemented at this point. We may add to this library over time, but it seems reasonably "complete" for now.

Most notably, we have not decided upon or implemented functionality to:

  • test whether add/mul/etc overflow
  • provide functions to implement add/mul/etc that accept an incoming carry and produce a result plus carry (or higher word)

Personally I'm not convinced those belong into a "bits" package (maybe the tests do). Functions to implement multi-precision add/sub/mul would allow a pure Go implementation of some of the math/big kernels, but I don't believe the granularity is right: What we want there is optimized kernels working on vectors, and maximum performance for those kernels. I don't believe we can achieve that with Go code depending on add/sub/mul "intrinsics" alone.

Thus, for now I like to close this issue as "done" unless there are major objections. Please speak up over the next week or so if you are against closing this.

@jimmyfrasche

This comment has been minimized.

Show comment
Hide comment
@jimmyfrasche

jimmyfrasche Apr 27, 2017

Member

I'm in favor of adding functions along those lines.

I strongly believe that they belong in their own package, if for no other reason than to give it a name that better reflects their collective functionality.

👍 on closing this issue and ❤️ for the work done so far.

Member

jimmyfrasche commented Apr 27, 2017

I'm in favor of adding functions along those lines.

I strongly believe that they belong in their own package, if for no other reason than to give it a name that better reflects their collective functionality.

👍 on closing this issue and ❤️ for the work done so far.

@griesemer

This comment has been minimized.

Show comment
Hide comment
@griesemer

griesemer May 5, 2017

Contributor

Closing since there were no objections.

Contributor

griesemer commented May 5, 2017

Closing since there were no objections.

@griesemer griesemer closed this May 5, 2017

gopherbot pushed a commit that referenced this issue May 10, 2017

cmd/compile: ppc64x intrinsics for math/bits
This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
ppc64x.

benchmark                       old ns/op     new ns/op     delta
BenchmarkLeadingZeros-16        4.26          1.71          -59.86%
BenchmarkLeadingZeros16-16      3.04          1.83          -39.80%
BenchmarkLeadingZeros32-16      3.31          1.82          -45.02%
BenchmarkLeadingZeros64-16      3.69          1.71          -53.66%
BenchmarkTrailingZeros-16       2.55          1.62          -36.47%
BenchmarkTrailingZeros32-16     2.55          1.77          -30.59%
BenchmarkTrailingZeros64-16     2.78          1.62          -41.73%
BenchmarkOnesCount-16           3.19          0.93          -70.85%
BenchmarkOnesCount32-16         2.55          1.18          -53.73%
BenchmarkOnesCount64-16         3.22          0.93          -71.12%

Update #18616

I also made a change to bits_test.go because when debugging some failures
the output was not quite providing the right argument information.

Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
Reviewed-on: https://go-review.googlesource.com/41630
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>

@glycerine glycerine referenced this issue in RoaringBitmap/roaring Jun 15, 2017

Closed

new go1.9 math/bits package #101

@dskinner

This comment has been minimized.

Show comment
Hide comment
@dskinner

dskinner Aug 17, 2017

Member

This is a comment for future decisions regarding API, I understand this particular one is set.

Rotate is a specialist function; LTR or RTL is only relevant given the context. @aclements brought up a valid question, not a valid ever-extending point. His question could have been answered as "it's RTL, just as integers increment"; easy, right?

But instead, cleverness ensues.

What "a specialist function" means is such a simple thing, it was likely dismissed to quickly. Given a code sample, one likely already understands the rotate to occur and the direction even before encountering the line of code. Such code is usually already preceded by illustrative ascii documentation as it is.

What's mentally turbulent is not that Go could have simply chosen RTL as a standard way of interpreting bits from an API perspective, but rather, I first pulled up the changes of 1.9 and find a RotateLeft with no counterpart and the doc giving an example of a negative stride. This is a mind-numbing committee-like decision that is very unfortunate to be landing in 1.9.

I only plead to stick to context of usage for the future. All of this should have been self-evident with questions like, "why are we not providing a counterpart to RotateLeft, why are we panic'ing on negative strides or debating int vs uint for a stride"; ultimately, because I think what "a specialist function" means was simply dismissed to easily for not being clever.

Let us please avoid cleverness in our justification of APIs. It shows in this 1.9 update.

Member

dskinner commented Aug 17, 2017

This is a comment for future decisions regarding API, I understand this particular one is set.

Rotate is a specialist function; LTR or RTL is only relevant given the context. @aclements brought up a valid question, not a valid ever-extending point. His question could have been answered as "it's RTL, just as integers increment"; easy, right?

But instead, cleverness ensues.

What "a specialist function" means is such a simple thing, it was likely dismissed to quickly. Given a code sample, one likely already understands the rotate to occur and the direction even before encountering the line of code. Such code is usually already preceded by illustrative ascii documentation as it is.

What's mentally turbulent is not that Go could have simply chosen RTL as a standard way of interpreting bits from an API perspective, but rather, I first pulled up the changes of 1.9 and find a RotateLeft with no counterpart and the doc giving an example of a negative stride. This is a mind-numbing committee-like decision that is very unfortunate to be landing in 1.9.

I only plead to stick to context of usage for the future. All of this should have been self-evident with questions like, "why are we not providing a counterpart to RotateLeft, why are we panic'ing on negative strides or debating int vs uint for a stride"; ultimately, because I think what "a specialist function" means was simply dismissed to easily for not being clever.

Let us please avoid cleverness in our justification of APIs. It shows in this 1.9 update.

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Jan 30, 2018

Change https://golang.org/cl/90835 mentions this issue: cmd/compile: arm64 intrinsics for math/bits.OnesCount

Change https://golang.org/cl/90835 mentions this issue: cmd/compile: arm64 intrinsics for math/bits.OnesCount

gopherbot pushed a commit that referenced this issue Feb 15, 2018

cmd/compile: arm64 intrinsics for math/bits.OnesCount
This adds math/bits intrinsics for OnesCount on arm64.

name         old time/op  new time/op  delta
OnesCount    3.81ns ± 0%  1.60ns ± 0%  -57.96%  (p=0.000 n=7+8)
OnesCount8   1.60ns ± 0%  1.60ns ± 0%     ~     (all equal)
OnesCount16  2.41ns ± 0%  1.60ns ± 0%  -33.61%  (p=0.000 n=8+8)
OnesCount32  4.17ns ± 0%  1.60ns ± 0%  -61.58%  (p=0.000 n=8+8)
OnesCount64  3.80ns ± 0%  1.60ns ± 0%  -57.84%  (p=0.000 n=8+8)

Update #18616

Conflicts:
	src/cmd/compile/internal/gc/asm_test.go

Change-Id: I63ac2f63acafdb1f60656ab8a56be0b326eec5cb
Reviewed-on: https://go-review.googlesource.com/90835
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment