Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: hash: make the Sum method less confusing #21070

Open
bcmills opened this issue Jul 18, 2017 · 7 comments

Comments

@bcmills
Copy link
Member

commented Jul 18, 2017

The Go 1 hash.Hash interface has a Sum method:

    // Sum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    Sum(b []byte) []byte

Individual packages (such as md5) contain Sum functions with very similar signatures but totally different meaning:

  // Sum returns the MD5 checksum of the data.
  func Sum(data []byte) [Size]byte

This makes (hash.Hash).Sum confusing, as illustrated in the review for http://golang.org/cl/49030.

I can see two possible improvements.

  1. We could rename the method to AppendSum (along the lines of the strconv Append functions). Adding such a method to the implementations (but not the Hash interface itself) would be backward-compatible with Go 1.
    // AppendSum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    AppendSum(b []byte) []byte
  1. We could change Sum to not accept a parameter and always return a new slice. If we pair that with appropriate inlining and devirtualization optimizations, it could theoretically be as efficient as appending to an existing slice.
    // Sum returns a new slice containing the current hash.
    // It does not change the underlying hash state.
    Sum() []byte

I recommend option (1), because I think it would migrate more smoothly in binaries that mix Go 1 and Go 2.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jul 18, 2017

always return a new slice. If we pair that with appropriate inlining and devirtualization optimizations, it could theoretically be as efficient as appending to an existing slice.

I don't see that happening anytime soon. I think AppendSum and being explicit about memory is still best until we've proven we can make a compiler and/or GC smart enough to not worry about allocation costs.

@bradfitz bradfitz added the Go2 label Jul 18, 2017

@ianlancetaylor ianlancetaylor changed the title proposal (Go 2): hash: make the Sum method less confusing proposal: Go 2: hash: make the Sum method less confusing Feb 27, 2018

@gopherbot gopherbot added this to the Proposal milestone Feb 27, 2018

@gopherbot gopherbot added the Proposal label Feb 27, 2018

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2018

I agree that renaming the Sum method to AppendSum seems like an appropriate change.

@RaduBerinde

This comment has been minimized.

Copy link
Contributor

commented Oct 2, 2018

In go 1, we could rename the argument b to something more explicit like appendTo:

Sum(appendTo []byte) []byte

This is just a cosmetic, backwards compatible change but it can help avoid the confusion.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

This may not be worth the pain. We'd need to make a new v2 hash package, and then either make v2s of all implementations, or add an additional AppendSum methods to all types alongside their old Sum methods to implement the old [v1] hash package. Both seem kinda gross.

@as

This comment has been minimized.

Copy link
Contributor

commented Nov 22, 2018

Why does the method need to be called AppendSum? What other Append operations could it possibly be referring to? If have an Appender interface type with the signature Append([]byte) []byte, naming the method with the extra suffix AppendSum prevents the standard library from accruing implicitly satisfied Appender types.

The original name Sum should have just been Append. The documentation for the method set seems to agree.

@akavel

This comment has been minimized.

Copy link
Contributor

commented Dec 11, 2018

@bradfitz Because of the "pain" you mention, I believe if this might ever be done, "Go2" is the only point where it could be. Please don't casually dismiss it too easily. Please note there's also a huge pain in the current state of events. Because of this very issue, I personally have a huge yellow virtual "WARNING" sign attached to the whole hash package in my mind, with the small print saying: "never touch the hash package without cautiously reading the godoc; there's some well-known minefield somewhere there". And I'm certainly not a noob to Go, being in the community since the 2009. Notably, I do also believe that the current shape of the interface can be unnecessarily risky with regards to use in security-related applications (there's a reason why many Hash implementations are located in the crypto package), and in this domain I believe it's generally better to be safe than sorry, including in API design. For the sake of making the virtual world somewhat safer.

I understand this may not be a "prime time" candidate for the "Go2" process. But once some of the necessary Go2 refactoring approaches/mechanisms get tested and proven on some more "attractive" packages/issues, and people get used to them in the Go 2 transition period, I don't see why the "smaller papercuts" issues like this one cannot get pulled onto the bandwagon as well.

@detailyang

This comment has been minimized.

Copy link

commented May 10, 2019

Any plan? it's highly useful to minimize memory overhead when the go program will do the sum

BTW: Append API naming style is idiomatic:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.