Skip to content
This repository has been archived by the owner on Nov 29, 2018. It is now read-only.

go 1.5 already includes the SSE42 updateCastagnoli() function #3

Closed
wscott opened this issue Nov 13, 2015 · 6 comments
Closed

go 1.5 already includes the SSE42 updateCastagnoli() function #3

wscott opened this issue Nov 13, 2015 · 6 comments

Comments

@wscott
Copy link

wscott commented Nov 13, 2015

I was playing with this repo mainly as a way to teach myself code and including optional assembly versions of some routines. Very nice.

So I extended your benchmarks to test the Castagnoli version as well, and found that it wasn't any faster than the system hash, and sure enough the go 1.5 version includes the SSE 4.2 code to use the new crc32c opcode. The IEEE crc32 is still slow.

It might be a good idea to note that in the README

@klauspost
Copy link
Owner

I am not sure what you mean. I have adjusted the documentation, since it is carryless multiplication your CPU must support, not only SSE 4.2.

@wscott
Copy link
Author

wscott commented Nov 13, 2015

For example I added some tests like this:

func BenchmarkCCrc1KB(b *testing.B) {
    benchmark(b, New(MakeTable(Castagnoli)), 1024)
}

func BenchmarkCStdCrc1KB(b *testing.B) {
    benchmark(b, crc32.New(crc32.MakeTable(Castagnoli)), 1024)
}

And found that for Castagnoli the std library was the same speed. And looking I see it already included that support. https://golang.org/src/hash/crc32/crc32_amd64.s

I assumed that you wrote all of the assembly files and the std library just copied part of them. Now I am guessing they always had the fast Castagnoli version and you extended it to include the SSE code for the IEEE crc as well.

@klauspost
Copy link
Owner

Yes, I started this as a copy of the standard library.

This is the current "tip" version, which includes my code: https://tip.golang.org/src/hash/crc32/crc32_amd64.s

@wscott
Copy link
Author

wscott commented Nov 13, 2015

BTW another optimization that I notice isn't included is that the slicingBy8 optimization can still be used in the Castagnoli case.

@klauspost
Copy link
Owner

Oh yes, that might be nice for other platforms.

@klauspost
Copy link
Owner

Added in PR #5

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants