Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore AVX/SSE code for tensorf32 #1

Closed
chewxy opened this issue Sep 15, 2016 · 1 comment
Closed

Restore AVX/SSE code for tensorf32 #1

chewxy opened this issue Sep 15, 2016 · 1 comment
Assignees

Comments

@chewxy
Copy link
Member

chewxy commented Sep 15, 2016

Something went wrong with the transfer to this repository, and all the assembly files wrt Float32 operations failed to pass the tests. Figure out what's wrong and fix it.

@chewxy chewxy self-assigned this Sep 15, 2016
chewxy added a commit that referenced this issue Sep 20, 2016
On a Core-i7 2600K at stock (unoverclocked speed) for float32:

$ go test -tags=avx -bench=.
BenchmarkVecAdd-8              	200000000	         9.06 ns/op
BenchmarkVanillaVecAdd-8       	30000000	        49.8 ns/op
BenchmarkVecSub-8              	200000000	         9.13 ns/op
BenchmarkVanillaVecSub-8       	50000000	        38.6 ns/op
BenchmarkVecMul-8              	200000000	         9.04 ns/op
BenchmarkVanillaVecMul-8       	30000000	        46.6 ns/op
BenchmarkVecDiv-8              	50000000	        27.6 ns/op
BenchmarkVanillaVecDiv-8       	20000000	       101 ns/op
BenchmarkVecSqrt-8             	20000000	        76.2 ns/op
BenchmarkVanillaVecSqrt-8      	20000000	       113 ns/op
BenchmarkVecInvSqrt-8          	10000000	       124 ns/op
BenchmarkVanillaVecInvSqrt-8   	10000000	       207 ns/op
@chewxy
Copy link
Member Author

chewxy commented Sep 20, 2016

tf32 and tf64 are at feature parity wrt to assembly functions. Still a bunch of fastmath stuff not ported over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant