Skip to content
Go implementation of BLAS (Basic Linear Algebra Subprograms)
Assembly Go Shell
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.gitignore
LICENSE
README.md
common.go
d_test.go
dasum.go
dasum_amd64.s
daxpy.go
daxpy_amd64.s
dcopy.go
dcopy_amd64.s
ddot.go
ddot_amd64.s
dgemv.go
dnrm2.go
dnrm2_amd64.s
doc.go
drot.go
drot_amd64.s
drotg.go
drotg_amd64.s
drotmg.go
dscal.go
dscal_amd64.s
dswap.go
dswap_amd64.s
idamax.go
idamax_amd64-simd_broken
idamax_amd64.s
isamax.go
isamax_amd64.s
s_test.go
sasum.go
sasum_amd64.s
saxpy.go
saxpy_amd64.s
scopy.go
scopy_amd64.s
sdot.go
sdot_amd64.s
sdsdot.go
sdsdot_amd64.s
simd.txt
snrm2.go
snrm2_amd64.s
srot.go
srot_amd64.s
srotg.go
srotg_amd64.s
sscal.go
sscal_amd64.s
sswap.go
sswap_amd64.s
stubs.bash
stubs_386.s
stubs_arm.s

README.md

Go implementation of BLAS (Basic Linear Algebra Subprograms)

Any function is implemented in generic Go and if it is justified, it is optimized for AMD64 (using SSE2 instructions).

AMD64 implementation uses MOVUPS/MOVUPD instructions if all strides equal to 1 so it run fast on Nehalem, Sandy Bridge and newer processors but relatively slow on older processors.

Any implemented function has its own unity test and benchmark.

Implemented functions

Level 1

Sdsdot, Sdot, Ddot, Snrm2, Dnrm2, Sasum, Dasum, Isamax, Idamax, Sswap, Dswap, Scopy, Dcopy, Saxpy, Daxpy, Sscal, Dscal, Srotg, Drotg, Srot, Drot

Level 2

not implemented

Level 3

not implemented

Example benchmarks

FunctionGeneric GoOptimized for AMD64
Ddot2825 ns/op895 ns/op
Dnrm22787 ns/op597 ns/op
Dasum3145 ns/op560 ns/op
Sdsdot3133 ns/op1733 ns/op
Sdot2832 ns/op508 ns/op

Documentation

http://godoc.org/github.com/ziutek/blas

Something went wrong with that request. Please try again.