Skip to content
This repository has been archived by the owner. It is now read-only.
Provide Golang native SIMD intrinsics on x86/amd64 platform
Go Assembly Makefile
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
sse2
sse3
sse41
sse42
ssse3
.gitignore
.travis.yml
LICENSE
Makefile
README.md update REadme Jun 22, 2017
scanner.go update test Jun 22, 2017

README.md

intrinsic

Build Status Go Report Card

Provide Golang native SIMD intrinsics on x86/amd64 platform

  • SSE2 godoc reference
  • SSE3 godoc reference
  • SSSE3 godoc reference
  • SSE41 godoc reference
  • SSE42 godoc reference

Usage

package main

import (
    "fmt"

    "github.com/mengzhuo/intrinsic/sse2"
)

func main() {
    src := []float32{3.14, 2.17}
    dst := []float32{2.17, 3.15}
    sse2.MAXSDm64float32(src, dst)
    fmt.Print(src, dst) //[2.17 3.15] [2.17 3.15]
}

Benchmarks

SSE2 it will provide about 6x-7x performance enhancement.

BenchmarkPMINUBByte-4         	1000000000	         2.65 ns/op	       0 B/op	       0 allocs/op
BenchmarkGeneralPMINUBByte-4   	100000000	        15.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkPAND-4               	1000000000	         2.61 ns/op	       0 B/op	       0 allocs/op
BenchmarkGeneralAND-4         	100000000	        15.4 ns/op	       0 B/op	       0 allocs/op

Development

All codes in subdir is generated by scanner.go , see Makefile for more detail.

x86.csv and x86desc.csv are from another repos in https://github.com/mengzhuo/x86data

TODO

  • resolve immediate opcode generate
  • SSE2 gen=80, total=141, ratio=56.74%
  • SSE3 gen=6, total=10, ratio=60.00%
  • SSSE3 gen=15, total=32, ratio=46.88%
  • SSE4_1 gen=26, total=49, ratio=53.06%
  • SSE4_2 gen=1, total=5, ratio=20.00%
  • AVX gen=66, total=378, ratio=17.46%
  • AVX2 gen=8, total=159, ratio=5.03%
  • FMA
You can’t perform that action at this time.