gonnx is a small Go package for shipping and using ONNX Runtime from Go
without cgo in your application code. It builds on
onnxruntime-purego, which loads libonnx using purego (an alternative to cgo - amazing), and adds
packaging helpers for applications that want a batteries included fast ONNX Runtime that embeds in the Go
binary. While there are other non-cgo, non purego options, I found the inference to be orders
of magnitudes slower. This library also has packaging helpers for easy embedding of models
into your binary.
Benefits:
- platform-specific ONNX Runtime bundles for Linux, macOS, and Windows
- blank-import runtime packages, so binaries only include the platforms you choose
- extraction of embedded shared libraries to a SHA-256 checked temp cache
- simple
Open,OpenReader, andOpenBundlehelpers for models - small tensor helpers for common input/output handling
- optional bundled model packages under
github.com/mackross/gonnx/models
import (
"github.com/mackross/gonnx"
_ "github.com/mackross/gonnx/runtimes/linuxamd64"
)
sess, err := gonnx.Open("model.onnx", gonnx.WithThreads(1))
if err != nil {
return err
}
defer sess.Close()Importing a runtime package embeds that ONNX Runtime build in your binary. At
runtime, gonnx extracts the registered shared libraries into a checked cache
under the system temp directory, then loads ONNX Runtime via
onnxruntime-purego.
If you do not want bundled runtime assets, use onnxruntime-purego directly or
pass your own runtime with gonnx.WithRuntime(rt).
The repository includes ONNX Runtime 1.23.x assets, matching the supported
version range of onnxruntime-purego.
Available runtime packages:
github.com/mackross/gonnx/runtimes/linuxamd64github.com/mackross/gonnx/runtimes/linuxarm64github.com/mackross/gonnx/runtimes/darwinamd64github.com/mackross/gonnx/runtimes/darwinarm64github.com/mackross/gonnx/runtimes/windowsamd64github.com/mackross/gonnx/runtimes/windowsarm64
Bundled model packages live outside this repository under the
gonnx-models GitHub organization. The root
gonnx module intentionally stays small and only provides ONNX Runtime loading,
asset extraction, chunk-joining, and tensor helpers.
Users should import individual model modules directly:
| Import path | Task |
|---|---|
github.com/gonnx-models/silero |
Silero voice activity detection for 16 kHz PCM. |
github.com/gonnx-models/smartturn |
Smart Turn voice turn-completion detection. |
github.com/gonnx-models/neurobert |
Tiny English NER. |
github.com/gonnx-models/distilbertcased |
Cased English DistilBERT NER. |
github.com/gonnx-models/distilbertuncased |
Uncased English DistilBERT NER. |
github.com/gonnx-models/bertcased |
Cased English BERT-base NER. |
github.com/gonnx-models/bertuncased |
Uncased English BERT-base NER. |
github.com/gonnx-models/multidistilbert |
Multilingual DistilBERT NER. |
The umbrella management repository is
github.com/gonnx-models/models. It
contains the model repositories as git submodules plus workspace/docs for
maintainers; it is not the import path users normally need.
Model packages expose high-level Open(opts ...gonnx.Option) helpers and hide
shared implementation utilities from callers. Large ONNX files are stored as
ordinary Git chunks rather than Git LFS, so go get can fetch usable embedded
assets. ModelBundle.ModelParts reconstructs chunked models in the local gonnx
cache before opening an ONNX Runtime session.
Example:
import (
"context"
"github.com/gonnx-models/bertuncased"
"github.com/mackross/gonnx"
_ "github.com/mackross/gonnx/runtimes/linuxamd64"
)
recognizer, err := bertuncased.Open(gonnx.WithThreads(1))
if err != nil {
return err
}
defer recognizer.Close()
entities, err := recognizer.Recognize(context.Background(), "Barack Obama worked at Microsoft in Seattle.")Open, OpenReader, OpenBundle, and NewRuntime use the same option style:
sess, err := gonnx.Open("model.onnx",
gonnx.WithThreads(1),
gonnx.WithLogLevel(onnxruntime.LoggingLevelWarning),
)Useful options:
WithRuntime(rt)uses an existing*onnxruntime.RuntimeWithAPIVersion(version)overrides the default ONNX Runtime C API version, currently23WithLogID(id)sets the ONNX Runtime environment log IDWithLogLevel(level)sets the ONNX Runtime log level, including verboseWithSessionOptions(options)passes rawonnxruntime.SessionOptionsWithThreads(n)setsSessionOptions.IntraOpNumThreads
input, err := gonnx.Tensor(sess.Runtime, []float32{1, 2, 3, 4}, 1, 4)
if err != nil {
return err
}
defer input.Close()
outputs, err := sess.Run(ctx, map[string]*onnxruntime.Value{
sess.InputNames()[0]: input,
})
if err != nil {
return err
}
defer outputs[sess.OutputNames()[0]].Close()
data, shape, err := gonnx.TensorData[float32](outputs[sess.OutputNames()[0]])Models and sidecar files can use the same extract-and-cache pattern as runtime libraries:
//go:embed models/model.onnx models/vocab.txt
var modelFS embed.FS
sess, err := gonnx.OpenBundle(gonnx.ModelBundle{
Name: "my-model",
FS: modelFS,
ModelRel: "models/model.onnx",
ExtraRels: []string{"models/vocab.txt"},
}, gonnx.WithThreads(1))OpenBundle automatically prepares the model bundle. PrepareModelBundle is
available if you want to prewarm the extraction cache or pass the extracted path
to lower-level APIs.
See examples/bert_ner for a live named-entity recognition
example using an ONNX export of
dslim/bert-base-NER-uncased.
Run the live test with:
GONNX_LIVE_BERT_NER=1 go test -v ./examples/bert_ner -run TestLiveBertNERRecognizesEntitiesgo test ./...
cd models && go test ./...