emb

redis-cli EMB minilm "hello world"
→ \x7c\x8e\x80\xbd...   (384 float32s × 4 bytes)

Install

curl -fsSL https://github.com/elcuervo/emb/raw/main/install.sh | sh

Installs to /usr/local/bin. Set EMB_INSTALL_DIR to change the target:

curl -fsSL https://github.com/elcuervo/emb/raw/main/install.sh | EMB_INSTALL_DIR=~/.local/bin sh

Platforms: macOS (Apple Silicon), Linux (amd64, arm64).

Quick start

# Auto-downloads a model from HuggingFace and starts the server
emb -model-repo Xenova/all-MiniLM-L6-v2

# In another terminal:
redis-cli EMB minilm "hello world"
→ \x7c\x8e\x80\xbd...   (384 float32s × 4 bytes)

Features

Redis protocol: any Redis client works (redis-cli, redis-py, redis-rb, etc.)
ONNX Runtime: fast CPU/GPU inference via CGo bindings
HuggingFace integration: auto-download models and auto-detect dim, max_length, output tensor, pooling strategy from ONNX graph + config.json
Multi-model queries: EMB.MULTI calls different models in one command (MGET-style partial failures)

Quick start

One-liner (no config file)

# Auto-downloads a model from HuggingFace and starts the server
emb -model-repo Xenova/all-MiniLM-L6-v2

# In another terminal:
redis-cli EMB model "hello world"

Two models inline

emb \
  -model minilm -model-onnx ./models/minilm/model.onnx -model-tokenizer ./models/minilm/tokenizer.json \
  -model bge   -model-repo Xenova/bge-small-en-v1.5

redis-cli EMB.MULTI minilm "hello" bge "world"

Local development (with config file)

# Download a model from HuggingFace
just download-model

# Start the server
just dev

# In another terminal:
redis-cli EMB minilm "hello world"

Commands

Command	Description
`EMB <model> <text> [text...]`	Embed one or more texts. Single text → bulk string, multiple → array of bulk strings
`EMB.MODELS`	List loaded models with dimensions and status
`EMB.INFO <model>`	Model details: dim, workers, requests served, avg latency
`EMB.STATS`	Server statistics: uptime, total requests, per-model breakdown
`EMB.MULTI <model> <text> [<model> <text>...]`	Embed texts across different models in one call
`EMB.HELP`	Command reference
`PING`	PONG

EMB.MULTI example

redis-cli EMB.MULTI minilm "hello" siglip2 "a photo of a cat"
1) \x7c\x8e\x80\xbd...   (minilm, 384 floats)
2) \x4a\x9f\x31\xc2...   (siglip2, 768 floats)

Configuration

listen: ":6379"

models:
  minilm:
    onnx: ./models/minilm/model.onnx

  siglip2:
    onnx: ./models/siglip2/text_model.onnx
    tokenizer: ./models/siglip2/tokenizer.json
    output_tensor: pooler_output
    pooling: none
    normalize: true
    dim: 768

  # Auto-download from HuggingFace
  e5:
    model_repo: intfloat/e5-small-v2
    pooling: none
    normalize: false

Model options

Field	Default	Description
`onnx`	—	Path to ONNX model file
`tokenizer`	`<model-dir>/tokenizer.json`	Path to HuggingFace tokenizer JSON
`model_repo`	—	HuggingFace repo (auto-downloads ONNX + tokenizer)
`dim`	auto-detected	Embedding dimension
`max_length`	auto-detected (or 512)	Max token sequence length
`pooling`	auto-detected	`mean` (3D output) or `none` (2D pre-pooled)
`normalize`	`false`	L2-normalize the output
`output_tensor`	auto-detected	ONNX output tensor name
`preload`	`false`	Load model at startup instead of on first request
`pad_output`	`false`	Pad sequences to `max_length` with trailing zeros (compatibility with legacy implementations that don't pass attention mask)
`workers`	auto-tuned	Number of worker goroutines
`batching`	`{timeout: 1, max_batch: 32}`	Smart batching settings (set `timeout: 0` to disable)

Clients

The response is raw little-endian float32 bytes. Any Redis client works.

Ruby:

require "redis_client"

redis = RedisClient.new(port: 6379)
raw = redis.call("EMB", "minilm", "hello world")
emb = raw.unpack("e*")

Or use the emb gem:

require "emb"

Emb[:minilm]["hello world"]
# => [0.0123, -0.0456, 0.0789, ...]

Python:

import struct
raw = redis.execute_command("EMB", "minilm", "hello world")
emb = list(struct.unpack(f"<{len(raw)//4}f", raw))

Go:

var vec []float32
binary.Read(bytes.NewReader(raw), binary.LittleEndian, &vec)

Ruby Gems

Ruby gems for emb:

emb — Client library with connection pooling, proxy, and multi-model support. Auto-decodes float32 responses. README
emb-server — Precompiled server binary. Install and run emb directly. README

Development

Commands

just format          # Format all Go code
just lint            # Run linters
just test            # Run tests
just bench           # Run benchmarks
just build           # Build the emb binary
just dev             # Build and run the server
just download-model  # Download a model from HuggingFace

Nix

A flake.nix is provided for reproducible development shells:

nix develop

This provides Go, ONNX Runtime, golangci-lint, just, and all CGo configuration.

Docker

# Run with a model mounted:
docker run -v ./models:/models elcuervo/emb \
  -config /models/config.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.github		.github
.opencode		.opencode
cmd		cmd
gems		gems
internal		internal
openspec		openspec
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
BENCHMARK.md		BENCHMARK.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
config.yaml		config.yaml
flake.lock		flake.lock
flake.nix		flake.nix
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
justfile		justfile
test-two-models.yaml		test-two-models.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

emb

Install

Quick start

Features

Quick start

One-liner (no config file)

Two models inline

Local development (with config file)

Commands

EMB.MULTI example

Configuration

Model options

Clients

Ruby Gems

Development

Commands

Nix

Docker

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

emb

Install

Quick start

Features

Quick start

One-liner (no config file)

Two models inline

Local development (with config file)

Commands

EMB.MULTI example

Configuration

Model options

Clients

Ruby Gems

Development

Commands

Nix

Docker

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages