Skip to content

kessler/gemma-embedding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gemma-embedding

@kessler/gemma-embedding

Local vector embeddings using Google's EmbeddingGemma 300M model via ONNX. Works in Node.js and browser.

Install

npm install @kessler/gemma-embedding

In Node.js, onnxruntime-node is automatically installed for faster native inference. Browser environments use WASM instead.

Usage

import { GemmaEmbedding, cosine } from '@kessler/gemma-embedding'

const embedding = new GemmaEmbedding()
await embedding.load()

// Embed documents
const doc1 = await embedding.embed('The cat sat on the mat')
const doc2 = await embedding.embed('A kitten was resting on the rug')
const doc3 = await embedding.embed('Quantum physics explains entanglement')

// Compute similarity
cosine(doc1, doc2) // ~0.85 (related)
cosine(doc1, doc3) // ~0.25 (unrelated)

// Query embedding (asymmetric — use for search queries)
const query = await embedding.embed('small pet on furniture', 'query')
cosine(query, doc1) // high similarity

// Batch embedding
const vectors = await embedding.embedBatch(['hello', 'world'])

// Cleanup
await embedding.unload()

Query vs Document Embedding

The model uses asymmetric prefixes for retrieval tasks:

  • Document ('document', default): "title: none | text: <your text>" — use when indexing content
  • Query ('query'): "task: search result | query: <your text>" — use when searching

Embed your corpus with 'document' mode, then embed search queries with 'query' mode for best retrieval quality.

Options

const embedding = new GemmaEmbedding({
  // Load from a local path instead of HuggingFace
  modelPath: '/path/to/local/model',

  // Device: 'cpu' (Node.js default), 'wasm' (browser default), 'webgpu'
  device: 'cpu',

  // Quantization: omit for environment default (fp32 on cpu, q8 on wasm)
  // Options: 'fp32', 'q8', 'q4', 'q4f16' (WebGPU-only)
  dtype: 'q8',

  // Progress callback during model download
  onProgress: (info) => {
    if (info.status === 'loading') console.log(`${info.progress}%`)
    if (info.status === 'ready') console.log('Model ready')
    if (info.status === 'error') console.error(info.error)
  },
})

Model Details

License

Apache-2.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors