GitHub - themaximalist/embeddings.js: Simple text embeddings library for Node.js (OpenAI, Mistral, Local)

Embeddings.js

Embeddings.js is a simple way to get text embeddings in Node.js. Embeddings are useful for text similarity search using a vector database.

await embeddings("Hello World!"); // embedding array

Easy to use
Works with any vector database
Supports multiple embedding models with the same simple interface
- Local with Xenova/all-MiniLM-L6-v2
- OpenAI with text-embedding-ada-002
- Mistral with mistral-embed
Caches embeddings
MIT license

Install

npm install @themaximalist/embeddings.js

To use local embeddings, be sure to install the model as well

npm install @xenova/transformers

Configure

Embeddings.js works out of the box with local embeddings, but if you use the OpenAI or Mistral embeddings you'll need an API key in your environment.

export OPENAI_API_KEY=<your-openai-api-key>
export MISRAL_API_KEY=<your-mistral-api-key>

Usage

Using Embeddings.js is as simple as calling a function with any string.

import embeddings from "@themaximalist/embeddings.js";

// defaults to local embeddings
const embedding = await embeddings("Hello World!");
// 384 dimension embedding array

Switching embedding models is easy:

// openai
const embedding = await embeddings("Hello World", {
    service: "openai"
});
// 1536 dimension embedding array

// mistral
const embedding = await embeddings("Hello World", {
    service: "mistral"
})
// 1024 dimension embedding array

Cache

Embeddings.js caches by default, but you can disable it by passing cache: false as an option.

// don't cache (on by default)
const embedding = await embeddings("Hello World!", {
    cache: false
});

The cache file is written to .embeddings.cache.json—you can also delete this file to reset the cache.

API

The Embeddings.js API is a simple function you call with your text, with an optional config object.

await embeddings(
    input, // Text input to compute embeddings
    {
        service: "openai", // Embedding service
        model: "text-embedding-ada-002", // Embedding model
        cache: true, // Cache embeddings
        cache_file: ".embeddings.cache.json", // Cache file
    }
);

Options

service <string>: Embedding service provider. Default is transformers, a local embedding provider.
model <string>: Embedding service model. Default is Xenova/all-MiniLM-L6-v2, a local embedding model. If no model is provided, it will use the default for the selected service.
cache <bool>: Cache embeddings. Default is true.
cache_file <string>: Cache file. Default is .embeddings.cache.json.

Response

Embeddings.js returns a float[] — an array of floating-point numbers.

[ -0.011776604689657688,   0.024298833683133125,  0.0012317118234932423, ... ]

The length of the array is the dimensions of the embedding. When performing text similarity, you'll want to know the dimensions of your embeddings to use them in a vector database.

Dimension Embeddings

Local: 384
OpenAI: 1536
Mistral: 1024

The Embeddings.js API ensures you have a simple way to use embeddings from multiple providers.

Debug

Embeddings.js uses the debug npm module with the embeddings.js namespace.

View debug logs by setting the DEBUG environment variable.

> DEBUG=embeddings.js*
> node src/get_embeddings.js
# debug logs

Vector Database

Embeddings can be used in any vector database like Pinecone, Chroma, PG Vector, etc...

For a local vector database that runs in-memory and uses Embeddings.js internally, check out VectorDB.js.

Projects

Embeddings.js is currently used in the following projects:

AI.js — simple AI library
VectorDB.js — local text similarity search
HyperType — knowledge graph toolkit
HyperTyper — multidimensional mind mapping

License

MIT

Author

Created by The Maximalist, see our open-source projects.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
public		public
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

public

public

src

src

test

test

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

Embeddings.js

Install

Configure

Usage

Cache

API

Debug

Vector Database

Projects

License

Author

About

Languages

License

themaximalist/embeddings.js

Folders and files

Latest commit

History

Repository files navigation

Embeddings.js

Install

Configure

Usage

Cache

API

Debug

Vector Database

Projects

License

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Languages