Skip to content

yangbooom/sqlite-wasm-vec

Repository files navigation

SQLite Wasm with sqlite-vec

SQLite Wasm with sqlite-vec extension for vector similarity search, conveniently wrapped as an ES Module.

This is a fork of @sqlite.org/sqlite-wasm that includes the sqlite-vec extension compiled into the WASM build, enabling vector operations directly in the browser.

Features

  • 🚀 Full SQLite with WASM support (SQLite 3.51.0)
  • 🔍 Vector similarity search using sqlite-vec extension (v0.1.7-alpha.2)
  • 📦 ~1.5MB WASM file (optimized with -O2)
  • 🌐 Browser & Node.js compatible
  • 💾 OPFS support for persistent storage
  • 🎯 TypeScript definitions included
  • 100% API compatible with @sqlite.org/sqlite-wasm

Node.js support

Warning

Node.js is currently only supported for in-memory databases without persistence.

Installation

npm install sqlite-wasm-vec

sqlite-vec Features

All sqlite-vec functions are available out of the box. Here are comprehensive examples:

Basic Setup and Version Check

import sqlite3InitModule from 'sqlite-wasm-vec';

// Initialize SQLite with vec extension
const sqlite3 = await sqlite3InitModule();
const db = new sqlite3.oo1.DB(':memory:'); // or use a file path

// Check sqlite-vec version
const vecVersion = db.selectValue('SELECT vec_version()');
console.log('sqlite-vec version:', vecVersion); // e.g., "v0.1.7-alpha.2"

Creating Vector Tables

// Create a virtual table for storing vectors
// Specify the vector dimension (e.g., 384 for all-MiniLM-L6-v2 embeddings)
db.exec(`
  CREATE VIRTUAL TABLE vec_items USING vec0(
    embedding float[384]
  )
`);

// You can also create tables with different vector types
db.exec(`
  CREATE VIRTUAL TABLE vec_int8_items USING vec0(
    embedding int8[128]  -- 8-bit integer vectors
  )
`);

db.exec(`
  CREATE VIRTUAL TABLE vec_bit_items USING vec0(
    embedding bit[1024]  -- Binary vectors for hamming distance
  )
`);

Inserting Vectors

// Method 1: Using prepared statements (recommended for multiple inserts)
const vectors = [
  { id: 1, data: [0.1, 0.2, 0.3, 0.4] },
  { id: 2, data: [0.5, 0.6, 0.7, 0.8] },
  { id: 3, data: [0.9, 1.0, 1.1, 1.2] },
];

const stmt = db.prepare(
  'INSERT INTO vec_items(rowid, embedding) VALUES (?, ?)',
);
for (const vector of vectors) {
  const float32Array = new Float32Array(vector.data);
  stmt.bind(vector.id, float32Array.buffer).stepReset();
}
stmt.finalize();

// Method 2: Direct insert with typed arrays
const embedding = new Float32Array(384);
// ... fill embedding with your data ...
db.exec({
  sql: 'INSERT INTO vec_items(rowid, embedding) VALUES (?, ?)',
  bind: [100, embedding.buffer],
});

Vector Similarity Search

// Perform k-NN search using MATCH operator
const queryVector = new Float32Array([0.2, 0.3, 0.4, 0.5]);

// Find 5 most similar vectors
const results = db.selectArrays(
  'SELECT rowid, distance FROM vec_items WHERE embedding MATCH ? ORDER BY distance LIMIT 5',
  queryVector.buffer,
);

// Results format: [[rowid1, distance1], [rowid2, distance2], ...]
results.forEach(([id, distance]) => {
  console.log(`Item ${id}: distance = ${distance}`);
});

// With additional filtering
const filtered = db.selectObjects(
  `SELECT
     vi.rowid,
     vi.distance,
     items.title,
     items.category
   FROM vec_items vi
   JOIN items ON items.id = vi.rowid
   WHERE
     embedding MATCH ? AND
     items.category = 'electronics'
   ORDER BY distance
   LIMIT 10`,
  queryVector.buffer,
);

Using Distance Functions

// Store vectors for comparison
db.exec('CREATE TEMP TABLE test_vectors (id INTEGER PRIMARY KEY, vec BLOB)');

const v1 = new Float32Array([1.0, 2.0, 3.0]);
const v2 = new Float32Array([4.0, 5.0, 6.0]);

const stmt = db.prepare('INSERT INTO test_vectors(id, vec) VALUES (?, ?)');
stmt.bind(1, v1.buffer).stepReset();
stmt.bind(2, v2.buffer).stepReset();
stmt.finalize();

// Calculate different distance metrics
const l2Distance = db.selectValue(`
  SELECT vec_distance_l2(
    (SELECT vec FROM test_vectors WHERE id = 1),
    (SELECT vec FROM test_vectors WHERE id = 2)
  )
`);

const cosineDistance = db.selectValue(`
  SELECT vec_distance_cosine(
    (SELECT vec FROM test_vectors WHERE id = 1),
    (SELECT vec FROM test_vectors WHERE id = 2)
  )
`);

const l1Distance = db.selectValue(`
  SELECT vec_distance_l1(
    (SELECT vec FROM test_vectors WHERE id = 1),
    (SELECT vec FROM test_vectors WHERE id = 2)
  )
`);

console.log('L2 distance:', l2Distance);
console.log('Cosine distance:', cosineDistance);
console.log('L1 distance:', l1Distance);

Vector Operations

// Vector normalization
const normalized = db.selectValue(
  'SELECT vec_normalize(?)',
  new Float32Array([3.0, 4.0]).buffer,
);

// Vector addition and subtraction
const sum = db.selectValue(
  'SELECT vec_add(?, ?)',
  new Float32Array([1, 2, 3]).buffer,
  new Float32Array([4, 5, 6]).buffer,
);

// Get vector metadata
const length = db.selectValue(
  'SELECT vec_length(?)',
  new Float32Array([1, 2, 3, 4, 5]).buffer,
); // Returns 5

// Convert vector to JSON for inspection
const jsonArray = db.selectValue(
  'SELECT vec_to_json(?)',
  new Float32Array([1.5, 2.5, 3.5]).buffer,
); // Returns "[1.5,2.5,3.5]"

Quantization for Memory Efficiency

// Quantize float32 vectors to int8 (reduces memory by 4x)
const float32Vec = new Float32Array([0.1, 0.5, 0.9, -0.3]);
const int8Vec = db.selectValue(
  'SELECT vec_quantize_int8(?)',
  float32Vec.buffer,
);

// Quantize to binary vectors (reduces memory by 32x)
const binaryVec = db.selectValue(
  'SELECT vec_quantize_binary(?)',
  float32Vec.buffer,
);

// Use quantized vectors for approximate search
db.exec(`
  CREATE VIRTUAL TABLE vec_binary USING vec0(
    embedding bit[1024]
  )
`);

Practical Example: Building a Semantic Search

import sqlite3InitModule from 'sqlite-wasm-vec';

async function buildSemanticSearch() {
  const sqlite3 = await sqlite3InitModule();
  const db = new sqlite3.oo1.DB(':memory:');

  // Create tables for documents and their embeddings
  db.exec(`
    CREATE TABLE documents (
      id INTEGER PRIMARY KEY,
      title TEXT,
      content TEXT,
      metadata JSON
    );

    CREATE VIRTUAL TABLE doc_embeddings USING vec0(
      embedding float[384]  -- Using 384-dim embeddings
    );
  `);

  // Insert sample documents with their embeddings
  const documents = [
    {
      id: 1,
      title: 'Introduction to AI',
      content: 'AI is transforming...',
      embedding: [
        /* ... */
      ],
    },
    {
      id: 2,
      title: 'Machine Learning Basics',
      content: 'ML algorithms learn...',
      embedding: [
        /* ... */
      ],
    },
    // ... more documents
  ];

  const insertDoc = db.prepare('INSERT INTO documents VALUES (?, ?, ?, ?)');
  const insertEmb = db.prepare(
    'INSERT INTO doc_embeddings(rowid, embedding) VALUES (?, ?)',
  );

  for (const doc of documents) {
    insertDoc
      .bind(doc.id, doc.title, doc.content, JSON.stringify({}))
      .stepReset();
    insertEmb.bind(doc.id, new Float32Array(doc.embedding).buffer).stepReset();
  }

  insertDoc.finalize();
  insertEmb.finalize();

  // Search function
  function search(queryEmbedding, limit = 5) {
    return db.selectObjects(
      `
      SELECT
        d.id,
        d.title,
        d.content,
        e.distance
      FROM doc_embeddings e
      JOIN documents d ON d.id = e.rowid
      WHERE e.embedding MATCH ?
      ORDER BY e.distance
      LIMIT ?
    `,
      new Float32Array(queryEmbedding).buffer,
      limit,
    );
  }

  // Perform a search
  const queryVector = [
    /* your query embedding */
  ];
  const results = search(queryVector, 10);

  return { db, search };
}

Available Functions

All sqlite-vec functions are compiled and available:

Core Functions:

  • vec_version() - Returns the version of the sqlite-vec extension
  • vec_debug() - Debugging information about the extension

Vector Creation and Manipulation:

  • vec_f32(values...) - Create a float32 vector from values
  • vec_int8(values...) - Create an int8 vector from values
  • vec_bit(values...) - Create a bit vector from values
  • vec_slice(vector, start, end) - Extract a slice of a vector
  • vec_length(vector) - Get the number of dimensions in a vector
  • vec_type(vector) - Get the type of a vector (f32, int8, bit)
  • vec_to_json(vector) - Convert vector to JSON array representation

Vector Operations:

  • vec_add(a, b) - Element-wise addition of two vectors
  • vec_sub(a, b) - Element-wise subtraction of two vectors
  • vec_normalize(vector) - Normalize a vector to unit length

Distance Functions:

  • vec_distance_l2(a, b) - Euclidean/L2 distance between vectors
  • vec_distance_cosine(a, b) - Cosine distance (1 - cosine similarity)
  • vec_distance_l1(a, b) - Manhattan/L1 distance between vectors
  • vec_distance_hamming(a, b) - Hamming distance for bit vectors

Quantization Functions:

  • vec_quantize_binary(vector) - Quantize to binary (bit) representation
  • vec_quantize_int8(vector) - Quantize float32 to int8 representation

Usage

There are three ways to use SQLite Wasm:

Only the worker versions allow you to use the origin private file system (OPFS) storage back-end.

In a wrapped worker (with OPFS if available):

Warning

For this to work, you need to set the following headers on your server:

Cross-Origin-Opener-Policy: same-origin

Cross-Origin-Embedder-Policy: require-corp

import { sqlite3Worker1Promiser } from 'sqlite-wasm-vec';

const log = console.log;
const error = console.error;

const initializeSQLite = async () => {
  try {
    log('Loading and initializing SQLite3 module...');

    const promiser = await new Promise((resolve) => {
      const _promiser = sqlite3Worker1Promiser({
        onready: () => resolve(_promiser),
      });
    });

    log('Done initializing. Running demo...');

    const configResponse = await promiser('config-get', {});
    log('Running SQLite3 version', configResponse.result.version.libVersion);

    const openResponse = await promiser('open', {
      filename: 'file:mydb.sqlite3?vfs=opfs',
    });
    const { dbId } = openResponse;
    log(
      'OPFS is available, created persisted database at',
      openResponse.result.filename.replace(/^file:(.*?)\?vfs=opfs$/, '$1'),
    );
    // Your SQLite code here.
  } catch (err) {
    if (!(err instanceof Error)) {
      err = new Error(err.result.message);
    }
    error(err.name, err.message);
  }
};

initializeSQLite();

The promiser object above implements the Worker1 API.

In a worker (with OPFS if available):

Warning

For this to work, you need to set the following headers on your server:

Cross-Origin-Opener-Policy: same-origin

Cross-Origin-Embedder-Policy: require-corp

// In `main.js`.
const worker = new Worker('worker.js', { type: 'module' });
// In `worker.js`.
import sqlite3InitModule from 'sqlite-wasm-vec';

const log = console.log;
const error = console.error;

const start = (sqlite3) => {
  log('Running SQLite3 version', sqlite3.version.libVersion);
  const db =
    'opfs' in sqlite3
      ? new sqlite3.oo1.OpfsDb('/mydb.sqlite3')
      : new sqlite3.oo1.DB('/mydb.sqlite3', 'ct');
  log(
    'opfs' in sqlite3
      ? `OPFS is available, created persisted database at ${db.filename}`
      : `OPFS is not available, created transient database ${db.filename}`,
  );
  // Your SQLite code here.
};

const initializeSQLite = async () => {
  try {
    log('Loading and initializing SQLite3 module...');
    const sqlite3 = await sqlite3InitModule({ print: log, printErr: error });
    log('Done initializing. Running demo...');
    start(sqlite3);
  } catch (err) {
    error('Initialization error:', err.name, err.message);
  }
};

initializeSQLite();

The db object above implements the Object Oriented API #1.

In the main thread (without OPFS):

import sqlite3InitModule from 'sqlite-wasm-vec';

const log = console.log;
const error = console.error;

const start = (sqlite3) => {
  log('Running SQLite3 version', sqlite3.version.libVersion);
  const db = new sqlite3.oo1.DB('/mydb.sqlite3', 'ct');
  // Your SQLite code here.
};

const initializeSQLite = async () => {
  try {
    log('Loading and initializing SQLite3 module...');
    const sqlite3 = await sqlite3InitModule({
      print: log,
      printErr: error,
    });
    log('Done initializing. Running demo...');
    start(sqlite3);
  } catch (err) {
    error('Initialization error:', err.name, err.message);
  }
};

initializeSQLite();

The db object above implements the Object Oriented API #1.

Usage with vite

If you are using vite, you need to add the following config option in vite.config.js:

import { defineConfig } from 'vite';

export default defineConfig({
  server: {
    headers: {
      'Cross-Origin-Opener-Policy': 'same-origin',
      'Cross-Origin-Embedder-Policy': 'require-corp',
    },
  },
  optimizeDeps: {
    exclude: ['sqlite-wasm-vec'],
  },
});

Check out a sample project that shows this in action.

Demo

See the demo folder for examples of how to use this in the main thread and in a worker. (Note that the worker variant requires special HTTP headers, so it can't be hosted on GitHub Pages.) An example that shows how to use this with vite is available on StackBlitz.

Projects using this package

See the list of npm dependents for this package.

Deploying a new version

(These steps can only be executed by maintainers.)

  1. Update the version number in package.json reflecting the current SQLite version number and add a build identifier suffix like -build1. The complete version number should read something like 3.41.2-build1.
  2. Run npm run build to build the ES Module. This downloads the latest SQLite Wasm binary and builds the ES Module.
  3. Run npm run deploy to commit the changes, push to GitHub, and publish the new version to npm.

License

Apache 2.0.

Acknowledgements

This project is based on SQLite Wasm, which it conveniently wraps as an ES Module and publishes to npm as @sqlite.org/sqlite-wasm.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 21