VGI — Vector Gateway Interface for DuckDB

VGI (Vector Gateway Interface) is an Apache Arrow–based protocol for extending DuckDB using any language — no C++/C/Rust/Zig and no compilation or linking required.

This repository contains the DuckDB extension (the C++ side of VGI). It loads into DuckDB and lets you ATTACH a worker — a program written in Python, TypeScript, Go, or anything that speaks the protocol — and then call the scalar, table, and aggregate functions that worker exposes as if they were native DuckDB functions. Data moves between DuckDB and the worker over Apache Arrow IPC, across a subprocess pipe, an HTTP connection, or a Unix domain socket.

The reference worker SDK is vgi-python — pip install vgi.

Created by Query.Farm.

Why VGI?

Traditional DuckDB extensions	VGI workers
C/C++ compilation required	Any language — write a script or ship a binary
Tied to a specific DuckDB version	Version independent
Complex build / release cycle	Ship a script or executable
Runs in-process	Process isolation
Single-threaded by default	Parallel pooled workers

Use cases: call REST APIs from SQL, run ML inference (PyTorch, scikit-learn), process data with Python/pandas/numpy, build custom ETL transforms, or expose external data sources as queryable tables and views.

Quick Start

Install the extension and attach a worker:

-- First time only
INSTALL vgi FROM community;
LOAD vgi;

-- Attach a worker as a catalog
ATTACH 'my_funcs' (TYPE vgi, LOCATION './my_worker.py');

-- Call the functions it exposes
SELECT upper_case(name) FROM users;
SELECT * FROM my_table_function('arg');

A minimal Python worker (using vgi-python):

# my_worker.py
from typing import Annotated
import pyarrow as pa
import pyarrow.compute as pc
from vgi import ScalarFunction, Param, Returns, Worker


class UpperCase(ScalarFunction):
    """Convert string values to uppercase."""

    @classmethod
    def compute(
        cls,
        value: Annotated[pa.StringArray, Param(doc="String value to uppercase")],
    ) -> Annotated[pa.StringArray, Returns()]:
        return pc.utf8_upper(value)


class MyWorker(Worker):
    catalog_name = "my_funcs"
    functions = [UpperCase]


if __name__ == "__main__":
    MyWorker().run()

Features

Function shapes — scalar, table, table-in-out (streaming), buffered table (see-every-row-before-output), aggregate, and windowed aggregate functions, all defined in the worker and surfaced as native DuckDB functions.
Full catalog integration — workers expose schemas, tables, views, and functions; DuckDB lazily loads catalog metadata, column statistics, and supports multi-branch (UNION-ALL) tables.
Multiple transports — LOCATION accepts a bare command (subprocess, pooled per DuckDB process), http(s):// (HTTP), unix:///path/to.sock (AF_UNIX), or launch:<argv> (launcher-managed shared worker).
Pushdown — projection, filter, ORDER BY + LIMIT, and join-key pushdown to workers that opt in.
Worker pooling — subprocess workers are pooled and reused across queries, with diagnostics (vgi_worker_subprocess_pool(), vgi_worker_pool_stats()).
Performance — optional POSIX shared-memory transport for zero-copy batch transfer.
Auth — per-catalog OAuth / bearer tokens; OIDC identity introspection via vgi_catalog_identity().

See the docs/ directory for deep dives on multi-branch tables, the launcher protocol, catalog profiling, and more.

Building

This extension uses VCPKG for dependency management and is built with multiple modules, so set USE_MERGED_VCPKG_MANIFEST=1. The Makefile auto-detects the VCPKG toolchain from vcpkg/ in the project tree.

git clone --recurse-submodules https://github.com/Query-farm/vgi.git
cd vgi

# Debug build
USE_MERGED_VCPKG_MANIFEST=1 GEN=ninja make debug

# Release build
USE_MERGED_VCPKG_MANIFEST=1 GEN=ninja make release

Installing ccache and ninja is strongly recommended for fast incremental rebuilds.

The build produces:

./build/release/duckdb                                    # DuckDB shell with the extension preloaded
./build/release/test/unittest                             # test runner
./build/release/extension/vgi/vgi.duckdb_extension        # the loadable extension

Testing

The extension supports two transports; subprocess is the faster default. Tests prefer the release build:

# Subprocess transport (default, faster)
make test_subprocess

# HTTP transport
make test_http

# Both
make test_all

The VGI_TEST_WORKER environment variable controls which worker is used and defaults to the vgi-python fixture worker. See the project root CLAUDE.md for the full matrix of test targets, debug builds, and environment variables.

License

This project is licensed under the Query Farm Source-Available License, Version 1.0 (Licensor: Query Farm LLC). Non-production use is freely permitted, as is production use — except for offering a Competing Offering or operating a Commercial Marketplace built on the Licensed Work, which require a separate commercial license. Each version converts to Apache 2.0 on its Change Date (the tenth anniversary of that version's first public release). See LICENSE for the full terms.

For commercial or custom licensing, contact hello@query.farm.

Name		Name	Last commit message	Last commit date
Latest commit History 359 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docs		docs
duckdb @ 14eca11		duckdb @ 14eca11
extension-ci-tools @ 66aeca8		extension-ci-tools @ 66aeca8
patches		patches
scripts		scripts
src		src
test		test
test_deferred		test_deferred
vcpkg_overlay_ports/arrow		vcpkg_overlay_ports/arrow
vcpkg_ports/arrow		vcpkg_ports/arrow
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Makefile		Makefile
extension_config.cmake		extension_config.cmake
extension_config_wasm.cmake		extension_config_wasm.cmake
vcpkg.json		vcpkg.json
vcpkg_wasm.json		vcpkg_wasm.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VGI — Vector Gateway Interface for DuckDB

Why VGI?

Quick Start

Features

Building

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VGI — Vector Gateway Interface for DuckDB

Why VGI?

Quick Start

Features

Building

Testing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages