The ADBC Scanner extension by Query.Farm enables DuckDB to connect to external databases using Arrow Database Connectivity (ADBC), a column-oriented API standard for database access. ADBC provides efficient data transfer using Apache Arrow's columnar format.
INSTALL adbc_scanner FROM community;
LOAD adbc_scanner;-- Connect to a SQLite database using driver name (requires installed manifest)
SET VARIABLE conn = (SELECT adbc_connect({
'driver': 'sqlite',
'uri': ':memory:'
}));
-- Or connect with explicit driver path
SET VARIABLE conn = (SELECT adbc_connect({
'driver': '/path/to/libadbc_driver_sqlite.dylib',
'uri': ':memory:'
}));
-- Query data
SELECT * FROM adbc_scan(getvariable('conn')::BIGINT, 'SELECT * FROM my_table');
-- Execute DDL/DML
SELECT adbc_execute(getvariable('conn')::BIGINT, 'CREATE TABLE users (id INT, name TEXT)');
SELECT adbc_execute(getvariable('conn')::BIGINT, 'INSERT INTO users VALUES (1, ''Alice'')');
-- Disconnect when done
SELECT adbc_disconnect(getvariable('conn')::BIGINT);Creates a connection to an external database via ADBC.
adbc_connect(options) -> BIGINTParameters:
options: A STRUCT or MAP containing connection options
Required Options:
driver: Driver name (e.g.,'sqlite','postgresql'), path to shared library, or path to manifest file (.toml)
Connection Options:
uri: Connection URI (driver-specific)username: Database usernamepassword: Database password
Driver Resolution Options:
entrypoint: Custom driver entrypoint function name (rarely needed)search_paths: Additional paths to search for driver manifests (colon-separated on Unix, semicolon on Windows)use_manifests: Enable/disable manifest search (default:'true'). Set to'false'to only use direct library paths.
Returns: A connection handle (BIGINT) used with other ADBC functions.
Examples:
-- Using driver name (requires installed manifest)
SELECT adbc_connect({
'driver': 'sqlite',
'uri': '/path/to/database.db'
});
-- Using explicit driver path
SELECT adbc_connect({
'driver': '/path/to/libadbc_driver_sqlite.dylib',
'uri': '/path/to/database.db'
});
-- PostgreSQL with credentials
SELECT adbc_connect({
'driver': 'postgresql',
'uri': 'postgresql://localhost:5432/mydb',
'username': 'user',
'password': 'pass'
});
-- Using MAP syntax
SELECT adbc_connect(MAP {
'driver': 'postgresql',
'uri': 'postgresql://localhost:5432/mydb',
'username': 'user',
'password': 'pass'
});
-- With custom search paths for driver manifests
SELECT adbc_connect({
'driver': 'sqlite',
'uri': ':memory:',
'search_paths': '/opt/adbc/drivers:/custom/path'
});
-- Disable manifest search (only use direct library paths)
SELECT adbc_connect({
'driver': '/explicit/path/to/driver.dylib',
'uri': ':memory:',
'use_manifests': 'false'
});
-- Store connection handle in a variable for reuse
SET VARIABLE conn = (SELECT adbc_connect({
'driver': 'sqlite',
'uri': ':memory:'
}));Closes an ADBC connection and releases resources.
adbc_disconnect(connection_id) -> BOOLEANParameters:
connection_id: Connection handle fromadbc_connect
Returns: true on success.
Example:
SELECT adbc_disconnect(getvariable('conn')::BIGINT);Enables or disables autocommit mode on a connection. When autocommit is disabled, changes require an explicit adbc_commit() call.
adbc_set_autocommit(connection_id, enabled) -> BOOLEANParameters:
connection_id: Connection handle fromadbc_connectenabled:trueto enable autocommit,falseto disable
Returns: true on success.
Example:
-- Disable autocommit to start a transaction
SELECT adbc_set_autocommit(getvariable('conn')::BIGINT, false);
-- Make changes...
SELECT adbc_execute(getvariable('conn')::BIGINT, 'INSERT INTO users VALUES (1, ''Alice'')');
-- Commit or rollback
SELECT adbc_commit(getvariable('conn')::BIGINT);
-- Re-enable autocommit
SELECT adbc_set_autocommit(getvariable('conn')::BIGINT, true);Commits the current transaction. Only applicable when autocommit is disabled.
adbc_commit(connection_id) -> BOOLEANParameters:
connection_id: Connection handle fromadbc_connect
Returns: true on success.
Example:
SELECT adbc_commit(getvariable('conn')::BIGINT);Rolls back the current transaction, discarding all uncommitted changes. Only applicable when autocommit is disabled.
adbc_rollback(connection_id) -> BOOLEANParameters:
connection_id: Connection handle fromadbc_connect
Returns: true on success.
Example:
-- Start a transaction
SELECT adbc_set_autocommit(getvariable('conn')::BIGINT, false);
-- Make changes
SELECT adbc_execute(getvariable('conn')::BIGINT, 'DELETE FROM users WHERE id = 1');
-- Oops, rollback!
SELECT adbc_rollback(getvariable('conn')::BIGINT);
-- Re-enable autocommit
SELECT adbc_set_autocommit(getvariable('conn')::BIGINT, true);Executes a SELECT query and returns the results as a table.
adbc_scan(connection_id, query, [params := row(...)]) -> TABLEParameters:
connection_id: Connection handle fromadbc_connectquery: SQL SELECT query to executeparams(optional): Query parameters as a STRUCT created withrow(...)
Returns: A table with columns matching the query result.
Examples:
-- Simple query
SELECT * FROM adbc_scan(getvariable('conn')::BIGINT, 'SELECT * FROM users');
-- Query with parameters
SELECT * FROM adbc_scan(
getvariable('conn')::BIGINT,
'SELECT * FROM users WHERE id = ? AND status = ?',
params := row(42, 'active')
);
-- Aggregate results
SELECT COUNT(*), AVG(price)
FROM adbc_scan(getvariable('conn')::BIGINT, 'SELECT * FROM orders');Executes DDL or DML statements (CREATE, INSERT, UPDATE, DELETE).
adbc_execute(connection_id, statement) -> BIGINTParameters:
connection_id: Connection handle fromadbc_connectstatement: SQL DDL or DML statement to execute
Returns: Number of rows affected (or 0 if not reported by driver).
Examples:
-- Create a table
SELECT adbc_execute(getvariable('conn')::BIGINT,
'CREATE TABLE products (id INT PRIMARY KEY, name TEXT, price DECIMAL)');
-- Insert data
SELECT adbc_execute(getvariable('conn')::BIGINT,
'INSERT INTO products VALUES (1, ''Widget'', 9.99), (2, ''Gadget'', 19.99)');
-- Update data
SELECT adbc_execute(getvariable('conn')::BIGINT,
'UPDATE products SET price = 14.99 WHERE id = 1');
-- Delete data
SELECT adbc_execute(getvariable('conn')::BIGINT,
'DELETE FROM products WHERE id = 2');Bulk insert data from a DuckDB query into a table via ADBC. This is more efficient than executing individual INSERT statements.
adbc_insert(connection_id, table_name, <table>, [mode:=]) -> TABLE(rows_inserted BIGINT)Parameters:
connection_id: Connection handle fromadbc_connecttable_name: Target table name in the remote database<table>: A subquery providing the data to insertmode(optional): Insert mode, one of:'create': Create the table; error if it exists (default for new tables)'append': Append to existing table; error if table doesn't exist'replace': Drop and recreate the table if it exists'create_append': Create if doesn't exist, append if it does
Returns: A table with a single row containing the number of rows inserted.
Examples:
-- Create a new table and insert data
SELECT * FROM adbc_insert(getvariable('conn')::BIGINT, 'users',
(SELECT id, name, email FROM local_users),
mode := 'create');
-- Append data to an existing table
SELECT * FROM adbc_insert(getvariable('conn')::BIGINT, 'users',
(SELECT id, name, email FROM new_users),
mode := 'append');
-- Replace an existing table with new data
SELECT * FROM adbc_insert(getvariable('conn')::BIGINT, 'users',
(SELECT * FROM updated_users),
mode := 'replace');Output:
┌───────────────┐
│ rows_inserted │
├───────────────┤
│ 1000 │
└───────────────┘
Returns driver and database information for a connection.
adbc_info(connection_id) -> TABLE(info_name VARCHAR, info_value VARCHAR)Parameters:
connection_id: Connection handle fromadbc_connect
Returns: A table with info_name and info_value columns.
Common Info Names:
vendor_name: Database vendor (e.g., "SQLite", "PostgreSQL")vendor_version: Database versiondriver_name: ADBC driver namedriver_version: Driver versiondriver_arrow_version: Arrow version used by driver
Example:
SELECT * FROM adbc_info(getvariable('conn')::BIGINT);Output:
┌──────────────────────┬────────────────────┐
│ info_name │ info_value │
├──────────────────────┼────────────────────┤
│ vendor_name │ SQLite │
│ vendor_version │ 3.50.4 │
│ driver_name │ ADBC SQLite Driver │
│ driver_version │ (unknown) │
│ driver_arrow_version │ 0.7.0 │
└──────────────────────┴────────────────────┘
Lists tables in the connected database with optional filtering.
adbc_tables(connection_id, [catalog:=], [schema:=], [table_name:=]) -> TABLEParameters:
connection_id: Connection handle fromadbc_connectcatalog(optional): Filter by catalog name patternschema(optional): Filter by schema name patterntable_name(optional): Filter by table name pattern
Returns: A table with columns:
catalog_name: Catalog containing the tableschema_name: Schema containing the tabletable_name: Name of the tabletable_type: Type (e.g., "table", "view")
Examples:
-- List all tables
SELECT * FROM adbc_tables(getvariable('conn')::BIGINT);
-- Filter by table name pattern
SELECT * FROM adbc_tables(getvariable('conn')::BIGINT, table_name := 'user%');
-- Filter by schema
SELECT * FROM adbc_tables(getvariable('conn')::BIGINT, schema := 'public');Returns the types of tables supported by the database (e.g., "table", "view").
adbc_table_types(connection_id) -> TABLE(table_type VARCHAR)Parameters:
connection_id: Connection handle fromadbc_connect
Returns: A table with a single table_type column listing supported types.
Example:
SELECT * FROM adbc_table_types(getvariable('conn')::BIGINT);Output:
┌────────────┐
│ table_type │
├────────────┤
│ table │
│ view │
└────────────┘
Returns column metadata for tables in the connected database.
adbc_columns(connection_id, [catalog:=], [schema:=], [table_name:=], [column_name:=]) -> TABLEParameters:
connection_id: Connection handle fromadbc_connectcatalog(optional): Filter by catalog name patternschema(optional): Filter by schema name patterntable_name(optional): Filter by table name patterncolumn_name(optional): Filter by column name pattern
Returns: A table with columns:
catalog_name: Catalog containing the tableschema_name: Schema containing the tabletable_name: Name of the tablecolumn_name: Name of the columnordinal_position: Column position (1-based)remarks: Database-specific descriptiontype_name: Data type name (e.g., "INTEGER", "TEXT")is_nullable: Whether the column allows NULL values
Examples:
-- Get all columns for a specific table
SELECT * FROM adbc_columns(getvariable('conn')::BIGINT, table_name := 'users');
-- Get specific column
SELECT * FROM adbc_columns(getvariable('conn')::BIGINT,
table_name := 'users',
column_name := 'email');
-- List all columns in the database
SELECT table_name, column_name, type_name
FROM adbc_columns(getvariable('conn')::BIGINT)
ORDER BY table_name, ordinal_position;Output:
┌──────────────┬─────────────┬────────────┬─────────────┬──────────────────┬─────────┬───────────┬─────────────┐
│ catalog_name │ schema_name │ table_name │ column_name │ ordinal_position │ remarks │ type_name │ is_nullable │
├──────────────┼─────────────┼────────────┼─────────────┼──────────────────┼─────────┼───────────┼─────────────┤
│ main │ NULL │ users │ id │ 1 │ NULL │ INTEGER │ true │
│ main │ NULL │ users │ name │ 2 │ NULL │ TEXT │ true │
│ main │ NULL │ users │ email │ 3 │ NULL │ TEXT │ true │
└──────────────┴─────────────┴────────────┴─────────────┴──────────────────┴─────────┴───────────┴─────────────┘
Returns the Arrow schema for a specific table, showing field names, Arrow data types, and nullability.
adbc_schema(connection_id, table_name, [catalog:=], [schema:=]) -> TABLEParameters:
connection_id: Connection handle fromadbc_connecttable_name: Name of the table to get the schema forcatalog(optional): Catalog containing the tableschema(optional): Database schema containing the table
Returns: A table with columns:
field_name: Name of the field/columnfield_type: Arrow data type (e.g., "int64", "utf8", "float64", "timestamp[us]")nullable: Whether the field allows NULL values
Example:
SELECT * FROM adbc_schema(getvariable('conn')::BIGINT, 'users');Output:
┌────────────┬────────────┬──────────┐
│ field_name │ field_type │ nullable │
├────────────┼────────────┼──────────┤
│ id │ int64 │ true │
│ name │ utf8 │ true │
│ email │ utf8 │ true │
│ created_at │ timestamp │ true │
└────────────┴────────────┴──────────┘
Note: The field_type shows Arrow types, which may differ from the SQL types defined in the table. The mapping depends on the ADBC driver implementation.
ADBC drivers are available for many databases. When using driver manifests (see below), you can reference drivers by their short name:
| Driver Name | Database | Developer |
|---|---|---|
bigquery |
Google BigQuery | ADBC Driver Foundry |
duckdb |
DuckDB | DuckDB Foundation |
flightsql |
Apache Arrow Flight SQL | Apache Software Foundation |
mssql |
Microsoft SQL Server | Columnar |
mysql |
MySQL | ADBC Driver Foundry |
postgresql |
PostgreSQL | Apache Software Foundation |
redshift |
Amazon Redshift | Columnar |
snowflake |
Snowflake | Apache Software Foundation |
sqlite |
SQLite | Apache Software Foundation |
There are a few options for installing drivers:
- Columnar's
dbcis a command-line tool that makes installing and managing ADBC drivers easy. It automatically creates driver manifests for you. - ADBC drivers can be installed from the Apache Arrow ADBC releases or built from source.
- On macOS with Homebrew:
brew install apache-arrow-adbc
Driver manifests allow you to reference ADBC drivers by name (e.g., 'sqlite') instead of specifying the full path to the shared library. A manifest is a TOML file that contains metadata about the driver and the path to its shared library.
Example manifest file (sqlite.toml):
[driver]
name = "sqlite"
description = "ADBC SQLite Driver"
library = "/usr/local/lib/libadbc_driver_sqlite.dylib"Manifest Search Locations:
The extension searches for driver manifests in these locations (in order):
macOS:
ADBC_DRIVER_PATHenvironment variable (colon-separated paths)$VIRTUAL_ENV/etc/adbc/drivers(if in a Python virtual environment)$CONDA_PREFIX/etc/adbc/drivers(if in a Conda environment)~/Library/Application Support/ADBC/Drivers/etc/adbc/drivers
Linux:
ADBC_DRIVER_PATHenvironment variable (colon-separated paths)$VIRTUAL_ENV/etc/adbc/drivers(if in a Python virtual environment)$CONDA_PREFIX/etc/adbc/drivers(if in a Conda environment)~/.config/adbc/drivers/etc/adbc/drivers
Windows:
ADBC_DRIVER_PATHenvironment variable (semicolon-separated paths)- Registry:
HKEY_CURRENT_USER\SOFTWARE\ADBC\Drivers\{name} %LOCAL_APPDATA%\ADBC\Drivers- Registry:
HKEY_LOCAL_MACHINE\SOFTWARE\ADBC\Drivers\{name}
You can also specify additional search paths using the search_paths option in adbc_connect().
-- Load the extension
LOAD adbc_scanner;
-- Connect to SQLite using driver name (requires installed manifest)
SET VARIABLE sqlite_conn = (SELECT adbc_connect({
'driver': 'sqlite',
'uri': '/tmp/example.db'
}));
-- Or connect with explicit driver path
-- SET VARIABLE sqlite_conn = (SELECT adbc_connect({
-- 'driver': '/opt/homebrew/lib/libadbc_driver_sqlite.dylib',
-- 'uri': '/tmp/example.db'
-- }));
-- Check connection info
SELECT * FROM adbc_info(getvariable('sqlite_conn')::BIGINT);
-- Create a table
SELECT adbc_execute(getvariable('sqlite_conn')::BIGINT,
'CREATE TABLE IF NOT EXISTS employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department TEXT,
salary REAL
)');
-- Insert data
SELECT adbc_execute(getvariable('sqlite_conn')::BIGINT,
'INSERT INTO employees VALUES
(1, ''Alice'', ''Engineering'', 95000),
(2, ''Bob'', ''Sales'', 75000),
(3, ''Charlie'', ''Engineering'', 105000)');
-- Query with DuckDB operations
SELECT department, AVG(salary) as avg_salary, COUNT(*) as count
FROM adbc_scan(getvariable('sqlite_conn')::BIGINT, 'SELECT * FROM employees')
GROUP BY department
ORDER BY avg_salary DESC;
-- List tables
SELECT * FROM adbc_tables(getvariable('sqlite_conn')::BIGINT);
-- List supported table types
SELECT * FROM adbc_table_types(getvariable('sqlite_conn')::BIGINT);
-- Parameterized query
SELECT * FROM adbc_scan(
getvariable('sqlite_conn')::BIGINT,
'SELECT * FROM employees WHERE department = ? AND salary > ?',
params := row('Engineering', 90000.0)
);
-- Transaction control
SELECT adbc_set_autocommit(getvariable('sqlite_conn')::BIGINT, false); -- Start transaction
SELECT adbc_execute(getvariable('sqlite_conn')::BIGINT,
'INSERT INTO employees VALUES (4, ''Diana'', ''Marketing'', 85000)');
SELECT adbc_commit(getvariable('sqlite_conn')::BIGINT); -- Commit changes
-- Or use: SELECT adbc_rollback(getvariable('sqlite_conn')::BIGINT); -- To discard changes
SELECT adbc_set_autocommit(getvariable('sqlite_conn')::BIGINT, true); -- Back to autocommit
-- Clean up
SELECT adbc_disconnect(getvariable('sqlite_conn')::BIGINT);ADBC functions throw exceptions on errors with descriptive messages:
-- Invalid connection handle
SELECT * FROM adbc_scan(12345, 'SELECT 1');
-- Error: Invalid Input Error: adbc_scan: Invalid connection handle: 12345
-- SQL syntax error
SELECT adbc_execute(getvariable('conn')::BIGINT, 'INVALID SQL');
-- Error: adbc_execute: Failed to prepare statement: ... [Query: INVALID SQL]
-- Missing driver option
SELECT adbc_connect({'uri': ':memory:'});
-- Error: Invalid Input Error: adbc_connect: 'driver' option is required- ADBC connections are not automatically closed; always call
adbc_disconnect()when done - The
rows_affectedcount fromadbc_executedepends on driver support; some drivers return 0 for all operations - Parameterized queries in
adbc_scanrequire theparamsnamed parameter syntax
See the development documentation for build instructions.
# Clone with submodules
git clone --recurse-submodules https://github.com/your-repo/adbc_scanner.git
# Set up vcpkg
export VCPKG_TOOLCHAIN_PATH=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
# Build
GEN=ninja make
# Test
make test