Data exists everywhere: your laptop, Postgres, Snowflake and as files in S3. It exists in various formats such as Parquet, CSV and JSON. Regardless, there will always be multiple steps spanning several destinations to get the insights you need.
GlareDB is designed to query your data wherever it lives using SQL that you already know.
Install/update glaredb
in the current directory:
curl https://glaredb.com/install.sh | sh
It may be helpful to install the binary in a location on your PATH
. For
example, ~/.local/bin
.
If you prefer manual installation, download, extract and run the GlareDB binary from a release in our releases page.
After Installing, get up and running with:
To start a local session, run the binary:
./glaredb
Or, you can execute SQL and immediately return (try it out!):
# Query a CSV on Hugging Face
./glaredb --query "SELECT * FROM \
'https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/raw/main/prompts.csv';"
To see all options use --help
:
./glaredb --help
-
Sign up at https://console.glaredb.com for a free fully-managed deployment of GlareDB
-
Copy the connection string from GlareDB Cloud, for example:
./glaredb --cloud-url="glaredb://user:pass@host:port/deployment" # or ./glaredb > \open "glaredb://user:pass@host:port/deployment
Read our announcement on Hybrid Execution for more information.
-
Install the official GlareDB Python library
pip install glaredb
-
Import and use
glaredb
.import glaredb con = glaredb.connect() con.sql("select 'hello world';").show()
To use Hybrid Execution, sign up at https://console.glaredb.com and use the connection string for your deployment. For example:
import glaredb
con = glaredb.connect("glaredb://user:pass@host:port/deployment")
con.sql("select 'hello hybrid exec';").show()
GlareDB work with Pandas and Polars DataFrames out of the box:
import glaredb
import polars as pl
df = pl.DataFrame(
{
"A": [1, 2, 3, 4, 5],
"fruits": ["banana", "banana", "apple", "apple", "banana"],
"B": [5, 4, 3, 2, 1],
"cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
}
)
con = glaredb.connect()
df = con.sql("select * from df where fruits = 'banana'").to_polars();
print(df)
The server
subcommand can be used to launch a server process for GlareDB:
./glaredb server
To see all options for running in server mode, use --help
:
./glaredb server --help
When launched as a server process, GlareDB can be reached on port 6543
using a
Postgres client. The following example uses psql
to connect to a locally
running server:
psql "host=localhost user=glaredb dbname=glaredb port=6543"
You can use a demo Postgres instance at pg.demo.glaredb.com
. Adding this
Postgres instance as data source is as easy as running the following command:
CREATE EXTERNAL DATABASE my_pg
FROM postgres
OPTIONS (
host = 'pg.demo.glaredb.com',
port = '5432',
user = 'demo',
password = 'demo',
database = 'postgres',
);
Once the data source has been added, it can be queried using fully qualified table names:
SELECT *
FROM my_pg.public.lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '90'
LIMIT 5;
Check out the docs to learn about all supported data sources. Many data sources can be connected to the same GlareDB instance.
Done with this data source? Remove it with the following command:
DROP DATABASE my_pg;
Source | Read | INSERT INTO |
COPY TO |
Table Function | External Table | External Database |
---|---|---|---|---|---|---|
Databases | -- | -- | -- | -- | -- | -- |
MySQL | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
PostgreSQL | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
MariaDB (via mysql) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
MongoDB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Microsoft SQL Server | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ |
Snowflake | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ |
BigQuery | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ |
Cassandra/ScyllaDB | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ |
ClickHouse | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ |
Oracle | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 |
ADBC | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 |
ODBC | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 |
Database Files | -- | -- | -- | -- | -- | -- |
SQLite | ✅ | ✅ | 🚧 | ✅ | ✅ | ✅ |
Microsoft Excel | ✅ | 🚧 | 🚧 | ✅ | ✅ | ➖ |
DuckDB | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 |
File Formats | -- | -- | -- | -- | -- | -- |
Apache Arrow | ✅ | 🚧 | ✅ | ✅ | ✅ | ➖ |
Apache Parquet | ✅ | 🚧 | ✅ | ✅ | ✅ | ➖ |
CSV | ✅ | 🚧 | ✅ | ✅ | ✅ | ➖ |
JSON | ✅ | 🚧 | ✅ | ✅ | ✅ | ➖ |
BSON | ✅ | 🚧 | ✅ | ✅ | ✅ | ➖ |
Apache Avro | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ➖ |
Apache ORC | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ➖ |
Table Formats | -- | -- | -- | -- | -- | -- |
Lance | ✅ | ✅ | ✅ | ✅ | ✅ | ➖ |
Delta | ✅ | ✅ | ✅ | ✅ | ✅ | ➖ |
Iceberg | ✅ | 🚧 | 🚧 | ✅ | ✅ | ➖ |
✅ = Supported ➖ = Not Applicable 🚧 = Not Yet Supported
Building GlareDB requires Rust/Cargo to be installed. Check out rustup for an easy way to install Rust on your system.
Running the following command will build a release binary:
just build --release
The compiled release binary can be found in target/release/glaredb
.
Browse GlareDB documentation on our docs.glaredb.com.
Contributions welcome! Check out CONTRIBUTING.md for how to get started.
See LICENSE. Unless otherwise noted, this license applies to all files in this repository.
GlareDB is proudly powered by Apache Datafusion and Apache Arrow. We are grateful for the work of the Apache Software Foundation and the community around these projects.