Glue Query Runner

Run SQL queries against Delta Lake tables in AWS Glue Data Catalog with DataFusion.

Why?

Query your Glue-registered Delta tables without Spark. Fast, lightweight, Rust-native.

Quick Start

# Run a query
cargo run --release --bin sql-runner -- \
  --file query.sql \
  --database my_database

# With debug tracing
RUST_LOG=glue_query_runner=debug cargo run --release --bin sql-runner -- \
  --file query.sql \
  --database my_database

Features

Builder API - Ergonomic, fluent configuration
Automatic caching - Table metadata cached with TTL (5min default)
Structured tracing - Optional debug logging with RUST_LOG
Batch processing - Process entire folders of SQL files
Query plans - --explain flag for optimization insights

Library Usage

Simple (one-liner setup)

use glue_query_runner::GlueContextExt;
use datafusion::prelude::*;

let aws_config = aws_config::load_defaults(aws_config::BehaviorVersion::latest()).await;
let ctx = SessionContext::new()
    .with_glue_database(&aws_config, "my_database")
    .await?;

let df = ctx.sql("SELECT * FROM my_table LIMIT 10").await?;
df.show().await?;

Advanced (builder pattern)

use glue_query_runner::{GlueSchemaProvider, catalog::GlueCatalog};
use std::sync::Arc;

let glue_catalog = Arc::new(GlueCatalog::new(&aws_config));

let schema = GlueSchemaProvider::builder("my_database")
    .catalog(glue_catalog)
    .cache_ttl(Duration::from_secs(600))  // 10 min cache
    .build()
    .await?;

CLI Options

# Single file
--file query.sql --database my_db

# Batch folder
--input-folder ./queries --output-folder ./results --database my_db

# With execution plans
--file query.sql --database my_db --explain

SQL Format

Queries separated by semicolons, comments with --:

-- Get top customers
SELECT customer_id, SUM(amount) as total
FROM orders
GROUP BY customer_id
ORDER BY total DESC
LIMIT 10;

-- Daily revenue
SELECT DATE(order_date), SUM(amount)
FROM orders
GROUP BY DATE(order_date);

Features (Cargo)

[features]
default = ["tracing", "cache"]
tracing = ["dep:tracing", "dep:tracing-subscriber"]
cache = ["dep:moka"]

Disable with --no-default-features.

Tracing Example

RUST_LOG=glue_query_runner=debug cargo run --release -- --file query.sql

Output:

DEBUG Fetching table list from Glue catalog
INFO  Retrieved table names from Glue count=100
DEBUG Table cache miss
INFO  Resolved table location location=s3://bucket/table
INFO  Delta table loaded files=21
DEBUG Table cached
DEBUG Table cache hit  # <-- Second query instant!

How It Works

GlueCatalog - Fetches table locations from AWS Glue
GlueSchemaProvider - Lazy-loads Delta tables, caches results
DataFusion - Executes SQL with full query optimization
Moka cache - In-memory TTL cache for table metadata

AWS Setup

Uses default AWS credential chain:

export AWS_REGION=us-east-1
export AWS_PROFILE=your_profile
# or AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY

Performance

First query: ~3s (Glue lookup + Delta load)
Cached query: ~instant (cache hit)
Cache TTL: 5 minutes (configurable)

Dependencies

deltalake 0.28 - Delta Lake with S3
datafusion 49.0 - Query engine
aws-sdk-glue 1.73 - Glue API
moka 0.12 - Async cache (optional)
tracing 0.1 - Structured logging (optional)

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Glue Query Runner

Why?

Quick Start

Features

Library Usage

Simple (one-liner setup)

Advanced (builder pattern)

CLI Options

SQL Format

Features (Cargo)

Tracing Example

How It Works

AWS Setup

Performance

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Glue Query Runner

Why?

Quick Start

Features

Library Usage

Simple (one-liner setup)

Advanced (builder pattern)

CLI Options

SQL Format

Features (Cargo)

Tracing Example

How It Works

AWS Setup

Performance

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages