Skip to content
View jakeswenson's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report jakeswenson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Data

25 repositories

Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.

Rust 1,574 127 Updated Jun 18, 2024

Rust-based WebAssembly bindings to read and write Apache Parquet data

Rust 635 29 Updated Jan 13, 2026

A native storage format for apache arrow

Rust 82 11 Updated Oct 18, 2023

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 5,949 523 Updated Jan 17, 2026

A RocksDB compliant high performance scalable embedded key-value store

C++ 1,003 80 Updated Jun 12, 2024

Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.

Go 4,147 226 Updated Aug 30, 2025

Apache DataFusion Comet Spark Accelerator

Scala 1,104 270 Updated Jan 16, 2026

🌎 Polars H3 Geospatial Plugin

Python 113 1 Updated Aug 17, 2025

Extremely fast Query Engine for DataFrames, written in Rust

Rust 37,019 2,561 Updated Jan 16, 2026

Embeddable stream processing engine based on Apache DataFusion

Rust 373 11 Updated Dec 18, 2024

🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.

Rust 5,144 522 Updated Jan 8, 2026

Distributed stream processing engine in Rust

Rust 4,758 333 Updated Jan 14, 2026

The Data Change Processing platform

C# 1,199 57 Updated Jan 9, 2026

The Feldera Incremental Computation Engine

Rust 1,759 93 Updated Jan 17, 2026

Fast web applications through dynamic, partially-stateful dataflow

Rust 5,216 250 Updated Oct 30, 2021

Web UI for Noria clusters

JavaScript 73 15 Updated Jul 22, 2019

The live data layer for apps and AI agents Create up-to-the-second views into your business, just using SQL

Rust 6,209 487 Updated Jan 17, 2026

Apache DataFusion SQL Query Engine

Rust 8,276 1,882 Updated Jan 17, 2026

Apache DataFusion Ballista Distributed Query Engine

Rust 1,950 259 Updated Jan 11, 2026

The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing

Rust 1,684 204 Updated Jan 17, 2026

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

C++ 10,161 1,276 Updated Dec 18, 2025

Remote shuffle service for Apache Spark to store shuffle data on remote servers.

Java 336 99 Updated Sep 29, 2023

An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.

Rust 2,498 115 Updated Jan 16, 2026

The lightweight, fault-tolerant database built on SQLite. Designed to keep your data highly available with minimal effort.

Go 17,241 760 Updated Jan 17, 2026