From f36580eb0c5e614bea86801826096bdd1910f76a Mon Sep 17 00:00:00 2001 From: blaginin Date: Mon, 29 Sep 2025 19:24:11 +0100 Subject: [PATCH 1/4] Cleanup usages + Add Vortex --- docs/source/user-guide/introduction.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/docs/source/user-guide/introduction.md b/docs/source/user-guide/introduction.md index ef82de9a24b3..577d60e57ebd 100644 --- a/docs/source/user-guide/introduction.md +++ b/docs/source/user-guide/introduction.md @@ -78,7 +78,7 @@ Here are some example systems built using DataFusion: - Specialized Analytical Database systems such as [HoraeDB] and more general Apache Spark like system such as [Ballista] - New query language engines such as [prql-query] and accelerators such as [VegaFusion] - Research platform for new Database Systems, such as [Flock] -- SQL support to another library, such as [dask sql] +- SQL support to another library, such as [Vortex] - Streaming data platforms such as [Synnada] - Tools for reading / sorting / transcoding Parquet, CSV, AVRO, and JSON files such as [qv] - Native Spark runtime replacement such as [Auron] @@ -101,11 +101,9 @@ Here are some active projects using DataFusion: - [CnosDB] Open Source Distributed Time Series Database - [Comet](https://github.com/apache/datafusion-comet) Apache Spark native query execution plugin - [Cube Store] Cube’s universal semantic layer platform is the next evolution of OLAP technology for AI, BI, spreadsheets, and embedded analytics -- [Dask SQL] Distributed SQL query engine in Python - [datafusion-dft](https://github.com/datafusion-contrib/datafusion-dft) Batteries included CLI, TUI, and server implementations for DataFusion. - [dbt Fusion engine](https://github.com/dbt-labs/dbt-fusion) The dbt Fusion engine, written in Rust, designed for speed and correctness with a native SQL understanding across DWH SQL dialects. - [delta-rs] Native Rust implementation of Delta Lake -- [Exon](https://github.com/wheretrue/exon) Analysis toolkit for life-science applications - [Feldera](https://github.com/feldera/feldera) Fast query engine for incremental computation - [Funnel](https://funnel.io/) Data Platform powering Marketing Intelligence applications. - [GlareDB](https://github.com/GlareDB/glaredb) Fast SQL database for querying and analyzing distributed data. @@ -125,12 +123,12 @@ Here are some active projects using DataFusion: - [Restate](https://github.com/restatedev) Easily build resilient applications using distributed durable async/await - [ROAPI] Create full-fledged APIs for slowly moving datasets without writing a single line of code - [Sail](https://github.com/lakehq/sail) Unifying stream, batch and AI workloads with Apache Spark compatibility -- [Seafowl] CDN-friendly analytical database - [SedonaDB](https://github.com/apache/sedona-db) A single-node analytical database engine with geospatial as a first-class citizen - [Sleeper](https://github.com/gchq/sleeper) Serverless, cloud-native, log-structured merge tree based, scalable key-value store - [Spice.ai] Building blocks for data-driven AI applications - [Synnada] Streaming-first framework for data products - [VegaFusion] Server-side acceleration for the [Vega](https://vega.github.io/) visualization grammar +- [Vortex] An extensible, state of the art columnar file format - [Telemetry](https://telemetry.sh/) Structured logging made easy - [Xorq](https://github.com/xorq-labs/xorq/) Xorq is a multi-engine batch transformation framework built on Ibis, DataFusion and Arrow @@ -146,7 +144,6 @@ Here are some less active projects that used DataFusion: [cloudfuse buzz]: https://github.com/cloudfuse-io/buzz-rust [cnosdb]: https://github.com/cnosdb/cnosdb [cube store]: https://github.com/cube-js/cube.js/tree/master/rust -[dask sql]: https://github.com/dask-contrib/dask-sql [datafusion-tui]: https://github.com/datafusion-contrib/datafusion-tui [delta-rs]: https://github.com/delta-io/delta-rs [flock]: https://github.com/flock-lab/flock @@ -159,7 +156,6 @@ Here are some less active projects that used DataFusion: [prql-query]: https://github.com/prql/prql-query [qv]: https://github.com/timvw/qv [roapi]: https://github.com/roapi/roapi -[seafowl]: https://github.com/splitgraph/seafowl [spice.ai]: https://github.com/spiceai/spiceai [synnada]: https://synnada.ai/ [tensorbase]: https://github.com/tensorbase/tensorbase From 6406813348e3115979be3dd669140d86df0a00bf Mon Sep 17 00:00:00 2001 From: blaginin Date: Mon, 29 Sep 2025 23:04:00 +0100 Subject: [PATCH 2/4] Move to `less active projects` --- docs/source/user-guide/introduction.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/source/user-guide/introduction.md b/docs/source/user-guide/introduction.md index 577d60e57ebd..a9c63824264f 100644 --- a/docs/source/user-guide/introduction.md +++ b/docs/source/user-guide/introduction.md @@ -104,6 +104,7 @@ Here are some active projects using DataFusion: - [datafusion-dft](https://github.com/datafusion-contrib/datafusion-dft) Batteries included CLI, TUI, and server implementations for DataFusion. - [dbt Fusion engine](https://github.com/dbt-labs/dbt-fusion) The dbt Fusion engine, written in Rust, designed for speed and correctness with a native SQL understanding across DWH SQL dialects. - [delta-rs] Native Rust implementation of Delta Lake +- [EDB Postgres Lakehouse] built with [Seafowl] - [Feldera](https://github.com/feldera/feldera) Fast query engine for incremental computation - [Funnel](https://funnel.io/) Data Platform powering Marketing Intelligence applications. - [GlareDB](https://github.com/GlareDB/glaredb) Fast SQL database for querying and analyzing distributed data. @@ -138,14 +139,19 @@ Here are some less active projects that used DataFusion: - [Cloudfuse Buzz] - [Flock] - [Tensorbase] +- [Dask SQL] Distributed SQL query engine in Python +- [Exon] Analysis toolkit for life-science applications [ballista]: https://github.com/apache/datafusion-ballista [auron]: https://github.com/apache/auron [cloudfuse buzz]: https://github.com/cloudfuse-io/buzz-rust [cnosdb]: https://github.com/cnosdb/cnosdb [cube store]: https://github.com/cube-js/cube.js/tree/master/rust +[dask sql]: https://github.com/dask-contrib/dask-sql [datafusion-tui]: https://github.com/datafusion-contrib/datafusion-tui [delta-rs]: https://github.com/delta-io/delta-rs +[EDB Postgres Lakehouse]: https://www.enterprisedb.com/products/analytics +[exon]: https://github.com/wheretrue/exon [flock]: https://github.com/flock-lab/flock [kamu]: https://github.com/kamu-data/kamu-cli [greptimedb]: https://github.com/GreptimeTeam/greptimedb @@ -156,6 +162,7 @@ Here are some less active projects that used DataFusion: [prql-query]: https://github.com/prql/prql-query [qv]: https://github.com/timvw/qv [roapi]: https://github.com/roapi/roapi +[seafowl]: https://github.com/splitgraph/seafowl [spice.ai]: https://github.com/spiceai/spiceai [synnada]: https://synnada.ai/ [tensorbase]: https://github.com/tensorbase/tensorbase From c4d71b6a2d1409f7201c91ef7bb66db3bc7079b9 Mon Sep 17 00:00:00 2001 From: blaginin Date: Mon, 29 Sep 2025 23:07:32 +0100 Subject: [PATCH 3/4] Sort --- docs/source/user-guide/introduction.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/source/user-guide/introduction.md b/docs/source/user-guide/introduction.md index a9c63824264f..e228415d8ab8 100644 --- a/docs/source/user-guide/introduction.md +++ b/docs/source/user-guide/introduction.md @@ -137,10 +137,10 @@ Here are some less active projects that used DataFusion: - [bdt](https://github.com/datafusion-contrib/bdt) Boring Data Tool - [Cloudfuse Buzz] -- [Flock] -- [Tensorbase] - [Dask SQL] Distributed SQL query engine in Python - [Exon] Analysis toolkit for life-science applications +- [Flock] +- [Tensorbase] [ballista]: https://github.com/apache/datafusion-ballista [auron]: https://github.com/apache/auron @@ -166,7 +166,8 @@ Here are some less active projects that used DataFusion: [spice.ai]: https://github.com/spiceai/spiceai [synnada]: https://synnada.ai/ [tensorbase]: https://github.com/tensorbase/tensorbase -[vegafusion]: https://vegafusion.io/ "if you know of another project, please submit a PR to add a link!" +[vegafusion]: https://vegafusion.io/ +[vortex]: https://vortex.dev/ "if you know of another project, please submit a PR to add a link!" ## Integrations and Extensions From fcc45c966167058a22dd8f78509a024e065cef9e Mon Sep 17 00:00:00 2001 From: blaginin Date: Mon, 29 Sep 2025 23:11:26 +0100 Subject: [PATCH 4/4] fmt --- docs/source/user-guide/introduction.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/user-guide/introduction.md b/docs/source/user-guide/introduction.md index e228415d8ab8..51f025d2790c 100644 --- a/docs/source/user-guide/introduction.md +++ b/docs/source/user-guide/introduction.md @@ -150,7 +150,7 @@ Here are some less active projects that used DataFusion: [dask sql]: https://github.com/dask-contrib/dask-sql [datafusion-tui]: https://github.com/datafusion-contrib/datafusion-tui [delta-rs]: https://github.com/delta-io/delta-rs -[EDB Postgres Lakehouse]: https://www.enterprisedb.com/products/analytics +[edb postgres lakehouse]: https://www.enterprisedb.com/products/analytics [exon]: https://github.com/wheretrue/exon [flock]: https://github.com/flock-lab/flock [kamu]: https://github.com/kamu-data/kamu-cli