Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions docs/source/user-guide/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Here are some example systems built using DataFusion:
- Specialized Analytical Database systems such as [HoraeDB] and more general Apache Spark like system such as [Ballista]
- New query language engines such as [prql-query] and accelerators such as [VegaFusion]
- Research platform for new Database Systems, such as [Flock]
- SQL support to another library, such as [dask sql]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can also move dask sql to the less active projects

- SQL support to another library, such as [Vortex]
- Streaming data platforms such as [Synnada]
- Tools for reading / sorting / transcoding Parquet, CSV, AVRO, and JSON files such as [qv]
- Native Spark runtime replacement such as [Auron]
Expand All @@ -101,11 +101,10 @@ Here are some active projects using DataFusion:
- [CnosDB] Open Source Distributed Time Series Database
- [Comet](https://github.com/apache/datafusion-comet) Apache Spark native query execution plugin
- [Cube Store] Cube’s universal semantic layer platform is the next evolution of OLAP technology for AI, BI, spreadsheets, and embedded analytics
- [Dask SQL] Distributed SQL query engine in Python
- [datafusion-dft](https://github.com/datafusion-contrib/datafusion-dft) Batteries included CLI, TUI, and server implementations for DataFusion.
- [dbt Fusion engine](https://github.com/dbt-labs/dbt-fusion) The dbt Fusion engine, written in Rust, designed for speed and correctness with a native SQL understanding across DWH SQL dialects.
- [delta-rs] Native Rust implementation of Delta Lake
- [Exon](https://github.com/wheretrue/exon) Analysis toolkit for life-science applications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we moe Exon to the 'less actve" projects below instead?

Here are some less active projects that used DataFusion:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

- [EDB Postgres Lakehouse] built with [Seafowl]
- [Feldera](https://github.com/feldera/feldera) Fast query engine for incremental computation
- [Funnel](https://funnel.io/) Data Platform powering Marketing Intelligence applications.
- [GlareDB](https://github.com/GlareDB/glaredb) Fast SQL database for querying and analyzing distributed data.
Expand All @@ -125,19 +124,21 @@ Here are some active projects using DataFusion:
- [Restate](https://github.com/restatedev) Easily build resilient applications using distributed durable async/await
- [ROAPI] Create full-fledged APIs for slowly moving datasets without writing a single line of code
- [Sail](https://github.com/lakehq/sail) Unifying stream, batch and AI workloads with Apache Spark compatibility
- [Seafowl] CDN-friendly analytical database
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seafowl is now the EDB's analytic engine -- maybe we can update the nam / link instead:

https://www.enterprisedb.com/blog/analytics-query-goes-6x-faster-edb-postgres-distributeds-new-analytics-engine

- [SedonaDB](https://github.com/apache/sedona-db) A single-node analytical database engine with geospatial as a first-class citizen
- [Sleeper](https://github.com/gchq/sleeper) Serverless, cloud-native, log-structured merge tree based, scalable key-value store
- [Spice.ai] Building blocks for data-driven AI applications
- [Synnada] Streaming-first framework for data products
- [VegaFusion] Server-side acceleration for the [Vega](https://vega.github.io/) visualization grammar
- [Vortex] An extensible, state of the art columnar file format
- [Telemetry](https://telemetry.sh/) Structured logging made easy
- [Xorq](https://github.com/xorq-labs/xorq/) Xorq is a multi-engine batch transformation framework built on Ibis, DataFusion and Arrow

Here are some less active projects that used DataFusion:

- [bdt](https://github.com/datafusion-contrib/bdt) Boring Data Tool
- [Cloudfuse Buzz]
- [Dask SQL] Distributed SQL query engine in Python
- [Exon] Analysis toolkit for life-science applications
- [Flock]
- [Tensorbase]

Expand All @@ -149,6 +150,8 @@ Here are some less active projects that used DataFusion:
[dask sql]: https://github.com/dask-contrib/dask-sql
[datafusion-tui]: https://github.com/datafusion-contrib/datafusion-tui
[delta-rs]: https://github.com/delta-io/delta-rs
[edb postgres lakehouse]: https://www.enterprisedb.com/products/analytics
[exon]: https://github.com/wheretrue/exon
[flock]: https://github.com/flock-lab/flock
[kamu]: https://github.com/kamu-data/kamu-cli
[greptimedb]: https://github.com/GreptimeTeam/greptimedb
Expand All @@ -163,7 +166,8 @@ Here are some less active projects that used DataFusion:
[spice.ai]: https://github.com/spiceai/spiceai
[synnada]: https://synnada.ai/
[tensorbase]: https://github.com/tensorbase/tensorbase
[vegafusion]: https://vegafusion.io/ "if you know of another project, please submit a PR to add a link!"
[vegafusion]: https://vegafusion.io/
[vortex]: https://vortex.dev/ "if you know of another project, please submit a PR to add a link!"

## Integrations and Extensions

Expand Down