Stars
Business intelligence as code: build fast, interactive data visualizations in SQL and markdown
Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
Home of the Open Data Contract Standard (ODCS).
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
This is a repo with links to everything you'd ever want to learn about data engineering
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
DuckDB-powered data lake analytics from Postgres
sqlfmt formats your dbt SQL files so you don't have to
Scalable and efficient data transformation framework - backwards compatible with dbt.
The data-validation toolkit for enhanced dbt (data build tool) PR review
dbt adapter for SQL Server and Azure SQL
A curated list of awesome ETL frameworks, libraries, and software.
Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊
A curated list of data engineering tools for software developers
Dagster Labs' open-source data platform, built with Dagster.
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
An orchestration platform for the development, production, and observation of data assets.
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
End-to-end encrypted platform for photos, videos and 2FA secrets.
Embedded property graph database built for speed. Vector search and full-text search built in. Implements Cypher.
Custom Dashboards for Beancount in Fava