Data
A DSL for data-driven computational pipelines
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems ca…
An orchestration platform for the development, production, and observation of data assets.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
JSON Hero is an open-source, beautiful JSON explorer for the web that lets you browse, search and navigate your JSON files at speed. 🚀. Built with 💜 by the Trigger.dev team.
Search Google and download specific file types
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
A high-performance observability data pipeline.
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
Elyra extends JupyterLab with an AI centric approach.
High-Performance Serverless event and data processing platform
Apache Atlas - Open Metadata Management and Governance capabilities across the Hadoop platform and beyond
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Grist is the evolution of spreadsheets.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL
Streamlit — A faster way to build and share data apps.
A terminal spreadsheet multitool for discovering and arranging data
💾 peer-to-peer sharing & live syncronization of files via command line
World's largest Contributor driven code dataset | Used in Quark Search Engine, @OpenGenus IQ, OpenGenus Visual Project




