awesome-data-management

A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀

Amundsen: Data discovery and metadata engine for improving the productivity when interacting with data.
Apache Atlas: Provides open metadata management and governance capabilities to build a data catalog.
CKAN: Open-source DMS (data management system) for powering data hubs and data portals.
DataHub: LinkedIn's generalized metadata search & discovery tool.
Datatile: A library for managing, validating, summarizing, and visualizing data.
Delta Lake: Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
Dolt: SQL database that you can fork, clone, branch, merge, push and pull just like a git repository.
DVC: Management and versioning of datasets and machine learning models.
Hub: A dataset format for creating, storing, and collaborating on AI datasets of any size.
Intake: A lightweight package for finding, investigating, loading and disseminating data.
Quilt: A self-organizing data hub with S3 support.
lakeFS: Repeatable, atomic and versioned data lake on top of object storage.
Magda: A federated, open-source data catalog for all your big data and small data.
Marquez: Collect, aggregate, and visualize a data ecosystem's metadata.
Metacat: Unified metadata exploration API service for Hive, RDS, Teradata, Redshift, S3 and Cassandra.
Milvus: An open source embedding vector similarity search engine powered by Faiss, NMSLIB and Annoy.
OpenMetadata: A Single place to discover, collaborate and get your data right.
Spark: Unified analytics engine for large-scale data processing.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

Repository files navigation

awesome-data-management

About

License

awesome-mlops/awesome-data-management

Folders and files

Latest commit

History

LICENSE

LICENSE

README.md

README.md

Repository files navigation

awesome-data-management

About

Topics

Resources

License

Stars

Watchers

Forks