The Metadata Platform for your Data Stack
-
Updated
Nov 1, 2024 - Java
The Metadata Platform for your Data Stack
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quality and user behaviour. This solution creates Data Catalog Tags history in BigQuery since Data Catalog keeps only the latest version of metadata for fast searchability.
A system for managing files and file replicas across many diverse sites
Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
Add a description, image, and links to the data-catalog topic page so that developers can more easily learn about it.
To associate your repository with the data-catalog topic, visit your repo's landing page and select "manage topics."