Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
-
Updated
May 27, 2024 - Java
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Possibly the fastest DataFrame-agnostic quality check library in town.
DataSanity contains Instance Scan checks that verify data quality.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Always know what to expect from your data.
DataOps TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset screening and hygiene review, algorithmic generation of data quality validation tests, ongoing testing of new data refreshes, & continuous data anomaly monitoring
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Enhance your data testing seamlessly with this Dataform package, featuring robust common assertions to ensure the accuracy and integrity of your warehouse data.
Open Source Data Quality Monitoring.
Datailot-cli is the command line interface for accessing the AI teammate for engineers to ensure best practices in their SQL and dbt projects.
Compare tables within or across databases
Library for Semi-Automated Data Science
ML powered analytics engine for outlier detection and root cause analysis.
re_data - fix data issues before your users & CEO would discover them 😊
The main code repository of Referencing Quality Scoring System metrics. Paper: https://www.semantic-web-journal.net/system/files/swj3593.pdf
Add a description, image, and links to the dataquality topic page so that developers can more easily learn about it.
To associate your repository with the dataquality topic, visit your repo's landing page and select "manage topics."