⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
Updated
May 30, 2024 - Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Possibly the fastest DataFrame-agnostic quality check library in town.
Swiple enables you to easily observe, understand, validate and improve the quality of your data
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
Safety net for machine learning pipelines. Plays nice with sklearn and pandas.
🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎
The script reads the dataset along the path and selects the columns in it received from the argument for the specified dates. Then it saves the report to the specified path of HDFS.
hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to Python
Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
An open source Python interface to the quality control of ocean in-situ observations
A library for authoring DLT pipelines via meta-programming patterns and deploying to Databricks workspaces.
Automatically validate datasets, poll task status, and display validation results in a GitHub using Swiple pull request.
profile tabular datasets, manage automatic validation for new datasets, automatic handling for quality issues.
Projeto de conclusão de curso do CESAR SCHOOL voltado para avaliação de ferramentas de Qualidade de Dados.
Validate tabular data in Python
Little tool to validate a folder with XML files with a XML schema
Framework to Automatically Determine the Quality of Open Data Catalogs
Add a description, image, and links to the data-quality-checks topic page so that developers can more easily learn about it.
To associate your repository with the data-quality-checks topic, visit your repo's landing page and select "manage topics."