dataquality
Here are 4 public repositories matching this topic...
Simple Spark wrapper for validating data
-
Updated
Oct 17, 2020 - Scala
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, la…
-
Updated
Apr 21, 2023 - Scala
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
-
Updated
Jun 19, 2024 - Scala
Improve this page
Add a description, image, and links to the dataquality topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataquality topic, visit your repo's landing page and select "manage topics."