hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
-
Updated
Dec 13, 2017 - Python
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
This script creates the RA and CM "Missing Entry Data Reports" which were formerly called the "Demographics Data Quality Reports"
A script for identifying likely errors with the housing move in date
Repository to host the LAPD project - Data Quality analysis and extraction - reports, import to mongoDb
Migrated to: https://gitlab.com/Oslandia/osm-data-classification
Airflow plug-in that allows you to automate robust Data Quality checks for BigQuery
Scripts I wrote at my job which could be helpful to others
A library of helpful pyspark functions
Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.
This project repository provides a headless module to enrich location data in a database table using the Google Maps Geocode API.
This repo shows how you can utilise CICD to speed up data development. Video link below.
This is a website for each registered member to assess the quality of their data based on five metrics that we have designed from the literature.
This code can be used for finding the angle that the image has been rotated by. Especially is tested on satellite data where geo-referencing rotates the image.
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
A collection of scripts written to complete DQLab Data Analyst Career Track 📊
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."