Skip to content

microsoft/Auto-Validate-by-History

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Auto-Validate by-History (AVH)

Python implementation of AVH and baselines reported in paper Auto-Validate by-History: Auto-Program Data Quality Constraints to Validate Recurring Data Pipelines. Can follow the steps below to reproduce results.

Please contact Dezhan Tu (dztu AT g.ucla.edu) and Yeye He (yeyehe AT microsoft.com) for questions or feedback.

Dependencies

  • Ubuntu 18.04, Anaconda 3.5+
  • Tested on Python 3.8.0
  • Download and install arrayfire e.g. pip install arrayfire-3.8.0-cp38-cp38-linux_x86_64.whl
  • All other required python packages can be installed using our prepared requirements.txt (run pip install -r requirements.txt)

Reproduce paper results in notebooks

Jupyter Notebook that shows and reproduces results reported in the paper

  • The notebook visualization.ipynb shows the main comparison results in our paper
  • The notebook sensitivity.ipynb shows all sensitivity and ablation results in our paper

Run AVH from the beginning to reproduce results in the paper

  • Run python avh_with_stationary.py
  • Run python avh_no_stationary.py
  • AVH result will be stored in ./result folder (consumed by the Jupyter notebook above)

Code Overview: Walk through of each .py file

Utility Tool

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published