Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test the output of the ETL #11

Closed
aguynamedryan opened this issue Jun 3, 2015 · 2 comments
Closed

Test the output of the ETL #11

aguynamedryan opened this issue Jun 3, 2015 · 2 comments

Comments

@aguynamedryan
Copy link
Contributor

We should gather a handful of rows from each source file and hand-convert those rows into CDMv5, then feed those rows through our ETL and verify it generates the exact same rows. This will serve to test that our ETL process is properly functioning.

As we hit odd rows in the raw data, we can add them to our set of test rows and verify that we're covering all the weird edge-cases we discover during the implementation of the ETL.

@aguynamedryan
Copy link
Contributor Author

We're working on hand-converting a few rows right now. If anyone else has already hand-converted some data, please let me know!

@ChristopheLambert
Copy link
Contributor

We checked our output against the hand-coded source files and everything checks out. We have also tested the output of the ETL by loading it all into the CDM v5 and running Achilles Heel. After many iterations we removed all errors, with some remaining warnings that are well-understood and documented in the python_etl/README.md file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants