You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 17, 2024. It is now read-only.
It's awesome for a few reasons that apply to data-diff. Namely, you can direct-query raw csv/txt/parquet files as though they were tables. (eg select posting_date, count(*) as r_count from '/Users/me/data.csv' group by posting_date )
We use this ability to load PROD v UAT files from our system to compare output. Being able to pass this across to data-diff would be incredible.
Whilst just being able to reference csv files in data-diff might be another option, doing this via duckDB would allow you to perform some basic transformations on the way; such as renaming fields, selecting a reduced range etc