Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Great Expectations - Data Quality limitation #9794

Closed
nayanex opened this issue Apr 23, 2024 · 1 comment
Closed

Great Expectations - Data Quality limitation #9794

nayanex opened this issue Apr 23, 2024 · 1 comment

Comments

@nayanex
Copy link

nayanex commented Apr 23, 2024

great expectations can't deal with checks on two or more tables.

like say date of column in table a should be greater than date of column in table B where table A and table B can be joined using an identifier

classic example is like say loan and loanpart where loan holds the info abt the mortoages and loanpart holds the info abt mortages payments. so the date in the loanpart should always be great than date in the loan as mortage payment can only be done for an active mortagage.

what great expectations can is check date if the date in the loan table is less than or greater than UTCDATE not compare dates between tables like mentioned above.

What approach would you suggest to overcome this limitation?

I only tried to use Great Expectations

@rachhouse
Copy link
Contributor

Hi @nayanex, thanks for your question!

You are correct that the majority of GX Expectations are built to run on a single Data Asset. To use GX to test columns in separate tables, you have two possible options:

1. Transform the data outside of your GX workflow so that it can be treated as a single Table Asset
With this approach, you would first create a SQL view based off a query that joins table A and table B on the shared identifier column. You would then create a GX Table Asset using the SQL view (rather than the original table A and table B), and run the Expectations of interest, for example: expect_column_pair_values_a_to_be_greater_than_b.

2. Create a Custom Expectation that implements your desired check
GX supports the ability to create Custom Expectations - you could create a Custom Query Expectation to implement the behavior you are looking for. There are currently two contributed custom Query Expectations that might provide a helpful reference if you decide to build your own:

Our Discourse forum is a great place to reach out if you have any future questions on general GX usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants