Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: run EXPERIMENTAL expectation (from great_expectations_experimental library) from Airflow? #109

Open
kujaska opened this issue Apr 24, 2023 · 1 comment

Comments

@kujaska
Copy link

kujaska commented Apr 24, 2023

Hi!
Is it possible to run
1) EXPERIMENTAL expectation (from great_expectations_experimental library) from Airflow?

example: expect_queried_column_values_to_exist_in_second_table_column

Simple import to DAG does not help:

from great_expectations_experimental.expectations.expect_queried_column_values_to_exist_in_second_table_column import ExpectQueriedColumnValuesToExistInSecondTableColumn

  • after DAG run getting this text in DataDocs instead of the expectation result:

expect_queried_column_values_to_exist_in_second_table_column(**{'batch_id': '0120cd462e58ed32be35bc92c0ae', 'template_dict': {'condition': '1=1', 'first_table_column': 'PROV_ID', 'second_table_column': 'PROV_ID', 'second_table_full_name': 'LINC'}}) (edited)

2) a custom expectation from great_expectations/plugins/expectations folder?
could it be run from Airflow? how?
https://docs.greatexpectations.io/docs/guides/expectations/creating_custom_expectations/how_to_use_custom_expectations/ (edited)

@kujaska
Copy link
Author

kujaska commented Jun 30, 2023

Same needed to run a custom expectation from PLUGINS folder.

It could be done with additional IMPORT with pure GE but there is no way to do it with airflow-provider-great-expectations

Meanwhile it looks like this works:

in your DAG:

def ge_run_func():
    import great_expectations as gx
    data_context = gx.get_context(context_root_dir="your_root_dir")
    #import your custom expectation plugin from plugins folder here
    from expectations.expect_column_values_to_be_alphabetical import ExpectColumnValuesToBeAlphabetical
    result: CheckpointResult = data_context.run_checkpoint(
        checkpoint_name="your_checkpoint_name"
        )
    return (result["success"])
  • and run ge_run_func in DAG with PythonOperator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant