### Integrating with Apache Airflow
**Description**: Integrate Great Expectations with Apache Airflow to run data quality checks automatically in your DAG.

**Steps**:
1. Install Airflow (if you haven't already):
2. Airflow DAG Integration:
    - Create a DAG file:
3. Deploy and Test:
    - Place this file in your Airflow DAGs directory and start your Airflow scheduler.
    - Open the Airflow UI and trigger the DAG to see it run your expectations.

In [1]:
# Write your code from here

from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime, timedelta
import great_expectations as ge

def run_data_quality_checks():
    # Initialize Great Expectations DataContext
    context = ge.data_context.DataContext()

    # Define your checkpoint name created in GE (replace with your checkpoint)
    checkpoint_name = "your_checkpoint_name"

    # Optionally define a batch request, if needed by your checkpoint
    batch_request = {
        "datasource_name": "your_datasource_name",
        "data_connector_name": "your_data_connector",
        "data_asset_name": "your_data_asset_name",
        "limit": 1000,  # Optional limit on rows
    }

    # Run checkpoint with batch request
    checkpoint_result = context.run_checkpoint(
        checkpoint_name=checkpoint_name,
        batch_request=batch_request,
    )

    # Check if validation succeeded
    if not checkpoint_result["success"]:
        raise ValueError("Data Quality Check Failed!")

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'email_on_failure': False,
    'start_date': datetime(2025, 1, 1),
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}

with DAG(
    'great_expectations_data_quality',
    default_args=default_args,
    description='Run Great Expectations data quality checks',
    schedule_interval=timedelta(days=1),
    catchup=False,
) as dag:

    run_quality_checks = PythonOperator(
        task_id='run_data_quality_checks',
        python_callable=run_data_quality_checks,
    )


[[34m2025-05-28T13:29:45.777+0000[0m] {[34m_docs_decorators.py:[0m115} INFO[0m - Skipping registering function get_context because it does not have a class[0m
[[34m2025-05-28T13:29:46.760+0000[0m] {[34m_docs_decorators.py:[0m109} INFO[0m - Skipping registering function DataSourceManager._register_add_datasource.<locals>.crud_method_info because it is a closure[0m
[[34m2025-05-28T13:29:46.761+0000[0m] {[34m_docs_decorators.py:[0m109} INFO[0m - Skipping registering function DataSourceManager._register_update_datasource.<locals>.crud_method_info because it is a closure[0m
[[34m2025-05-28T13:29:46.762+0000[0m] {[34m_docs_decorators.py:[0m109} INFO[0m - Skipping registering function DataSourceManager._register_add_or_update_datasource.<locals>.crud_method_info because it is a closure[0m
[[34m2025-05-28T13:29:46.763+0000[0m] {[34m_docs_decorators.py:[0m109} INFO[0m - Skipping registering function DataSourceManager._register_delete_datasource.<locals>.crud_method_

TypeError: DAG.__init__() got an unexpected keyword argument 'schedule_interval'