Skip to content

Common-SQL Operators Various Bugs #25163

@denimalpaca

Description

@denimalpaca

Apache Airflow Provider(s)

common-sql

Versions of Apache Airflow Providers

apache-airflow-providers-common-sql==1.0.0

Apache Airflow version

2.3.3 (latest released)

Operating System

macOS Monterey 12.3.1

Deployment

Astronomer

Deployment details

No response

What happened

  • SQLTableCheckOperator builds multiple checks in such a way that if two or more checks are given, and one is not a fully aggregated statement, then the SQL fails as it is missing a GROUP BY clause.
  • SQLColumnCheckOperator provides only the last SQL query built from the columns, so when a check fails, it will only give the correct SQL in the exception statement by coincidence.

What you think should happen instead

  • Multiple checks should not need a GROUP BY clause
  • Either the correct SQL statement, or no SQL statement, should be returned in the exception message.

How to reproduce

For the SQLTableCheckOperator, using the operator like so:

table_cheforestfire_costs_table_checkscks = SQLTableCheckOperator(
            task_id="forestfire_costs_table_checks",
            table=SNOWFLAKE_FORESTFIRE_COST_TABLE,
            checks={
                "row_count_check": {"check_statement": "COUNT(*) = 9"},
                "total_cost_check": {"check_statement": "land_damage_cost + property_damage_cost + lost_profits_cost = total_cost"}
            }
        )

For the SQLColumnCheckOperator, using the operator like so:

cost_column_checks = SQLColumnCheckOperator(
    task_id="cost_column_checks",
    table=SNOWFLAKE_COST_TABLE,
    column_mapping={
        "ID": {"null_check": {"equal_to": 0}},
        "LAND_DAMAGE_COST": {"min": {"geq_to": 0}},
        "PROPERTY_DAMAGE_COST": {"min": {"geq_to": 0}},
        "LOST_PROFITS_COST": {"min": {"geq_to": 0}},
    }
)

and ensuring that any of the ID, LAND_DAMAGE_COST, or PROPERTY_DAMAGE_COST checks fail.

An example DAG with the correct environment and data can be found here.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions