Skip to content

BigQueryInsertJobOperator creates invalid job_ids #11856

@nathadfield

Description

@nathadfield

Apache Airflow version: 1.10.*
BackPort Packages version: 2020.10.29rc1

What happened:

BigQueryInsertJobOperator tries to start a BigQuery job by specifying a job_id which is a combination of dag_id, task_id, execution_date and an additional uniqueness_suffix.

https://github.com/PolideaInternal/airflow/blob/e5307990adb49a5a1c1c88a029940682efee0f9e/airflow/providers/google/cloud/operators/bigquery.py#L2073

However, it would appear that the regex pattern that is attempting to remove the + character is not correct as this still remain in job_id after re.sub is applied.

https://github.com/PolideaInternal/airflow/blob/e5307990adb49a5a1c1c88a029940682efee0f9e/airflow/providers/google/cloud/operators/bigquery.py#L2073

Invalid job ID "airflow_my_test_dag_v1_test_2020_10_01T00_00_00+00_00_27f69de5cd17a012f3c89d414a70fe64". Job IDs must be alphanumeric (plus underscores and dashes) and must be at most 1024 characters long.

How to reproduce it:

  • Create a DAG with the name my-test-dag-v1.
  • Run any query against BigQuery using this operator

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions