-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bigquery Operators in defferable mode fail if the location is not US. #29307
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
@lwyszomi can you take a look? |
Hi Team! |
Hi Team!
But specifying the correct location of the job in the url solves this problem:
Current implementation of the methods for Job object from gcloud.aio.bigquery package doesn't support adding location as a parameter for the methods: I have created an issue in GitHub repo for the gcloud.aio.bigquery package with the example and reproduction steps of the problem, however it may take some time to covert this issue from their side: As a workaround, I can implement async methods for BigQuery calls and don't use the Job object from gcloud.aio.bigquery package, but it will require to change all other classes that use current implementation of BigQueryAsyncHook with calling Job object methods: BigQueryIntervalCheckTrigger, BigQueryValueCheckTrigger, BigQueryCheckTrigger, BigQueryGetDataTrigger and all operators that use those triggers. Please, let me know what option you prefer more, thanks :) |
I don't think Airflow should workaround bugs/issues of upstream packages (at least not these kind of bugs) I would suggest to link here the bug report of upstream repo, users who will hit this issue can find the workaround as you suggested and comment in the upstream package repo to increase awareness for the bug. |
Okay, sure. I have updated previous comment with the link to created issue. |
@VladaZakharova so if i'm reading this right we don't have a task on this issue and this one should be fixed upstream? if I think we can close this one? |
@eladkal Yes, you are right, you can close it. I have placed the solution o the comment and the issue. |
Apache Airflow Provider(s)
google
Versions of Apache Airflow Providers
apache-airflow-providers-google==8.8.0
Apache Airflow version
2.5.1
Operating System
Mac
Deployment
Astronomer
Deployment details
No response
What happened
While using BigqueryInsertJobOperator in defferable mode, it fails.
Logs -
airflow.exceptions.AirflowException: 404, message='Not Found: {\n "error": {\n "code": 404,\n "message": "Not found: Job dev-data-platform-294611:airflow_derived_tables_all_tasks_group_temp_user_seg_all_create_stage_table_2023_02_01T04_30_00_00_00_0f2853ad8762909d41067023ddb3c6d8",\n "errors": [\n {\n "message": "Not found: Job dev-data-platform-294611:airflow_derived_tables_all_tasks_group_temp_user_seg_all_create_stage_table_2023_02_01T04_30_00_00_00_0f2853ad8762909d41067023ddb3c6d8",\n "domain": "global",\n "reason": "notFound"\n }\n ],\n "status": "NOT_FOUND"\n }\n}\n', url=URL('https://www.googleapis.com/bigquery/v2/projects/dev-data-platform-294611/jobs/airflow_derived_tables_all_tasks_group_temp_user_seg_all_create_stage_table_2023_02_01T04_30_00_00_00_0f2853ad8762909d41067023ddb3c6d8')
What you think should happen instead
The bigquery insert job should succeed.
Debugged and the error happens because the google-aio lib does not pass in the location param when making the GET JOB api call to bigquery.
Acc. to docs if using any region besides us and europe, this location should be passed.
As seen in the logs, it uses the domain global instead of the location that is passed to it.
How to reproduce
In the BigqueryInsertJobOperator, give a location which is not in US (eg - asia-south1) and make deferrable=true
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: