Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] many timeouts with priority: interactive config #1246

Open
2 tasks done
dataders opened this issue May 20, 2024 · 7 comments
Open
2 tasks done

[Bug] many timeouts with priority: interactive config #1246

dataders opened this issue May 20, 2024 · 7 comments
Labels
bug Something isn't working triage

Comments

@dataders
Copy link
Contributor

dataders commented May 20, 2024

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Community Slack thread in #db-bigquery

Runtime Error in test [TEST NAME] (path/to/test/file.yml)
  Operation did not complete within the designated timeout of 300 seconds. 

I'm seeing the same thing the past 2 days -- it seems to be the same timeout issue but the model / test that fails hasn't been consistent

had to dbt retry the same job 5 times this AM, each time would fail on timeout on a different test or model build. In BQ. seeing a query that takes 0.03 secs of slot time taking 3 mins and 10 seconds duration

[normally], we had 2 issues in 200 runs, and since last night we've had 6/7 timeout.

Expected Behavior

Should not time out

Steps To Reproduce

Relevant log output

No response

Environment

- OS:
- Python:
- dbt-core:
- dbt-bigquery:

Additional Context

No response

@dataders dataders added bug Something isn't working triage labels May 20, 2024
@chase-jones
Copy link

Adding comment here: this is not something that we can consistently reproduce, however, we are noticing GCP queries that take very little slot time (i.e. 0.03s of slot time) will take 3+ mins to complete.

We are using interactive mode to submit these queries.

We also can look at query hashes for long running queries, and see that those some query hashes do not take nearly as much time to run in prior runs.

@chase-jones
Copy link

chase-jones commented May 20, 2024

Auto dbt-retry with exponential backoff would be nice to have as a ways of combating this issue from the dbt side of things. Manually triggering the retry will often work, but in an instance this AM, had to dbt retry (from dbt cloud) the same job 4 times before it went through.

@dougscc
Copy link

dougscc commented May 20, 2024

Noticed the same issue starting with our May 19, 2024, 11:45 PM EDT "production" run. Same issue on our morning run.

The first two times it happened, we were able to recover automatically (we use airflow to get exponential backoff as @chase-jones mentioned above.)

If an automated retry function is added, it would be friendly to add an option to run the retry with --select result:error

@dataders
Copy link
Contributor Author

For diligence, another thing I noticed was that there was a new minor release of the BigQuery Python client four days ago: 3.23.0.

However, I don't see anything there that could affect this, which further leads me to think the issue is with BigQuery itself

@vincentchangsnt
Copy link

Yes, this is happening to us as well. We are encountering a timeout issue that we have not encountered previously. We've tried to rerun the job and it's failing at a different stage each time. Our pipeline ran fine all weekend and unexpectedly started failing this morning with no rhyme or reason.

@wazi55
Copy link
Contributor

wazi55 commented May 20, 2024

Do you also see the same timeout error when you run the query interactively in BigQuery console? If so, I would recommend creating a ticket directly with Google Cloud Support team.

@aleksandrpolyak-bicyclehealth

We ran the same dbt pipeline using GCP Cloud Build and it executed successfully without any slowdowns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

6 participants