Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Query Cancellation #918

Closed
wants to merge 2 commits into from
Closed

Add Query Cancellation #918

wants to merge 2 commits into from

Conversation

d-cole
Copy link
Contributor

@d-cole d-cole commented Sep 8, 2023

resolves # #917
docs dbt-labs/docs.getdbt.com/# N/A

Problem

The BigQueryConnectionManager cannot cancel queries.

Solution

This PR adds query cancellation by:

  • Predetermining and storing job_ids on a per-thread basis in BigQueryConnectionManager.job_ids_by_thread
  • Implements cancel_open to attempt to cancel any running (or previously run) job ids associated with each thread before closing the connection. This is done by using the existing _retry_and_handle to call client.cancel_job.

Why predetermine job_ids?
Initially I considered persisting job_ids after client.query is called. However, this could lead to a race condition if a job has been started and a termination is sent before the job_id was stored, leading to a failure to cancel the job. By predetermining job_ids (uuid4), we can persist the job_id before the job has been kicked off. Doing this, the race condition only leads to attempting to cancel a job that doesn't exist.

Alternatives considered:
I considered using sessions similar to the Snowflake adapter. This doesn't seem feasible as the python api currently does not support session termination.

Questions for reviewers:

  • Are there concerns around modifying jobs_by_thread without locking?
  • What (if any) are the implications of this for python models?
  • When a job has completed it is not removed from jobs_by_thread. This means upon cancelation it will attempt to cancel jobs that have already completed. Will this cause any issues?
  • What logging should be added upon the successful cancelation of a job?
  • What integration tests (if any) are needed?

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

Copy link
Contributor

github-actions bot commented Apr 4, 2024

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Apr 4, 2024
Copy link
Contributor

Although we are closing this PR as stale, it can still be reopened to continue development. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this Apr 11, 2024
@d-cole
Copy link
Contributor Author

d-cole commented Apr 29, 2024

Reopening this PR as supporting query cancelation would be valuable and bring dbt-bigquery inline with cancelation support in other adapters. Benefits of this feature are discussed here: #917

This will benefit users of dbt-bigquery in two scenarios:

  • When executions triggered during local development are interrupted, this feature will avoid wasting compute. This will help reduce cost, especially when users are using on-demand billing. Additionally, it will save users time as they will not need to use the BQ ui and api to identify and cancel query jobs.
  • This will allow for orchestrator-configured timeouts (e.g. airflow) to safely cancel model execution. This will save compute and ensure retries don't rerun a query that was unknowingly completed or still running.

If this feature will be addressed by a different approach/PR please let me know and we can close this one.

@jtcohen6 jtcohen6 reopened this May 16, 2024
@jtcohen6
Copy link
Contributor

jtcohen6 commented May 16, 2024

Thanks @d-cole - we can try taking another look here. The changes in this PR are fairly intricate, so it will require careful review, especially around how we want to functionally test this.

@jtcohen6 jtcohen6 mentioned this pull request May 31, 2024
@colin-rogers-dbt colin-rogers-dbt self-assigned this Jun 5, 2024
@colin-rogers-dbt colin-rogers-dbt added the community This PR is from a community member label Jun 20, 2024
@colin-rogers-dbt
Copy link
Contributor

closing in favor of #1251

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes community This PR is from a community member ok to test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants