Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backoff and retry when KC BQ hits quota limit #14

Closed
C0urante opened this issue Aug 23, 2016 · 1 comment
Closed

Add backoff and retry when KC BQ hits quota limit #14

C0urante opened this issue Aug 23, 2016 · 1 comment

Comments

@C0urante
Copy link
Collaborator

(Migrated from internal Jira issue DI-448)

We currently have a hard-coded sleep of 1000ms (BigQuerySinkTask.TABLE_WRITE_INTERVAL) for writes to each table. This is a performance penalty that's pretty expensive when running a bootstrap of a lot of data (we're behind in the log). During load testing, I was seeing 30000 rows (212 bytes each) take > 3 seconds to write to BigQuery using the streaming API. The single-writer performance in the BQ stream API seems to be in the 1-2 megs/sec range.

Eliminating this config will expose us to quota issues with BigQuery. They only allow 100,000 rows/sec/table. It's going to be really difficult for us to tune this properly using configuration in a distributed environment, since we'll have some number of writers distributed across multiple machines. I think the right approach is actually just to add back off and retry logic when we receive a quota_exceeded error. This will cause the tasks to automatically handle quota exceeded errors, and will allow us to go full throttle when bootstrapping data.

mtagle added a commit that referenced this issue Sep 14, 2016
Resolve Issue #14 
Remove hard-coded throttling in between requests to stay under quota requirements.
If a request attempt returns a quotaExceeded error, pause, and then retry request, similar to 500/503 exception handling logic.
@mtagle
Copy link
Contributor

mtagle commented Sep 14, 2016

Fixed in PR #40

@mtagle mtagle closed this as completed Sep 14, 2016
aakashnshah pushed a commit to aakashnshah/kafka-connect-bigquery that referenced this issue Sep 18, 2020
FMC-463: Fail task if error is encountered when writing to BigQuery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants