Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Configurable batch size and max wait limit for targets #1626

Open
Tracked by #1
edgarrmondragon opened this issue Apr 20, 2023 · 1 comment · May be fixed by #1876
Open
Tracked by #1

feat: Configurable batch size and max wait limit for targets #1626

edgarrmondragon opened this issue Apr 20, 2023 · 1 comment · May be fixed by #1876

Comments

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Apr 20, 2023

Feature scope

Targets (data type handling, batching, SQL object generation, etc.)

Description

Prior art

Many targets in the Singer ecosystem (target-snowflake is one) support the following settings or some variant of them:

Property Type Description
batch_size_rows Integer (Default: 100000) Maximum number of rows in each batch. At the end of each batch, the rows in the batch are loaded into Snowflake.
batch_wait_limit_seconds Integer (Default: None) Maximum time to wait for batch to reach batch_size_rows.

Current SDK support

The Singer SDK currently only allows target developers to configure the maximum size of a batch via an attribute:

MAX_SIZE_DEFAULT = 10000

By making this configurable, users would have better control of the resources in a constrained environment (i.e. low memory).

Proposal

  • Add a new pair of built-in settings batch_size_rows and batch_wait_limit_seconds
  • Update Sink.is_full to use batch_size_rows and fall back to Sink.MAX_SIZE_DEFAULT if batch_size_rows is not configured
  • Track the start time of each batch and implement Sink.is_too_old to check against batch_wait_limit_seconds
  • Update the TargetBase class to make use of the new method that checks the batch TTL

Related

@tayloramurphy
Copy link
Collaborator

Requested by multiple people in slack. We may want to do this as part of v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants