Skip to content

Get larger batch of input files when using native batch with google cloud#9307

Merged
jihoonson merged 1 commit intoapache:masterfrom
zachjsh:IMPLY-2030
Feb 4, 2020
Merged

Get larger batch of input files when using native batch with google cloud#9307
jihoonson merged 1 commit intoapache:masterfrom
zachjsh:IMPLY-2030

Conversation

@zachjsh
Copy link
Copy Markdown
Contributor

@zachjsh zachjsh commented Feb 4, 2020

Description

By default native batch ingestion was only getting a batch of 10
files at a time when used with google cloud. The Default for other
cloud providers is 1024, and should be similar for google cloud.
The low batch size was caused by mistype. This change updates the
batch size to 1024 when using google cloud.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

…loud

By default native batch ingestion was only getting a batch of 10
files at a time when used with google cloud. The Default for other
cloud providers is 1024, and should be similar for google cloud.
The low batch size was caused by mistype. This change updates the
batch size to 1024 when using google cloud.
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing 👍


public class GoogleCloudStorageInputSourceTest extends InitializedNullHandlingTest
{
private static final long EXPECTED_MAX_LISTING_LENGTH = 1024L;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should probably just use GoogleCloudStorageInputSource.MAX_LISTING_LENGTH directly in the event it ever changes

@jihoonson jihoonson merged commit 768d60c into apache:master Feb 4, 2020
zachjsh added a commit to zachjsh/druid that referenced this pull request Feb 6, 2020
…loud (apache#9307)

By default native batch ingestion was only getting a batch of 10
files at a time when used with google cloud. The Default for other
cloud providers is 1024, and should be similar for google cloud.
The low batch size was caused by mistype. This change updates the
batch size to 1024 when using google cloud.
zachjsh added a commit to implydata/druid-public that referenced this pull request Feb 6, 2020
…loud (apache#9307)

By default native batch ingestion was only getting a batch of 10
files at a time when used with google cloud. The Default for other
cloud providers is 1024, and should be similar for google cloud.
The low batch size was caused by mistype. This change updates the
batch size to 1024 when using google cloud.
zachjsh added a commit to implydata/druid-public that referenced this pull request Feb 7, 2020
…loud (apache#9307) (#52)

By default native batch ingestion was only getting a batch of 10
files at a time when used with google cloud. The Default for other
cloud providers is 1024, and should be similar for google cloud.
The low batch size was caused by mistype. This change updates the
batch size to 1024 when using google cloud.
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants