Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding push job type of segment metadata only mode #5967

Merged
merged 4 commits into from
Sep 30, 2020

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Sep 3, 2020

Description

  • Add new FileUploadType METADATA
  • Add /segments/metadata endpoint to upload segment with METADATA only mode.
  • Add new pinot push job types: SegmentMetadataPush and SegmentCreationAndMetadataPush
  • This job will upload pinot segment metadata along with download URI to bypass controller downloading segment code path.

@xiangfu0 xiangfu0 force-pushed the segment_metadata_push branch 2 times, most recently from e3095b2 to cb159a3 Compare September 3, 2020 23:29
@yupeng9
Copy link
Contributor

yupeng9 commented Sep 3, 2020

Curious, why is this mode needed? Could this lead to inconsistency between metadata and segments?

@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Sep 4, 2020

Curious, why is this mode needed? Could this lead to inconsistency between metadata and segments?

This mode should be used in caution.
We observed for some users, it takes 10 hours to push 20 TB segments to controller to download all segments, load metadata and add segments.
This is mostly useful for data bootstrapping to reduce total push time.

@yupeng9
Copy link
Contributor

yupeng9 commented Sep 4, 2020

Curious, why is this mode needed? Could this lead to inconsistency between metadata and segments?

This mode should be used in caution.
We observed for some users, it takes 10 hours to push 20 TB segments to controller to download all segments, load metadata and add segments.
This is mostly useful for data bootstrapping to reduce total push time.

I see. I can see this can be a useful tool for issue mitigation. It'll be good to add some description of the scenarios that it targets in the PR description.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Please add a test if possible

@xiangfu0 xiangfu0 force-pushed the segment_metadata_push branch 3 times, most recently from 2d32aaf to dc89494 Compare September 29, 2020 21:33
@xiangfu0 xiangfu0 merged commit 4f2e767 into master Sep 30, 2020
@xiangfu0 xiangfu0 deleted the segment_metadata_push branch September 30, 2020 04:47
@xiangfu0 xiangfu0 added the release-notes Referenced by PRs that need attention when compiling the next release notes label Sep 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants