Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Realtime Segment Commit Upload Improvements #10725

Merged
merged 6 commits into from May 10, 2023

Conversation

ankitsultana
Copy link
Contributor

@ankitsultana ankitsultana commented May 5, 2023

  • Allow configuring timeout for segment uploader to deep-store. This is a backwards incompatible change since the default value will change from 10s to 300s. This is done to keep the config for upload timeout same.
  • Track latency for uploading segments from the server.
  • Track how many segment uploads timeout, fail or succeed.
  • Track how many segments have missing deep-store links.

@codecov-commenter
Copy link

codecov-commenter commented May 5, 2023

Codecov Report

Merging #10725 (aace5c3) into master (53469c0) will increase coverage by 41.12%.
The diff coverage is 96.66%.

@@              Coverage Diff              @@
##             master   #10725       +/-   ##
=============================================
+ Coverage     27.51%   68.63%   +41.12%     
- Complexity       58     6464     +6406     
=============================================
  Files          2127     2143       +16     
  Lines        114728   115193      +465     
  Branches      17296    17354       +58     
=============================================
+ Hits          31563    79062    +47499     
+ Misses        80019    30553    -49466     
- Partials       3146     5578     +2432     
Flag Coverage Δ
integration1 24.07% <53.33%> (-0.15%) ⬇️
integration2 ?
unittests1 68.08% <92.59%> (?)
unittests2 13.75% <6.89%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...altime/ServerSegmentCompletionProtocolHandler.java 51.88% <ø> (ø)
.../data/manager/realtime/PinotFSSegmentUploader.java 75.00% <91.66%> (+17.42%) ⬆️
...g/apache/pinot/common/metrics/ControllerMeter.java 100.00% <100.00%> (ø)
...a/org/apache/pinot/common/metrics/ServerMeter.java 98.78% <100.00%> (+0.04%) ⬆️
...a/org/apache/pinot/common/metrics/ServerTimer.java 88.88% <100.00%> (+12.41%) ⬆️
.../core/realtime/PinotLLCRealtimeSegmentManager.java 75.03% <100.00%> (+13.12%) ⬆️
...data/manager/realtime/SegmentCommitterFactory.java 100.00% <100.00%> (+13.63%) ⬆️
...ger/realtime/Server2ControllerSegmentUploader.java 77.14% <100.00%> (+11.62%) ⬆️
...server/starter/helix/HelixInstanceDataManager.java 68.32% <100.00%> (-5.25%) ⬇️

... and 1589 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ankitsultana ankitsultana changed the title Realtime Segment Commit Upload Minor Realtime Segment Commit Upload Improvements May 5, 2023
@ankitsultana ankitsultana marked this pull request as ready for review May 5, 2023 23:28
_serverMetrics.addMeteredTableValue(segmentName.getTableName(), ServerMeter.SEGMENT_UPLOAD_TIMEOUT, 1);
LOGGER.warn("Timed out waiting to upload segment: {} for table: {}",
segmentName.getSegmentName(), segmentName.getTableName());
_serverMetrics.addMeteredTableValue(rawTableName, ServerMeter.SEGMENT_UPLOAD_TIMEOUT, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitsultana Should we also emit metric for segment upload success / failure overall? This is not related to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right. Done. Also added these metrics for Server2Controller segment uploader.

Comment on lines +611 to +613
if (StringUtils.isBlank(committingSegmentDescriptor.getSegmentLocation())) {
_controllerMetrics.addMeteredTableValue(realtimeTableName, ControllerMeter.SEGMENT_MISSING_DEEP_STORE_LINK, 1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitsultana will this not be equal to the failure metric which we are emitting from SegmentUploader classes? Can there be a scenario where these 2 differ? If not, do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be different. In Split commit servers upload the file to a temp location, and the file is atomically moved to the final location in the controller. It could be that different replicas try to upload the segment to the deep-store.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm makes sense!

@walterddr
Copy link
Contributor

walterddr commented May 9, 2023

tagging @npawar, @snleee and @klsince to take a look

@chenboat chenboat merged commit 1763e70 into apache:master May 10, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants