Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 2570: Update segment metrics when a transaction merge happens #3239

Merged
merged 6 commits into from Jan 11, 2019

Conversation

kevinhan88
Copy link
Contributor

@kevinhan88 kevinhan88 commented Jan 4, 2019

Signed-off-by: kevinhan88 kevinhan88@gmail.com

Change log description
Transaction segment write_bytes and write_events will only be added into parent segment metrics when transaction segments merge happens.

Purpose of the change
Fixes #2570

What the code does
Inside StreamSegmentContainer.java when mergeStreamSegment() is called, if the sourceSegment is not empty and is a transaction segment, then:
.the length of source segment is added into the WRITE_BYTES metrics of target segment
.the event count of source segment is added into the WRITE_EVENTS metrics of target segment

How to verify it
Integration unit test included.
Can also be observed in live metrics: after txn commit, the metrics of parent segment changes; if txn is aborted, no change to the parent segment.

@fpj fpj requested a review from RaulGracia January 4, 2019 10:37
Copy link
Contributor

@RaulGracia RaulGracia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis build is failing (StreamSegmentContainerTests).

@codecov
Copy link

codecov bot commented Jan 4, 2019

Codecov Report

Merging #3239 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3239      +/-   ##
============================================
- Coverage     78.84%   78.81%   -0.03%     
+ Complexity     8016     8015       -1     
============================================
  Files           609      609              
  Lines         31476    31479       +3     
  Branches       3041     3042       +1     
============================================
- Hits          24816    24810       -6     
- Misses         4870     4876       +6     
- Partials       1790     1793       +3
Impacted Files Coverage Δ Complexity Δ
...mentstore/server/host/handler/AppendProcessor.java 76.14% <100%> (+0.5%) 34 <0> (+1) ⬆️
...e/server/host/handler/PravegaRequestProcessor.java 71.81% <100%> (+2.39%) 65 <0> (+3) ⬆️
...ntstore/server/reading/StorageReadResultEntry.java 66.66% <0%> (-25%) 2% <0%> (ø)
...main/java/io/pravega/controller/task/TaskBase.java 77.57% <0%> (-6.55%) 26% <0%> (-3%)
...a/io/pravega/controller/metrics/StreamMetrics.java 74.13% <0%> (-5.18%) 10% <0%> (-1%)
...erver/logs/operations/CheckpointOperationBase.java 90.9% <0%> (-4.55%) 4% <0%> (-1%)
.../client/stream/impl/ControllerResolverFactory.java 82.56% <0%> (-3.67%) 7% <0%> (ø)
...eventProcessor/requesthandlers/SealStreamTask.java 90.9% <0%> (-3.64%) 23% <0%> (-1%)
...ega/controller/store/task/ZKTaskMetadataStore.java 82.6% <0%> (-2.9%) 12% <0%> (ø)
...ga/controller/store/client/StoreClientFactory.java 79.59% <0%> (-2.05%) 10% <0%> (-1%)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 289d02d...bc73bda. Read the comment docs.

@@ -493,6 +504,13 @@ public boolean isOffline() {
// the Merge right after the Seal.
result = CompletableFuture.allOf(result,
this.durableLog.add(new MergeSegmentOperation(targetSegmentId, sourceSegmentId), timer.getRemaining()));
//if segment is a transaction, add its length onto metrics of target segment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea to place this kind of metrics logic in upper layers whenever possible. For this reason, it would be nice to explore if it is possible to report these metrics in PravegaRequestProcessor.mergeSegments (there is a private method recordStatForTransaction in which you can add your reporting logic).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@andreipaduroiu
Copy link
Member

I am OK with this, but just FYI that we use that metric to measure ingestion throughput. Since the data for transactions has already been reported, we will get throughput spikes for every transaction merge, even if we did not actually ingest any data.

@kevinhan88
Copy link
Contributor Author

@andreipaduroiu yes this approach will cause spike for segment level throughput. Hopefully the situation is mitigated by the global WRITE_BYTES and WRITE_EVENTS metrics as these two are reported regardless.

@kevinhan88 kevinhan88 force-pushed the issue-2570 branch 3 times, most recently from 522e94e to 2107d98 Compare January 8, 2019 18:03
@fpj fpj changed the title Issue-2570: Update segment metrics when a transaction merge happens Issue 2570: Update segment metrics when a transaction merge happens Jan 9, 2019
Copy link
Contributor

@RaulGracia RaulGracia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just a couple of minor comments).

@Slf4j
public class TransactionMetricsTest {

private final int controllerPort = TestUtils.getAvailableListenPort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: indentation in this class seems to be the half of what it should be, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -623,6 +625,13 @@ private void recordStatForTransaction(SegmentProperties sourceInfo, String targe
if (statsRecorder != null) {
statsRecorder.merge(targetSegmentName, len, numOfEvents, creationTime);
}

//if source segment is a transaction, add its length and event count onto metrics of target segment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, according to our guidelines, comments start by space and capitals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
…since some ad-hoc code may skip it

Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
field.set(null, newValue);
}

SegmentProperties createSegmentProperty(String streamSegmentName, UUID txnId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use StreamSegmentInformation.builder(). …. build()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

statsProvider.start();
log.info("Metrics Stats provider is started");

executor = Executors.newSingleThreadScheduledExecutor();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Derive this class from ThreadPooledTestSuite and override getThreadPoolSize. That will take care of creating executors and properly shutting them down for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire test has been replaced by light weight mock tests.

updatedCounter = true;
} catch (AssertionError e) {
log.info("Metric not updated in the cache. Retrying.", e);
Exceptions.handleInterrupted(() -> Thread.sleep(100));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no better way of doing this than using a busy-wait loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we shouldn't have too many unit integration tests - as of today it already takes over 40 minutes to finish all unit tests.
I've deleted this integration test, and added some mock tests.

@@ -1,20 +1,23 @@
/**
* Copyright (c) 2017 Dell Inc., or its subsidiaries. All Rights Reserved.
*
* <p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, remove these markers from the license header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Signed-off-by: kevinhan88 <kevinhan88@gmail.com>
adrianmo added a commit to adrianmo/pravega that referenced this pull request Jan 23, 2019
* master: (31 commits)
  Issue 3279: Client is performing blocking operations inside Netty Event Loop (pravega#3280)
  Issue 3200: Document Review: wire-protocol.md (pravega#3201)
  Issue 2933: (SegmentStore) testEndToEndWithFencing fixes (pravega#3294)
  Issue pravega#2768: Rename PravegaCredsWrapper class and its member variable (pravega#3302)
  Issue pravega#3299: Reduce dependencies in client (pravega#3300)
  Issue 2895: Wire Protocol implementation for TableStore (pravega#3283)
  Issue-3239: Make DYNAMIC_LOGGER non-static (pravega#3293)
  Issue 2956: (SegmentStore) TableSegment.deleteIfEmpty (pravega#3263)
  Issue pravega#3289: Use compile-time dependencies where possible (pravega#3290)
  Issue 3223: RetentionTest failure (pravega#3272)
  Issue 2001: Make a defensive copy of streamCut's position map (pravega#3281)
  Issue 3287: Exclude "passwd" extensions from rat. (pravega#3288)
  Issue 3285: Changed image pull policy. (pravega#3286)
  Issue 3266: Ensure bookkeeper.bkAckQuorumSize is 3 for BookieFailover test. (pravega#3267)
  Issue 3264: Return a SegmentHandle from Storage create method. (pravega#3276)
  Issue 3086: Controller Diagrams(new) included (pravega#3097)
  Issue 2570: Update segment metrics when a transaction merge happens (pravega#3239)
  Issue 3149: (SegmentStore) Pinned Segments (pravega#3215)
  Issue 3269: Fix invalid initialization order in AbstractScaleTests. (pravega#3270)
  Issue 3148: (Segment Store) Segment Metadata Store (pravega#3206)
  ...
@kevinhan88 kevinhan88 deleted the issue-2570 branch June 1, 2020 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants