Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] provide exponential_avg* stats for batch transforms #52041

Merged
merged 8 commits into from
Feb 12, 2020

Conversation

hendrikmuhs
Copy link
Contributor

@hendrikmuhs hendrikmuhs commented Feb 7, 2020

provide exponential_avg* stats for batch transforms, avoids confusion why those values are all 0 otherwise

Dear reviewer, please ignore the 1st commit and start with: 7d76136

Closes #52037

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was quick! LGTM

The client defaults these settings to 0.0 if not set.

this.expAvgCheckpointDurationMs = expAvgCheckpointDurationMs == null ? 0.0 : expAvgCheckpointDurationMs;

Are you happy with that?

@hendrikmuhs hendrikmuhs changed the title [Transform] omit continuous stats for batch transforms [Transform] provide exponential_avg* stats for batch transforms Feb 10, 2020
Copy link
Contributor Author

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidkyle

I completely reworked this change. Now it reports the exp averages even for checkpoint 1, that means for batch, too.

Unfortunately the PR got a bit messy, I therefore annotated the relevant bits. Can you review it?

@@ -361,9 +361,8 @@ protected void onFinish(ActionListener<Void> listener) {
if (progress != null && progress.getPercentComplete() != null && progress.getPercentComplete() < 100.0) {
progress.incrementDocsProcessed(progress.getTotalDocs() - progress.getDocumentsProcessed());
}
// If the last checkpoint is now greater than 1, that means that we have just processed the first
// continuous checkpoint and should start recording the exponential averages
if (lastCheckpoint != null && lastCheckpoint.getCheckpoint() > 1) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ this is basically the main change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😁 ha it took a lot of reviewing to get here and it's a one liner

this(numPages, numInputDocuments, numOutputDocuments, numInvocations, indexTime, searchTime, indexTotal, searchTotal,
indexFailures, searchFailures, 0.0, 0.0, 0.0);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ removed this constructor

@@ -69,7 +69,7 @@ public void testBwcWith73() throws IOException {
STARTED,
randomBoolean() ? null : randomAlphaOfLength(100),
randomBoolean() ? null : NodeAttributeTests.randomNodeAttributes(),
new TransformIndexerStats(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
new TransformIndexerStats(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 0.0, 0.0, 0.0),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not get confused, testBwcWith73: this test BWC with 7.3 where we had no exponential averages, so its ok to construct the stats with 0.0

};
TransformIndexerStats.EXPONENTIAL_AVG_CHECKPOINT_DURATION_MS.getPreferredName(),
TransformIndexerStats.EXPONENTIAL_AVG_DOCUMENTS_INDEXED.getPreferredName(),
TransformIndexerStats.EXPONENTIAL_AVG_DOCUMENTS_PROCESSED.getPreferredName(), };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ a bug uncovered by this change: no telemetry for the exponential averages

10, // searchFailures
11.0, // exponential_avg_checkpoint_duration_ms
12.0, // exponential_avg_documents_indexed
13.0 // exponential_avg_documents_processed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ foolish test, due to the alternative constructor, the test was incomplete

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

assertThat(transformStats.get("documents_processed"), equalTo(1000));
assertThat(transformStats.get("documents_indexed"), equalTo(27));
assertThat(
"exponential_avg_checkpoint_duration_ms is not 0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the message should change '.. is not > 0.0'

@@ -361,9 +361,8 @@ protected void onFinish(ActionListener<Void> listener) {
if (progress != null && progress.getPercentComplete() != null && progress.getPercentComplete() < 100.0) {
progress.incrementDocsProcessed(progress.getTotalDocs() - progress.getDocumentsProcessed());
}
// If the last checkpoint is now greater than 1, that means that we have just processed the first
// continuous checkpoint and should start recording the exponential averages
if (lastCheckpoint != null && lastCheckpoint.getCheckpoint() > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😁 ha it took a lot of reviewing to get here and it's a one liner

@@ -115,8 +130,16 @@ public void testUsageDisabled() throws IOException, InterruptedException, Execut
when(licenseState.isTransformAllowed()).thenReturn(true);
Settings.Builder settings = Settings.builder();
settings.put("xpack.transform.enabled", false);
var usageAction = new TransformUsageTransportAction(mock(TransportService.class), null, null,
mock(ActionFilters.class), null, settings.build(), licenseState, mock(Client.class));
var usageAction = new TransformUsageTransportAction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The var is great I'm looking forward to using it. Expecting a backport problem I looked for this class in the 7.x branch and couldn't find it, instead there is TransformFeatureSetTests which looks very similar. Up to you but you might want to sync the 2 branches to make future backports easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file/class does not exist in 7.x, its based on a re-factoring that was only done on master: #43563

I stumbled upon this several times (e.g. when renaming the feature to transform) and its indeed a troublemaker for backports. The code for 7.x is different, every time I change something here, I need to re-work the PR for 7.x. Fortunately this file doesn't change that often.

@davidkyle
Copy link
Member

@hendrikmuhs sorry for the slow review I missed the notification at first

@hendrikmuhs hendrikmuhs merged commit 34734ae into elastic:master Feb 12, 2020
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this pull request Feb 13, 2020
…tic#52041)

provide exponential_avg* stats for batch transforms, avoids confusion
why those values are all 0 otherwise
hendrikmuhs pushed a commit that referenced this pull request Feb 14, 2020
…52041) (#52323)

provide exponential_avg* stats for batch transforms, avoids confusion
why those values are all 0 otherwise
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Transform exponential_avg* stats not updated in batch mode
4 participants