Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

add async trainModel #81

Merged
merged 1 commit into from
Apr 10, 2020
Merged

add async trainModel #81

merged 1 commit into from
Apr 10, 2020

Conversation

wnbts
Copy link
Member

@wnbts wnbts commented Apr 8, 2020

This change adds a new async trainModel implementation with the same business logic to replace the current synchronous implementation.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@wnbts wnbts marked this pull request as ready for review April 8, 2020 23:20
}
}

private void trainModelForIteration(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iteration means some repeated steps. Suggest to rename to step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chagned

Comment on lines +623 to +633
Entry<Integer, Integer> partitionResults = getPartitionedForestSizes(
RandomCutForest
.builder()
.dimensions(rcfNumFeatures)
.sampleSize(rcfNumSamplesInTree)
.numberOfTrees(rcfNumTrees)
.outputAfter(rcfNumSamplesInTree)
.parallelExecutionEnabled(false)
.build(),
anomalyDetector.getDetectorId()
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this in another PR: https://github.com/opendistro-for-elasticsearch/anomaly-detection/pull/83/files#diff-0ba3da6c04a6db2df8146de98b12d850

This is to have a single place to get the number of partitioned forests. Previously, we have redundant code in both ModelManager and ADStateManager.

If you agree, please use the changed getPartitionedForestSizes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes in that pr are currently unavailable in dev branch. if that is checked in first, this pr can be updated based on that. Or if this pr is checked in first, the refactoring can be done in a separate pr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough.

* onFailure is called IllegalArgumentException when training data is invalid
* onFailure is called LimitExceededException when a limit for training is exceeded
*/
public void trainModel(AnomalyDetector anomalyDetector, double[][] dataPoints, ActionListener<Void> listener) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just transform the sync method to callback, not change any logic, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, changed to async inside out

Copy link
Contributor

@ylwu-amzn ylwu-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the change.

@wnbts wnbts merged commit f851f1a into opendistro-for-elasticsearch:development Apr 10, 2020
kaituo pushed a commit to kaituo/anomaly-detection that referenced this pull request Apr 13, 2020
@wnbts wnbts deleted the mm1-train branch September 11, 2020 01:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants