-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IGNITE-9412: GDB convergence by error support #4670
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like the algorithmic improvement, but sometimes the code looks like is not very supportable from my side, for example.
@@ -95,7 +98,7 @@ public static void main(String... args) throws InterruptedException { | |||
@NotNull private static CacheConfiguration<Integer, double[]> createCacheConfiguration() { | |||
CacheConfiguration<Integer, double[]> trainingSetCfg = new CacheConfiguration<>(); | |||
trainingSetCfg.setName("TRAINING_SET"); | |||
trainingSetCfg.setAffinity(new RendezvousAffinityFunction(false, 10)); | |||
trainingSetCfg.setAffinity(new RendezvousAffinityFunction(false, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it have 1 partition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It remained after debugging. I forget revert.
I will fix
* | ||
* @param factory Factory. | ||
*/ | ||
public GDBLearningStrategy withCheckConvergenceStgyFactory(ConvergenceCheckStrategyFactory factory) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike the class which has in name and behaviour Strategy + Factory, is it required to use here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I should use another name, but factory need to create checker on runtime (like sample size) and it need to avoid additional parameters (like DatasourceBuilder) for user api
I will rename CheckConvergenceStgy to ConvergenceChecker to avoid missunderstanding
ModelsComposition mdl); | ||
|
||
/** | ||
* Compute error on one element of dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe: Compute error for the specific vector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is
Thank you
*/ | ||
public class FeatureMatrixWithLabelsOnHeapData implements AutoCloseable { | ||
/** Matrix with features. */ | ||
private final double[][] features; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike this class which handles only double features. Maybe is better to use more widely used class like https://github.com/apache/ignite/blob/master/modules/ml/src/main/java/org/apache/ignite/ml/structures/LabeledVectorSet.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you, but
I need to use such structure for DecisionTrees compaptibility in Gradient Boosting. If we implement Vector-structures support in DT later, then we can rewrite this part to your data structure
* @param features Features. | ||
* @param labels Labels. | ||
*/ | ||
public FeatureMatrixWithLabelsOnHeapData(double[][] features, double[] labels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #4670 (comment)
/** Vector with labels. */ | ||
private final double[] labels; | ||
|
||
public class DecisionTreeDataWithIndex extends FeatureMatrixWithLabelsOnHeapData implements AutoCloseable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we have in database DTDataWihoutIndex? Should it be renamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right
|
||
MeanAbsValueCheckConvergenceStgyFactory factory = new MeanAbsValueCheckConvergenceStgyFactory(0.1); | ||
ModelsComposition notConvergedModel = new ModelsComposition(Collections.emptyList(), null) { | ||
@Override public Double apply(Vector features) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain the meaning of the apply implentation here, in test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move it to test-class fields
|
||
/** Mean error more sensitive to anomalies in data */ | ||
@Test | ||
public void testComputing2() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename test to something meaningful (like the comment above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
public class MedianOfMedianConvergenceCheckStrategyTest { | ||
/** */ | ||
@Test | ||
public void testComputing() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename the test to the something meaningful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
…-9412 # Conflicts: # examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesClassificationTrainerExample.java # examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesRegressionTrainerExample.java # modules/ml/src/test/java/org/apache/ignite/ml/composition/boosting/GDBTrainerTest.java
…-9412 # Conflicts: # examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesClassificationTrainerExample.java # examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesRegressionTrainerExample.java # modules/ml/src/test/java/org/apache/ignite/ml/composition/boosting/GDBTrainerTest.java
No description provided.