Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-9412: GDB convergence by error support #4670

Closed
wants to merge 19 commits into from

Conversation

avplatonov
Copy link
Contributor

No description provided.

Copy link
Member

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the algorithmic improvement, but sometimes the code looks like is not very supportable from my side, for example.

@@ -95,7 +98,7 @@ public static void main(String... args) throws InterruptedException {
@NotNull private static CacheConfiguration<Integer, double[]> createCacheConfiguration() {
CacheConfiguration<Integer, double[]> trainingSetCfg = new CacheConfiguration<>();
trainingSetCfg.setName("TRAINING_SET");
trainingSetCfg.setAffinity(new RendezvousAffinityFunction(false, 10));
trainingSetCfg.setAffinity(new RendezvousAffinityFunction(false, 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it have 1 partition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It remained after debugging. I forget revert.
I will fix

*
* @param factory Factory.
*/
public GDBLearningStrategy withCheckConvergenceStgyFactory(ConvergenceCheckStrategyFactory factory) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the class which has in name and behaviour Strategy + Factory, is it required to use here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I should use another name, but factory need to create checker on runtime (like sample size) and it need to avoid additional parameters (like DatasourceBuilder) for user api

I will rename CheckConvergenceStgy to ConvergenceChecker to avoid missunderstanding

ModelsComposition mdl);

/**
* Compute error on one element of dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe: Compute error for the specific vector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is
Thank you

*/
public class FeatureMatrixWithLabelsOnHeapData implements AutoCloseable {
/** Matrix with features. */
private final double[][] features;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike this class which handles only double features. Maybe is better to use more widely used class like https://github.com/apache/ignite/blob/master/modules/ml/src/main/java/org/apache/ignite/ml/structures/LabeledVectorSet.java

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you, but
I need to use such structure for DecisionTrees compaptibility in Gradient Boosting. If we implement Vector-structures support in DT later, then we can rewrite this part to your data structure

* @param features Features.
* @param labels Labels.
*/
public FeatureMatrixWithLabelsOnHeapData(double[][] features, double[] labels) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** Vector with labels. */
private final double[] labels;

public class DecisionTreeDataWithIndex extends FeatureMatrixWithLabelsOnHeapData implements AutoCloseable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we have in database DTDataWihoutIndex? Should it be renamed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right


MeanAbsValueCheckConvergenceStgyFactory factory = new MeanAbsValueCheckConvergenceStgyFactory(0.1);
ModelsComposition notConvergedModel = new ModelsComposition(Collections.emptyList(), null) {
@Override public Double apply(Vector features) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the meaning of the apply implentation here, in test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move it to test-class fields


/** Mean error more sensitive to anomalies in data */
@Test
public void testComputing2() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename test to something meaningful (like the comment above)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

public class MedianOfMedianConvergenceCheckStrategyTest {
/** */
@Test
public void testComputing() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename the test to the something meaningful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

…-9412

# Conflicts:
#	examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesClassificationTrainerExample.java
#	examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesRegressionTrainerExample.java
#	modules/ml/src/test/java/org/apache/ignite/ml/composition/boosting/GDBTrainerTest.java
…-9412

# Conflicts:
#	examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesClassificationTrainerExample.java
#	examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesRegressionTrainerExample.java
#	modules/ml/src/test/java/org/apache/ignite/ml/composition/boosting/GDBTrainerTest.java
@asfgit asfgit closed this in 6225c56 Sep 6, 2018
SGrimstad pushed a commit to gridgain/apache-ignite that referenced this pull request Sep 11, 2018
akalash pushed a commit to gridgain/apache-ignite that referenced this pull request Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants