Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4998][MLlib]delete the "train" function #3836

Closed
wants to merge 1 commit into from

Conversation

ljzzju
Copy link
Contributor

@ljzzju ljzzju commented Dec 30, 2014

To make the functions with the same in "object" effective, specially when using java reflection.
As the "train" function defined in "class DecisionTree" will hide the functions with the same name in "object DecisionTree".

JIRA[SPARK-4998]

To make the functions with the same in "object" effective, specially when using java reflection.
As the "train" function defined in "class DecisionTree" will hide the functions with the same name in "object DecisionTree".

JIRA[SPARK-4998]
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Dec 30, 2014

Although this is an experimental class, and so API methods could be removed, I think you'd still want a decent reason to remove this method at this point, even if it's deprecated.

Here, there is no method with the same signature in the object though, so I'm not sure what the problem is. It's common to have many methods with the same name in a class anyway. With reflection you differentiate them by their method signature, so this shouldn't be an obstacle. (I think the object even appears as a separate class DecisionTree$ in the JVM?)

Last you should connect your existing JIRA to this PR by putting SPARK-4998 in the title. https://issues.apache.org/jira/browse/SPARK-4998

@ljzzju ljzzju changed the title [MLlib]delete the "train" function [SPARK-4998][MLlib]delete the "train" function Dec 30, 2014
@mengxr
Copy link
Contributor

mengxr commented Dec 30, 2014

@srowen The Scala compiler doesn't generate the static train method under DecisionTree if there is a train method under class DecisionTree, regardless of the method signature. That's why we deprecated this method. From javap:

public class org.apache.spark.mllib.tree.DecisionTree implements scala.Serializable,org.apache.spark.Logging {
  public static scala.Option<org.apache.spark.mllib.tree.impl.NodeIdCache> findBestSplits$default$10();
  public static org.apache.spark.mllib.tree.impl.TimeTracker findBestSplits$default$9();
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainRegressor(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint>, java.util.Map<java.lang.Integer, java.lang.Integer>, java.lang.String, int, int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainClassifier(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint>, int, java.util.Map<java.lang.Integer, java.lang.Integer>, java.lang.String, int, int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int);
  public org.slf4j.Logger org$apache$spark$Logging$$log_();
  public void org$apache$spark$Logging$$log__$eq(org.slf4j.Logger);
  public java.lang.String logName();
  public org.slf4j.Logger log();
  public void logInfo(scala.Function0<java.lang.String>);
  public void logDebug(scala.Function0<java.lang.String>);
  public void logTrace(scala.Function0<java.lang.String>);
  public void logWarning(scala.Function0<java.lang.String>);
  public void logError(scala.Function0<java.lang.String>);
  public void logInfo(scala.Function0<java.lang.String>, java.lang.Throwable);
  public void logDebug(scala.Function0<java.lang.String>, java.lang.Throwable);
  public void logTrace(scala.Function0<java.lang.String>, java.lang.Throwable);
  public void logWarning(scala.Function0<java.lang.String>, java.lang.Throwable);
  public void logError(scala.Function0<java.lang.String>, java.lang.Throwable);
  public boolean isTraceEnabled();
  public org.apache.spark.mllib.tree.model.DecisionTreeModel run(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>);
  public org.apache.spark.mllib.tree.model.DecisionTreeModel train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>);
  public org.apache.spark.mllib.tree.DecisionTree(org.apache.spark.mllib.tree.configuration.Strategy);
}

One way to call this static train method in Java is quite ugly:

DecisionTree$.MODULE$.train(...)

@srowen
Copy link
Member

srowen commented Dec 30, 2014

Ah, I see the problem now. That's surprising. Yes I'm used to using MODULE$ from Java where necessary, so I think there's a fairly straightforward workaround. But this seems to be a legitimate argument for removing the deprecated method sooner than later indeed.

@mengxr
Copy link
Contributor

mengxr commented Dec 30, 2014

ok to test

@mengxr
Copy link
Contributor

mengxr commented Dec 30, 2014

add to whitelist

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24909 has started for PR 3836 at commit 4e13133.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24910 has started for PR 3836 at commit 4e13133.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24909 has finished for PR 3836 at commit 4e13133.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24909/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Dec 30, 2014

Test build #24910 has finished for PR 3836 at commit 4e13133.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24910/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Dec 30, 2014

LGTM. Merged into master. Thanks!

@asfgit asfgit closed this in 035bac8 Dec 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants