Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-11723] [ML] [Doc] Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame #9690

Closed
wants to merge 5 commits into from

Conversation

yanboliang
Copy link
Contributor

Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame, include:

@mengxr

case _ =>
throw new IllegalArgumentException("Algo ${params.algo} not supported.")
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringIndexer will cast label column to String automatically and then index, so we don't need this code snippet.

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45858 has finished for PR 9690 at commit 83a0b2c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -82,7 +82,7 @@ private[libsvm] class LibSVMRelation(val path: String, val numFeatures: Int, val
* .load("data/mllib/sample_libsvm_data.txt")
*
* // Java
* DataFrame df = sqlContext.read.format("libsvm")
* DataFrame df = sqlContext.read().format("libsvm")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this!

asfgit pushed a commit that referenced this pull request Nov 13, 2015
…dLibSVMFile to load DataFrame

Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame, include:
* Use libSVM data source for all example codes under examples/ml, and remove unused import.
* Use libSVM data source for user guides under ml-*** which were omitted by #8697.
* Fix bug: We should use ```sqlContext.read().format("libsvm").load(path)``` at Java side, but the API doc and user guides misuse as ```sqlContext.read.format("libsvm").load(path)```.
* Code cleanup.

mengxr

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #9690 from yanboliang/spark-11723.

(cherry picked from commit 99693fe)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@mengxr
Copy link
Contributor

mengxr commented Nov 13, 2015

LGTM. Merged into master and branch-1.6. Thanks!

@asfgit asfgit closed this in 99693fe Nov 13, 2015
dskrvk pushed a commit to dskrvk/spark that referenced this pull request Nov 13, 2015
…dLibSVMFile to load DataFrame

Use LibSVM data source rather than MLUtils.loadLibSVMFile to load DataFrame, include:
* Use libSVM data source for all example codes under examples/ml, and remove unused import.
* Use libSVM data source for user guides under ml-*** which were omitted by apache#8697.
* Fix bug: We should use ```sqlContext.read().format("libsvm").load(path)``` at Java side, but the API doc and user guides misuse as ```sqlContext.read.format("libsvm").load(path)```.
* Code cleanup.

mengxr

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#9690 from yanboliang/spark-11723.
@yanboliang yanboliang deleted the spark-11723 branch November 16, 2015 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants