[SPARK-9888][MLlib]User guide for new LDA features #8254

feynmanliang · 2015-08-17T22:22:05Z

Adds two new sections to LDA's user guide; one for each optimizer/model
Documents new features added to LDA (e.g. topXXXperXXX, asymmetric priors, hyperpam optimization)
Cleans up a TODO and sets a default parameter in LDA code

@jkbradley @hhbyyh

SparkQA · 2015-08-17T23:09:00Z

Test build #41059 has finished for PR 8254 at commit b8b9f9a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

feynmanliang · 2015-08-21T20:56:31Z

@jkbradley do you mind reviewing?

jkbradley · 2015-08-25T06:58:15Z

docs/mllib-clustering.md

-
-*Note*: LDA is a new feature with some missing functionality.  In particular, it does not yet
-support prediction on new documents, and it does not have a Python API.  These will be added in the future.
+* `LDAOptimizer`: Optimizer to use for learning the LDA model, either


Actually just called "optimizer" in public API

jkbradley · 2015-08-25T18:04:18Z

docs/mllib-clustering.md

-
-* Topics: Inferred topics, each of which is a probability distribution over terms (words).
-* Topic distributions for documents: For each non empty document in the training set, LDA gives a probability distribution over topics. (EM only). Note that for empty documents, we don't create the topic distributions. (EM only)
+* Topics correspond to cluster centers, and documents correspond to


FYI, for the future, try not to change formatting in Markdown unnecessarily since it makes reviewing harder. There aren't style guidelines for Markdown.

SparkQA · 2015-08-25T18:39:51Z

Test build #41541 has finished for PR 8254 at commit 7401012.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

See [discussion](#8254 (comment)) CC jkbradley Author: Feynman Liang <fliang@databricks.com> Closes #8422 from feynmanliang/SPARK-10230. (cherry picked from commit 881208a) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>

See [discussion](#8254 (comment)) CC jkbradley Author: Feynman Liang <fliang@databricks.com> Closes #8422 from feynmanliang/SPARK-10230.

jkbradley · 2015-08-25T23:34:24Z

LGTM pending tests

SparkQA · 2015-08-25T23:41:56Z

Test build #41569 has finished for PR 8254 at commit c8a1013.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-08-26T00:38:51Z

Merging with master and branch-1.5

* Adds two new sections to LDA's user guide; one for each optimizer/model * Documents new features added to LDA (e.g. topXXXperXXX, asymmetric priors, hyperpam optimization) * Cleans up a TODO and sets a default parameter in LDA code jkbradley hhbyyh Author: Feynman Liang <fliang@databricks.com> Closes #8254 from feynmanliang/SPARK-9888. (cherry picked from commit 125205c) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>

See [discussion](apache/spark#8254 (comment)) CC jkbradley Author: Feynman Liang <fliang@databricks.com> Closes #8422 from feynmanliang/SPARK-10230.

Adds new LDA features to user guide

b8b9f9a

jkbradley mentioned this pull request Aug 25, 2015

[Minor] [Doc] Fix LDA user guide issue #8410

Closed

jkbradley reviewed Aug 25, 2015
View reviewed changes

Code review comments

7401012

feynmanliang mentioned this pull request Aug 25, 2015

[SPARK-10230][MLlib]Rename optimizeAlpha to optimizeDocConcentration #8422

Closed

jkbradley reviewed Aug 25, 2015
View reviewed changes

asfgit pushed a commit that referenced this pull request Aug 25, 2015

[SPARK-10230] [MLLIB] Rename optimizeAlpha to optimizeDocConcentration

881208a

See [discussion](#8254 (comment)) CC jkbradley Author: Feynman Liang <fliang@databricks.com> Closes #8422 from feynmanliang/SPARK-10230.

Code review changes

c8a1013

asfgit closed this in 125205c Aug 26, 2015

feynmanliang deleted the SPARK-9888 branch August 26, 2015 02:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-9888][MLlib]User guide for new LDA features #8254

[SPARK-9888][MLlib]User guide for new LDA features #8254

feynmanliang commented Aug 17, 2015

SparkQA commented Aug 17, 2015

feynmanliang commented Aug 21, 2015

jkbradley Aug 25, 2015

feynmanliang Aug 25, 2015

jkbradley Aug 25, 2015

feynmanliang Aug 25, 2015

SparkQA commented Aug 25, 2015

jkbradley commented Aug 25, 2015

SparkQA commented Aug 25, 2015

jkbradley commented Aug 26, 2015

[SPARK-9888][MLlib]User guide for new LDA features #8254

[SPARK-9888][MLlib]User guide for new LDA features #8254

Conversation

feynmanliang commented Aug 17, 2015

SparkQA commented Aug 17, 2015

feynmanliang commented Aug 21, 2015

jkbradley Aug 25, 2015

Choose a reason for hiding this comment

feynmanliang Aug 25, 2015

Choose a reason for hiding this comment

jkbradley Aug 25, 2015

Choose a reason for hiding this comment

feynmanliang Aug 25, 2015

Choose a reason for hiding this comment

SparkQA commented Aug 25, 2015

jkbradley commented Aug 25, 2015

SparkQA commented Aug 25, 2015

jkbradley commented Aug 26, 2015