Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8536][MLlib]Generalize OnlineLDAOptimizer to asymmetric document-topic Dirichlet priors #7575

Conversation

feynmanliang
Copy link
Contributor

Modify LDA to take asymmetric document-topic prior distributions and OnlineLDAOptimizer to use the asymmetric prior during variational inference.

This PR only generalizes OnlineLDAOptimizer and the associated LocalLDAModel; EMLDAOptimizer and DistributedLDAModel still only support symmetric alpha (checked during EMLDAOptimizer.initialize).

@SparkQA
Copy link

SparkQA commented Jul 21, 2015

Test build #37978 has finished for PR 7575 at commit 58f1d7b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@feynmanliang feynmanliang changed the title Edit [SPARK-8536][MLlib]Generalize OnlineLDAOptimizer to asymmetric document-topic Dirichlet priors [SPARK-8536][MLlib]Generalize OnlineLDAOptimizer to asymmetric document-topic Dirichlet priors Jul 22, 2015
@jkbradley
Copy link
Member

Looks good except for the merge conflicts

Feynman Liang added 2 commits July 22, 2015 14:13
…metric-priors

* apache/master:
  [SPARK-9224] [MLLIB] OnlineLDA Performance Improvements
  [SPARK-9024] Unsafe HashJoin/HashOuterJoin/HashSemiJoin
  [SPARK-9165] [SQL] codegen for CreateArray, CreateStruct and CreateNamedStruct
  [SPARK-9082] [SQL] Filter using non-deterministic expressions should not be pushed down
  [SPARK-9254] [BUILD] [HOTFIX] sbt-launch-lib.bash should support HTTP/HTTPS redirection
  [SPARK-4233] [SPARK-4367] [SPARK-3947] [SPARK-3056] [SQL] Aggregation Improvement
  [SPARK-9232] [SQL] Duplicate code in JSONRelation
  [SPARK-9121] [SPARKR] Get rid of the warnings about `no visible global function definition` in SparkR
  [SPARK-9154][SQL] Rename formatString to format_string.
  [SPARK-9154] [SQL] codegen StringFormat
  [SPARK-9206] [SQL] Fix HiveContext classloading for GCS connector.
  [SPARK-8906][SQL] Move all internal data source classes into execution.datasources.
  [SPARK-8357] Fix unsafe memory leak on empty inputs in GeneratedAggregate
  Revert "[SPARK-9154] [SQL] codegen StringFormat"
@SparkQA
Copy link

SparkQA commented Jul 22, 2015

Test build #38111 has finished for PR 7575 at commit af8fbb7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CreateArray(children: Seq[Expression]) extends Expression
    • case class CreateStruct(children: Seq[Expression]) extends Expression
    • case class CreateNamedStruct(children: Seq[Expression]) extends Expression

@jkbradley
Copy link
Member

LGTM, merging with master
Thanks!

@asfgit asfgit closed this in 1aca9c1 Jul 22, 2015
@feynmanliang feynmanliang deleted the SPARK-8536-LDA-asymmetric-priors branch July 22, 2015 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants