Skip to content

Bayer-Group/chinese-restaurant-process

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chinese-restaurant-process

Highly optimized Scala implementation of Chinese Restaurant Process based non-parametric Bayesian clustering

Include in your project

Add the following to your SBT dependencies:

"com.monsanto.stats" %% "chinese-restaurant-process" % "1.0.2"

For older versions of SBT you may also need to add:

resolvers += Resolver.bintrayRepo("monsanto", "maven")

Basic Usage

import com.monsanto.stats.tables._
import com.monsanto.stats.tables.clustering._

val cannedAllTopicVectorResults: Vector[TopicVectorInput] = MnMGen.cannedData
val cannedCrp = new CRP(ModelParams(5, 2, 2), cannedAllTopicVectorResults)
val crpResult = cannedCrp.findClusters(200, RealRandomNumGen, cannedCrp.selectCluster)

Iteration 1: cluster count was 365, reseat: 35, score: -29578.83920*
Iteration 2: cluster count was 118, reseat: 15, score: -29111.34349*
Iteration 3: cluster count was 61, reseat: 7, score: -28919.62995*
Iteration 4: cluster count was 40, reseat: 6, score: -28852.91482*
Iteration 5: cluster count was 29, reseat: 6, score: -28804.38123*
Iteration 6: cluster count was 24, reseat: 5, score: -28741.68993*
Iteration 7: cluster count was 16, reseat: 5, score: -28734.04974*
Iteration 8: cluster count was 14, reseat: 6, score: -28742.16624
Iteration 9: cluster count was 12, reseat: 5, score: -28739.19560
Iteration 10: cluster count was 10, reseat: 5, score: -28738.64498
...
Iteration 190: cluster count was 4, reseat: 10, score: -28724.77273
Iteration 191: cluster count was 3, reseat: 11, score: -28724.77273
Iteration 192: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 193: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 194: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 195: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 196: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 197: cluster count was 3, reseat: 11, score: -28724.77273
Iteration 198: cluster count was 3, reseat: 10, score: -28724.77273
Iteration 199: cluster count was 3, reseat: 13, score: -28724.77273
Iteration 200: cluster count was 3, reseat: 12, score: -28724.77273

About

Highly optimized Scala implementation of Chinese Restaurant Process based non-parametric Bayesian clustering

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages