Skip to content

Commit

Permalink
updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jashmenn committed May 27, 2010
1 parent 0f31883 commit 88c1fb1
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 15 deletions.
30 changes: 18 additions & 12 deletions README.md
@@ -1,11 +1,26 @@
## communitize
## SSimilarity

### What this does
### What this is

Runs a distributed computation of the cosine distance of the item-vecotrs of a user-item matrix
Runs a distributed computation of the cosine distance of the item-vectors of a
user-item matrix. This value can be used as the basis for an item-based
recommendation system.

Benefits:

* It's fully distributed
* It handles string Ids (unlike Mahout)
* It's short: only about 300 lines of scala code.

### Prereq

* Hadoop 0.20.x
* Apache Buildr

### How

bu clean package
rake clean default

### Input

Expand Down Expand Up @@ -47,15 +62,6 @@ Phase 3: Compute the pairwise cosine similarity for all item pairs that have bee
where [1] is the sum of the product of the co-rated pref values (dot product) and
[2] is the product of the vector lengths




### Output

Item-to-item similarity

xxx

### Credit

* Outline originally based on ItemSimilarityJob in the Apache Mahout project.
Expand Down
3 changes: 0 additions & 3 deletions src/main/scala/SSimilarity.scala
Expand Up @@ -203,7 +203,6 @@ object SSimilarity extends HadoopInterop {
numerator += value
actualCoratedCount += 1
}
println("min: %d act: %d", minimumCoratedCount, actualCoratedCount)

if(actualCoratedCount >= minimumCoratedCount) {
var (idA, idB, length) = pair.parsedItemPairLength
Expand Down Expand Up @@ -245,8 +244,6 @@ object SSimilarity extends HadoopInterop {
def truncate(d : Double, places : int) : Double = {
return new java.math.BigDecimal(d, new java.math.MathContext(places)).doubleValue
}



}

Expand Down

0 comments on commit 88c1fb1

Please sign in to comment.