Skip to content

Commit

Permalink
Update to version 0.1.4
Browse files Browse the repository at this point in the history
  • Loading branch information
johnynek committed Dec 13, 2012
1 parent 0805d6f commit f742879
Show file tree
Hide file tree
Showing 3 changed files with 57 additions and 2 deletions.
14 changes: 14 additions & 0 deletions CHANGES.md
Original file line number Original file line Diff line number Diff line change
@@ -1,5 +1,19 @@
# Algebird # # Algebird #


### Version 0.1.4 ###
* Count-min-sketch (with Monoid)
* Added Bloom Filter (with Monoid)
* HyperLogLog now uses Murmur128 (should be faster)
* Max/Min/First/Last Monoids
* VectorSpace trait (implementations for Maps/Vector)
* DecayedVector for efficient exponential moving average on vectors
* Metric trait
* Approximate[Numeric]/Boolean to track error in approximations
* Adds Semigroup and implicits for usual primitives and collections
* Fixes EitherMonoid to have a zero
* Add MinPlus algebra for shortest path calculations
* Lots of code cleanups

### Version 0.1.2 ### ### Version 0.1.2 ###
* Improves speed of HyperLogLog. * Improves speed of HyperLogLog.
* Refactoring of RightFolded Monoid * Refactoring of RightFolded Monoid
Expand Down
43 changes: 42 additions & 1 deletion README.md
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -2,10 +2,43 @@


Abstract algebra for Scala. This code is targeted at building aggregation systems (via [Scalding](https://github.com/twitter/scalding) or [Storm](https://github.com/nathanmarz/storm)). It was originally developed as part of Scalding's Matrix API, where Matrices had values which are elements of Monoids, Groups, or Rings. Subsequently, it was clear that the code had broader application within Scalding and on other projects within Twitter. Abstract algebra for Scala. This code is targeted at building aggregation systems (via [Scalding](https://github.com/twitter/scalding) or [Storm](https://github.com/nathanmarz/storm)). It was originally developed as part of Scalding's Matrix API, where Matrices had values which are elements of Monoids, Groups, or Rings. Subsequently, it was clear that the code had broader application within Scalding and on other projects within Twitter.


## What can you do with this code?

```scala
Welcome to Scala version 2.9.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_07).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.twitter.algebird._
import com.twitter.algebird._

scala> import com.twitter.algebird.Operators._
import com.twitter.algebird.Operators._

scala> Map(1 -> Max(2)) + Map(1 -> Max(3)) + Map(2 -> Max(4))
res1: scala.collection.immutable.Map[Int,com.twitter.algebird.Max[Int]] = Map(2 -> Max(4), 1 -> Max(3))
```
In the above, the class Max[T] signifies that the + operator should actually be max (this is
accomplished by providing an implicit instance of a typeclass for Max that handles +).

* Model a wide class of "reductions" as a sum on some iterator of a particular value type.
For example, average, moving average, max/min, set
union, approximate set size (in much less memory with HyperLogLog), approximate item counting
(using CountMinSketch).
* All of these combine naturally in tuples, vectors, maps, options and more standard scala classes.
* Implementations of Monoids for interesting approximation algorithms, such as Bloom filter,
HyperLogLog and CountMinSketch. These allow you to think of these sophisticated operations like
you might numbers, and add them up in hadoop or online to produce powerful statistics and
analytics.

## Maven ## Maven
Current version is 0.1.2. groupid="com.twitter" artifact="algebird_2.9.2". Current version is 0.1.4. groupid="com.twitter" artifact="algebird_2.9.2".


## Questions ## Questions
> Why not use spire?
We didn't know about it when we started this code, but it seems like we're more focused on
large scale analytics.

> Why not use Scalaz's [Monoid](https://github.com/scalaz/scalaz/blob/master/core/src/main/scala/scalaz/Monoid.scala) trait? > Why not use Scalaz's [Monoid](https://github.com/scalaz/scalaz/blob/master/core/src/main/scala/scalaz/Monoid.scala) trait?
The answer is a mix of the following: The answer is a mix of the following:
Expand All @@ -18,7 +51,15 @@ The answer is a mix of the following:
## Authors ## Authors


* Oscar Boykin <http://twitter.com/posco> * Oscar Boykin <http://twitter.com/posco>
* Avi Bryant <http://twitter.com/avibryant>
* Edwin Chen <http://twitter.com/echen>
* ellchow <http://github.com/ellchow>
* Mike Gagnon <https://twitter.com/MichaelNGagnon>
* Moses Nakamura <https://twitter.com/mnnakamura>
* Steven Nobel <http://twitter.com/snoble>
* Sam Ritchie <http://twitter.com/sritchie> * Sam Ritchie <http://twitter.com/sritchie>
* Ashutosh Singhal <http://twitter.com/daashu>
* Argyris Zymnis <http://twitter.com/argyris>


## License ## License
Copyright 2012 Twitter, Inc. Copyright 2012 Twitter, Inc.
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Original file line Diff line number Diff line change
@@ -1,6 +1,6 @@
name := "algebird" name := "algebird"


version := "0.1.4-SNAPSHOT" version := "0.1.4"


organization := "com.twitter" organization := "com.twitter"


Expand Down

0 comments on commit f742879

Please sign in to comment.