Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Jul 30, 2013
  1. @cburroughs
  2. @cburroughs
  3. @abramsm

    Merge branch 'serialization_improvements'

    abramsm authored
    Conflicts:
    	src/main/java/com/clearspring/analytics/stream/cardinality/HyperLogLogPlus.java
  4. @abramsm

    Merge branch 'serialization_improvements'

    abramsm authored
    Conflicts:
    	src/main/java/com/clearspring/analytics/stream/cardinality/HyperLogLogPlus.java
Commits on Jul 23, 2013
  1. @abramsm
  2. @abramsm

    fixes #44

    abramsm authored
Commits on Jul 19, 2013
  1. @cykl

    Faster addAll when other is in sparse mode.

    cykl authored
    addAll is much faster in sparse mode since only existing indexes have to be
    updated. We can be smarter than creating a copy of other and converting it
    to normal mode when other is in smart mode. We can traverse the other sparseSet
    and only update relevant index.
    
    This patch speed up scenario when lot of small cardinalities HLL++ are
    aggregated into more coarse grained buckets.
Commits on Jul 18, 2013
  1. @abramsm

    Merge pull request #43 from mspiegel/master

    abramsm authored
    Adding support for insertions and lookups for Strings in CountMinSketch
  2. Clarify that CountMinSketch and associated unit test is liscenced

    Michael Spiegel authored
    under the Apache License, Version 2.0.
Commits on Jul 17, 2013
  1. @abramsm

    Merge pull request #41 from cykl/travis-ci

    abramsm authored
    Enable Travis CI
  2. @cykl

    Ensure that all the cardinality estimator follows the same merge sema…

    cykl authored
    …ntics
    
    - A new estimator is always created
    - This and estimators are never modified
  3. @cykl

    Add addAll method to HLL & HLL++

    cykl authored
    - addAll is similar to Set.addAll. It performs a mutable union. The current
      HLL(++) is modified, the elements of the other HLL(++) are added to the
      current HLL(++). Other is never modified.
      We cannot reuse the merge name since it would break compatibility.
    
    - Updated merge to have a consistent behavior. A new HLL(++) is always created.
      Nor this nor the estimators are ever modified. The issues were that:
    
        - this was returned when estimators was null or empty. This is an unsafe
          behavior. It could easily lead to corruption since the user has now way
          to know if it is safe to modify the returned instance of not.
    
        - The HLL++ implementation was modifying this. This behavior is not
          consistent with HLL and is unsafe.
    
        - The HLL++ implementation was converting estimators to normal mode
          is this is in normal mode. This can be avoided, and is costly both
          in time and memory
  4. Adding support for insertions and lookups for String keys in CountMin…

    Michael Spiegel authored
    …Sketch class.
Commits on Jul 16, 2013
  1. @cykl

    Update mvn profile to enable gpg signing on demand.

    cykl authored
    Use "mvn -Pgpg" to enable gpg for a specific build. By default gpg is disabled.
  2. @abramsm
Commits on Jul 15, 2013
  1. @abramsm

    - ensure tmpIndex is reset when the tmpSet is merged with the sparseSet

    abramsm authored
    - only sort values in the encodedSet up to a specified index to prevent uninitialized values from polluting result
    - fix failing tests and add a new test of a single element
  2. @cykl

    Travis CI does not provide oraclejdk6

    cykl authored
    "Travis CI provides OpenJDK 6, OpenJDK 7 and Oracle JDK 7. Sun JDK 6 is not provided and because it is EOL as of November 2012."
  3. @cykl

    Add a profile to enable gpg signing only for releases.

    cykl authored
    Update travis configuration file, -Dgpg.skip=true no longer required.
  4. @cykl

    Add Travis CI configuration file

    cykl authored
Commits on Jul 14, 2013
  1. @abramsm

    convert sparseSet from an List of byte arrays to a int array. this im…

    abramsm authored
    …proves memory efficiency. This commit does introduce a few more member variables that track HyperLogLogPlus' state. This member variables such as the sparseSet index and tmpSet index will need to be synchronized in some way to make the class threadsafe
Commits on Jul 12, 2013
  1. @abramsm

    Merge pull request #40 from cykl/fix_java6

    abramsm authored
    Fix java 6 regression introduced by 0fcb805
  2. @cykl
Commits on Jul 10, 2013
  1. @abramsm

    add version to the codec and automatically degrade to legacy decoding…

    abramsm authored
    … when the version number is not present in the first byte of the stream. All encodes will use the new encoding scheme so the result should be a one time transformation from legacy to new encoding format
  2. @abramsm
  3. @abramsm

    - get more accurate sparseSet size when comparing to sort threshold t…

    abramsm authored
    …o prevent sparseSet from growing too large
    
    - improve space efficiency for encoding/decoding Sparse and Normal representations of HLL.  Significant space savings but this is a breaking change and not compatible with previous encoding schemes
Commits on Jun 20, 2013
  1. @abramsm

    add missing imports

    abramsm authored
  2. @abramsm
  3. @abramsm

    Merge branch 'master' of https://github.com/eric-vlaanderen/stream-lib

    abramsm authored
    …into eric-vlaanderen-master
Commits on Jun 14, 2013
  1. Improvements to prevent corruption of the minimum value,

    Eric authored
    Combined redundant classes (ScoredItem and ErrorAndCount),
    Prevent ever-increasing variable "size".
Commits on Jun 13, 2013
  1. Fix.

    Eric authored
  2. Implement ConcurrentStreamSummary

    Eric authored
Commits on Jun 5, 2013
  1. @abramsm

    Merge pull request #36 from cykl/faster_hll_card

    abramsm authored
    Faster hll card
Commits on Jun 3, 2013
  1. @cykl

    Faster HLL cardinality for small sets

    cykl authored
    Avoid to traverse the register set twice for small sets.
    It does not imply any slow down for large sets.
  2. @cykl

    Faster HLL cardinality

    cykl authored
    Math.pow(2, (-1 * X)) is the same than 1.0 / (1 << X) but much slower.
    
    The new implementation is from 2 to 50 times faster depending on the set cardinality.
  3. @abramsm

    Merge pull request #35 from cykl/faster_hll

    abramsm authored
    Faster hll
Something went wrong with that request. Please try again.