Skip to content

Releases: simple-statistics/simple-statistics

Combinatorics and improved Ckmeans

20 Oct 16:34
Compare
Choose a tag to compare
  • Improved Ckmeans algorithm
    from the updated R project
    that dramatically increases performance.
  • Adds permutationHeap method for computing all permutations of an array.
  • Adds combinations for combinations without replacement
  • Adds combinationsReplacement for combinations with replacement

v2.0.0

02 Jun 14:56
Compare
Choose a tag to compare

New features:

  • product: returns the product of a series of numbers
  • medianSorted: exposes the internal method of median
    that only operates on sorted arrays and works in constant time
  • modeSorted: exposes the internal method of mode and works
    in linear time.

Specifications:

  • Adds Flow annotations to all methods, allowing
    up-front typechecking if you use Flow in your application.

Changes:

  • Invalid input now uniformly produces the value NaN instead
    of previously a mix of null and undefined.
  • The method sortedUniqueCount is now called uniqueCountSorted to
    match the other sorted methods, medianSorted and modeSorted

Fixes:

  • equalIntervalBreaks was not exported by index.js, and now is.

Minor fixes & bundle size improvements

30 Nov 21:34
Compare
Choose a tag to compare

Fixes:

Housekeeping:

  • Add keywords to package. Fixes #120
  • Standardize indentation, add example for epsilon
  • Browser testing with Sauce Labs

Bundle size optimizations:

  • Add external sourcemaps for minified and unminified standalone bundles
  • Use bundle-collapser for smaller bundles
  • Indicate numericSort as an internal method.

simple-statistics 1.0.0

10 Aug 04:28
Compare
Choose a tag to compare

This is the first major release of simple-statistics. It represents the work of 18 contributors, many improvements and battle-testing in over 3 years of development.

There are also several major changes that happened in this release.

In the course of many great additions, the monolithic simple_statistics.js file grew to over 1,500 lines, making it relatively unwieldy to develop. This release adopts the node require() method to split the methods into small focused files. This doesn't mean that node is required: a robust browser distribution is now provided in uncompressed & compressed forms under the /dist directory, and tools like browserify and webpack make it very convenient to use this kind of module unchanged.

The original simple-statistics absorbed one of my transgressions against JavaScript code style: underscore naming instead of camelCase naming. v1.0.0 reverses this decision, adopting the near-universal camelCase naming style for all methods.

v1.0.0 keeps all of the literate documentation that helps make tricky statistical methods understandable, but adds a big new documentation component: JSDoc, interpreted through documentation.js. documentation.js has been a side project of mine for quite a while now and this API-style documentation has proven to be a more scannable & uniformly styled counterpart to literate documentation. This also means that we're showing many more code samples inline with documentation to make it immediately clear what input and output types are used with each method.

On the algorithmic side, an implementation of Ckmeans clustering by Haizhou Wang and Mingzhou Song replaces the implementation of Jenks clustering ported from the original Fortran. A big thanks to Mason Lai for pointing me to this great research that establishes a more consistent, better explained, and faster implementation of the clustering problem.

Full changelog to follow:


1.0.0

Breaking Changes

  • Removed the .m() and .b() shortcuts from the linear regression
    class. Use .mb().b and .mb().m instead.
  • linearRegression is now a function, and linearRegressionLine is a separate
    function.

UPGRADING

Linear Regression

Before:

var l = ss.linear_regression().data([[0, 0], [1, 1]]);
l.line()(0); // 0

After:

var line = ss.linearRegressionLine(ss.linearRegression([[0, 0], [1, 1]]));
line(0); // 0

Jenks -> ckmeans

The implementation of Jenks natural breaks was removed: an implementation
of Ckmeans, an improvement on its technique, is added. Ckmeans should
work better for nearly all Jenks usecases.

Before:

ss.jenks([1, 2, 4, 5, 7, 9, 10, 20], 3) //= [1, 7, 20, 20]

After:

ss.ckmeans([1, 2, 4, 5, 7, 9, 10, 20], 3))
//= [ [ 1,  2,  4,  5, 7, 9 ],  [ 10 ],  [ 20 ] ]

Instead of class breaks, ckmeans returns clustered data. Class breaks
can be derived by taking the first value from each cluster:

var breaks = ss.ckmeans([1, 2, 4, 5, 7, 9, 10, 20], 3)).map(function(cluster) {
  return cluster[0];
});
  • BayesModel is now a class
  • PerceptronModel is now a class, and the weights and bias members
    are accessable as properties rather than methods.
  • All multi-word method names are now camelCase rather than underscore_cased:
    this means that a method like ss.r_squared is now accessible as ss.rSquared

New Features

  • Ckmeans replaces Jenks
  • sortedUniqueCount provides an extremely fast method for counting
    unique values of sorted arrays.
  • sumNthPowerDeviations is now exposed, providing a simple way to calculate
    the fundamental aspect of measures like variance and skewness.

Non-Breaking Changes

  • JSDoc documentation throughout
  • Each function is now its own file, and simple-statistics
    is assembled with CommonJS-style require() statements. simple-statistics can
    still be used in a browser with browserify.
  • The standard normal table is now calculated using the cumulative distribution
    function, rather than hardcoded.