Permalink
Browse files

Renamed package from DSNA to Bottleneck.

  • Loading branch information...
1 parent b5877c2 commit 9df171924c67e04a969f1f975ed8e713abac5f32 @kwgoodman committed Nov 28, 2010
Showing with 4,378 additions and 4,271 deletions.
  1. +1 −1 LICENSE
  2. +4 −4 MANIFEST.in
  3. +63 −68 README.rst
  4. +4 −4 RELEASE.rst
  5. +8 −7 {dsna → bottleneck}/LICENSE
  6. +3 −3 {dsna → bottleneck}/__init__.py
  7. 0 {dsna/testing → bottleneck/bench}/__init__.py
  8. 0 {dsna → bottleneck}/bench/autotimeit.py
  9. +20 −20 {dsna → bottleneck}/bench/bench.py
  10. +2 −2 {dsna → bottleneck}/src/Makefile
  11. +2,630 −2,630 {dsna → bottleneck}/src/func/func.c
  12. 0 {dsna → bottleneck}/src/func/func.pyx
  13. 0 {dsna → bottleneck}/src/func/header.pyx
  14. +12 −12 {dsna → bottleneck}/src/func/max.pyx
  15. +14 −14 {dsna → bottleneck}/src/func/mean.pyx
  16. +12 −12 {dsna → bottleneck}/src/func/min.pyx
  17. +5 −5 {dsna → bottleneck}/src/func/setup.py
  18. +12 −12 {dsna → bottleneck}/src/func/std.pyx
  19. +14 −13 {dsna → bottleneck}/src/func/sum.pyx
  20. +10 −10 {dsna → bottleneck}/src/func/var.pyx
  21. +1,083 −1,067 {dsna → bottleneck}/src/group/group.c
  22. 0 {dsna → bottleneck}/src/group/group.pyx
  23. 0 {dsna → bottleneck}/src/group/group_header.pyx
  24. +2 −2 {dsna → bottleneck}/src/group/group_mapper.pyx
  25. +9 −8 {dsna → bottleneck}/src/group/group_mean.pyx
  26. +5 −5 {dsna → bottleneck}/src/group/setup.py
  27. +253 −253 {dsna → bottleneck}/src/move/move.c
  28. 0 {dsna → bottleneck}/src/move/move.pyx
  29. 0 {dsna → bottleneck}/src/move/move_header.pyx
  30. +7 −7 {dsna → bottleneck}/src/move/move_sum.pyx
  31. +5 −5 {dsna → bottleneck}/src/move/setup.py
  32. 0 {dsna/bench → bottleneck/testing}/__init__.py
  33. 0 {dsna → bottleneck}/testing/group_validator.py
  34. 0 {dsna → bottleneck}/testing/move_validators.py
  35. +10 −10 {dsna → bottleneck}/tests/func_test.py
  36. +5 −5 {dsna → bottleneck}/tests/group_test.py
  37. +4 −4 {dsna → bottleneck}/tests/move_test.py
  38. +1 −1 {dsna → bottleneck}/version.py
  39. +3 −3 doc/doc_howto
  40. +5 −5 doc/source/conf.py
  41. +4 −4 doc/source/index.rst
  42. +1 −1 doc/source/license.rst
  43. +18 −18 doc/source/reference.rst
  44. +149 −56 setup.py
View
@@ -1 +1 @@
-The license file is one level down from this file: dsna/LICENSE.
+The license file is one level down from this file: bottleneck/LICENSE.
View
@@ -1,6 +1,6 @@
-include LICENSE README.rst RELEASE.rst dsna/LICENSE
-include dsna/src/MakeFile
-include dsna/src/func/setup.py
-recursive-include dsna/src/func *.pyx
+include LICENSE README.rst RELEASE.rst bottleneck/LICENSE
+include bottleneck/src/MakeFile
+include bottleneck/src/func/setup.py
+recursive-include bottleneck/src/func *.pyx
recursive-include doc *
recursive-exclude doc/build *
View
@@ -1,78 +1,77 @@
-====
-DSNA
-====
+==========
+Bottleneck
+==========
Introduction
============
-DSNA uses the magic of Cython to give you fast, NaN-aware descriptive
-statistics of NumPy arrays.
+Bottleneck is a collection of fast, NumPy array functions written in Cython.
+
+The three categories of Bottleneck functions:
+
+- Faster, drop-in replacement for NaN functions in NumPy and SciPy
+- Moving window functions
+- Group functions that bin calculations by like-labeled elements
-The functions in dsna fall into three categories:
+Function signatures (using mean as an example):
-=============== ===============================
- General sum(arr, axis=None)
- Moving window move_sum(arr, window, axis=0)
- Group by group_sum(arr, label, axis=0)
-=============== ===============================
+=============== ============================================
+ NaN functions mean(arr, axis=None)
+ Moving window move_mean(arr, window, axis=0)
+ Group by group_mean(arr, label, order=None, axis=0)
+=============== ============================================
-For example, create a NumPy array::
+Let's give it a try. Create a NumPy array::
>>> import numpy as np
>>> arr = np.array([1, 2, np.nan, 4, 5])
-Then find the sum::
+Find the sum::
- >>> import dsna as ds
- >>> ds.sum(arr)
+ >>> import bottleneck as bn
+ >>> bn.sum(arr)
12.0
Moving window sum::
- >>> ds.move_sum(arr, window=2)
+ >>> bn.move_sum(arr, window=2)
array([ nan, 3., 2., 4., 9.])
Group mean::
>>> label = ['a', 'a', 'b', 'b', 'a']
- >>> ds.group_mean(arr, label)
+ >>> bn.group_mean(arr, label)
(array([ 2.66666667, 4. ]), ['a', 'b'])
- >>> ds.group_mean(arr, label, order=['b', 'a'])
- (array([ 4. , 2.66666667]), ['b', 'a'])
- >>> ds.group_mean(arr, label, order=['b'])
- (array([ 4.]), ['b'])
Fast
====
-DNSA is fast::
+Bottleneck is fast::
- >>> import dsna as ds
- >>> import numpy as np
- >>> arr = np.random.rand(100, 100)
-
+ >>> arr = np.random.rand(100, 100)
>>> timeit np.nansum(arr)
10000 loops, best of 3: 68.4 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
100000 loops, best of 3: 17.7 us per loop
Let's not forget to add some NaNs::
>>> arr[arr > 0.5] = np.nan
>>> timeit np.nansum(arr)
1000 loops, best of 3: 417 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
10000 loops, best of 3: 64.8 us per loop
-DSNA comes with a benchmark suite that compares the performance of the dsna
-functions that have a NumPy/SciPy equivalent. To run the benchmark::
+Bottleneck comes with a benchmark suite that compares the performance of the
+bottleneck functions that have a NumPy/SciPy equivalent. To run the
+benchmark::
- >>> ds.benchit(verbose=False)
- DSNA performance benchmark
- DSNA 0.0.1dev
- Numpy 1.5.1
- Scipy 0.8.0
- Speed is numpy (or scipy) time divided by dsna time
+ >>> bn.benchit(verbose=False)
+ Bottleneck performance benchmark
+ Bottleneck 0.1.0dev
+ Numpy 1.5.1
+ Scipy 0.8.0
+ Speed is numpy (or scipy) time divided by Bottleneck time
NaN means all NaNs
Speed Test Shape dtype NaN?
4.8103 nansum(a, axis=-1) (500,500) int64
@@ -119,54 +118,49 @@ functions that have a NumPy/SciPy equivalent. To run the benchmark::
Faster
======
-Under the hood dsna uses a separate Cython function for each combination of
-ndim, dtype, and axis. A lot of the overhead in ds.max, for example, is
-in checking that your axis is within range, converting non-array data to an
+Under the hood Bottleneck uses a separate Cython function for each combination
+of ndim, dtype, and axis. A lot of the overhead in bn.max(), for example, is
+in checking that the axis is within range, converting non-array data to an
array, and selecting the function to use to calculate the maximum.
You can get rid of the overhead by doing all this before you, say, enter
an inner loop::
>>> arr = np.random.rand(10,10)
- >>> func, a = ds.func.max_selector(arr, axis=0)
+ >>> func, a = bn.func.max_selector(arr, axis=0)
>>> func
<built-in function max_2d_float64_axis0>
-Let's see how much faster than runs::
+Let's see how much faster than runs::
>> timeit np.nanmax(arr, axis=0)
10000 loops, best of 3: 25.7 us per loop
- >> timeit ds.max(arr, axis=0)
+ >> timeit bn.max(arr, axis=0)
100000 loops, best of 3: 5.25 us per loop
>> timeit func(a)
100000 loops, best of 3: 2.5 us per loop
-Note that ``func`` is faster than the Numpy's non-nan version of max::
+Note that ``func`` is faster than Numpy's non-NaN version of max::
>> timeit arr.max(axis=0)
100000 loops, best of 3: 3.28 us per loop
-So adding NaN protection to your inner loops has a negative cost!
+So adding NaN protection to your inner loops comes at a negative cost!
Functions
=========
-DSNA is in the prototype stage.
+Bottleneck is in the prototype stage.
-DSNA contains the following functions (an asterisk means not yet complete):
+Bottleneck contains the following functions:
========= ============== ===============
-sum* move_sum* group_sum*
-mean move_mean* group_mean*
-var move_var* group_var*
-std move_std* group_std*
-min move_min* group_min*
-max move_max* group_max*
-median* move_median* group_median*
-zscore* move_zscore* group_zscore*
-ranking* move_ranking* group_ranking*
-quantile* move_quantile* group_quantile*
-count* move_count* group_count*
+sum move_sum
+mean group_mean
+var
+std
+min
+max
========= ============== ===============
Currently only 1d, 2d, and 3d NumPy arrays with dtype int32, int64, and
@@ -175,23 +169,24 @@ float64 are supported.
License
=======
-DSNA is distributed under a Simplified BSD license. Parts of NumPy and Scipy,
-which both have BSD licenses, are included in dsna. See the LICENSE file,
-which is distributed with dsna, for details.
+Bottleneck is distributed under a Simplified BSD license. Parts of NumPy,
+Scipy and numpydoc, all of which have BSD licenses, are included in
+Bottleneck. See the LICENSE file, which is distributed with Bottleneck, for
+details.
-Install
-=======
+Download and install
+====================
-You can grab dsna from http://github.com/kwgoodman/dsna
+You can grab Bottleneck from http://github.com/kwgoodman/bottleneck
**GNU/Linux, Mac OS X, et al.**
-To install dsna::
+To install Bottleneck::
$ python setup.py build
$ sudo python setup.py install
-Or, if you wish to specify where dsna is installed, for example inside
+Or, if you wish to specify where Bottleneck is installed, for example inside
``/usr/local``::
$ python setup.py build
@@ -210,10 +205,10 @@ commands::
**Post install**
-After you have installed dsna, run the suite of unit tests::
+After you have installed Bottleneck, run the suite of unit tests::
- >>> import dsna
- >>> dsna.test()
+ >>> import bottleneck as bn
+ >>> bn.test()
<snip>
Ran 10 tests in 13.756s
OK
View
@@ -4,11 +4,11 @@ Release Notes
=============
These are the major changes made in each release. For details of the changes
-see the commit log at http://github.com/kwgoodman/dsna
+see the commit log at http://github.com/kwgoodman/bottleneck
-dsna 0.1.0
-==========
+Bottleneck 0.1.0
+================
*Release date: Not yet released, in development*
-The first release of dsna (descriptive statistics of NumPy arrays).
+The first release of Bottleneck.
@@ -2,11 +2,12 @@
License
=======
-DSNA is distributed under a Simplified BSD license. Parts of NumPy, SciPy and
-and numpydoc, which all have BSD licenses, are included in dsna.
+Bottleneck is distributed under a Simplified BSD license. Parts of NumPy,
+SciPy and and numpydoc, which all have BSD licenses, are included in
+Bottleneck.
-DSNA license
-============
+Bottleneck license
+==================
Copyright (c) 2010, Archipel Asset Management AB.
All rights reserved.
@@ -37,8 +38,8 @@ POSSIBILITY OF SUCH DAMAGE.
Other licenses
==============
-DSNA contains doc strings from NumPy and SciPy and Sphinx extensions from
-numpydoc.
+Bottleneck contains doc strings from NumPy and SciPy and Sphinx extensions
+from numpydoc.
NumPy license
@@ -115,4 +116,4 @@ DAMAGE.
numpydoc license
----------------
-The numpydoc license is in dsna/doc/sphinxext/LICENSE.txt
+The numpydoc license is in bottleneck/doc/sphinxext/LICENSE.txt
@@ -3,12 +3,12 @@
from move import move_sum
from group import group_mean
-from dsna.version import __version__
-from dsna.bench.bench import *
+from bottleneck.version import __version__
+from bottleneck.bench.bench import *
try:
from numpy.testing import Tester
test = Tester().test
del Tester
except (ImportError, ValueError):
- print "No dsna unit testing available."
+ print "No Bottleneck unit testing available."
File renamed without changes.
File renamed without changes.
@@ -1,7 +1,7 @@
import numpy as np
import scipy
-import dsna as ds
+import bottleneck as bn
from autotimeit import autotimeit
__all__ = ['benchit']
@@ -35,20 +35,20 @@ def suite():
statements = {}
setups = {}
- setups['(10000,) float64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'float64')"
- setups['(500,500) float64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'float64')"
- setups['(10000,) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'float64', True)"
- setups['(500,500) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'float64', True)"
- setups['(10000,) int32'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'int32')"
- setups['(500,500) int32'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'int32')"
- setups['(10000,) int64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'int64')"
- setups['(500,500) int64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'int64')"
+ setups['(10000,) float64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'float64')"
+ setups['(500,500) float64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'float64')"
+ setups['(10000,) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'float64', True)"
+ setups['(500,500) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'float64', True)"
+ setups['(10000,) int32'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'int32')"
+ setups['(500,500) int32'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'int32')"
+ setups['(10000,) int64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'int64')"
+ setups['(500,500) int64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'int64')"
- # DSNA
- s = ['ds.sum(a, axis=-1)', 'ds.max(a, axis=-1)',
- 'ds.min(a, axis=-1)', 'ds.mean(a, axis=-1)',
- 'ds.std(a, axis=-1)']
- statements['dsna'] = s
+ # Bottleneck
+ s = ['bn.sum(a, axis=-1)', 'bn.max(a, axis=-1)',
+ 'bn.min(a, axis=-1)', 'bn.mean(a, axis=-1)',
+ 'bn.std(a, axis=-1)']
+ statements['bottleneck'] = s
# Numpy
s = ['np.nansum(a, axis=-1)', 'np.nanmax(a, axis=-1)',
@@ -60,14 +60,14 @@ def suite():
def display(results):
results = list(results)
- na = [i for i in results if i[0].startswith('ds.')]
+ na = [i for i in results if i[0].startswith('bn.')]
nu = [i for i in results if i[0].startswith('np.') or
i[0].startswith('sp.')]
- print 'DSNA performance benchmark'
- print "\tDSNA %s" % ds.__version__
- print "\tNumpy %s" % np.__version__
- print "\tScipy %s" % scipy.__version__
- print "\tSpeed is numpy (or scipy) time divided by dsna time"
+ print 'Bottleneck performance benchmark'
+ print "\tBottleneck %s" % bn.__version__
+ print "\tNumpy %s" % np.__version__
+ print "\tScipy %s" % scipy.__version__
+ print "\tSpeed is numpy (or scipy) time divided by Bottleneck time"
print "\tNaN means all NaNs"
print " Speed Test Shape dtype NaN?"
for nai in na:
@@ -31,10 +31,10 @@ groups:
mv group.so ../group.so
test:
- python -c "import dsna; dsna.test()"
+ python -c "import bottleneck; bottleneck.test()"
bench:
- python -c "import dsna; dsna.benchit()"
+ python -c "import bottleneck; bottleneck.benchit()"
# Phony targets for cleanup and similar uses
Oops, something went wrong.

0 comments on commit 9df1719

Please sign in to comment.