Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Renamed package from DSNA to Bottleneck.

  • Loading branch information...
commit 9df171924c67e04a969f1f975ed8e713abac5f32 1 parent b5877c2
kwgoodman authored
Showing with 4,378 additions and 4,271 deletions.
  1. +1 −1  LICENSE
  2. +4 −4 MANIFEST.in
  3. +63 −68 README.rst
  4. +4 −4 RELEASE.rst
  5. +8 −7 {dsna → bottleneck}/LICENSE
  6. +3 −3 {dsna → bottleneck}/__init__.py
  7. 0  {dsna/testing → bottleneck/bench}/__init__.py
  8. 0  {dsna → bottleneck}/bench/autotimeit.py
  9. +20 −20 {dsna → bottleneck}/bench/bench.py
  10. +2 −2 {dsna → bottleneck}/src/Makefile
  11. +2,630 −2,630 {dsna → bottleneck}/src/func/func.c
  12. 0  {dsna → bottleneck}/src/func/func.pyx
  13. 0  {dsna → bottleneck}/src/func/header.pyx
  14. +12 −12 {dsna → bottleneck}/src/func/max.pyx
  15. +14 −14 {dsna → bottleneck}/src/func/mean.pyx
  16. +12 −12 {dsna → bottleneck}/src/func/min.pyx
  17. +5 −5 {dsna → bottleneck}/src/func/setup.py
  18. +12 −12 {dsna → bottleneck}/src/func/std.pyx
  19. +14 −13 {dsna → bottleneck}/src/func/sum.pyx
  20. +10 −10 {dsna → bottleneck}/src/func/var.pyx
  21. +1,083 −1,067 {dsna → bottleneck}/src/group/group.c
  22. 0  {dsna → bottleneck}/src/group/group.pyx
  23. 0  {dsna → bottleneck}/src/group/group_header.pyx
  24. +2 −2 {dsna → bottleneck}/src/group/group_mapper.pyx
  25. +9 −8 {dsna → bottleneck}/src/group/group_mean.pyx
  26. +5 −5 {dsna → bottleneck}/src/group/setup.py
  27. +253 −253 {dsna → bottleneck}/src/move/move.c
  28. 0  {dsna → bottleneck}/src/move/move.pyx
  29. 0  {dsna → bottleneck}/src/move/move_header.pyx
  30. +7 −7 {dsna → bottleneck}/src/move/move_sum.pyx
  31. +5 −5 {dsna → bottleneck}/src/move/setup.py
  32. 0  {dsna/bench → bottleneck/testing}/__init__.py
  33. 0  {dsna → bottleneck}/testing/group_validator.py
  34. 0  {dsna → bottleneck}/testing/move_validators.py
  35. +10 −10 {dsna → bottleneck}/tests/func_test.py
  36. +5 −5 {dsna → bottleneck}/tests/group_test.py
  37. +4 −4 {dsna → bottleneck}/tests/move_test.py
  38. +1 −1  {dsna → bottleneck}/version.py
  39. +3 −3 doc/doc_howto
  40. +5 −5 doc/source/conf.py
  41. +4 −4 doc/source/index.rst
  42. +1 −1  doc/source/license.rst
  43. +18 −18 doc/source/reference.rst
  44. +149 −56 setup.py
2  LICENSE
View
@@ -1 +1 @@
-The license file is one level down from this file: dsna/LICENSE.
+The license file is one level down from this file: bottleneck/LICENSE.
8 MANIFEST.in
View
@@ -1,6 +1,6 @@
-include LICENSE README.rst RELEASE.rst dsna/LICENSE
-include dsna/src/MakeFile
-include dsna/src/func/setup.py
-recursive-include dsna/src/func *.pyx
+include LICENSE README.rst RELEASE.rst bottleneck/LICENSE
+include bottleneck/src/MakeFile
+include bottleneck/src/func/setup.py
+recursive-include bottleneck/src/func *.pyx
recursive-include doc *
recursive-exclude doc/build *
131 README.rst
View
@@ -1,59 +1,57 @@
-====
-DSNA
-====
+==========
+Bottleneck
+==========
Introduction
============
-DSNA uses the magic of Cython to give you fast, NaN-aware descriptive
-statistics of NumPy arrays.
+Bottleneck is a collection of fast, NumPy array functions written in Cython.
+
+The three categories of Bottleneck functions:
+
+- Faster, drop-in replacement for NaN functions in NumPy and SciPy
+- Moving window functions
+- Group functions that bin calculations by like-labeled elements
-The functions in dsna fall into three categories:
+Function signatures (using mean as an example):
-=============== ===============================
- General sum(arr, axis=None)
- Moving window move_sum(arr, window, axis=0)
- Group by group_sum(arr, label, axis=0)
-=============== ===============================
+=============== ============================================
+ NaN functions mean(arr, axis=None)
+ Moving window move_mean(arr, window, axis=0)
+ Group by group_mean(arr, label, order=None, axis=0)
+=============== ============================================
-For example, create a NumPy array::
+Let's give it a try. Create a NumPy array::
>>> import numpy as np
>>> arr = np.array([1, 2, np.nan, 4, 5])
-Then find the sum::
+Find the sum::
- >>> import dsna as ds
- >>> ds.sum(arr)
+ >>> import bottleneck as bn
+ >>> bn.sum(arr)
12.0
Moving window sum::
- >>> ds.move_sum(arr, window=2)
+ >>> bn.move_sum(arr, window=2)
array([ nan, 3., 2., 4., 9.])
Group mean::
>>> label = ['a', 'a', 'b', 'b', 'a']
- >>> ds.group_mean(arr, label)
+ >>> bn.group_mean(arr, label)
(array([ 2.66666667, 4. ]), ['a', 'b'])
- >>> ds.group_mean(arr, label, order=['b', 'a'])
- (array([ 4. , 2.66666667]), ['b', 'a'])
- >>> ds.group_mean(arr, label, order=['b'])
- (array([ 4.]), ['b'])
Fast
====
-DNSA is fast::
+Bottleneck is fast::
- >>> import dsna as ds
- >>> import numpy as np
- >>> arr = np.random.rand(100, 100)
-
+ >>> arr = np.random.rand(100, 100)
>>> timeit np.nansum(arr)
10000 loops, best of 3: 68.4 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
100000 loops, best of 3: 17.7 us per loop
Let's not forget to add some NaNs::
@@ -61,18 +59,19 @@ Let's not forget to add some NaNs::
>>> arr[arr > 0.5] = np.nan
>>> timeit np.nansum(arr)
1000 loops, best of 3: 417 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
10000 loops, best of 3: 64.8 us per loop
-DSNA comes with a benchmark suite that compares the performance of the dsna
-functions that have a NumPy/SciPy equivalent. To run the benchmark::
+Bottleneck comes with a benchmark suite that compares the performance of the
+bottleneck functions that have a NumPy/SciPy equivalent. To run the
+benchmark::
- >>> ds.benchit(verbose=False)
- DSNA performance benchmark
- DSNA 0.0.1dev
- Numpy 1.5.1
- Scipy 0.8.0
- Speed is numpy (or scipy) time divided by dsna time
+ >>> bn.benchit(verbose=False)
+ Bottleneck performance benchmark
+ Bottleneck 0.1.0dev
+ Numpy 1.5.1
+ Scipy 0.8.0
+ Speed is numpy (or scipy) time divided by Bottleneck time
NaN means all NaNs
Speed Test Shape dtype NaN?
4.8103 nansum(a, axis=-1) (500,500) int64
@@ -119,54 +118,49 @@ functions that have a NumPy/SciPy equivalent. To run the benchmark::
Faster
======
-Under the hood dsna uses a separate Cython function for each combination of
-ndim, dtype, and axis. A lot of the overhead in ds.max, for example, is
-in checking that your axis is within range, converting non-array data to an
+Under the hood Bottleneck uses a separate Cython function for each combination
+of ndim, dtype, and axis. A lot of the overhead in bn.max(), for example, is
+in checking that the axis is within range, converting non-array data to an
array, and selecting the function to use to calculate the maximum.
You can get rid of the overhead by doing all this before you, say, enter
an inner loop::
>>> arr = np.random.rand(10,10)
- >>> func, a = ds.func.max_selector(arr, axis=0)
+ >>> func, a = bn.func.max_selector(arr, axis=0)
>>> func
<built-in function max_2d_float64_axis0>
-Let's see how much faster than runs::
+Let's see how much faster than runs::
>> timeit np.nanmax(arr, axis=0)
10000 loops, best of 3: 25.7 us per loop
- >> timeit ds.max(arr, axis=0)
+ >> timeit bn.max(arr, axis=0)
100000 loops, best of 3: 5.25 us per loop
>> timeit func(a)
100000 loops, best of 3: 2.5 us per loop
-Note that ``func`` is faster than the Numpy's non-nan version of max::
+Note that ``func`` is faster than Numpy's non-NaN version of max::
>> timeit arr.max(axis=0)
100000 loops, best of 3: 3.28 us per loop
-So adding NaN protection to your inner loops has a negative cost!
+So adding NaN protection to your inner loops comes at a negative cost!
Functions
=========
-DSNA is in the prototype stage.
+Bottleneck is in the prototype stage.
-DSNA contains the following functions (an asterisk means not yet complete):
+Bottleneck contains the following functions:
========= ============== ===============
-sum* move_sum* group_sum*
-mean move_mean* group_mean*
-var move_var* group_var*
-std move_std* group_std*
-min move_min* group_min*
-max move_max* group_max*
-median* move_median* group_median*
-zscore* move_zscore* group_zscore*
-ranking* move_ranking* group_ranking*
-quantile* move_quantile* group_quantile*
-count* move_count* group_count*
+sum move_sum
+mean group_mean
+var
+std
+min
+max
========= ============== ===============
Currently only 1d, 2d, and 3d NumPy arrays with dtype int32, int64, and
@@ -175,23 +169,24 @@ float64 are supported.
License
=======
-DSNA is distributed under a Simplified BSD license. Parts of NumPy and Scipy,
-which both have BSD licenses, are included in dsna. See the LICENSE file,
-which is distributed with dsna, for details.
+Bottleneck is distributed under a Simplified BSD license. Parts of NumPy,
+Scipy and numpydoc, all of which have BSD licenses, are included in
+Bottleneck. See the LICENSE file, which is distributed with Bottleneck, for
+details.
-Install
-=======
+Download and install
+====================
-You can grab dsna from http://github.com/kwgoodman/dsna
+You can grab Bottleneck from http://github.com/kwgoodman/bottleneck
**GNU/Linux, Mac OS X, et al.**
-To install dsna::
+To install Bottleneck::
$ python setup.py build
$ sudo python setup.py install
-Or, if you wish to specify where dsna is installed, for example inside
+Or, if you wish to specify where Bottleneck is installed, for example inside
``/usr/local``::
$ python setup.py build
@@ -210,10 +205,10 @@ commands::
**Post install**
-After you have installed dsna, run the suite of unit tests::
+After you have installed Bottleneck, run the suite of unit tests::
- >>> import dsna
- >>> dsna.test()
+ >>> import bottleneck as bn
+ >>> bn.test()
<snip>
Ran 10 tests in 13.756s
OK
8 RELEASE.rst
View
@@ -4,11 +4,11 @@ Release Notes
=============
These are the major changes made in each release. For details of the changes
-see the commit log at http://github.com/kwgoodman/dsna
+see the commit log at http://github.com/kwgoodman/bottleneck
-dsna 0.1.0
-==========
+Bottleneck 0.1.0
+================
*Release date: Not yet released, in development*
-The first release of dsna (descriptive statistics of NumPy arrays).
+The first release of Bottleneck.
15 dsna/LICENSE → bottleneck/LICENSE
View
@@ -2,11 +2,12 @@
License
=======
-DSNA is distributed under a Simplified BSD license. Parts of NumPy, SciPy and
-and numpydoc, which all have BSD licenses, are included in dsna.
+Bottleneck is distributed under a Simplified BSD license. Parts of NumPy,
+SciPy and and numpydoc, which all have BSD licenses, are included in
+Bottleneck.
-DSNA license
-============
+Bottleneck license
+==================
Copyright (c) 2010, Archipel Asset Management AB.
All rights reserved.
@@ -37,8 +38,8 @@ POSSIBILITY OF SUCH DAMAGE.
Other licenses
==============
-DSNA contains doc strings from NumPy and SciPy and Sphinx extensions from
-numpydoc.
+Bottleneck contains doc strings from NumPy and SciPy and Sphinx extensions
+from numpydoc.
NumPy license
@@ -115,4 +116,4 @@ DAMAGE.
numpydoc license
----------------
-The numpydoc license is in dsna/doc/sphinxext/LICENSE.txt
+The numpydoc license is in bottleneck/doc/sphinxext/LICENSE.txt
6 dsna/__init__.py → bottleneck/__init__.py
View
@@ -3,12 +3,12 @@
from move import move_sum
from group import group_mean
-from dsna.version import __version__
-from dsna.bench.bench import *
+from bottleneck.version import __version__
+from bottleneck.bench.bench import *
try:
from numpy.testing import Tester
test = Tester().test
del Tester
except (ImportError, ValueError):
- print "No dsna unit testing available."
+ print "No Bottleneck unit testing available."
0  dsna/testing/__init__.py → bottleneck/bench/__init__.py
View
File renamed without changes
0  dsna/bench/autotimeit.py → bottleneck/bench/autotimeit.py
View
File renamed without changes
40 dsna/bench/bench.py → bottleneck/bench/bench.py
View
@@ -1,7 +1,7 @@
import numpy as np
import scipy
-import dsna as ds
+import bottleneck as bn
from autotimeit import autotimeit
__all__ = ['benchit']
@@ -35,20 +35,20 @@ def suite():
statements = {}
setups = {}
- setups['(10000,) float64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'float64')"
- setups['(500,500) float64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'float64')"
- setups['(10000,) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'float64', True)"
- setups['(500,500) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'float64', True)"
- setups['(10000,) int32'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'int32')"
- setups['(500,500) int32'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'int32')"
- setups['(10000,) int64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=10000; a = geta((N,), 'int64')"
- setups['(500,500) int64'] = "import numpy as np; import scipy.stats as sp; import dsna as ds; from dsna.bench.bench import geta; N=500; a = geta((N, N), 'int64')"
+ setups['(10000,) float64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'float64')"
+ setups['(500,500) float64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'float64')"
+ setups['(10000,) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'float64', True)"
+ setups['(500,500) float64 NaN'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'float64', True)"
+ setups['(10000,) int32'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'int32')"
+ setups['(500,500) int32'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'int32')"
+ setups['(10000,) int64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=10000; a = geta((N,), 'int64')"
+ setups['(500,500) int64'] = "import numpy as np; import scipy.stats as sp; import bottleneck as bn; from bottleneck.bench.bench import geta; N=500; a = geta((N, N), 'int64')"
- # DSNA
- s = ['ds.sum(a, axis=-1)', 'ds.max(a, axis=-1)',
- 'ds.min(a, axis=-1)', 'ds.mean(a, axis=-1)',
- 'ds.std(a, axis=-1)']
- statements['dsna'] = s
+ # Bottleneck
+ s = ['bn.sum(a, axis=-1)', 'bn.max(a, axis=-1)',
+ 'bn.min(a, axis=-1)', 'bn.mean(a, axis=-1)',
+ 'bn.std(a, axis=-1)']
+ statements['bottleneck'] = s
# Numpy
s = ['np.nansum(a, axis=-1)', 'np.nanmax(a, axis=-1)',
@@ -60,14 +60,14 @@ def suite():
def display(results):
results = list(results)
- na = [i for i in results if i[0].startswith('ds.')]
+ na = [i for i in results if i[0].startswith('bn.')]
nu = [i for i in results if i[0].startswith('np.') or
i[0].startswith('sp.')]
- print 'DSNA performance benchmark'
- print "\tDSNA %s" % ds.__version__
- print "\tNumpy %s" % np.__version__
- print "\tScipy %s" % scipy.__version__
- print "\tSpeed is numpy (or scipy) time divided by dsna time"
+ print 'Bottleneck performance benchmark'
+ print "\tBottleneck %s" % bn.__version__
+ print "\tNumpy %s" % np.__version__
+ print "\tScipy %s" % scipy.__version__
+ print "\tSpeed is numpy (or scipy) time divided by Bottleneck time"
print "\tNaN means all NaNs"
print " Speed Test Shape dtype NaN?"
for nai in na:
4 dsna/src/Makefile → bottleneck/src/Makefile
View
@@ -31,10 +31,10 @@ groups:
mv group.so ../group.so
test:
- python -c "import dsna; dsna.test()"
+ python -c "import bottleneck; bottleneck.test()"
bench:
- python -c "import dsna; dsna.benchit()"
+ python -c "import bottleneck; bottleneck.benchit()"
# Phony targets for cleanup and similar uses
5,260 dsna/src/func/func.c → bottleneck/src/func/func.c
View
2,630 additions, 2,630 deletions not shown
0  dsna/src/func/func.pyx → bottleneck/src/func/func.pyx
View
File renamed without changes
0  dsna/src/func/header.pyx → bottleneck/src/func/header.pyx
View
File renamed without changes
24 dsna/src/func/max.pyx → bottleneck/src/func/max.pyx
View
@@ -56,16 +56,16 @@ def max(arr, axis=None):
Examples
--------
- >>> ds.max(1)
+ >>> bn.max(1)
1
- >>> ds.max([1])
+ >>> bn.max([1])
1
- >>> ds.max([1, np.nan])
+ >>> bn.max([1, np.nan])
1.0
>>> a = np.array([[1, 4], [1, np.nan]])
- >>> ds.max(a)
+ >>> bn.max(a)
4.0
- >>> ds.max(a, axis=0)
+ >>> bn.max(a, axis=0)
array([ 1., 4.])
"""
@@ -76,11 +76,11 @@ def max_selector(arr, axis):
"""
Return maximum function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.max() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the maximum.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.max()
+ is in checking that `axis` is within range, converting `arr` into an
+ array (if it is not already an array), and selecting the function to use
+ to calculate the maximum.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -112,7 +112,7 @@ def max_selector(arr, axis):
Obtain the function needed to determine the maximum of `arr` along
axis=0:
- >>> func, a = ds.func.max_selector(arr, axis=0)
+ >>> func, a = bn.func.max_selector(arr, axis=0)
>>> func
<built-in function max_1d_float64_axis0>
@@ -127,7 +127,7 @@ def max_selector(arr, axis):
cdef np.dtype dtype = a.dtype
cdef int size = a.size
if size == 0:
- msg = "numpy.nanmax() raises on size=0 input; so dsna does too."
+ msg = "numpy.nanmax() raises on size=0 input; so Bottleneck does too."
raise ValueError, msg
if axis != None:
if axis < 0:
28 dsna/src/func/mean.pyx → bottleneck/src/func/mean.pyx
View
@@ -67,25 +67,25 @@ def mean(arr, axis=None):
Examples
--------
- >>> ds.mean(1)
+ >>> bn.mean(1)
1.0
- >>> ds.mean([1])
+ >>> bn.mean([1])
1.0
- >>> ds.mean([1, np.nan])
+ >>> bn.mean([1, np.nan])
1.0
>>> a = np.array([[1, 4], [1, np.nan]])
- >>> ds.mean(a)
+ >>> bn.mean(a)
2.0
- >>> ds.mean(a, axis=0)
+ >>> bn.mean(a, axis=0)
array([ 1., 4.])
When positive infinity and negative infinity are present:
- >>> ds.mean([1, np.nan, np.inf])
+ >>> bn.mean([1, np.nan, np.inf])
inf
- >>> ds.mean([1, np.nan, np.NINF])
+ >>> bn.mean([1, np.nan, np.NINF])
-inf
- >>> ds.mean([1, np.nan, np.inf, np.NINF])
+ >>> bn.mean([1, np.nan, np.inf, np.NINF])
nan
"""
@@ -96,11 +96,11 @@ def mean_selector(arr, axis):
"""
Return mean function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.mean() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the mean.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.mean()
+ is in checking that `axis` is within range, converting `arr` into an
+ array (if it is not already an array), and selecting the function to use
+ to calculate the mean.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -130,7 +130,7 @@ def mean_selector(arr, axis):
Obtain the function needed to determine the mean of `arr` along axis=0:
- >>> func, a = ds.func.mean_selector(arr, axis=0)
+ >>> func, a = bn.func.mean_selector(arr, axis=0)
>>> func
<built-in function mean_1d_float64_axis0>
24 dsna/src/func/min.pyx → bottleneck/src/func/min.pyx
View
@@ -56,16 +56,16 @@ def min(arr, axis=None):
Examples
--------
- >>> ds.min(1)
+ >>> bn.min(1)
1
- >>> ds.min([1])
+ >>> bn.min([1])
1
- >>> ds.min([1, np.nan])
+ >>> bn.min([1, np.nan])
1.0
>>> a = np.array([[1, 4], [1, np.nan]])
- >>> ds.min(a)
+ >>> bn.min(a)
1.0
- >>> ds.min(a, axis=0)
+ >>> bn.min(a, axis=0)
array([ 1., 4.])
"""
@@ -76,11 +76,11 @@ def min_selector(arr, axis):
"""
Return minimum function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.min() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the minimum.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.min()
+ is in checking that `axis` is within range, converting `arr` into an
+ array (if it is not already an array), and selecting the function to use
+ to calculate the minimum.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -112,7 +112,7 @@ def min_selector(arr, axis):
Obtain the function needed to determine the minimum of `arr` along
axis=0:
- >>> func, a = ds.func.min_selector(arr, axis=0)
+ >>> func, a = bn.func.min_selector(arr, axis=0)
>>> func
<built-in function min_1d_float64_axis0>
@@ -127,7 +127,7 @@ def min_selector(arr, axis):
cdef np.dtype dtype = a.dtype
cdef int size = a.size
if size == 0:
- msg = "numpy.nanmin() raises on size=0 input; sdsnaa does too."
+ msg = "numpy.nanmin() raises on size=0 input; so Bottleneck does too."
raise ValueError, msg
if axis != None:
if axis < 0:
10 dsna/src/func/setup.py → bottleneck/src/func/setup.py
View
@@ -1,15 +1,15 @@
"""
Use to convert func.pyx to a C file.
-This setup.py is NOT used to install the DSNA package. The DSNA setup.py
-file is dsna/setup.py
+This setup.py is NOT used to install the Bottleneck package. The Bottleneck
+setup.py file is bottleneck/setup.py
-The C files are distributed with dsna, so this file is only useful if you
-modify sum.pyx or std.pyx or ...
+The C files are distributed with Bottleneck, so this file is only useful if
+you modify sum.pyx or std.pyx or ...
To convert from cython to C:
-$ cd dsna/dsna/src
+$ cd bottleneck/bottleneck/src
$ python func/setup.py build_ext --inplace
"""
24 dsna/src/func/std.pyx → bottleneck/src/func/std.pyx
View
@@ -64,21 +64,21 @@ def std(arr, axis=None, int ddof=0):
Examples
--------
- >>> ds.std(1)
+ >>> bn.std(1)
0.0
- >>> ds.std([1])
+ >>> bn.std([1])
0.0
- >>> ds.std([1, np.nan])
+ >>> bn.std([1, np.nan])
0.0
>>> a = np.array([[1, 4], [1, np.nan]])
- >>> ds.std(a)
+ >>> bn.std(a)
1.4142135623730951
- >>> ds.std(a, axis=0)
+ >>> bn.std(a, axis=0)
array([ 0., 0.])
When positive infinity or negative infinity are present NaN is returned:
- >>> ds.std([1, np.nan, np.inf])
+ >>> bn.std([1, np.nan, np.inf])
nan
"""
@@ -89,11 +89,11 @@ def std_selector(arr, axis):
"""
Return std function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.std() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the standard deviation.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.std()
+ is in checking that `axis` is within range, converting `arr` into an
+ array (if it is not already an array), and selecting the function to use
+ to calculate the standard deviation.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -126,7 +126,7 @@ def std_selector(arr, axis):
Obtain the function needed to determine the standard deviation of `arr`
along axis=0:
- >>> func, a = ds.func.std_selector(arr, axis=0)
+ >>> func, a = bn.func.std_selector(arr, axis=0)
>>> func
<built-in function std_1d_float64_axis0>
27 dsna/src/func/sum.pyx → bottleneck/src/func/sum.pyx
View
@@ -66,25 +66,25 @@ def sum(arr, axis=None):
Examples
--------
- >>> ds.sum(1)
+ >>> bn.sum(1)
1
- >>> ds.sum([1])
+ >>> bn.sum([1])
1
- >>> ds.sum([1, np.nan])
+ >>> bn.sum([1, np.nan])
1.0
>>> a = np.array([[1, 1], [1, np.nan]])
- >>> ds.sum(a)
+ >>> bn.sum(a)
3.0
- >>> ds.sum(a, axis=0)
+ >>> bn.sum(a, axis=0)
array([ 2., 1.])
When positive infinity and negative infinity are present:
- >>> ds.sum([1, np.nan, np.inf])
+ >>> bn.sum([1, np.nan, np.inf])
inf
- >>> ds.sum([1, np.nan, np.NINF])
+ >>> bn.sum([1, np.nan, np.NINF])
-inf
- >>> ds.sum([1, np.nan, np.inf, np.NINF])
+ >>> bn.sum([1, np.nan, np.inf, np.NINF])
nan
"""
@@ -95,10 +95,11 @@ def sum_selector(arr, axis=None):
"""
Return sum function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.sum() is in checking
- that `axis` is within range, converting `arr` into an array (if it is not
- already an array), and selecting the function to use to calculate the sum.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.sum()
+ is in checking that `axis` is within range, converting `arr` into an
+ array (if it is not already an array), and selecting the function to use
+ to calculate the sum.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -128,7 +129,7 @@ def sum_selector(arr, axis=None):
Obtain the function needed to sum `arr` along axis=0:
- >>> func, a = ds.func.sum_selector(arr, axis=0)
+ >>> func, a = bn.func.sum_selector(arr, axis=0)
>>> func
<built-in function sum_1d_float64_axis0>
20 dsna/src/func/var.pyx → bottleneck/src/func/var.pyx
View
@@ -64,21 +64,21 @@ def var(arr, axis=None, int ddof=0):
Examples
--------
- >>> ds.var(1)
+ >>> bn.var(1)
0.0
- >>> ds.var([1])
+ >>> bn.var([1])
0.0
- >>> ds.var([1, np.nan])
+ >>> bn.var([1, np.nan])
0.0
>>> a = np.array([[1, 4], [1, np.nan]])
- >>> ds.var(a)
+ >>> bn.var(a)
2.0
- >>> ds.var(a, axis=0)
+ >>> bn.var(a, axis=0)
array([ 0., 0.])
When positive infinity or negative infinity are present NaN is returned:
- >>> ds.var([1, np.nan, np.inf])
+ >>> bn.var([1, np.nan, np.inf])
nan
"""
@@ -89,10 +89,10 @@ def var_selector(arr, axis):
"""
Return variance function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.var() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in bn.var()
+ is in checking that `axis` is within range, converting `arr` into an array
+ (if it is not already an array), and selecting the function to use to
calculate the variance.
You can get rid of the overhead by doing all this before you, for example,
2,150 dsna/src/group/group.c → bottleneck/src/group/group.c
View
1,083 additions, 1,067 deletions not shown
0  dsna/src/group/group.pyx → bottleneck/src/group/group.pyx
View
File renamed without changes
0  dsna/src/group/group_header.pyx → bottleneck/src/group/group_header.pyx
View
File renamed without changes
4 dsna/src/group/group_mapper.pyx → bottleneck/src/group/group_mapper.pyx
View
@@ -40,7 +40,7 @@ def group_mapper(label, order=None):
Examples
--------
- >>> from dsna.group import group_mapper
+ >>> from bottleneck.group import group_mapper
>>> group_mapper([1, 2, 1, 2])
{1: [0, 2], 2: [1, 3]}
>>> group_mapper(['1', '2', '1', '2'])
@@ -88,7 +88,7 @@ def group_mapper_selector(label):
Examples
--------
- >>> from dsna.group import group_mapper_selector
+ >>> from bottleneck.group import group_mapper_selector
>>> group_mapper_selector([1, 2, 1, 2])
(<built-in function group_mapper_list>, [1, 2, 1, 2])
>>> group_mapper_selector((1, 2, 1, 2))
17 dsna/src/group/group_mean.pyx → bottleneck/src/group/group_mean.pyx
View
@@ -75,7 +75,7 @@ def group_mean(arr, label, order=None, int axis=0):
--------
Set up the problem:
- >>> from dsna import group_mean
+ >>> from bottleneck import group_mean
>>> arr = np.array([1, 2, 3, 9])
>>> label = ['a', 'b', 'b', 'a']
@@ -96,18 +96,19 @@ def group_mean(arr, label, order=None, int axis=0):
(array([ 5. , 2.5]), ['a', 'b'])
"""
- func, arr, label_dict, order = group_mean_selector(arr, label, order, axis)
+ func, arr, label_dict, order = group_mean_selector(arr, label, order,
+ axis)
return func(arr, label_dict, order)
def group_mean_selector(arr, label, order=None, int axis=0):
"""
Group mean function, array, and label mapper to use for specified problem.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.group_mean() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the group mean.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in
+ bn.group_mean() is in checking that `axis` is within range, converting
+ `arr` into an array (if it is not already an array), and selecting the
+ function to use to calculate the group mean.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -154,7 +155,7 @@ def group_mean_selector(arr, label, order=None, int axis=0):
Create a numpy array:
>>> arr = np.array([1.0, 2.0, 3.0])
- >>> from dsna.group import group_mean_selector
+ >>> from bottleneck.group import group_mean_selector
Obtain the function, etc. needed to determine the group mean of `arr`
along axis=0:
10 dsna/src/group/setup.py → bottleneck/src/group/setup.py
View
@@ -1,15 +1,15 @@
"""
Use to convert move.pyx to a C file.
-This setup.py is NOT used to install the DSNA package. The DSNA setup.py
-file is dsna/setup.py
+This setup.py is NOT used to install the Bottleneck package. The Bottleneck
+setup.py file is bottleneck/setup.py
-The C files are distributed with dsna, so this file is only useful if you
-modify group_sum.pyx or group_std.pyx or ...
+The C files are distributed with bottleneck, so this file is only useful if
+you modify group_sum.pyx or group_std.pyx or ...
To convert from cython to C:
-$ cd dsna/dsna/src
+$ cd bottleneck/bottleneck/src
$ python group/setup.py build_ext --inplace
"""
506 dsna/src/move/move.c → bottleneck/src/move/move.c
View
253 additions, 253 deletions not shown
0  dsna/src/move/move.pyx → bottleneck/src/move/move.pyx
View
File renamed without changes
0  dsna/src/move/move_header.pyx → bottleneck/src/move/move_header.pyx
View
File renamed without changes
14 dsna/src/move/move_sum.pyx → bottleneck/src/move/move_sum.pyx
View
@@ -34,7 +34,7 @@ def move_sum(arr, int window, int axis=0):
Examples
--------
>>> arr = np.array([1.0, 2.0, 3.0, 4.0])
- >>> ds.mov_sum(arr, window=2, axis=0)
+ >>> bn.mov_sum(arr, window=2, axis=0)
array([ NaN, 3., 5., 7.])
"""
@@ -45,11 +45,11 @@ def move_sum_selector(arr, int window, int axis):
"""
Return move_sum function and array that matches `arr` and `axis`.
- Under the hood dsna uses a separate Cython function for each combination
- of ndim, dtype, and axis. A lot of the overhead in ds.move_sum() is in
- checking that `axis` is within range, converting `arr` into an array (if
- it is not already an array), and selecting the function to use to
- calculate the moving sum.
+ Under the hood Bottleneck uses a separate Cython function for each
+ combination of ndim, dtype, and axis. A lot of the overhead in
+ bn.move_sum() is in checking that `axis` is within range, converting
+ `arr` into an array (if it is not already an array), and selecting the
+ function to use to calculate the moving sum.
You can get rid of the overhead by doing all this before you, for example,
enter an inner loop, by using the this function.
@@ -81,7 +81,7 @@ def move_sum_selector(arr, int window, int axis):
Obtain the function needed to determine the mean of `arr` along axis=0:
>>> window, axis = 2, 0
- >>> func, a = ds.move.move_sum_selector(arr, window=2, axis=0)
+ >>> func, a = bn.move.move_sum_selector(arr, window=2, axis=0)
<built-in function move_sum_1d_float64_axis0>
Use the returned function and array to determine the mean:
10 dsna/src/move/setup.py → bottleneck/src/move/setup.py
View
@@ -1,15 +1,15 @@
"""
Use to convert move.pyx to a C file.
-This setup.py is NOT used to install the DSNA package. The DSNA setup.py
-file is dsna/setup.py
+This setup.py is NOT used to install the Bottleneck package. The Bottleneck
+setup.py file is bottleneck/setup.py
-The C files are distributed with dsna, so this file is only useful if you
-modify move_sum.pyx or move_std.pyx or ...
+The C files are distributed with bottleneck, so this file is only useful if
+you modify move_sum.pyx or move_std.pyx or ...
To convert from cython to C:
-$ cd dsna/dsna/src
+$ cd bottleneck/bottleneck/src
$ python move/setup.py build_ext --inplace
"""
0  dsna/bench/__init__.py → bottleneck/testing/__init__.py
View
File renamed without changes
0  dsna/testing/group_validator.py → bottleneck/testing/group_validator.py
View
File renamed without changes
0  dsna/testing/move_validators.py → bottleneck/testing/move_validators.py
View
File renamed without changes
20 dsna/tests/func_test.py → bottleneck/tests/func_test.py
View
@@ -5,7 +5,7 @@
from numpy.testing import (assert_equal, assert_array_equal, assert_raises,
assert_array_almost_equal)
nan = np.nan
-import dsna as ds
+import bottleneck as bn
def arrays(dtypes=['int32', 'int64', 'float64']):
@@ -34,7 +34,7 @@ def arrays(dtypes=['int32', 'int64', 'float64']):
yield -a
def unit_maker(func, func0, decimal=np.inf):
- "Test that ds.xxx gives the same output as np.."
+ "Test that bn.xxx gives the same output as np.."
msg = '\nfunc %s | input %s (%s) | shape %s | axis %s\n'
msg += '\nInput array:\n%s\n'
for i, arr in enumerate(arrays()):
@@ -56,27 +56,27 @@ def unit_maker(func, func0, decimal=np.inf):
def test_sum():
"Test sum."
- yield unit_maker, ds.sum, np.nansum
+ yield unit_maker, bn.sum, np.nansum
def test_max():
"Test max."
- yield unit_maker, ds.max, np.nanmax
+ yield unit_maker, bn.max, np.nanmax
def test_min():
"Test min."
- yield unit_maker, ds.min, np.nanmin
+ yield unit_maker, bn.min, np.nanmin
def test_mean():
"Test mean."
- yield unit_maker, ds.mean, sp.nanmean, 13
+ yield unit_maker, bn.mean, sp.nanmean, 13
def test_std():
"Test min."
- yield unit_maker, ds.std, scipy_nanstd
+ yield unit_maker, bn.std, scipy_nanstd
def test_var():
"Test min."
- yield unit_maker, ds.var, scipy_nanstd_squared, 13
+ yield unit_maker, bn.var, scipy_nanstd_squared, 13
# ---------------------------------------------------------------------------
# Check that exceptions are raised
@@ -88,7 +88,7 @@ def test_max_size_zero():
for shape in shapes:
for dtype in dtypes:
a = np.zeros(shape, dtype=dtype)
- assert_raises(ValueError, ds.max, a)
+ assert_raises(ValueError, bn.max, a)
assert_raises(ValueError, np.nanmax, a)
def test_min_size_zero():
@@ -98,7 +98,7 @@ def test_min_size_zero():
for shape in shapes:
for dtype in dtypes:
a = np.zeros(shape, dtype=dtype)
- assert_raises(ValueError, ds.min, a)
+ assert_raises(ValueError, bn.min, a)
assert_raises(ValueError, np.nanmin, a)
# ---------------------------------------------------------------------------
10 dsna/tests/group_test.py → bottleneck/tests/group_test.py
View
@@ -5,8 +5,8 @@
assert_array_almost_equal)
nan = np.nan
from scipy.stats import nanmean
-import dsna as ds
-from dsna.testing.group_validator import group_func
+import bottleneck as bn
+from bottleneck.testing.group_validator import group_func
def array_iter(dtypes=['float64']):
@@ -31,7 +31,7 @@ def array_iter(dtypes=['float64']):
yield -a
def label_iter(n):
- "Iterator that yields a variety of labels of given length"
+ "Iterator that yielbn a variety of labels of given length"
dtypes = ['int32', 'int64', 'float64', 'str']
for dtype in dtypes:
label0 = np.ones(n, dtype=dtype)
@@ -54,7 +54,7 @@ def label_iter(n):
yield label[::-1].tolist()
def unit_maker(func, func0, decimal=np.inf):
- "Test that ds.xxx gives the same output as a reference function."
+ "Test that bn.xxx gives the same output as a reference function."
msg = "\nfunc %s | input %s (%s) | shape %s | axis %s\n"
msg += "\nInput array:\n%s\n"
msg += "\nLabel (%s):\n%s\n"
@@ -85,4 +85,4 @@ def unit_maker(func, func0, decimal=np.inf):
def test_group_mean():
"Test group_mean."
- yield unit_maker, ds.group_mean, nanmean, 13
+ yield unit_maker, bn.group_mean, nanmean, 13
8 dsna/tests/move_test.py → bottleneck/tests/move_test.py
View
@@ -4,8 +4,8 @@
from numpy.testing import (assert_equal, assert_array_equal, assert_raises,
assert_array_almost_equal)
nan = np.nan
-import dsna as ds
-from dsna.testing.move_validators import move_sum as alt_move_sum
+import bottleneck as bn
+from bottleneck.testing.move_validators import move_sum as alt_move_sum
def arrays(dtypes=['float64']):
@@ -30,7 +30,7 @@ def arrays(dtypes=['float64']):
yield -a
def unit_maker(func, func0, decimal=np.inf):
- "Test that ds.xxx gives the same output as a reference function."
+ "Test that bn.xxx gives the same output as a reference function."
msg = '\nfunc %s | window %d | input %s (%s) | shape %s | axis %s\n'
msg += '\nInput array:\n%s\n'
for i, arr in enumerate(arrays()):
@@ -57,4 +57,4 @@ def unit_maker(func, func0, decimal=np.inf):
def test_move_sum():
"Test move_sum."
- yield unit_maker, ds.move_sum, alt_move_sum
+ yield unit_maker, bn.move_sum, alt_move_sum
2  dsna/version.py → bottleneck/version.py
View
@@ -1,4 +1,4 @@
-"dsna version"
+"Bottleneck version"
# Format expected by setup.py and doc/source/conf.py: string of form "X.Y.Z"
__version__ = "0.1.0dev"
6 doc/doc_howto
View
@@ -3,13 +3,13 @@ Upload Sphinx docs to web server
================================
Godaddy doesn't allow rsyc. So to upload Shinx docs to
-http://berkeleyanalytics.com/dsna:
+http://berkeleyanalytics.com/bottleneck:
-In dsna/doc:
+In bottleneck/doc:
$ make clean
$ make html
Then to upload (it will ask for the password):
-$ scp -r doc/build/html/* username@berkeleyanalytics.com:html/dsna/
+$ scp -r doc/build/html/* username@berkeleyanalytics.com:html/bottleneck/
10 doc/source/conf.py
View
@@ -38,11 +38,11 @@
master_doc = 'index'
# General information about the project.
-project = u'dsna'
+project = u'Bottleneck'
copyright = u'2010, Archipel Asset Management AB'
-# Grab version from dsna/version.py
-ver_file = os.path.join('..', '..', 'dsna', 'version.py')
+# Grab version from bottleneck/version.py
+ver_file = os.path.join('..', '..', 'bottleneck', 'version.py')
fid = file(ver_file, 'r')
VER = fid.read()
fid.close()
@@ -185,7 +185,7 @@
#html_file_suffix = ''
# Output file base name for HTML help builder.
-htmlhelp_basename = 'dsnadoc'
+htmlhelp_basename = 'bottleneckdoc'
# -- Options for LaTeX output --------------------------------------------------
@@ -199,7 +199,7 @@
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
- ('index', 'dsna.tex', u'DSNA Documentation',
+ ('index', 'bottleneck.tex', u'bottleneck Documentation',
u'Keith Goodman', 'manual'),
]
8 doc/source/index.rst
View
@@ -1,8 +1,8 @@
-====
-DSNA
-====
+==========
+Bottleneck
+==========
-Fast, NaN-aware descriptive statistics of NumPy arrays:
+Bottleneck is a collection of fast, NumPy array functions written in Cython.
.. toctree::
:maxdepth: 2
2  doc/source/license.rst
View
@@ -1 +1 @@
-.. include:: ../../dsna/LICENSE
+.. include:: ../../bottleneck/LICENSE
36 doc/source/reference.rst
View
@@ -2,19 +2,19 @@
Reference
=========
-Most of the functionality of the ``dsna`` package falls into three broad
+Most of the functionality of the ``bottleneck`` package falls into three broad
categories: functions, moving window functions, and group-by functions.
-DSNA provides the following functions:
+Bottleneck provides the following functions:
-========================== ================================= =====================================
-:meth:`sum <dsna.sum>` :meth:`move_sum <dsna.move_sum>`
-:meth:`mean <dsna.mean>` :meth:`group_mean <dsna.group_mean>`
-:meth:`var <dsna.var>`
-:meth:`std <dsna.std>`
-:meth:`min <dsna.min>`
-:meth:`max <dsna.max>`
-========================== ================================= =====================================
+============================== ====================================== ==========================================
+:meth:`sum <bottleneck.sum>` :meth:`move_sum <bottleneck.move_sum>`
+:meth:`mean <bottleneck.mean>` :meth:`group_mean <bottleneck.group_mean>`
+:meth:`var <bottleneck.var>`
+:meth:`std <bottleneck.std>`
+:meth:`min <bottleneck.min>`
+:meth:`max <bottleneck.max>`
+============================== ====================================== ==========================================
Functions
@@ -22,27 +22,27 @@ Functions
------------
-.. autofunction:: dsna.sum
+.. autofunction:: bottleneck.sum
------------
-.. autofunction:: dsna.mean
+.. autofunction:: bottleneck.mean
------------
-.. autofunction:: dsna.var
+.. autofunction:: bottleneck.var
------------
-.. autofunction:: dsna.std
+.. autofunction:: bottleneck.std
------------
-.. autofunction:: dsna.min
+.. autofunction:: bottleneck.min
------------
-.. autofunction:: dsna.max
+.. autofunction:: bottleneck.max
@@ -51,7 +51,7 @@ Moving window functions
------------
-.. autofunction:: dsna.move_sum
+.. autofunction:: bottleneck.move_sum
Group functions
@@ -59,4 +59,4 @@ Group functions
------------
-.. autofunction:: dsna.group_mean
+.. autofunction:: bottleneck.group_mean
205 setup.py
View
@@ -15,56 +15,55 @@
"Programming Language :: Python",
"Topic :: Scientific/Engineering"]
-description = "Fast, NaN-aware descriptive statistics of NumPy arrays."
+description = "Fast, NumPy array functions written in Cython"
long_description = """
-DSNA uses the magic of Cython to give you fast, NaN-aware descriptive
-statistics of NumPy arrays.
+Bottleneck is a collection of fast, NumPy array functions written in Cython.
-The functions in dsna fall into three categories:
+The three categories of Bottleneck functions:
-=============== ===============================
- General sum(arr, axis=None)
- Moving window move_sum(arr, window, axis=0)
- Group by group_sum(arr, label, axis=0)
-=============== ===============================
+- Faster, drop-in replacement for NaN functions in NumPy and SciPy
+- Moving window functions
+- Group functions that bin calculations by like-labeled elements
-For example, create a NumPy array::
+Function signatures (using mean as an example):
+
+=============== ============================================
+ NaN functions mean(arr, axis=None)
+ Moving window move_mean(arr, window, axis=0)
+ Group by group_mean(arr, label, order=None, axis=0)
+=============== ============================================
+
+Let's give it a try. Create a NumPy array::
>>> import numpy as np
>>> arr = np.array([1, 2, np.nan, 4, 5])
-Then find the sum::
+Find the sum::
- >>> import dsna as ds
- >>> ds.sum(arr)
+ >>> import bottleneck as bn
+ >>> bn.sum(arr)
12.0
Moving window sum::
- >>> ds.move_sum(arr, window=2)
+ >>> bn.move_sum(arr, window=2)
array([ nan, 3., 2., 4., 9.])
-Group sum::
+Group mean::
>>> label = ['a', 'a', 'b', 'b', 'a']
- >>> a, lab = ds.group_sum(arr, label)
- >>> a
- array([ 8., 4.])
- >>> lab
- ['a', 'b']
+ >>> bn.group_mean(arr, label)
+ (array([ 2.66666667, 4. ]), ['a', 'b'])
Fast
====
-DNSA is fast::
+Bottleneck is fast::
- >>> import dsna as ds
- >>> import numpy as np
- >>> arr = np.random.rand(100, 100)
-
+ >>> arr = np.random.rand(100, 100)
>>> timeit np.nansum(arr)
10000 loops, best of 3: 68.4 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
100000 loops, best of 3: 17.7 us per loop
Let's not forget to add some NaNs::
@@ -72,70 +71,164 @@
>>> arr[arr > 0.5] = np.nan
>>> timeit np.nansum(arr)
1000 loops, best of 3: 417 us per loop
- >>> timeit ds.sum(arr)
+ >>> timeit bn.sum(arr)
10000 loops, best of 3: 64.8 us per loop
+Bottleneck comes with a benchmark suite that compares the performance of the
+bottleneck functions that have a NumPy/SciPy equivalent. To run the
+benchmark::
+
+ >>> bn.benchit(verbose=False)
+ Bottleneck performance benchmark
+ Bottleneck 0.1.0dev
+ Numpy 1.5.1
+ Scipy 0.8.0
+ Speed is numpy (or scipy) time divided by Bottleneck time
+ NaN means all NaNs
+ Speed Test Shape dtype NaN?
+ 4.8103 nansum(a, axis=-1) (500,500) int64
+ 5.1392 nansum(a, axis=-1) (10000,) float64
+ 7.1373 nansum(a, axis=-1) (500,500) int32
+ 6.0882 nansum(a, axis=-1) (500,500) float64
+ 7.7081 nansum(a, axis=-1) (10000,) int32
+ 2.1392 nansum(a, axis=-1) (10000,) int64
+ 9.8542 nansum(a, axis=-1) (500,500) float64 NaN
+ 7.9069 nansum(a, axis=-1) (10000,) float64 NaN
+ 5.1859 nanmax(a, axis=-1) (500,500) int64
+ 9.5304 nanmax(a, axis=-1) (10000,) float64
+ 0.1392 nanmax(a, axis=-1) (500,500) int32
+ 10.8645 nanmax(a, axis=-1) (500,500) float64
+ 2.4558 nanmax(a, axis=-1) (10000,) int32
+ 3.2855 nanmax(a, axis=-1) (10000,) int64
+ 9.6748 nanmax(a, axis=-1) (500,500) float64 NaN
+ 8.3101 nanmax(a, axis=-1) (10000,) float64 NaN
+ 5.1828 nanmin(a, axis=-1) (500,500) int64
+ 6.8145 nanmin(a, axis=-1) (10000,) float64
+ 0.1349 nanmin(a, axis=-1) (500,500) int32
+ 7.6657 nanmin(a, axis=-1) (500,500) float64
+ 2.4619 nanmin(a, axis=-1) (10000,) int32
+ 3.2942 nanmin(a, axis=-1) (10000,) int64
+ 9.7377 nanmin(a, axis=-1) (500,500) float64 NaN
+ 8.3564 nanmin(a, axis=-1) (10000,) float64 NaN
+ 20.7414 nanmean(a, axis=-1) (500,500) int64
+ 13.0027 nanmean(a, axis=-1) (10000,) float64
+ 19.1651 nanmean(a, axis=-1) (500,500) int32
+ 13.3462 nanmean(a, axis=-1) (500,500) float64
+ 18.1296 nanmean(a, axis=-1) (10000,) int32
+ 18.9846 nanmean(a, axis=-1) (10000,) int64
+ 53.6566 nanmean(a, axis=-1) (500,500) float64 NaN
+ 23.0624 nanmean(a, axis=-1) (10000,) float64 NaN
+ 6.8075 nanstd(a, axis=-1) (500,500) int64
+ 9.0953 nanstd(a, axis=-1) (10000,) float64
+ 7.2786 nanstd(a, axis=-1) (500,500) int32
+ 11.1632 nanstd(a, axis=-1) (500,500) float64
+ 5.9248 nanstd(a, axis=-1) (10000,) int32
+ 5.2482 nanstd(a, axis=-1) (10000,) int64
+ 89.4077 nanstd(a, axis=-1) (500,500) float64 NaN
+ 27.0319 nanstd(a, axis=-1) (10000,) float64 NaN
+
Faster
======
-Under the hood dsna uses a separate Cython function for each combination of
-ndim, dtype, and axis. A lot of the overhead in ds.max, for example, is
-in checking that your axis is within range, converting non-array data to an
+Under the hood Bottleneck uses a separate Cython function for each combination
+of ndim, dtype, and axis. A lot of the overhead in bn.max(), for example, is
+in checking that the axis is within range, converting non-array data to an
array, and selecting the function to use to calculate the maximum.
You can get rid of the overhead by doing all this before you, say, enter
an inner loop::
>>> arr = np.random.rand(10,10)
- >>> func, a = ds.func.max_selector(arr, axis=0)
+ >>> func, a = bn.func.max_selector(arr, axis=0)
>>> func
<built-in function max_2d_float64_axis0>
-Let's see how much faster than runs::
+Let's see how much faster than runs::
>> timeit np.nanmax(arr, axis=0)
10000 loops, best of 3: 25.7 us per loop
- >> timeit ds.max(arr, axis=0)
+ >> timeit bn.max(arr, axis=0)
100000 loops, best of 3: 5.25 us per loop
>> timeit func(a)
100000 loops, best of 3: 2.5 us per loop
-Note that ``func`` is faster than the Numpy's non-nan version of max::
+Note that ``func`` is faster than Numpy's non-NaN version of max::
>> timeit arr.max(axis=0)
100000 loops, best of 3: 3.28 us per loop
-So adding NaN protection to your inner loops has a negative cost!
+So adding NaN protection to your inner loops comes at a negative cost!
Functions
=========
-DSNA is in the prototype stage.
+Bottleneck is in the prototype stage.
-DSNA contains the following functions (an asterisk means not yet complete):
+Bottleneck contains the following functions:
========= ============== ===============
-sum* move_sum* group_sum*
-mean move_mean* group_mean*
-var move_var* group_var*
-std move_std* group_std*
-min move_min* group_min*
-max move_max* group_max*
-median* move_median* group_median*
-zscore* move_zscore* group_zscore*
-ranking* move_ranking* group_ranking*
-quantile* move_quantile* group_quantile*
-count* move_count* group_count*
+sum move_sum
+mean group_mean
+var
+std
+min
+max
========= ============== ===============
Currently only 1d, 2d, and 3d NumPy arrays with dtype int32, int64, and
float64 are supported.
-No long description yet.
+License
+=======
+
+Bottleneck is distributed under a Simplified BSD license. Parts of NumPy,
+Scipy and numpydoc, all of which have BSD licenses, are included in
+Bottleneck. See the LICENSE file, which is distributed with Bottleneck, for
+details.
+
+Install
+=======
+
+You can grab Bottleneck from http://github.com/kwgoodman/bottleneck
+
+**GNU/Linux, Mac OS X, et al.**
+
+To install Bottleneck::
+
+ $ python setup.py build
+ $ sudo python setup.py install
+
+Or, if you wish to specify where Bottleneck is installed, for example inside
+``/usr/local``::
+
+ $ python setup.py build
+ $ sudo python setup.py install --prefix=/usr/local
+
+**Windows**
+
+In order to compile the C code in dsna you need a Windows version of the gcc
+compiler. MinGW (Minimalist GNU for Windows) contains gcc and has been used to successfully compile dsna on Windows.
+
+Install MinGW and add it to your system path. Then install dsna with the
+commands::
+
+ python setup.py build --compiler=mingw32
+ python setup.py install
+
+**Post install**
+
+After you have installed Bottleneck, run the suite of unit tests::
+
+ >>> import bottleneck as bn
+ >>> bn.test()
+ <snip>
+ Ran 10 tests in 13.756s
+ OK
+ <nose.result.TextTestResult run=10 errors=0 failures=0>
"""
# Get la version
-ver_file = os.path.join('dsna', 'version.py')
+ver_file = os.path.join('bottleneck', 'version.py')
fid = file(ver_file, 'r')
VER = fid.read()
fid.close()
@@ -144,7 +237,7 @@
VER = VER.strip("\"")
VER = VER.split('.')
-NAME = 'dsna'
+NAME = 'Bottleneck'
MAINTAINER = "Keith Goodman"
MAINTAINER_EMAIL = ""
DESCRIPTION = description
@@ -161,9 +254,9 @@
MICRO = VER[2]
ISRELEASED = False
VERSION = '%s.%s.%s' % (MAJOR, MINOR, MICRO)
-PACKAGES = ["dsna", "dsna/src", "dsna/src/func", "dsna/tests",
- "dsna/bench"]
-PACKAGE_DATA = {'dsna': ['LICENSE']}
+PACKAGES = ["bottleneck", "bottleneck/src", "bottleneck/src/func",
+ "bottleneck/tests", "bottleneck/bench"]
+PACKAGE_DATA = {'bottleneck': ['LICENSE']}
REQUIRES = ["numpy"]
@@ -183,7 +276,7 @@
packages=PACKAGES,
package_data=PACKAGE_DATA,
requires=REQUIRES,
- ext_modules = [Extension("dsna.func",
- sources=["dsna/src/func/func.c"],
+ ext_modules = [Extension("bottleneck.func",
+ sources=["bottleneck/src/func/func.c"],
include_dirs=[numpy.get_include()])]
)
Please sign in to comment.
Something went wrong with that request. Please try again.