Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix 1.9 alignment issues #5316

Merged
merged 2 commits into from
Apr 22, 2015
Merged

fix 1.9 alignment issues #5316

merged 2 commits into from
Apr 22, 2015

Conversation

juliantaylor
Copy link
Contributor

reduce maximum required alignment to 8 bytes on 32 bit but no sparc and fix alignment flag of string arrays, see commit messages for details

@juliantaylor
Copy link
Contributor Author

@cournape can you test this on win32? it works on linux32 with -malign-double so it should work on windows too.
I'll try to test sparc though the situation there should been unchanged (longdouble f2py doesn't work)

@juliantaylor
Copy link
Contributor Author

should fix scipy/scipy#4168

@@ -19,7 +20,11 @@
* amd64 is not harmed much by the bloat as the system provides 16 byte
* alignment by default.
*/
#if (defined NPY_CPU_X86 || defined _WIN32) && !defined NPY_CPU_SPARC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the check on NPY_CPU_SPARC needed? Not that it hurts...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sparc will bus error without it, but I'll test that again

@njsmith
Copy link
Member

njsmith commented Nov 26, 2014

@juliantaylor: Would it be possible to take this opportunity to write a few paragraphs on what the ALIGNED flag actually means, how it's used, and the constraints it has to satisfy? Partly to make reviewing this easier but mostly so we can avoid so much confusion if it breaks later?

@juliantaylor
Copy link
Contributor Author

note that this makes most flexible types aligned again regardless on whether they are.
It is not clear what the alignment flag means for flexible arrays e.g. you could have (S3, i4, S1) what should aligned flag true mean here?

If itemsize is a power of two use that as the required alignment up to
the maximum provided by the platform. Power of two sizes may be accessed
via larger moves than bytes.
Non-power of two sizes are accessed bytewise and can thus always be
considered aligned.
Closes numpygh-5293
malloc only provides 8byte alignment and is sufficient to load complex
on x86 platforms.
This fixes the f2py alignment failures with complex types on win32 or on
linux32 with -malign-double
@charris
Copy link
Member

charris commented Nov 26, 2014

There is an alignment flag for the flexible dtype creation that aligns the structure. Not sure how that interacts with the rest of it except that if columns are extracted the addresses will check as aligned.

@njsmith
Copy link
Member

njsmith commented Nov 26, 2014

What I would expect to happen:

When creating a structured dtype, if you pass align=True then you get the equivalent of a C struct with the same members, i.e., each member has padding added before it to ensure that it falls on its own required alignment (assuming that the struct as a whole is at least that aligned), and then the struct gets extra padding tacked on the end so that its size of a multiple of the most restrictive alignment of any of its members. That way, a contiguous array of such structs is aligned iff the first item in the array is as aligned as required by the most restrictive member of the array. Then the dtype should export some info saying how aligned it needs to be for everything in it to be aligned. (This should of course not be dependent upon

Eg:

  • dtype("S3,i4", align=True) -> itemsize = 8, required alignment = 4
  • dtype("i4,i2", align=True) -> itemsize = 8, required alignment = 4
  • dtype("i2,i1", align=True) -> itemsize = 4, required alignment = 2
  • dtype("i2,i2", align=False) -> itemsize = 4, required alignment = 2 (we got lucky even though align=False
  • dtype("i2,i1", align=False) -> itemsize = 3, required alignment = 2 (both entries are in fact aligned if the struct is aligned)
  • dtype("i1,i2", align=False) -> itemsize = 3, required alignment = haha you're joking right

And then I'd expect that the overall array's ALIGNED flag should be set if: arr.size == 0 or all((arr.strides % arr.dtype.alignment == 0) | (arr.shape <= 1)) (pretending strides and shapes support broadcasting, and % ha ha you're joking right is never zero). I think that handles all the special cases.

However:

  • I don't know what ALIGNED is supposed to mean so I don't know if this is correct.
  • Right now, structured dtypes with align=False always claim their required alignment = 1 which is just wrong -- it means they claim to be aligned no matter what (which is what you meant in the comment above I guess?), even if they actually need some alignment (as in the "i2,i2" case) or if no alignment is possible (as in the "i1,i2" case).
  • Right now, structured dtypes with align=True don't add any padding at the end of the struct. (E.g., dtype("i4,i2", align=True) has itemsize=6, which is weird.) In practice this means that even so-called aligned structured dtypes will generally become unaligned as soon as you have two of them in the same array.

What a mess.

@charris
Copy link
Member

charris commented Nov 30, 2014

@cournape Have you had a chance to test this?

@cournape
Copy link
Member

@charris I have tested something slightly different (though has the same end result) in cournape/numpy@9fdcc60

We could make the factor 2 for complex conditional on the platform, though. I have some issues accessing my build setup from here, I will have more results for every platform tomorrow.

@cournape
Copy link
Member

cournape commented Dec 1, 2014

Ok, so with my patch cournape/numpy@9fdcc60, I get no errors for numpy test suite, and only "minor" precision test failures for scipy.

Platforms tested:

  • numpy 1.9.1 + my test, scipy 0.14.x
  • Centos 5.9 with gcc4.4, 32 and 64 bits
  • windows 7, MSVC 2008 and Ifort
  • os x 10.6 with gcc-4.2, gfortran
  • MKL 10.3 on every platform.

I see a non trivial amount of failures with other packages, but nothing that seem alignment related.

@charris
Copy link
Member

charris commented Dec 3, 2014

@juliantaylor Any reason to prefer one patch or the other? @cournape It looks like you don't test SPARC, it that the case?

@cournape
Copy link
Member

cournape commented Dec 3, 2014

@charris I don't have access to sparc. If necessary, I could try to install debian on qemu sparc I guess, though I would rather not.

From a C standard POV, I would say removing the factor 2 in alignment as I do is the most obvious fix (complex<t> and t have the same alignment), but I don't really care either way either.

@njsmith
Copy link
Member

njsmith commented Dec 4, 2014

@matthew-brett seems to be the go-to guy for testing things on sparc. I
hereby summon him to this thread!

On Wed, Dec 3, 2014 at 8:45 AM, David Cournapeau notifications@github.com
wrote:

@charris https://github.com/charris I don't have access to sparc. If
necessary, I could try to install debian on qemu sparc I guess, though I
would rather not.

From a C standard POV, I would say removing the factor 2 in alignment as I
do is the most obvious fix (complex and t have the same alignment),
but I don't really care either way either.


Reply to this email directly or view it on GitHub
#5316 (comment).

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

@cournape
Copy link
Member

cournape commented Dec 4, 2014

@njsmith I actually contacted him already :) I tried to install debian on sparc on qemu, but that did not go too well (and it took > 10 hours...)

@juliantaylor
Copy link
Contributor Author

this patch is tested on sparc, I can probably test davids version later this week

@matthew-brett
Copy link
Contributor

The SPARC I have access to belongs to Yarik Halchenko and the NeuroDebian
project.

I have emailed him about access for y'all but the machine seems to be down
at the moment.

@pv
Copy link
Member

pv commented Dec 11, 2014

Part of Numpy assumes descr->alignment means something that is completely internal to Numpy (alignment to be used in internal copy loops), whereas another part assumes it is the alignment decided by the C compiler --- e.g. np.dtype(..., align=True) and the PEP3118 buffer interface code. As descr->alignment is also a public API, it is fairly confusing if it means something that has no meaning outside Numpy...

Btw, the doubling of alignment of complex was done already in Numpy 1.8.x:

>>> np.__version__
'1.7.2'
>>> np.dtype([('a', np.int8), ('b', np.complex64)], align=True)
dtype({'names':['a','b'], 'formats':['i1','<c8'], 'offsets':[0,4], 'itemsize':12}, align=True)
>>> np.__version__
'1.8.2'
>>> np.dtype([('a', np.int8), ('b', np.complex64)], align=True)
dtype({'names':['a','b'], 'formats':['i1','<c8'], 'offsets':[0,8], 'itemsize':16}, align=True)

The former is compatible with the C compiler, the latter is not. This IIRC doesn't pad properly the end of the struct, so probably not so widely used. The new buffer interface with aligned structs containing complex members is probably also fairly rarely used...

@stonebig
Copy link

Hi all,

Is there a release schedule for numpy 1.9.2 ?
(January 2015 at the latest would be nice)

@charris
Copy link
Member

charris commented Jan 19, 2015

@juliantaylor @cournape @pv et al. Time to settle this, we now have three alignment PRs, this, #5365, and #5457. I don't understand the bento build failure in #5365. I'm tempted to just apply this, but Pauli's comment #5365 (comment) looks interesting.

@charris
Copy link
Member

charris commented Jan 19, 2015

@juliantaylor Is this rebased for backport?

@juliantaylor
Copy link
Contributor Author

the bento failure is a real bug, namely the misalignment crash which is the reason for all this mess

fwiw, I have yet another branch which changes the copy loops to check the alignment instead of using the dtype flag. Its almost ready, I think only the raw_assign* stuff still needs checking, unfortunately that code duplicates all the other in functionality but in a slightly different way.
See https://github.com/juliantaylor/numpy/tree/fix-copy-alignment
I think that branch is probably a bit much to backport, this one would probably be safer.
It needs an explicit backport as the same lines of code have been touched in the branch without a rebase.

@charris
Copy link
Member

charris commented Jan 19, 2015

OK, shall I do a backport and close this?

@juliantaylor
Copy link
Contributor Author

sure, did you make a list of all other bugs you merged into 1.10 which were not backported?
I am a bit burned out on stable release tracking and it sometimes appears I am the only one of the devs who cares about the maintenance branch. We need make some proper core-developer guidelines for bugfix commits.

alignment = NPY_MAX_COPY_ALIGNMENT;
npy_intp itemsize = PyArray_ITEMSIZE(ap);
/* power of two sizes may be loaded in larger moves */
if (((itemsize & (itemsize - 1)) == 0)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be removed and just restored to 1.8 state (always aligned)
this might still cause issues and I think I have a better solution in my other fix attempt branch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just replace common.c with the 1.8 version?

@charris
Copy link
Member

charris commented Jan 19, 2015

I was thinking that the only thing(s) that really matters is fixing the alignment problem, as folks are reluctant to use 1.9.1 until f2py and scipy work reliably. Currently, there are fixes for ATLAS 3.10 and xerbla in 1.9.x, and a couple of other minor bits, but I'd rather get 1.9.2 out and then do the first 1.10 in a couple of months. Current 1.9.x commits minus merge commits:

commit 9568de1ff8c07a00d87256d8c42ab283adc1cc63
Author: Henning Dickten <hdickten@uni-bonn.de>
Date:   Wed Nov 26 17:19:19 2014 +0100

    BUG: Closes #5313 PyArray_AsCArray caused segfault

commit 1e052f387fccc0625e423e23b92a590d211e4a10
Author: Henning Dickten <hdickten@uni-bonn.de>
Date:   Mon Dec 1 00:25:50 2014 +0100

    TST: added test for PyArray_AsCArray #5313

commit f0b2dd7d5151878f2b4b3ea20ff551b27243f27d
Author: Charles Harris <charlesr.harris@gmail.com>
Date:   Wed Dec 24 11:26:13 2014 -0700

    BUG: Xerbla doesn't get linked in 1.9 on Fedora 21.

    Add our python_xerbla to the blasdot sources. That function is
    needed for all modules that link to the ATLAS 3.10 libraries, which
    in Fedora 21 are located in just two files: libsatlas and libtatlas.

    Also make the test for xerbla linkage work better. If xerbla is not
    linked the test will be skipped with a message.

commit dd0732e4bda8f4379b17ea479bcecc876ab50ce6
Author: Charles Harris <charlesr.harris@gmail.com>
Date:   Wed Dec 10 17:42:02 2014 -0700

    ENH: Add support for ATLAS > 3.9.33.

    Recent ATLAS combines the previous libraries into two

    * libsatlas -- single threaded.
    * libtatlas -- threaded.

    This fix is a bit of hack in that ATLAS > 3.9.33 is treated as a new,
    separate library covered by atlas_3_10_info, but the latter derived
    from atlas_info, which treats the cblas, atlas, and atlas_lapack
    libraries separately, so the new info has a bit of repetition.

    The alternative would be to rewrite atlas_info, but that can wait
    on a larger cleanup of the build system.

    Closes #3774.

commit 3b6217f3799f85686c1ae87bbdc5cb10ad4585ec
Author: Eric O. LEBIGOT (EOL) <eric.lebigot@normalesup.org>
Date:   Wed Dec 17 16:30:02 2014 +0800

    DOC: Fixed incorrect assert_array_almost_equal_nulp documentation

    The `max()` function previously used does not work with two array-like.
    `maximum()` does, and is what is essentially used in the code
    (`ref = nulp * np.spacing(np.where(ax > ay, ax, ay))`).

commit fee4bcbaf0eb1c7d5c5a0eadc2011be00370b335
Author: Sturla Molden <sturla@molden.no>
Date:   Tue Dec 23 17:25:27 2014 +0100

    BUG: make seed, randint and shuffle threadsafe

commit 52958884a2ff36737c018102c68f28d221dac613
Author: Sturla Molden <sturla@molden.no>
Date:   Tue Dec 23 16:19:38 2014 +0100

    BUG: make set_state and get_state threadsafe

commit 14a3dca63e7fac58a1311acf3e01ab548a2e7ea2
Author: Thomas A Caswell <tcaswell@gmail.com>
Date:   Tue Dec 9 14:39:48 2014 -0500

    DOC : minor changes to linspace docstring

     - added optional flag to dtype
     - moved conditional on step to the description from the type

commit e4c861f4c4e9c0e71904858d770661f264a86ba1
Author: Julian Taylor <jtaylor.debian@googlemail.com>
Date:   Wed Dec 10 19:31:56 2014 +0100

    REL: set version number to unreleased 1.9.2

@charris
Copy link
Member

charris commented Jan 19, 2015

There are always going to be bugs, The Bugcount Also Rises

@rgommers
Copy link
Member

@juliantaylor that doesn't sound good, take it easy - a burnout is no fun.

You're certainly not the only one who cares about maintenance branches. You just switched to a much more labor-intensive way of updating those branches than we used before. I was used to just doing a big backport of appropriate fixes in one go, typically in preparation for a maintenance release. That has a couple of advantages:

  • less overhead with PRs, switching branches, etc.
  • letting fixes sit in master for a while makes sure they're right, prevents double work when they're later changes again or reverted.
  • less change of regressions in maintenance branches

Furthermore, we (you and Chuck mainly) changed recently to a significantly more agressive backporting strategy, maybe even without realizing it. With the associated extra work. Previously we didn't backport all fixes, only ones deemed important enough.

Final comment: if we'd set the Milestone of each merged PR to the right release, then it's easy to get an overview of all fixes that got merged for a release. For Scipy this has proven to be useful.

@rgommers
Copy link
Member

And Re: expanding the developer docs: +10

@charris charris mentioned this pull request Jan 22, 2015
@charris
Copy link
Member

charris commented Jan 22, 2015

@juliantaylor Closing this as there is now a 1.9 version and the 1.10 fix will be different.

@charris charris closed this Jan 22, 2015
@charris charris reopened this Apr 7, 2015
@charris
Copy link
Member

charris commented Apr 7, 2015

@juliantaylor Reopening this for 1.10. Do you have a better fix in mind?

@charris charris added this to the 1.10.0 release milestone Apr 7, 2015
jsonn pushed a commit to jsonn/pkgsrc that referenced this pull request Apr 17, 2015
Reviewed by:	wiz@

Upstream changes:
NumPy 1.9.2 Release Notes
*************************

This is a bugfix only release in the 1.9.x series.

Issues fixed
============

* `#5316 <https://github.com/numpy/numpy/issues/5316>`__: fix too large dtype alignment of strings and complex types
* `#5424 <https://github.com/numpy/numpy/issues/5424>`__: fix ma.median when used on ndarrays
* `#5481 <https://github.com/numpy/numpy/issues/5481>`__: Fix astype for structured array fields of different byte order
* `#5354 <https://github.com/numpy/numpy/issues/5354>`__: fix segfault when clipping complex arrays
* `#5524 <https://github.com/numpy/numpy/issues/5524>`__: allow np.argpartition on non ndarrays
* `#5612 <https://github.com/numpy/numpy/issues/5612>`__: Fixes ndarray.fill to accept full range of uint64
* `#5155 <https://github.com/numpy/numpy/issues/5155>`__: Fix loadtxt with comments=None and a string None data
* `#4476 <https://github.com/numpy/numpy/issues/4476>`__: Masked array view fails if structured dtype has datetime component
* `#5388 <https://github.com/numpy/numpy/issues/5388>`__: Make RandomState.set_state and RandomState.get_state threadsafe
* `#5390 <https://github.com/numpy/numpy/issues/5390>`__: make seed, randint and shuffle threadsafe
* `#5374 <https://github.com/numpy/numpy/issues/5374>`__: Fixed incorrect assert_array_almost_equal_nulp documentation
* `#5393 <https://github.com/numpy/numpy/issues/5393>`__: Add support for ATLAS > 3.9.33.
* `#5313 <https://github.com/numpy/numpy/issues/5313>`__: PyArray_AsCArray caused segfault for 3d arrays
* `#5492 <https://github.com/numpy/numpy/issues/5492>`__: handle out of memory in rfftf
* `#4181 <https://github.com/numpy/numpy/issues/4181>`__: fix a few bugs in the random.pareto docstring
* `#5359 <https://github.com/numpy/numpy/issues/5359>`__: minor changes to linspace docstring
* `#4723 <https://github.com/numpy/numpy/issues/4723>`__: fix a compile issues on AIX

NumPy 1.9.1 Release Notes
*************************

This is a bugfix only release in the 1.9.x series.

Issues fixed
============

* gh-5184: restore linear edge behaviour of gradient to as it was in < 1.9.
  The second order behaviour is available via the `edge_order` keyword
* gh-4007: workaround Accelerate sgemv crash on OSX 10.9
* gh-5100: restore object dtype inference from iterable objects without `len()`
* gh-5163: avoid gcc-4.1.2 (red hat 5) miscompilation causing a crash
* gh-5138: fix nanmedian on arrays containing inf
* gh-5240: fix not returning out array from ufuncs with subok=False set
* gh-5203: copy inherited masks in MaskedArray.__array_finalize__
* gh-2317: genfromtxt did not handle filling_values=0 correctly
* gh-5067: restore api of npy_PyFile_DupClose in python2
* gh-5063: cannot convert invalid sequence index to tuple
* gh-5082: Segmentation fault with argmin() on unicode arrays
* gh-5095: don't propagate subtypes from np.where
* gh-5104: np.inner segfaults with SciPy's sparse matrices
* gh-5251: Issue with fromarrays not using correct format for unicode arrays
* gh-5136: Import dummy_threading if importing threading fails
* gh-5148: Make numpy import when run with Python flag '-OO'
* gh-5147: Einsum double contraction in particular order causes ValueError
* gh-479: Make f2py work with intent(in out)
* gh-5170: Make python2 .npy files readable in python3
* gh-5027: Use 'll' as the default length specifier for long long
* gh-4896: fix build error with MSVC 2013 caused by C99 complex support
* gh-4465: Make PyArray_PutTo respect writeable flag
* gh-5225: fix crash when using arange on datetime without dtype set
* gh-5231: fix build in c99 mode

NumPy 1.9.0 Release Notes
*************************

This release supports Python 2.6 - 2.7 and 3.2 - 3.4.


Highlights
==========
* Numerous performance improvements in various areas, most notably indexing and
  operations on small arrays are significantly faster.
  Indexing operations now also release the GIL.
* Addition of `nanmedian` and `nanpercentile` rounds out the nanfunction set.


Dropped Support
===============

* The oldnumeric and numarray modules have been removed.
* The doc/pyrex and doc/cython directories have been removed.
* The doc/numpybook directory has been removed.
* The numpy/testing/numpytest.py file has been removed together with
  the importall function it contained.


Future Changes
==============

* The numpy/polynomial/polytemplate.py file will be removed in NumPy 1.10.0.
* Default casting for inplace operations will change to 'same_kind' in
  Numpy 1.10.0. This will certainly break some code that is currently
  ignoring the warning.
* Relaxed stride checking will be the default in 1.10.0
* String version checks will break because, e.g., '1.9' > '1.10' is True. A
  NumpyVersion class has been added that can be used for such comparisons.
* The diagonal and diag functions will return writeable views in 1.10.0
* The `S` and/or `a` dtypes may be changed to represent Python strings
  instead of bytes, in Python 3 these two types are very different.


Compatibility notes
===================

The diagonal and diag functions return readonly views.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In NumPy 1.8, the diagonal and diag functions returned readonly copies, in
NumPy 1.9 they return readonly views, and in 1.10 they will return writeable
views.

Special scalar float values don't cause upcast to double anymore
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In previous numpy versions operations involving floating point scalars
containing special values ``NaN``, ``Inf`` and ``-Inf`` caused the result
type to be at least ``float64``.  As the special values can be represented
in the smallest available floating point type, the upcast is not performed
anymore.

For example the dtype of:

    ``np.array([1.], dtype=np.float32) * float('nan')``

now remains ``float32`` instead of being cast to ``float64``.
Operations involving non-special values have not been changed.

Percentile output changes
~~~~~~~~~~~~~~~~~~~~~~~~~
If given more than one percentile to compute numpy.percentile returns an
array instead of a list. A single percentile still returns a scalar.  The
array is equivalent to converting the list returned in older versions
to an array via ``np.array``.

If the ``overwrite_input`` option is used the input is only partially
instead of fully sorted.

ndarray.tofile exception type
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All ``tofile`` exceptions are now ``IOError``, some were previously
``ValueError``.

Invalid fill value exceptions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Two changes to numpy.ma.core._check_fill_value:

* When the fill value is a string and the array type is not one of
  'OSUV', TypeError is raised instead of the default fill value being used.

* When the fill value overflows the array type, TypeError is raised instead
  of OverflowError.

Polynomial Classes no longer derived from PolyBase
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This may cause problems with folks who depended on the polynomial classes
being derived from PolyBase. They are now all derived from the abstract
base class ABCPolyBase. Strictly speaking, there should be a deprecation
involved, but no external code making use of the old baseclass could be
found.

Using numpy.random.binomial may change the RNG state vs. numpy < 1.9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A bug in one of the algorithms to generate a binomial random variate has
been fixed. This change will likely alter the number of random draws
performed, and hence the sequence location will be different after a
call to distribution.c::rk_binomial_btpe. Any tests which rely on the RNG
being in a known state should be checked and/or updated as a result.

Random seed enforced to be a 32 bit unsigned integer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.random.seed`` and ``np.random.RandomState`` now throw a ``ValueError``
if the seed cannot safely be converted to 32 bit unsigned integers.
Applications that now fail can be fixed by masking the higher 32 bit values to
zero: ``seed = seed & 0xFFFFFFFF``. This is what is done silently in older
versions so the random stream remains the same.

Argmin and argmax out argument
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``out`` argument to ``np.argmin`` and ``np.argmax`` and their
equivalent C-API functions is now checked to match the desired output shape
exactly.  If the check fails a ``ValueError`` instead of ``TypeError`` is
raised.

Einsum
~~~~~~
Remove unnecessary broadcasting notation restrictions.
``np.einsum('ijk,j->ijk', A, B)`` can also be written as
``np.einsum('ij...,j->ij...', A, B)`` (ellipsis is no longer required on 'j')

Indexing
~~~~~~~~

The NumPy indexing has seen a complete rewrite in this version. This makes
most advanced integer indexing operations much faster and should have no
other implications.  However some subtle changes and deprecations were
introduced in advanced indexing operations:

* Boolean indexing into scalar arrays will always return a new 1-d array.
  This means that ``array(1)[array(True)]`` gives ``array([1])`` and
  not the original array.

* Advanced indexing into one dimensional arrays used to have
  (undocumented) special handling regarding repeating the value array in
  assignments when the shape of the value array was too small or did not
  match.  Code using this will raise an error. For compatibility you can
  use ``arr.flat[index] = values``, which uses the old code branch.  (for
  example ``a = np.ones(10); a[np.arange(10)] = [1, 2, 3]``)

* The iteration order over advanced indexes used to be always C-order.
  In NumPy 1.9. the iteration order adapts to the inputs and is not
  guaranteed (with the exception of a *single* advanced index which is
  never reversed for compatibility reasons). This means that the result
  is undefined if multiple values are assigned to the same element.  An
  example for this is ``arr[[0, 0], [1, 1]] = [1, 2]``, which may set
  ``arr[0, 1]`` to either 1 or 2.

* Equivalent to the iteration order, the memory layout of the advanced
  indexing result is adapted for faster indexing and cannot be predicted.

* All indexing operations return a view or a copy. No indexing operation
  will return the original array object. (For example ``arr[...]``)

* In the future Boolean array-likes (such as lists of python bools) will
  always be treated as Boolean indexes and Boolean scalars (including
  python ``True``) will be a legal *boolean* index. At this time, this is
  already the case for scalar arrays to allow the general
  ``positive = a[a > 0]`` to work when ``a`` is zero dimensional.

* In NumPy 1.8 it was possible to use ``array(True)`` and
  ``array(False)`` equivalent to 1 and 0 if the result of the operation
  was a scalar.  This will raise an error in NumPy 1.9 and, as noted
  above, treated as a boolean index in the future.

* All non-integer array-likes are deprecated, object arrays of custom
  integer like objects may have to be cast explicitly.

* The error reporting for advanced indexing is more informative, however
  the error type has changed in some cases. (Broadcasting errors of
  indexing arrays are reported as ``IndexError``)

* Indexing with more then one ellipsis (``...``) is deprecated.

Non-integer reduction axis indexes are deprecated
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Non-integer axis indexes to reduction ufuncs like `add.reduce` or `sum` are
deprecated.

``promote_types`` and string dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``promote_types`` function now returns a valid string length when given an
integer or float dtype as one argument and a string dtype as another
argument.  Previously it always returned the input string dtype, even if it
wasn't long enough to store the max integer/float value converted to a
string.

``can_cast`` and string dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``can_cast`` function now returns False in "safe" casting mode for
integer/float dtype and string dtype if the string dtype length is not long
enough to store the max integer/float value converted to a string.
Previously ``can_cast`` in "safe" mode returned True for integer/float
dtype and a string dtype of any length.

astype and string dtype
~~~~~~~~~~~~~~~~~~~~~~~
The ``astype`` method now returns an error if the string dtype to cast to
is not long enough in "safe" casting mode to hold the max value of
integer/float array that is being casted. Previously the casting was
allowed even if the result was truncated.

`npyio.recfromcsv` keyword arguments change
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`npyio.recfromcsv` no longer accepts the undocumented `update` keyword,
which used to override the `dtype` keyword.

The ``doc/swig`` directory moved
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``doc/swig`` directory has been moved to ``tools/swig``.

The ``npy_3kcompat.h`` header changed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The unused ``simple_capsule_dtor`` function has been removed from
``npy_3kcompat.h``.  Note that this header is not meant to be used outside
of numpy; other projects should be using their own copy of this file when
needed.

Negative indices in C-Api ``sq_item`` and ``sq_ass_item`` sequence methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When directly accessing the ``sq_item`` or ``sq_ass_item`` PyObject slots
for item getting, negative indices will not be supported anymore.
``PySequence_GetItem`` and ``PySequence_SetItem`` however fix negative
indices so that they can be used there.

NDIter
~~~~~~
When ``NpyIter_RemoveAxis`` is now called, the iterator range will be reset.

When a multi index is being tracked and an iterator is not buffered, it is
possible to use ``NpyIter_RemoveAxis``. In this case an iterator can shrink
in size. Because the total size of an iterator is limited, the iterator
may be too large before these calls. In this case its size will be set to ``-1``
and an error issued not at construction time but when removing the multi
index, setting the iterator range, or getting the next function.

This has no effect on currently working code, but highlights the necessity
of checking for an error return if these conditions can occur. In most
cases the arrays being iterated are as large as the iterator so that such
a problem cannot occur.

This change was already applied to the 1.8.1 release.

``zeros_like`` for string dtypes now returns empty strings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To match the `zeros` function `zeros_like` now returns an array initialized
with empty strings instead of an array filled with `'0'`.


New Features
============

Percentile supports more interpolation options
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.percentile`` now has the interpolation keyword argument to specify in
which way points should be interpolated if the percentiles fall between two
values.  See the documentation for the available options.

Generalized axis support for median and percentile
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.median`` and ``np.percentile`` now support generalized axis arguments like
ufunc reductions do since 1.7. One can now say axis=(index, index) to pick a
list of axes for the reduction. The ``keepdims`` keyword argument was also
added to allow convenient broadcasting to arrays of the original shape.

Dtype parameter added to ``np.linspace`` and ``np.logspace``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The returned data type from the ``linspace`` and ``logspace`` functions can
now be specified using the dtype parameter.

More general ``np.triu`` and ``np.tril`` broadcasting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For arrays with ``ndim`` exceeding 2, these functions will now apply to the
final two axes instead of raising an exception.

``tobytes`` alias for ``tostring`` method
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``ndarray.tobytes`` and ``MaskedArray.tobytes`` have been added as aliases
for ``tostring`` which exports arrays as ``bytes``. This is more consistent
in Python 3 where ``str`` and ``bytes`` are not the same.

Build system
~~~~~~~~~~~~
Added experimental support for the ppc64le and OpenRISC architecture.

Compatibility to python ``numbers`` module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All numerical numpy types are now registered with the type hierarchy in
the python ``numbers`` module.

``increasing`` parameter added to ``np.vander``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ordering of the columns of the Vandermonde matrix can be specified with
this new boolean argument.

``unique_counts`` parameter added to ``np.unique``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The number of times each unique item comes up in the input can now be
obtained as an optional return value.

Support for median and percentile in nanfunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``np.nanmedian`` and ``np.nanpercentile`` functions behave like
the median and percentile functions except that NaNs are ignored.

NumpyVersion class added
~~~~~~~~~~~~~~~~~~~~~~~~
The class may be imported from numpy.lib and can be used for version
comparison when the numpy version goes to 1.10.devel. For example::

    >>> from numpy.lib import NumpyVersion
    >>> if NumpyVersion(np.__version__) < '1.10.0'):
    ...     print('Wow, that is an old NumPy version!')

Allow saving arrays with large number of named columns
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The numpy storage format 1.0 only allowed the array header to have a total size
of 65535 bytes. This can be exceeded by structured arrays with a large number
of columns. A new format 2.0 has been added which extends the header size to 4
GiB. `np.save` will automatically save in 2.0 format if the data requires it,
else it will always use the more compatible 1.0 format.

Full broadcasting support for ``np.cross``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.cross`` now properly broadcasts its two input arrays, even if they
have different number of dimensions. In earlier versions this would result
in either an error being raised, or wrong results computed.


Improvements
============

Better numerical stability for sum in some cases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pairwise summation is now used in the sum method, but only along the fast
axis and for groups of the values <= 8192 in length. This should also
improve the accuracy of var and std in some common cases.

Percentile implemented in terms of ``np.partition``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.percentile`` has been implemented in terms of ``np.partition`` which
only partially sorts the data via a selection algorithm. This improves the
time complexity from ``O(nlog(n))`` to ``O(n)``.

Performance improvement for ``np.array``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The performance of converting lists containing arrays to arrays using
``np.array`` has been improved. It is now equivalent in speed to
``np.vstack(list)``.

Performance improvement for ``np.searchsorted``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the built-in numeric types, ``np.searchsorted`` no longer relies on the
data type's ``compare`` function to perform the search, but is now
implemented by type specific functions. Depending on the size of the
inputs, this can result in performance improvements over 2x.

Optional reduced verbosity for np.distutils
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set ``numpy.distutils.system_info.system_info.verbosity = 0`` and then
calls to ``numpy.distutils.system_info.get_info('blas_opt')`` will not
print anything on the output. This is mostly for other packages using
numpy.distutils.

Covariance check in ``np.random.multivariate_normal``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A ``RuntimeWarning`` warning is raised when the covariance matrix is not
positive-semidefinite.

Polynomial Classes no longer template based
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The polynomial classes have been refactored to use an abstract base class
rather than a template in order to implement a common interface. This makes
importing the polynomial package faster as the classes do not need to be
compiled on import.

More GIL releases
~~~~~~~~~~~~~~~~~
Several more functions now release the Global Interpreter Lock allowing more
efficient parallization using the ``threading`` module. Most notably the GIL is
now released for fancy indexing, ``np.where`` and the ``random`` module now
uses a per-state lock instead of the GIL.

MaskedArray support for more complicated base classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Built-in assumptions that the baseclass behaved like a plain array are being
removed. In particalur, ``repr`` and ``str`` should now work more reliably.


C-API
~~~~~


Deprecations
============

Non-integer scalars for sequence repetition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using non-integer numpy scalars to repeat python sequences is deprecated.
For example ``np.float_(2) * [1]`` will be an error in the future.

``select`` input deprecations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The integer and empty input to ``select`` is deprecated. In the future only
boolean arrays will be valid conditions and an empty ``condlist`` will be
considered an input error instead of returning the default.

``rank`` function
~~~~~~~~~~~~~~~~~
The ``rank`` function has been deprecated to avoid confusion with
``numpy.linalg.matrix_rank``.

Object array equality comparisons
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the future object array comparisons both `==` and `np.equal` will not
make use of identity checks anymore. For example:

>>> a = np.array([np.array([1, 2, 3]), 1])
>>> b = np.array([np.array([1, 2, 3]), 1])
>>> a == b

will consistently return False (and in the future an error) even if the array
in `a` and `b` was the same object.

The equality operator `==` will in the future raise errors like `np.equal`
if broadcasting or element comparisons, etc. fails.

Comparison with `arr == None` will in the future do an elementwise comparison
instead of just returning False. Code should be using `arr is None`.

All of these changes will give Deprecation- or FutureWarnings at this time.

C-API
~~~~~

The utility function npy_PyFile_Dup and npy_PyFile_DupClose are broken by the
internal buffering python 3 applies to its file objects.
To fix this two new functions npy_PyFile_Dup2 and npy_PyFile_DupClose2 are
declared in npy_3kcompat.h and the old functions are deprecated.
Due to the fragile nature of these functions it is recommended to instead use
the python API when possible.

This change was already applied to the 1.8.1 release.

NumPy 1.8.2 Release Notes
*************************

This is a bugfix only release in the 1.8.x series.

Issues fixed
============

* gh-4836: partition produces wrong results for multiple selections in equal ranges
* gh-4656: Make fftpack._raw_fft threadsafe
* gh-4628: incorrect argument order to _copyto in in np.nanmax, np.nanmin
* gh-4642: Hold GIL for converting dtypes types with fields
* gh-4733: fix np.linalg.svd(b, compute_uv=False)
* gh-4853: avoid unaligned simd load on reductions on i386
* gh-4722: Fix seg fault converting empty string to object
* gh-4613: Fix lack of NULL check in array_richcompare
* gh-4774: avoid unaligned access for strided byteswap
* gh-650: Prevent division by zero when creating arrays from some buffers
* gh-4602: ifort has issues with optimization flag O2, use O1
NumPy 1.8.1 Release Notes
*************************

This is a bugfix only release in the 1.8.x series.


Issues fixed
============

* gh-4276: Fix mean, var, std methods for object arrays
* gh-4262: remove insecure mktemp usage
* gh-2385: absolute(complex(inf)) raises invalid warning in python3
* gh-4024: Sequence assignment doesn't raise exception on shape mismatch
* gh-4027: Fix chunked reading of strings longer than BUFFERSIZE
* gh-4109: Fix object scalar return type of 0-d array indices
* gh-4018: fix missing check for memory allocation failure in ufuncs
* gh-4156: high order linalg.norm discards imaginary elements of complex arrays
* gh-4144: linalg: norm fails on longdouble, signed int
* gh-4094: fix NaT handling in _strided_to_strided_string_to_datetime
* gh-4051: fix uninitialized use in _strided_to_strided_string_to_datetime
* gh-4093: Loading compressed .npz file fails under Python 2.6.6
* gh-4138: segfault with non-native endian memoryview in python 3.4
* gh-4123: Fix missing NULL check in lexsort
* gh-4170: fix native-only long long check in memoryviews
* gh-4187: Fix large file support on 32 bit
* gh-4152: fromfile: ensure file handle positions are in sync in python3
* gh-4176: clang compatibility: Typos in conversion_utils
* gh-4223: Fetching a non-integer item caused array return
* gh-4197: fix minor memory leak in memoryview failure case
* gh-4206: fix build with single-threaded python
* gh-4220: add versionadded:: 1.8.0 to ufunc.at docstring
* gh-4267: improve handling of memory allocation failure
* gh-4267: fix use of capi without gil in ufunc.at
* gh-4261: Detect vendor versions of GNU Compilers
* gh-4253: IRR was returning nan instead of valid negative answer
* gh-4254: fix unnecessary byte order flag change for byte arrays
* gh-3263: numpy.random.shuffle clobbers mask of a MaskedArray
* gh-4270: np.random.shuffle not work with flexible dtypes
* gh-3173: Segmentation fault when 'size' argument to random.multinomial
* gh-2799: allow using unique with lists of complex
* gh-3504: fix linspace truncation for integer array scalar
* gh-4191: get_info('openblas') does not read libraries key
* gh-3348: Access violation in _descriptor_from_pep3118_format
* gh-3175: segmentation fault with numpy.array() from bytearray
* gh-4266: histogramdd - wrong result for entries very close to last boundary
* gh-4408: Fix stride_stricks.as_strided function for object arrays
* gh-4225: fix log1p and exmp1 return for np.inf on windows compiler builds
* gh-4359: Fix infinite recursion in str.format of flex arrays
* gh-4145: Incorrect shape of broadcast result with the exponent operator
* gh-4483: Fix commutativity of {dot,multiply,inner}(scalar, matrix_of_objs)
* gh-4466: Delay npyiter size check when size may change
* gh-4485: Buffered stride was erroneously marked fixed
* gh-4354: byte_bounds fails with datetime dtypes
* gh-4486: segfault/error converting from/to high-precision datetime64 objects
* gh-4428: einsum(None, None, None, None) causes segfault
* gh-4134: uninitialized use for for size 1 object reductions

Changes
=======

NDIter
~~~~~~
When ``NpyIter_RemoveAxis`` is now called, the iterator range will be reset.

When a multi index is being tracked and an iterator is not buffered, it is
possible to use ``NpyIter_RemoveAxis``. In this case an iterator can shrink
in size. Because the total size of an iterator is limited, the iterator
may be too large before these calls. In this case its size will be set to ``-1``
and an error issued not at construction time but when removing the multi
index, setting the iterator range, or getting the next function.

This has no effect on currently working code, but highlights the necessity
of checking for an error return if these conditions can occur. In most
cases the arrays being iterated are as large as the iterator so that such
a problem cannot occur.

Optional reduced verbosity for np.distutils
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set ``numpy.distutils.system_info.system_info.verbosity = 0`` and then
calls to ``numpy.distutils.system_info.get_info('blas_opt')`` will not
print anything on the output. This is mostly for other packages using
numpy.distutils.

Deprecations
============

C-API
~~~~~

The utility function npy_PyFile_Dup and npy_PyFile_DupClose are broken by the
internal buffering python 3 applies to its file objects.
To fix this two new functions npy_PyFile_Dup2 and npy_PyFile_DupClose2 are
declared in npy_3kcompat.h and the old functions are deprecated.
Due to the fragile nature of these functions it is recommended to instead use
the python API when possible.
@charris
Copy link
Member

charris commented Apr 22, 2015

OK, putting this in for now. @juliantaylor This seemed to work for 1.9.2.

charris added a commit that referenced this pull request Apr 22, 2015
@charris charris merged commit 02b8583 into numpy:master Apr 22, 2015
jsonn pushed a commit to jsonn/pkgsrc that referenced this pull request Apr 22, 2015
Reviewed by:	wiz@

Upstream changes:
NumPy 1.9.2 Release Notes
*************************

This is a bugfix only release in the 1.9.x series.

Issues fixed
============

* `#5316 <https://github.com/numpy/numpy/issues/5316>`__: fix too large dtype alignment of strings and complex types
* `#5424 <https://github.com/numpy/numpy/issues/5424>`__: fix ma.median when used on ndarrays
* `#5481 <https://github.com/numpy/numpy/issues/5481>`__: Fix astype for structured array fields of different byte order
* `#5354 <https://github.com/numpy/numpy/issues/5354>`__: fix segfault when clipping complex arrays
* `#5524 <https://github.com/numpy/numpy/issues/5524>`__: allow np.argpartition on non ndarrays
* `#5612 <https://github.com/numpy/numpy/issues/5612>`__: Fixes ndarray.fill to accept full range of uint64
* `#5155 <https://github.com/numpy/numpy/issues/5155>`__: Fix loadtxt with comments=None and a string None data
* `#4476 <https://github.com/numpy/numpy/issues/4476>`__: Masked array view fails if structured dtype has datetime component
* `#5388 <https://github.com/numpy/numpy/issues/5388>`__: Make RandomState.set_state and RandomState.get_state threadsafe
* `#5390 <https://github.com/numpy/numpy/issues/5390>`__: make seed, randint and shuffle threadsafe
* `#5374 <https://github.com/numpy/numpy/issues/5374>`__: Fixed incorrect assert_array_almost_equal_nulp documentation
* `#5393 <https://github.com/numpy/numpy/issues/5393>`__: Add support for ATLAS > 3.9.33.
* `#5313 <https://github.com/numpy/numpy/issues/5313>`__: PyArray_AsCArray caused segfault for 3d arrays
* `#5492 <https://github.com/numpy/numpy/issues/5492>`__: handle out of memory in rfftf
* `#4181 <https://github.com/numpy/numpy/issues/4181>`__: fix a few bugs in the random.pareto docstring
* `#5359 <https://github.com/numpy/numpy/issues/5359>`__: minor changes to linspace docstring
* `#4723 <https://github.com/numpy/numpy/issues/4723>`__: fix a compile issues on AIX

NumPy 1.9.1 Release Notes
*************************

This is a bugfix only release in the 1.9.x series.

Issues fixed
============

* gh-5184: restore linear edge behaviour of gradient to as it was in < 1.9.
  The second order behaviour is available via the `edge_order` keyword
* gh-4007: workaround Accelerate sgemv crash on OSX 10.9
* gh-5100: restore object dtype inference from iterable objects without `len()`
* gh-5163: avoid gcc-4.1.2 (red hat 5) miscompilation causing a crash
* gh-5138: fix nanmedian on arrays containing inf
* gh-5240: fix not returning out array from ufuncs with subok=False set
* gh-5203: copy inherited masks in MaskedArray.__array_finalize__
* gh-2317: genfromtxt did not handle filling_values=0 correctly
* gh-5067: restore api of npy_PyFile_DupClose in python2
* gh-5063: cannot convert invalid sequence index to tuple
* gh-5082: Segmentation fault with argmin() on unicode arrays
* gh-5095: don't propagate subtypes from np.where
* gh-5104: np.inner segfaults with SciPy's sparse matrices
* gh-5251: Issue with fromarrays not using correct format for unicode arrays
* gh-5136: Import dummy_threading if importing threading fails
* gh-5148: Make numpy import when run with Python flag '-OO'
* gh-5147: Einsum double contraction in particular order causes ValueError
* gh-479: Make f2py work with intent(in out)
* gh-5170: Make python2 .npy files readable in python3
* gh-5027: Use 'll' as the default length specifier for long long
* gh-4896: fix build error with MSVC 2013 caused by C99 complex support
* gh-4465: Make PyArray_PutTo respect writeable flag
* gh-5225: fix crash when using arange on datetime without dtype set
* gh-5231: fix build in c99 mode

NumPy 1.9.0 Release Notes
*************************

This release supports Python 2.6 - 2.7 and 3.2 - 3.4.


Highlights
==========
* Numerous performance improvements in various areas, most notably indexing and
  operations on small arrays are significantly faster.
  Indexing operations now also release the GIL.
* Addition of `nanmedian` and `nanpercentile` rounds out the nanfunction set.


Dropped Support
===============

* The oldnumeric and numarray modules have been removed.
* The doc/pyrex and doc/cython directories have been removed.
* The doc/numpybook directory has been removed.
* The numpy/testing/numpytest.py file has been removed together with
  the importall function it contained.


Future Changes
==============

* The numpy/polynomial/polytemplate.py file will be removed in NumPy 1.10.0.
* Default casting for inplace operations will change to 'same_kind' in
  Numpy 1.10.0. This will certainly break some code that is currently
  ignoring the warning.
* Relaxed stride checking will be the default in 1.10.0
* String version checks will break because, e.g., '1.9' > '1.10' is True. A
  NumpyVersion class has been added that can be used for such comparisons.
* The diagonal and diag functions will return writeable views in 1.10.0
* The `S` and/or `a` dtypes may be changed to represent Python strings
  instead of bytes, in Python 3 these two types are very different.


Compatibility notes
===================

The diagonal and diag functions return readonly views.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In NumPy 1.8, the diagonal and diag functions returned readonly copies, in
NumPy 1.9 they return readonly views, and in 1.10 they will return writeable
views.

Special scalar float values don't cause upcast to double anymore
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In previous numpy versions operations involving floating point scalars
containing special values ``NaN``, ``Inf`` and ``-Inf`` caused the result
type to be at least ``float64``.  As the special values can be represented
in the smallest available floating point type, the upcast is not performed
anymore.

For example the dtype of:

    ``np.array([1.], dtype=np.float32) * float('nan')``

now remains ``float32`` instead of being cast to ``float64``.
Operations involving non-special values have not been changed.

Percentile output changes
~~~~~~~~~~~~~~~~~~~~~~~~~
If given more than one percentile to compute numpy.percentile returns an
array instead of a list. A single percentile still returns a scalar.  The
array is equivalent to converting the list returned in older versions
to an array via ``np.array``.

If the ``overwrite_input`` option is used the input is only partially
instead of fully sorted.

ndarray.tofile exception type
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All ``tofile`` exceptions are now ``IOError``, some were previously
``ValueError``.

Invalid fill value exceptions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Two changes to numpy.ma.core._check_fill_value:

* When the fill value is a string and the array type is not one of
  'OSUV', TypeError is raised instead of the default fill value being used.

* When the fill value overflows the array type, TypeError is raised instead
  of OverflowError.

Polynomial Classes no longer derived from PolyBase
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This may cause problems with folks who depended on the polynomial classes
being derived from PolyBase. They are now all derived from the abstract
base class ABCPolyBase. Strictly speaking, there should be a deprecation
involved, but no external code making use of the old baseclass could be
found.

Using numpy.random.binomial may change the RNG state vs. numpy < 1.9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A bug in one of the algorithms to generate a binomial random variate has
been fixed. This change will likely alter the number of random draws
performed, and hence the sequence location will be different after a
call to distribution.c::rk_binomial_btpe. Any tests which rely on the RNG
being in a known state should be checked and/or updated as a result.

Random seed enforced to be a 32 bit unsigned integer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.random.seed`` and ``np.random.RandomState`` now throw a ``ValueError``
if the seed cannot safely be converted to 32 bit unsigned integers.
Applications that now fail can be fixed by masking the higher 32 bit values to
zero: ``seed = seed & 0xFFFFFFFF``. This is what is done silently in older
versions so the random stream remains the same.

Argmin and argmax out argument
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``out`` argument to ``np.argmin`` and ``np.argmax`` and their
equivalent C-API functions is now checked to match the desired output shape
exactly.  If the check fails a ``ValueError`` instead of ``TypeError`` is
raised.

Einsum
~~~~~~
Remove unnecessary broadcasting notation restrictions.
``np.einsum('ijk,j->ijk', A, B)`` can also be written as
``np.einsum('ij...,j->ij...', A, B)`` (ellipsis is no longer required on 'j')

Indexing
~~~~~~~~

The NumPy indexing has seen a complete rewrite in this version. This makes
most advanced integer indexing operations much faster and should have no
other implications.  However some subtle changes and deprecations were
introduced in advanced indexing operations:

* Boolean indexing into scalar arrays will always return a new 1-d array.
  This means that ``array(1)[array(True)]`` gives ``array([1])`` and
  not the original array.

* Advanced indexing into one dimensional arrays used to have
  (undocumented) special handling regarding repeating the value array in
  assignments when the shape of the value array was too small or did not
  match.  Code using this will raise an error. For compatibility you can
  use ``arr.flat[index] = values``, which uses the old code branch.  (for
  example ``a = np.ones(10); a[np.arange(10)] = [1, 2, 3]``)

* The iteration order over advanced indexes used to be always C-order.
  In NumPy 1.9. the iteration order adapts to the inputs and is not
  guaranteed (with the exception of a *single* advanced index which is
  never reversed for compatibility reasons). This means that the result
  is undefined if multiple values are assigned to the same element.  An
  example for this is ``arr[[0, 0], [1, 1]] = [1, 2]``, which may set
  ``arr[0, 1]`` to either 1 or 2.

* Equivalent to the iteration order, the memory layout of the advanced
  indexing result is adapted for faster indexing and cannot be predicted.

* All indexing operations return a view or a copy. No indexing operation
  will return the original array object. (For example ``arr[...]``)

* In the future Boolean array-likes (such as lists of python bools) will
  always be treated as Boolean indexes and Boolean scalars (including
  python ``True``) will be a legal *boolean* index. At this time, this is
  already the case for scalar arrays to allow the general
  ``positive = a[a > 0]`` to work when ``a`` is zero dimensional.

* In NumPy 1.8 it was possible to use ``array(True)`` and
  ``array(False)`` equivalent to 1 and 0 if the result of the operation
  was a scalar.  This will raise an error in NumPy 1.9 and, as noted
  above, treated as a boolean index in the future.

* All non-integer array-likes are deprecated, object arrays of custom
  integer like objects may have to be cast explicitly.

* The error reporting for advanced indexing is more informative, however
  the error type has changed in some cases. (Broadcasting errors of
  indexing arrays are reported as ``IndexError``)

* Indexing with more then one ellipsis (``...``) is deprecated.

Non-integer reduction axis indexes are deprecated
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Non-integer axis indexes to reduction ufuncs like `add.reduce` or `sum` are
deprecated.

``promote_types`` and string dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``promote_types`` function now returns a valid string length when given an
integer or float dtype as one argument and a string dtype as another
argument.  Previously it always returned the input string dtype, even if it
wasn't long enough to store the max integer/float value converted to a
string.

``can_cast`` and string dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``can_cast`` function now returns False in "safe" casting mode for
integer/float dtype and string dtype if the string dtype length is not long
enough to store the max integer/float value converted to a string.
Previously ``can_cast`` in "safe" mode returned True for integer/float
dtype and a string dtype of any length.

astype and string dtype
~~~~~~~~~~~~~~~~~~~~~~~
The ``astype`` method now returns an error if the string dtype to cast to
is not long enough in "safe" casting mode to hold the max value of
integer/float array that is being casted. Previously the casting was
allowed even if the result was truncated.

`npyio.recfromcsv` keyword arguments change
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`npyio.recfromcsv` no longer accepts the undocumented `update` keyword,
which used to override the `dtype` keyword.

The ``doc/swig`` directory moved
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``doc/swig`` directory has been moved to ``tools/swig``.

The ``npy_3kcompat.h`` header changed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The unused ``simple_capsule_dtor`` function has been removed from
``npy_3kcompat.h``.  Note that this header is not meant to be used outside
of numpy; other projects should be using their own copy of this file when
needed.

Negative indices in C-Api ``sq_item`` and ``sq_ass_item`` sequence methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When directly accessing the ``sq_item`` or ``sq_ass_item`` PyObject slots
for item getting, negative indices will not be supported anymore.
``PySequence_GetItem`` and ``PySequence_SetItem`` however fix negative
indices so that they can be used there.

NDIter
~~~~~~
When ``NpyIter_RemoveAxis`` is now called, the iterator range will be reset.

When a multi index is being tracked and an iterator is not buffered, it is
possible to use ``NpyIter_RemoveAxis``. In this case an iterator can shrink
in size. Because the total size of an iterator is limited, the iterator
may be too large before these calls. In this case its size will be set to ``-1``
and an error issued not at construction time but when removing the multi
index, setting the iterator range, or getting the next function.

This has no effect on currently working code, but highlights the necessity
of checking for an error return if these conditions can occur. In most
cases the arrays being iterated are as large as the iterator so that such
a problem cannot occur.

This change was already applied to the 1.8.1 release.

``zeros_like`` for string dtypes now returns empty strings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To match the `zeros` function `zeros_like` now returns an array initialized
with empty strings instead of an array filled with `'0'`.


New Features
============

Percentile supports more interpolation options
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.percentile`` now has the interpolation keyword argument to specify in
which way points should be interpolated if the percentiles fall between two
values.  See the documentation for the available options.

Generalized axis support for median and percentile
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.median`` and ``np.percentile`` now support generalized axis arguments like
ufunc reductions do since 1.7. One can now say axis=(index, index) to pick a
list of axes for the reduction. The ``keepdims`` keyword argument was also
added to allow convenient broadcasting to arrays of the original shape.

Dtype parameter added to ``np.linspace`` and ``np.logspace``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The returned data type from the ``linspace`` and ``logspace`` functions can
now be specified using the dtype parameter.

More general ``np.triu`` and ``np.tril`` broadcasting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For arrays with ``ndim`` exceeding 2, these functions will now apply to the
final two axes instead of raising an exception.

``tobytes`` alias for ``tostring`` method
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``ndarray.tobytes`` and ``MaskedArray.tobytes`` have been added as aliases
for ``tostring`` which exports arrays as ``bytes``. This is more consistent
in Python 3 where ``str`` and ``bytes`` are not the same.

Build system
~~~~~~~~~~~~
Added experimental support for the ppc64le and OpenRISC architecture.

Compatibility to python ``numbers`` module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All numerical numpy types are now registered with the type hierarchy in
the python ``numbers`` module.

``increasing`` parameter added to ``np.vander``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ordering of the columns of the Vandermonde matrix can be specified with
this new boolean argument.

``unique_counts`` parameter added to ``np.unique``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The number of times each unique item comes up in the input can now be
obtained as an optional return value.

Support for median and percentile in nanfunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``np.nanmedian`` and ``np.nanpercentile`` functions behave like
the median and percentile functions except that NaNs are ignored.

NumpyVersion class added
~~~~~~~~~~~~~~~~~~~~~~~~
The class may be imported from numpy.lib and can be used for version
comparison when the numpy version goes to 1.10.devel. For example::

    >>> from numpy.lib import NumpyVersion
    >>> if NumpyVersion(np.__version__) < '1.10.0'):
    ...     print('Wow, that is an old NumPy version!')

Allow saving arrays with large number of named columns
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The numpy storage format 1.0 only allowed the array header to have a total size
of 65535 bytes. This can be exceeded by structured arrays with a large number
of columns. A new format 2.0 has been added which extends the header size to 4
GiB. `np.save` will automatically save in 2.0 format if the data requires it,
else it will always use the more compatible 1.0 format.

Full broadcasting support for ``np.cross``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.cross`` now properly broadcasts its two input arrays, even if they
have different number of dimensions. In earlier versions this would result
in either an error being raised, or wrong results computed.


Improvements
============

Better numerical stability for sum in some cases
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pairwise summation is now used in the sum method, but only along the fast
axis and for groups of the values <= 8192 in length. This should also
improve the accuracy of var and std in some common cases.

Percentile implemented in terms of ``np.partition``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``np.percentile`` has been implemented in terms of ``np.partition`` which
only partially sorts the data via a selection algorithm. This improves the
time complexity from ``O(nlog(n))`` to ``O(n)``.

Performance improvement for ``np.array``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The performance of converting lists containing arrays to arrays using
``np.array`` has been improved. It is now equivalent in speed to
``np.vstack(list)``.

Performance improvement for ``np.searchsorted``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the built-in numeric types, ``np.searchsorted`` no longer relies on the
data type's ``compare`` function to perform the search, but is now
implemented by type specific functions. Depending on the size of the
inputs, this can result in performance improvements over 2x.

Optional reduced verbosity for np.distutils
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set ``numpy.distutils.system_info.system_info.verbosity = 0`` and then
calls to ``numpy.distutils.system_info.get_info('blas_opt')`` will not
print anything on the output. This is mostly for other packages using
numpy.distutils.

Covariance check in ``np.random.multivariate_normal``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A ``RuntimeWarning`` warning is raised when the covariance matrix is not
positive-semidefinite.

Polynomial Classes no longer template based
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The polynomial classes have been refactored to use an abstract base class
rather than a template in order to implement a common interface. This makes
importing the polynomial package faster as the classes do not need to be
compiled on import.

More GIL releases
~~~~~~~~~~~~~~~~~
Several more functions now release the Global Interpreter Lock allowing more
efficient parallization using the ``threading`` module. Most notably the GIL is
now released for fancy indexing, ``np.where`` and the ``random`` module now
uses a per-state lock instead of the GIL.

MaskedArray support for more complicated base classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Built-in assumptions that the baseclass behaved like a plain array are being
removed. In particalur, ``repr`` and ``str`` should now work more reliably.


C-API
~~~~~


Deprecations
============

Non-integer scalars for sequence repetition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using non-integer numpy scalars to repeat python sequences is deprecated.
For example ``np.float_(2) * [1]`` will be an error in the future.

``select`` input deprecations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The integer and empty input to ``select`` is deprecated. In the future only
boolean arrays will be valid conditions and an empty ``condlist`` will be
considered an input error instead of returning the default.

``rank`` function
~~~~~~~~~~~~~~~~~
The ``rank`` function has been deprecated to avoid confusion with
``numpy.linalg.matrix_rank``.

Object array equality comparisons
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the future object array comparisons both `==` and `np.equal` will not
make use of identity checks anymore. For example:

>>> a = np.array([np.array([1, 2, 3]), 1])
>>> b = np.array([np.array([1, 2, 3]), 1])
>>> a == b

will consistently return False (and in the future an error) even if the array
in `a` and `b` was the same object.

The equality operator `==` will in the future raise errors like `np.equal`
if broadcasting or element comparisons, etc. fails.

Comparison with `arr == None` will in the future do an elementwise comparison
instead of just returning False. Code should be using `arr is None`.

All of these changes will give Deprecation- or FutureWarnings at this time.

C-API
~~~~~

The utility function npy_PyFile_Dup and npy_PyFile_DupClose are broken by the
internal buffering python 3 applies to its file objects.
To fix this two new functions npy_PyFile_Dup2 and npy_PyFile_DupClose2 are
declared in npy_3kcompat.h and the old functions are deprecated.
Due to the fragile nature of these functions it is recommended to instead use
the python API when possible.

This change was already applied to the 1.8.1 release.

NumPy 1.8.2 Release Notes
*************************

This is a bugfix only release in the 1.8.x series.

Issues fixed
============

* gh-4836: partition produces wrong results for multiple selections in equal ranges
* gh-4656: Make fftpack._raw_fft threadsafe
* gh-4628: incorrect argument order to _copyto in in np.nanmax, np.nanmin
* gh-4642: Hold GIL for converting dtypes types with fields
* gh-4733: fix np.linalg.svd(b, compute_uv=False)
* gh-4853: avoid unaligned simd load on reductions on i386
* gh-4722: Fix seg fault converting empty string to object
* gh-4613: Fix lack of NULL check in array_richcompare
* gh-4774: avoid unaligned access for strided byteswap
* gh-650: Prevent division by zero when creating arrays from some buffers
* gh-4602: ifort has issues with optimization flag O2, use O1
NumPy 1.8.1 Release Notes
*************************

This is a bugfix only release in the 1.8.x series.


Issues fixed
============

* gh-4276: Fix mean, var, std methods for object arrays
* gh-4262: remove insecure mktemp usage
* gh-2385: absolute(complex(inf)) raises invalid warning in python3
* gh-4024: Sequence assignment doesn't raise exception on shape mismatch
* gh-4027: Fix chunked reading of strings longer than BUFFERSIZE
* gh-4109: Fix object scalar return type of 0-d array indices
* gh-4018: fix missing check for memory allocation failure in ufuncs
* gh-4156: high order linalg.norm discards imaginary elements of complex arrays
* gh-4144: linalg: norm fails on longdouble, signed int
* gh-4094: fix NaT handling in _strided_to_strided_string_to_datetime
* gh-4051: fix uninitialized use in _strided_to_strided_string_to_datetime
* gh-4093: Loading compressed .npz file fails under Python 2.6.6
* gh-4138: segfault with non-native endian memoryview in python 3.4
* gh-4123: Fix missing NULL check in lexsort
* gh-4170: fix native-only long long check in memoryviews
* gh-4187: Fix large file support on 32 bit
* gh-4152: fromfile: ensure file handle positions are in sync in python3
* gh-4176: clang compatibility: Typos in conversion_utils
* gh-4223: Fetching a non-integer item caused array return
* gh-4197: fix minor memory leak in memoryview failure case
* gh-4206: fix build with single-threaded python
* gh-4220: add versionadded:: 1.8.0 to ufunc.at docstring
* gh-4267: improve handling of memory allocation failure
* gh-4267: fix use of capi without gil in ufunc.at
* gh-4261: Detect vendor versions of GNU Compilers
* gh-4253: IRR was returning nan instead of valid negative answer
* gh-4254: fix unnecessary byte order flag change for byte arrays
* gh-3263: numpy.random.shuffle clobbers mask of a MaskedArray
* gh-4270: np.random.shuffle not work with flexible dtypes
* gh-3173: Segmentation fault when 'size' argument to random.multinomial
* gh-2799: allow using unique with lists of complex
* gh-3504: fix linspace truncation for integer array scalar
* gh-4191: get_info('openblas') does not read libraries key
* gh-3348: Access violation in _descriptor_from_pep3118_format
* gh-3175: segmentation fault with numpy.array() from bytearray
* gh-4266: histogramdd - wrong result for entries very close to last boundary
* gh-4408: Fix stride_stricks.as_strided function for object arrays
* gh-4225: fix log1p and exmp1 return for np.inf on windows compiler builds
* gh-4359: Fix infinite recursion in str.format of flex arrays
* gh-4145: Incorrect shape of broadcast result with the exponent operator
* gh-4483: Fix commutativity of {dot,multiply,inner}(scalar, matrix_of_objs)
* gh-4466: Delay npyiter size check when size may change
* gh-4485: Buffered stride was erroneously marked fixed
* gh-4354: byte_bounds fails with datetime dtypes
* gh-4486: segfault/error converting from/to high-precision datetime64 objects
* gh-4428: einsum(None, None, None, None) causes segfault
* gh-4134: uninitialized use for for size 1 object reductions

Changes
=======

NDIter
~~~~~~
When ``NpyIter_RemoveAxis`` is now called, the iterator range will be reset.

When a multi index is being tracked and an iterator is not buffered, it is
possible to use ``NpyIter_RemoveAxis``. In this case an iterator can shrink
in size. Because the total size of an iterator is limited, the iterator
may be too large before these calls. In this case its size will be set to ``-1``
and an error issued not at construction time but when removing the multi
index, setting the iterator range, or getting the next function.

This has no effect on currently working code, but highlights the necessity
of checking for an error return if these conditions can occur. In most
cases the arrays being iterated are as large as the iterator so that such
a problem cannot occur.

Optional reduced verbosity for np.distutils
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Set ``numpy.distutils.system_info.system_info.verbosity = 0`` and then
calls to ``numpy.distutils.system_info.get_info('blas_opt')`` will not
print anything on the output. This is mostly for other packages using
numpy.distutils.

Deprecations
============

C-API
~~~~~

The utility function npy_PyFile_Dup and npy_PyFile_DupClose are broken by the
internal buffering python 3 applies to its file objects.
To fix this two new functions npy_PyFile_Dup2 and npy_PyFile_DupClose2 are
declared in npy_3kcompat.h and the old functions are deprecated.
Due to the fragile nature of these functions it is recommended to instead use
the python API when possible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants