Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

DOC: document the change in .base #2737

Merged
merged 1 commit into from

3 participants

@certik
Owner

Fixes gh-438.

/cc @njsmith, @iguananaut.

@njsmith
Owner
@certik
Owner

You probably mean #2708. Yes, I used that one.

@njsmith
Owner
@certik
Owner

Here is the exact text from the PR #2708:

+     * Don't allow infinite chains of views, always set the base
+     * to the first owner of the data.  
+     * That is, either the first object which isn't an array, 
+     * or the first object which owns its own data.

and here is the exact text from this PR:

+The ``.base`` attribute now points to the first owner of the data; that is,
+either the first object which isn't an array, or the first object which owns
+its own data.

It seems to me that the two texts are exactly equivalent. Can you point me to a difference? Maybe the comment in the PR #2708 is not correct.

@njsmith
Owner

The texts are equivalent. Now that I look at it, that PR is kind of not very well written though...

The key check appears to be:

    if (!(PyArray_CheckExact(arr)) & (Py_TYPE(tmp) != Py_TYPE(arr))) {
        /* stop collapsing */
    }

So obviously that should be && not &, but I think in this case & does produce the same behaviour as && since the two inputs are the output from C boolean operators, which always return either 0 or 1 exactly. So, we can ignore that. What I think the code says is that if your view is a subclass of ndarray, then you only collapse out the base pointers that also point to the exact same subclass. As soon as you hit a .base which points to a regular ndarray, or a subclass-of-your-subclass, or anything else, you stop. For regular base class ndarrays, you ignore this rule, and just collapse right through subclasses etc. until you run out of .base pointers (like the comment says).

Sort of a strange rule, and the comment is misleading.

@certik
Owner

@teoliphant, do you agree with the better explanation by Nathan? If so, I can send a PR to both the master and the release branch with this better explanation.

@embray

Maybe this isn't the best place to report this, but I was just playing around with Numpy dev with some of my code and hit a use case where these new .base semantics don't hold up as documented:

print np.__version__
arr_base = np.arange(0, 10000, dtype=np.ubyte)
dtype = np.rec.format_parser('2i4', ['foo'], None).dtype
arr_overlay = arr_base[:2000].view(dtype=dtype, type=np.rec.recarray)
print arr_base.shape
print arr_base.flags

print arr_overlay.base.shape
print arr_overlay.base.flags
print arr_overlay.base is arr_base
1.8.0.dev-7b75899
(10000,)
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
(2000,)
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
False

arr_overlay.base should be the original arr_base. Instead the slice of arr_base[:2000] is the base. In my actual code the arr_base is actually a mmap and then things get even worse.

@certik certik referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
@certik
Owner

@njsmith, I've updated the docs based on your comment. Can you please review it?

@njsmith
Owner

Can we write

The `.base` attribute on ndarrays, which is used on views to ensure that the underlying array owning the memory is not deallocated prematurely, now collapses out references when you have a view-of-a-view. For example::

    a = np.arange(10)
    b = a[1:]
    c = b[1:]

In numpy 1.6, `c.base` is `b`, and `c.base.base` is `a`. In numpy 1.7, `c.base` is `a`.

To increase backwards compatibility for software which relies on the old behaviour of `.base`, we only 'skip over' objects which have exactly the same type as the newly created view. This makes a difference if you use `ndarray` subclasses. For example, if we have a mix of `ndarray` and `matrix` objects which are all views on the same original `ndarray`::

    a = np.arange(10)
    b = np.asmatrix(a)
    c = b[0, 1:]
    d = c[0, 1:]

then `d.base` will be `b`. This is because `d` is a `matrix` object, and so the collapsing process only continues so long as it encounters other `matrix` objects. It considers `c`, `b`, and `a` in that order, and `b` is the last entry in that list which is a `matrix` object.

? I think that's accurate and hopefully clearer for non-experts...?

@embray

@njsmith I think that makes sense and explains my earlier confusion on this.

@certik certik DOC: document the change in .base
Fixes gh-438. Based on Nathan's comments in gh-2737.
a757cb5
@certik
Owner

Sounds good. I have rebased the commits and just used the latest Nathan's description. I think it is finally clear (even to me:). I just fixed the rst formatting. Now it renders correctly. @njsmith, if you agree, then just merge this PR please.

@njsmith
Owner

okie-dokie

@njsmith njsmith merged commit c2b2422 into numpy:maintenance/1.7.x
@certik certik referenced this pull request from a commit
@certik certik DOC: document the change in .base
Forwardport gh-2737.
a8c641f
@certik certik deleted the certik:fix438 branch
@certik
Owner

Thanks. Forwardported in a8c641f.

@certik certik referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 26, 2012
  1. @certik

    DOC: document the change in .base

    certik authored
    Fixes gh-438. Based on Nathan's comments in gh-2737.
This page is out of date. Refresh to see the latest.
Showing with 27 additions and 0 deletions.
  1. +27 −0 doc/release/1.7.0-notes.rst
View
27 doc/release/1.7.0-notes.rst
@@ -181,6 +181,33 @@ objects. For example the expression (3 and 'test') produces the string
'test', and now np.logical_and(np.array(3, 'O'), np.array('test', 'O'))
produces 'test' as well.
+The ``.base`` attribute on ndarrays, which is used on views to ensure that the
+underlying array owning the memory is not deallocated prematurely, now
+collapses out references when you have a view-of-a-view. For example::
+
+ a = np.arange(10)
+ b = a[1:]
+ c = b[1:]
+
+In numpy 1.6, ``c.base`` is ``b``, and ``c.base.base`` is ``a``. In numpy 1.7,
+``c.base`` is ``a``.
+
+To increase backwards compatibility for software which relies on the old
+behaviour of ``.base``, we only 'skip over' objects which have exactly the same
+type as the newly created view. This makes a difference if you use ``ndarray``
+subclasses. For example, if we have a mix of ``ndarray`` and ``matrix`` objects
+which are all views on the same original ``ndarray``::
+
+ a = np.arange(10)
+ b = np.asmatrix(a)
+ c = b[0, 1:]
+ d = c[0, 1:]
+
+then ``d.base`` will be ``b``. This is because ``d`` is a ``matrix`` object,
+and so the collapsing process only continues so long as it encounters other
+``matrix`` objects. It considers ``c``, ``b``, and ``a`` in that order, and
+``b`` is the last entry in that list which is a ``matrix`` object.
+
Deprecations
============
Something went wrong with that request. Please try again.