Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Fix for issue #291 #444

Merged
merged 9 commits into from

7 participants

@87
87 commented

Hi, I made an attempt at fixing #291, by redoing part of the array interface.

It can now accept objects that do not specify their 'data' attribute and have a shape of () or a shape of (1,). To accept no shape information at all is a bit dangerous, because it would be prone to errors.

This now works:

class Foo(object):
    def __init__(self, value):
        self.value = value
    def __float__(self):
        return float(self.value)
    @property
    def __array_interface__(self):
        return {'typestr' : '=f8', 'shape' : ()}

>>> f = Foo(0.5)
>>> np.array(f)
array([ 0.5])

But without the shape information, it still gives the error:
ValueError: Missing __array_interface__ shape

I think enforcing the shape would be the best way to go, because otherwise the array interface can be interpreted wrong when someone does supply the 'data' attribute.

@87
87 commented

Hmm, seems the coercion does not work for Python 3.x..

@87
87 commented

Now it does; the typestring is unicode by default on Python 3.

@teoliphant
Owner

Why can't we assume that shape is () if not given? We could raise an error if data is given without shape. I'm not sure we can fix this with just changes to the array interface. It seems that the context of using the array interface is important as well (i.e. using it to get the data-type in the array factory).

@87
87 commented

Hmm, yes, that could be possible, but is it safe to assume that every object that does not specify a shape is a scalar?

@87
87 commented

The array interface would be the most logical place, because it is queried for interface information by PyArray_GetArrayParamsFromObject.

@87
87 commented

The PyArray_FromInterface already constructs the array object, so it does not return shape/dtype information.

@njsmith
Owner

I don't think it matters so much what requirements we decide to put on shape, just so long as we document them clearly and enforce them in the code, so that later on we'll have some confidence that people are using this interface the way we expect. The current __array_interface__ docs are (IMHO) quite confusing already, with many small inconsistencies on exactly what is allowed/required, even without mentioning this auto-conversion usage.

What are the intended semantics, btw? (The patch doesn't seem to update the docs to match.) Is the idea that if you leave off the data attribute, then we just try to cast the object to the specified sort of array using normal python conversion rules? Basically calling np.array(obj, dtype=obj.__array_interface__["descr"]).reshape(obj.__array_interface__["shape"])?

@87
87 commented

It creates a new array from the dtype that follows out of "typestr" and the shape from the shape tuple and sets the object as item, basically doing a[0] = obj. That is also what happens in setArrayFromSequence, which also coerces data when the array interface is specified.

@strogdon

I've been following this from the Sage side. The old numpy behavior was such that if a single object didn't have a shape attribute then numpy.array(object) would return a numpy array with a 'shape': () attribute and numpy.array([object]) would return a numpy array with a 'shape': (1,). With the present commit both numpy.array(object) and numpy.array([object)] return numpy arrays with 'shape': (1,). Is it inconsistent with the array_interface for numpy.array(object) of this commit to return a numpy array with 'shape': (), i.e. the old numpy behavior? If I'm not mistaken the present Sage relies, to some extent, on this.

@teoliphant
Owner

This really doesn't need any discussion does it? We need to revert to the old behavior and then if it is desired to change that behavior figure out a migration path. It is completely acceptable to allow the array interface to not specify a shape and assume that () is intended.

@teoliphant
Owner

@87 Yes, it is safe to assume that an object exporting the array_interface that does not specify a shape is a scalar.

@teoliphant
Owner

We can and should require the shape information if there is also a data-parameter. If there is no data attribute, then we can assume shape is () if not given.

@strogdon

Just so I wasn't misunderstood, by the "old numpy behavior" I wasn't referring to the missing shape information that's passed to numpy.array. I was referring to what is returned by numpy.array. The shape attribute may well be needed for objects passed to numpy.array with numpy-1.7.x. The old numpy.array(object) returned a numpy array with shape () while with this commit numpy.array(object) returns a numpy array with shape (1,). The old behavior

>>> f=Foo(0.5)
>>> f.__array_interface__
{'typestr': '=f8'}
>>> numpy.array(f).__array_interface__
{'descr': [('', '<f8')], 'strides': None, 'shape': (), 'version': 3, 'typestr': '<f8', 'data': (10034896, False)}
>>> numpy.array(f)
array(0.5)
>>> numpy.array([f])
array([ 0.5])

The new behavior

>>> f=Foo(0.5)
>>> f.__array_interface__
{'typestr': '=f8', 'shape': ()}
>>> numpy.array(f).__array_interface__
{'descr': [('', '<f8')], 'strides': None, 'shape': (1,), 'version': 3, 'typestr': '<f8', 'data': (138360552, False)}
>>> numpy.array(f)
array([ 0.5])
>>> numpy.array([f])
array([ 0.5])
@njsmith
Owner

@strogdon: I think that's an unambiguous bug, yes. We shouldn't be converting 0-d arrays into 1-d arrays gratuitously.

@teoliphant: It looks like the only people who care about this are Sage, and they already use a frozen snapshot of numpy, since we already broke this in 1.6. So if there's a better way for this interface to work, I think a reasonable migration scheme is to just release that better way in 1.7, and Sage will make the (trivial) adaptations necessary at the same time they migrate their snapshot to 1.7?

But, like I said, I don't think it really matters what rules we use for interpreting the shape field (or its absence) etc., so long as they're complete, documented, and tested.

@teoliphant
Owner
@87
87 commented

Hi, I made some changes to the PR. The implicit conversion of scalar to one-dimensional array has been removed, so that this works again:

>>> f=Foo(0.5)
>>> numpy.array(f)
array(0.5)

Also, it is now possible to omit the shape information if data is also omitted. A few more test cases have been added to reflect this behaviour.

The docs still have to be fixed. It also might be a good idea to test other aspects of the array interface as well...

@certik
Owner

@teoliphant, should this PR go into the release as well? Or into the next release.

@teoliphant
Owner
@certik
Owner

Excellent, I put it into the milestone for this release.

The changes in this PR are quite extensive, so we need to test this + review it thoroughly.

@teoliphant teoliphant merged commit ca27396 into from
@kiwifb

Sorry to commenting after this is closed but I see that the pull has been committed on master but there is no trace of it on the 1.7.x maintenance branch. I thought the plan was to have it in numpy 1.7.0. Am I mistaken in thinking that the release will be cut from the current 1.7.x maintenance branch?

@charris
Owner

@certik It looks like you had this marked for the 1.7 release.

@certik certik referenced this pull request
Merged

Backport444 #2815

@certik
Owner

@charris, @kiwifb, thanks for the reminder, I just backported it in the PR #2815.

@teoliphant teoliphant commented on the diff
numpy/core/src/multiarray/ctors.c
((152 lines not shown))
PyObject *dataptr;
+ if (n == 0) {
+ PyErr_SetString(PyExc_ValueError,
@teoliphant Owner

This check is causing np.ndindex() to fail in master which I didn't catch earlier. Why is this check here? We needed to remove this check to get code that used np.ndindex() working against master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@teoliphant
Owner

It looks like the check for (n==0) is causing a failure on master of np.nindex(). Shall I open a new issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
19 numpy/core/src/multiarray/common.c
@@ -309,9 +309,24 @@ PyArray_DTypeFromObjectHelper(PyObject *obj, int maxdims,
if (ip != NULL) {
if (PyDict_Check(ip)) {
PyObject *typestr;
+#if defined(NPY_PY3K)
+ PyObject *tmp = NULL;
+#endif
typestr = PyDict_GetItemString(ip, "typestr");
- if (typestr && PyString_Check(typestr)) {
- dtype =_array_typedescr_fromstr(PyString_AS_STRING(typestr));
+#if defined(NPY_PY3K)
+ /* Allow unicode type strings */
+ if (PyUnicode_Check(typestr)) {
+ tmp = PyUnicode_AsASCIIString(typestr);
+ typestr = tmp;
+ }
+#endif
+ if (typestr && PyBytes_Check(typestr)) {
+ dtype =_array_typedescr_fromstr(PyBytes_AS_STRING(typestr));
+#if defined(NPY_PY3K)
+ if (tmp == typestr) {
+ Py_DECREF(tmp);
+ }
+#endif
Py_DECREF(ip);
if (dtype == NULL) {
goto fail;
View
241 numpy/core/src/multiarray/ctors.c
@@ -1977,84 +1977,113 @@ PyArray_FromStructInterface(PyObject *input)
/*NUMPY_API*/
NPY_NO_EXPORT PyObject *
-PyArray_FromInterface(PyObject *input)
+PyArray_FromInterface(PyObject *origin)
{
- PyObject *attr = NULL, *item = NULL;
- PyObject *tstr = NULL, *shape = NULL;
- PyObject *inter = NULL;
+ PyObject *tmp = NULL;
+ PyObject *iface = NULL;
+ PyObject *attr = NULL;
PyObject *base = NULL;
PyArrayObject *ret;
- PyArray_Descr *type=NULL;
- char *data;
+ PyArray_Descr *dtype = NULL;
+ char *data = NULL;
Py_ssize_t buffer_len;
int res, i, n;
npy_intp dims[NPY_MAXDIMS], strides[NPY_MAXDIMS];
int dataflags = NPY_ARRAY_BEHAVED;
- /* Get the memory from __array_data__ and __array_offset__ */
- /* Get the shape */
/* Get the typestring -- ignore array_descr */
+ /* Get the shape */
+ /* Get the memory from __array_data__ and __array_offset__ */
/* Get the strides */
- inter = PyObject_GetAttrString(input, "__array_interface__");
- if (inter == NULL) {
+ iface = PyObject_GetAttrString(origin, "__array_interface__");
+ if (iface == NULL) {
PyErr_Clear();
return Py_NotImplemented;
}
- if (!PyDict_Check(inter)) {
- Py_DECREF(inter);
+ if (!PyDict_Check(iface)) {
+ Py_DECREF(iface);
PyErr_SetString(PyExc_ValueError,
"Invalid __array_interface__ value, must be a dict");
return NULL;
}
- shape = PyDict_GetItemString(inter, "shape");
- if (shape == NULL) {
- Py_DECREF(inter);
- PyErr_SetString(PyExc_ValueError,
- "Missing __array_interface__ shape");
- return NULL;
- }
- tstr = PyDict_GetItemString(inter, "typestr");
- if (tstr == NULL) {
- Py_DECREF(inter);
+
+ /* Get type string from interface specification */
+ attr = PyDict_GetItemString(iface, "typestr");
+ if (attr == NULL) {
+ Py_DECREF(iface);
PyErr_SetString(PyExc_ValueError,
"Missing __array_interface__ typestr");
return NULL;
}
+#if defined(NPY_PY3K)
+ /* Allow unicode type strings */
+ if (PyUnicode_Check(attr)) {
+ tmp = PyUnicode_AsASCIIString(attr);
+ attr = tmp;
+ }
+#endif
+ if (!PyBytes_Check(attr)) {
+ PyErr_SetString(PyExc_TypeError,
+ "__array_interface__ typestr must be a string");
+ goto fail;
+ }
+ /* Get dtype from type string */
+ dtype = _array_typedescr_fromstr(PyString_AS_STRING(attr));
+#if defined(NPY_PY3K)
+ if (tmp == attr) {
+ Py_DECREF(tmp);
+ }
+#endif
+ if (dtype == NULL) {
+ goto fail;
+ }
- attr = PyDict_GetItemString(inter, "data");
- base = input;
- if ((attr == NULL) || (attr==Py_None) || (!PyTuple_Check(attr))) {
- if (attr && (attr != Py_None)) {
- item = attr;
+ /* Get shape tuple from interface specification */
+ attr = PyDict_GetItemString(iface, "shape");
+ if (attr == NULL) {
+ /* Shape must be specified when 'data' is specified */
+ if (PyDict_GetItemString(iface, "data") != NULL) {
+ Py_DECREF(iface);
+ PyErr_SetString(PyExc_ValueError,
+ "Missing __array_interface__ shape");
+ return NULL;
}
+ /* Assume shape as scalar otherwise */
else {
- item = input;
- }
- res = PyObject_AsWriteBuffer(item, (void **)&data, &buffer_len);
- if (res < 0) {
- PyErr_Clear();
- res = PyObject_AsReadBuffer(
- item, (const void **)&data, &buffer_len);
- if (res < 0) {
- goto fail;
- }
- dataflags &= ~NPY_ARRAY_WRITEABLE;
+ /* NOTE: pointers to data and base should be NULL */
+ n = dims[0] = 0;
}
- attr = PyDict_GetItemString(inter, "offset");
- if (attr) {
- npy_longlong num = PyLong_AsLongLong(attr);
- if (error_converting(num)) {
- PyErr_SetString(PyExc_TypeError,
- "__array_interface__ offset must be an integer");
+ }
+ /* Make sure 'shape' is a tuple */
+ else if (!PyTuple_Check(attr)) {
+ PyErr_SetString(PyExc_TypeError,
+ "shape must be a tuple");
+ goto fail;
+ }
+ /* Get dimensions from shape tuple */
+ else {
+ n = PyTuple_GET_SIZE(attr);
+ for (i = 0; i < n; i++) {
+ tmp = PyTuple_GET_ITEM(attr, i);
+ dims[i] = PyArray_PyIntAsIntp(tmp);
+ if (error_converting(dims[i])) {
goto fail;
}
- data += num;
}
- base = item;
}
- else {
+
+ /* Get data buffer from interface specification */
+ attr = PyDict_GetItemString(iface, "data");
+
+ /* Case for data access through pointer */
+ if (attr && PyTuple_Check(attr)) {
PyObject *dataptr;
+ if (n == 0) {
+ PyErr_SetString(PyExc_ValueError,
@teoliphant Owner

This check is causing np.ndindex() to fail in master which I didn't catch earlier. Why is this check here? We needed to remove this check to get code that used np.ndindex() working against master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ "__array_interface__ shape must be at least size 1");
+ goto fail;
+ }
if (PyTuple_GET_SIZE(attr) != 2) {
PyErr_SetString(PyExc_TypeError,
"__array_interface__ data must be a 2-tuple with "
@@ -2083,90 +2112,102 @@ PyArray_FromInterface(PyObject *input)
if (PyObject_IsTrue(PyTuple_GET_ITEM(attr,1))) {
dataflags &= ~NPY_ARRAY_WRITEABLE;
}
+ base = origin;
}
- attr = tstr;
-#if defined(NPY_PY3K)
- if (PyUnicode_Check(tstr)) {
- /* Allow unicode type strings */
- attr = PyUnicode_AsASCIIString(tstr);
- }
-#endif
- if (!PyBytes_Check(attr)) {
- PyErr_SetString(PyExc_TypeError,
- "__array_interface__ typestr must be a string");
- goto fail;
- }
- type = _array_typedescr_fromstr(PyString_AS_STRING(attr));
-#if defined(NPY_PY3K)
- if (attr != tstr) {
- Py_DECREF(attr);
- }
-#endif
- if (type == NULL) {
- goto fail;
- }
- attr = shape;
- if (!PyTuple_Check(attr)) {
- PyErr_SetString(PyExc_TypeError,
- "shape must be a tuple");
- Py_DECREF(type);
- goto fail;
- }
- n = PyTuple_GET_SIZE(attr);
- for (i = 0; i < n; i++) {
- item = PyTuple_GET_ITEM(attr, i);
- dims[i] = PyArray_PyIntAsIntp(item);
- if (error_converting(dims[i])) {
- break;
+
+ /* Case for data access through buffer */
+ else if (attr) {
+ if (n == 0) {
+ PyErr_SetString(PyExc_ValueError,
+ "__array_interface__ shape must be at least size 1");
+ goto fail;
+ }
+ if (attr && (attr != Py_None)) {
+ base = attr;
+ }
+ else {
+ base = origin;
+ }
+ res = PyObject_AsWriteBuffer(base, (void **)&data, &buffer_len);
+ if (res < 0) {
+ PyErr_Clear();
+ res = PyObject_AsReadBuffer(
+ base, (const void **)&data, &buffer_len);
+ if (res < 0) {
+ goto fail;
+ }
+ dataflags &= ~NPY_ARRAY_WRITEABLE;
+ }
+ /* Get offset number from interface specification */
+ attr = PyDict_GetItemString(origin, "offset");
+ if (attr) {
+ npy_longlong num = PyLong_AsLongLong(attr);
+ if (error_converting(num)) {
+ PyErr_SetString(PyExc_TypeError,
+ "__array_interface__ offset must be an integer");
+ goto fail;
+ }
+ data += num;
}
}
- ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, type,
+ ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, dtype,
n, dims,
NULL, data,
dataflags, NULL);
if (ret == NULL) {
- return NULL;
+ goto fail;
}
- Py_INCREF(base);
- if (PyArray_SetBaseObject(ret, base) < 0) {
- Py_DECREF(ret);
- return NULL;
+ if (data == NULL) {
+ if (PyArray_SIZE(ret) > 1) {
+ PyErr_SetString(PyExc_ValueError,
+ "cannot coerce scalar to array with size > 1");
+ Py_DECREF(ret);
+ goto fail;
+ }
+ if (PyArray_SETITEM(ret, PyArray_DATA(ret), origin) < 0) {
+ Py_DECREF(ret);
+ goto fail;
+ }
}
-
- attr = PyDict_GetItemString(inter, "strides");
+ if (base) {
+ Py_INCREF(base);
+ if (PyArray_SetBaseObject(ret, base) < 0) {
+ Py_DECREF(ret);
+ goto fail;
+ }
+ }
+ attr = PyDict_GetItemString(iface, "strides");
if (attr != NULL && attr != Py_None) {
if (!PyTuple_Check(attr)) {
PyErr_SetString(PyExc_TypeError,
"strides must be a tuple");
Py_DECREF(ret);
- return NULL;
+ goto fail;
}
if (n != PyTuple_GET_SIZE(attr)) {
PyErr_SetString(PyExc_ValueError,
"mismatch in length of strides and shape");
Py_DECREF(ret);
- return NULL;
+ goto fail;
}
for (i = 0; i < n; i++) {
- item = PyTuple_GET_ITEM(attr, i);
- strides[i] = PyArray_PyIntAsIntp(item);
+ tmp = PyTuple_GET_ITEM(attr, i);
+ strides[i] = PyArray_PyIntAsIntp(tmp);
if (error_converting(strides[i])) {
- break;
+ Py_DECREF(ret);
+ goto fail;
}
}
- if (PyErr_Occurred()) {
- PyErr_Clear();
- }
memcpy(PyArray_STRIDES(ret), strides, n*sizeof(npy_intp));
}
- else PyErr_Clear();
PyArray_UpdateFlags(ret, NPY_ARRAY_UPDATE_ALL);
- Py_DECREF(inter);
+ Py_DECREF(iface);
return (PyObject *)ret;
fail:
- Py_XDECREF(inter);
+ Py_XDECREF(dtype);
+ Py_XDECREF(iface);
return NULL;
}
View
25 numpy/core/tests/test_multiarray.py
@@ -2809,6 +2809,31 @@ def test_multiarray_flags_not_writable_attribute_deletion(self):
for s in attr:
assert_raises(AttributeError, delattr, a, s)
+def test_array_interface():
+ # Test scalar coercion within the array interface
+ class Foo(object):
+ def __init__(self, value):
+ self.value = value
+ self.iface = {'typestr' : '=f8'}
+ def __float__(self):
+ return float(self.value)
+ @property
+ def __array_interface__(self):
+ return self.iface
+ f = Foo(0.5)
+ assert_equal(np.array(f), 0.5)
+ assert_equal(np.array([f]), [0.5])
+ assert_equal(np.array([f, f]), [0.5, 0.5])
+ assert_equal(np.array(f).dtype, np.dtype('=f8'))
+ # Test various shape definitions
+ f.iface['shape'] = ()
+ assert_equal(np.array(f), 0.5)
+ f.iface['shape'] = None
+ assert_raises(TypeError, np.array, f)
+ f.iface['shape'] = (1,1)
+ assert_equal(np.array(f), [[0.5]])
+ f.iface['shape'] = (2,)
+ assert_raises(ValueError, np.array, f)
def test_flat_element_deletion():
it = np.ones(3).flat
Something went wrong with that request. Please try again.