Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: core: in dot(), make copies if out has memory overlap with input #8539

Merged
merged 4 commits into from
Jan 31, 2017

Conversation

pv
Copy link
Member

@pv pv commented Jan 28, 2017

BLAS does not allow aliased inputs. If user-provided out= argument may
overlap in memory with one of the inputs to dot(), put the output to a
temporary work array and copy back after the operation.

Fixes gh-8440

BLAS does not allow aliased inputs. If user-provided out= argument may
overlap in memory with one of the inputs to dot(), put the output to a
temporary work array and copy back after the operation.
else {
Py_INCREF(out);
ret = out;
}
Py_INCREF(out);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Looks like it has already been done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, a note would be useful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the variable naming, it should be more clear what the difference between ret and result is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charris: yes it's needed, since both variables are owned references.
@juliantaylor: agreed, would be best to change.

@@ -742,11 +767,13 @@ cblas_matrixproduct(int typenum, PyArrayObject *ap1, PyArrayObject *ap2,

Py_DECREF(ap1);
Py_DECREF(ap2);
return PyArray_Return(ret);
Py_DECREF(ret);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might note that this triggers copyback if needed.

def test_dot_out_mem_overlap(self):
np.random.seed(1)

for dtype in [np.object_, np.float32, np.complex128, np.int64]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also test float64 and complex64?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need multiple dtype tests at all? none of the new code has any relations to type. Doesn't matter much here as its a fast test, but I like to avoid unnecessary test bloat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two code paths --- blas and non-blas, so you need at least two dtypes.

x = np.dot(a, b, out=b)
assert_equal(x, y, err_msg=repr(dtype))
# Test BLAS and non-BLAS code paths
for dtype in [np.float32, np.int64]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I prefer the maximal tests because the code may change in the future. The tests not only check current behavior, they also define correct behavior for the future.

@charris charris merged commit d4f5bea into numpy:master Jan 31, 2017
@charris
Copy link
Member

charris commented Jan 31, 2017

Thanks Pauli.

@se4u
Copy link

se4u commented Jan 31, 2017

Thank you @pv

@pv pv mentioned this pull request Feb 8, 2017
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants