New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of host to device memory copy #1367
Conversation
Jenkins CI test failed with status FAILURE. |
222b16d
to
9ab0d5d
Compare
Jenkins CI test (for commit |
Jenkins CI test (for commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check Jenkins test failures?
cupy/core/core.pyx
Outdated
@@ -2124,20 +2125,20 @@ cpdef ndarray array(obj, dtype=None, bint copy=True, str order='K', | |||
order = 'A' | |||
a_cpu = numpy.array(obj, dtype=dtype, copy=False, order=order, | |||
ndmin=ndmin) | |||
order = 'C' if a_cpu.flags.c_contiguous else 'F' | |||
order = None if a_cpu.flags.c_contiguous else 'F' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed? It seems the behavior does not change between 'C'
and None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
means 'C'
. And, checking None
is faster than checking 'C'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks!
@okuta Could you check Jenkins test failures? |
Jenkins CI test (for commit 1faceab) failed with status FAILURE. |
Jenkins CI test (for commit 50e9150) failed with status FAILURE. |
Cloud you merge #1385 first? |
Jenkins, test this please. |
Jenkins CI test (for commit 50e9150) failed with status FAILURE. |
Jenkins, test this please. |
Jenkins CI test (for commit 50e9150) failed with status FAILURE. |
Jenkins CI test (for commit 0b9dec6) failed with status FAILURE. |
Jenkins, test this please. |
Jenkins CI test (for commit 0b9dec6) succeeded without errors! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except commented.
cupy/core/core.pyx
Outdated
@@ -2139,24 +2140,26 @@ cpdef ndarray array(obj, dtype=None, bint copy=True, str order='K', | |||
a = a.view() | |||
a.shape = (1,) * (ndmin - ndim) + a.shape | |||
else: | |||
if order == 'K': | |||
order = 'A' | |||
if order is not None and len(order) == 1 and order in 'KAka': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Order needs to be normalized to accept longer names (ANY
, KEEP
).
I benchmarked this PR with this code and confirmed performance improvement! 🎉 Current master:
This PR:
|
Thank you for your performance tests. |
Thanks! Jenkins, test this please. |
Jenkins CI test (for commit f65a4c5) succeeded without errors! |
LGTM! |
Improve host to device memory copy
Jenkins CI test (for commit f65a4c5) failed with status FAILURE. |
This PR removes CPU overhead.