-
Notifications
You must be signed in to change notification settings - Fork 337
Type-safe copying #616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type-safe copying #616
Conversation
When InputIterator (host) is a non-contiguous iterator we don't need a separate algorithm for cases when value_types of InputIterator and OutputIterator (device) do not match and cases when they do match.
Type-safe copying from device to host. Seperate copying algorithm device -> host for non-contiguous OutputIterator (host).
At the end of test we should read from input vector (not output) in order to check if transform() with as<int>() was performed correctly.
This commit modifies svm_ptr<T> to keep its context. It is convenient for the users and enables creating svm_ptr_index_expr<T, IndexExpr> class.
Tests for copying SVM memory to/from/on device when value_types of InputIterator and OutputIterator mismatch.
example/opencv_histogram.cpp
Outdated
#define BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION | ||
#ifndef BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION | ||
#define BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just remove this define from the file. It should only really be defined by the user when they are debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix that later today.
make_buffer_iterator<output_type>(mapped_host), | ||
queue | ||
); | ||
// update host memory asynchronously by maping and unmaping memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't asynchronously
here be synchronously
? Also, I think an alternative would be instead of mapping and unmapping the buffer, replace the call above with copy_on_device_async()
and wait on the returned event/future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed, it should be synchronously
since later wait()
is called for event returned by unmap operation.
And yes, those lines can be replace by just calling copy_on_device_async()
as it does the same (create buffer for mapping host memory, perform copying on device and map/unmap buffer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: dispatch_copy_async()
, not copy_on_device_async()
. We have to map and unmap at the end to force OpenCL driver to update host memory (mapping is a synchronization point).
Looks good! Left a few comments. I'll finish up reviewing/testing this out today. |
Thanks! Just fyi, I ran all the tests with SVM on Windows with the latest AMD drivers with fixed |
I've just found that that there is an inconsistency in what async. copy functions return when |
Awesome! Merged! |
Resolves #363.
Sync. copying to/from device
For both coping data to and from device there are 3
dispatch_copy()
functions and each function works in different scenario:InputIterator
andOutputIterator
) match and host iterator is contiguous1st option
It's straightforward copying using OpenCL functions for reading and writing a device memory.
2nd option has three sub-options
std::copy
(casting/converting and copying on host)std::vector<DeviceIterator::value_type>
as a intermediate memory (casting/converting on host, copying using OpenCL function)3rd option has two sub-options
std::copy
(casting/converting and copying on host)std::vector<DeviceIterator::value_type>
as a intermediate memory (casting/converting on host, copying using OpenCL function)Additional info
When copying to host memory we treat host iterators whose value type is
bool
as if they were non-contiguous.Async. copying to/from device
There is asynchronous copying with mismatching types works too. In both cases (to/from device) I used maping host memory to the device and using copy kernel to both convert and copy the data.
Various fixes
perf_sort_by_key.cpp
svm_ptr<>
from/to device