Skip to content

Conversation

jszuppe
Copy link
Contributor

@jszuppe jszuppe commented May 26, 2016

Resolves #363.

Sync. copying to/from device

For both coping data to and from device there are 3 dispatch_copy() functions and each function works in different scenario:

  1. Copying to device when value types (of InputIterator and OutputIterator) match and host iterator is contiguous
  2. Copying to device when value types mismatch and host iterator is contiguous
  3. Copying to device when host iterator is not contiguous (values types does not have to match)
1st option

It's straightforward copying using OpenCL functions for reading and writing a device memory.

2nd option has three sub-options
  1. Maping device memory to host and performing std::copy (casting/converting and copying on host)
  2. Using std::vector<DeviceIterator::value_type> as a intermediate memory (casting/converting on host, copying using OpenCL function)
  3. Maping host memory to the device and using copy kernel to both convert and copy the data.
3rd option has two sub-options
  1. Maping device memory to host and performing std::copy (casting/converting and copying on host)
  2. Using std::vector<DeviceIterator::value_type> as a intermediate memory (casting/converting on host, copying using OpenCL function)
Additional info

When copying to host memory we treat host iterators whose value type is bool as if they were non-contiguous.

Async. copying to/from device

There is asynchronous copying with mismatching types works too. In both cases (to/from device) I used maping host memory to the device and using copy kernel to both convert and copy the data.

Various fixes

  • Fix types in perf_sort_by_key.cpp
  • Fix return values in async coping svm_ptr<> from/to device
  • Fix test in test_functional_as.cpp

jszuppe added 10 commits May 21, 2016 16:44
When InputIterator (host) is a non-contiguous iterator we don't
need a separate algorithm for cases when value_types of InputIterator
and OutputIterator (device) do not match and cases when they do
match.
Type-safe copying from device to host. Seperate copying
algorithm device -> host for non-contiguous OutputIterator (host).
At the end of test we should read from input vector
(not output) in order to check if transform() with
as<int>() was performed correctly.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.3%) to 80.828% when pulling c3f32be on haahh:pr_typesafe_copy into d41ff00 on boostorg:develop.

jszuppe added 8 commits May 28, 2016 13:15
This commit modifies svm_ptr<T> to keep its context. It is
convenient for the users and enables creating
svm_ptr_index_expr<T, IndexExpr> class.
Tests for copying SVM memory to/from/on device when
value_types of InputIterator and OutputIterator mismatch.
@jszuppe jszuppe force-pushed the pr_typesafe_copy branch from c3f32be to 69e09f2 Compare May 28, 2016 14:47
#define BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION
#ifndef BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION
#define BOOST_COMPUTE_DEBUG_KERNEL_COMPILATION
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just remove this define from the file. It should only really be defined by the user when they are debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix that later today.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.3%) to 80.857% when pulling 69e09f2 on haahh:pr_typesafe_copy into d41ff00 on boostorg:develop.

make_buffer_iterator<output_type>(mapped_host),
queue
);
// update host memory asynchronously by maping and unmaping memory
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't asynchronously here be synchronously? Also, I think an alternative would be instead of mapping and unmapping the buffer, replace the call above with copy_on_device_async() and wait on the returned event/future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed, it should be synchronously since later wait() is called for event returned by unmap operation.

And yes, those lines can be replace by just calling copy_on_device_async() as it does the same (create buffer for mapping host memory, perform copying on device and map/unmap buffer).

Copy link
Contributor Author

@jszuppe jszuppe May 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix: dispatch_copy_async(), not copy_on_device_async(). We have to map and unmap at the end to force OpenCL driver to update host memory (mapping is a synchronization point).

@kylelutz
Copy link
Collaborator

kylelutz commented May 28, 2016

Looks good! Left a few comments. I'll finish up reviewing/testing this out today.

@jszuppe
Copy link
Contributor Author

jszuppe commented May 28, 2016

Thanks! Just fyi, I ran all the tests with SVM on Windows with the latest AMD drivers with fixed clEnqueueSVMMemcpy() (you can see that there are tests with svm_ptr<> that are skipped for AMD, this is the reason).

@jszuppe
Copy link
Contributor Author

jszuppe commented May 28, 2016

I've just found that that there is an inconsistency in what async. copy functions return when first == last. In one function it's future object with result iterator and an empty event (event()), in other functions it's empty future object... However, from user perspective it does not matter because it's impossible to get the value in both of those cases (user gets Invalid event error or sth like this when try to get() the value).

@coveralls
Copy link

Coverage Status

Coverage increased (+0.3%) to 80.861% when pulling 8f4db3d on haahh:pr_typesafe_copy into d41ff00 on boostorg:develop.

@kylelutz kylelutz merged commit 4c31d07 into boostorg:develop Jun 1, 2016
@kylelutz
Copy link
Collaborator

kylelutz commented Jun 1, 2016

Awesome! Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants