We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@xiaoming-qxm @baoleai we can use thrust::cuda::par.on(stream).
thrust::cuda::par.on(stream)
e.g.
#include <cuda_runtime.h> #include <thrust/device_vector.h> #include <thrust/transform.h> template <typename T> class cap_by { const T cap; public: cap_by(const T cap) : cap(cap) {} __host__ __device__ T operator()(T x) const { if (x > cap) { return cap; } return x; } }; void f(cudaStream_t stream) { int n = 1 << 10; using T = int; thrust::device_vector<T> xs(n); thrust::device_vector<T> ys(n); thrust::sort(thrust::cuda::par.on(stream), xs.begin(), xs.end()); int k = 5; thrust::transform(thrust::cuda::par.on(stream), xs.begin(), xs.end(), ys.begin(), cap_by<T>(k)); } int main() { cudaStream_t stream; cudaStreamCreate(&stream); f(stream); cudaStreamDestroy(stream); return 0; }
The text was updated successfully, but these errors were encountered:
Awesome! So we can port pytorch_quiver to actor framework easily.
Sorry, something went wrong.
But it seems that thrust APIs doesn't support asynchronous behaviors, unavoidable synchronization after every algorithm call is a performance killer.
Reference: NVIDIA/thrust#961
No branches or pull requests
@xiaoming-qxm @baoleai we can use
thrust::cuda::par.on(stream)
.e.g.
The text was updated successfully, but these errors were encountered: