Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor factories (and functions) should accept Tensors/Scalars as arguments to reduce sync points (?) #217

Open
c-hofer opened this issue Jun 26, 2018 · 0 comments

Comments

@c-hofer
Copy link

c-hofer commented Jun 26, 2018

Currently if you want to create a new Tensor from specifications residing on gpu a sync is needed (imho):
It would be greate if sthg like the following could work:

... my_func(...){ 
...
my_gpu_tensor = ...; // size N x 1
//now we want to create a new M x 1 tensor where M = my_gpu_tensor[N][0] 
auto new_size = Scalar(my_gpu_tensor[N][0]);
auto new_tensor = my_gpu_tensor.type({new_size}).

//or a Tensor of size my_gpu_tensor.slice(0, N-2) 
auto new_tensor_2 = my_gpu_tensor.type(my_gpu_tensor.slice(0, N-2).squeeze());
...
}

Currently I have to do first sthg like new_size = new_size.to<int>() to make it work.
But from my understanding this introduces a device->host action.
Hence, it interrupts the asynchronous nature of the gpu calls and it prevents me from calling
my_func asynchronously on several streams and then waiting together.

As I am not deep enough in the ATen sources, is it technically possible with reasonable effort to make this work? Or is there actually a way to do this I did not see?

regards c.hofer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant