Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch_split much slower than equivalent function in pytorch #992

Closed
egillax opened this issue Mar 17, 2023 · 2 comments · Fixed by #993
Closed

torch_split much slower than equivalent function in pytorch #992

egillax opened this issue Mar 17, 2023 · 2 comments · Fixed by #993

Comments

@egillax
Copy link
Contributor

egillax commented Mar 17, 2023

Hi @dfalbel,

I'm encountering a huge difference in performance when using torch_split in R and in pytorch.

Here's an example using R where torch_split takes about 25-30 seconds:

library(torch)
nSubjects <- 75e3
nFeatures <- 1e3
torch::torch_manual_seed(seed=42)

rowIds <- torch::torch_randint(1,nSubjects, c(1e6,1))
columnIds <- torch::torch_randint(1,nFeatures, c(1e6, 1))

tensor <- torch::torch_cat(c(rowIds, columnIds), dim=2)

sortedTensor <- tensor$sort(dim=1)[[1]]

counts <- as.integer(torch::torch_unique_consecutive(sortedTensor[,1], return_counts=TRUE)[[3]])

microbenchmark::microbenchmark(splitted <- torch::torch_split(sortedTensor[,2], counts), times=1)

In python the split function for the same example takes about 0.5 seconds.

import torch
torch.manual_seed(seed=42)
nSubjects = int(75e3)
nFeatures = int(1e3)

rowIds = torch.randint(0, nSubjects, (int(1e6), 1))
columnIds = torch.randint(1, nFeatures, (int(1e6), 1))

tensor = torch.cat((rowIds, columnIds), dim=1)
sortedTensor = tensor.sort(dim=1)[1]

counts = torch.unique_consecutive(sortedTensor[:, 0], return_counts=True)[1].tolist()
splitted = torch.split(sortedTensor[:, 1], counts)

I assume these are calling the same libtorch functions under the hood? Any idea for the difference in performance?

I'm using torch 0.9 and pytorch 1.13.

@dfalbel
Copy link
Member

dfalbel commented Mar 17, 2023

Hi @egillax ,

Thanks very much for reporting. #993 will fix it. Once it's merged, please install the dev version from GitHub.

@egillax
Copy link
Contributor Author

egillax commented Mar 17, 2023

thanks for the fast response! this is indeed much faster now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants