torch_split much slower than equivalent function in pytorch #992

egillax · 2023-03-17T15:14:41Z

I'm encountering a huge difference in performance when using torch_split in R and in pytorch.

Here's an example using R where torch_split takes about 25-30 seconds:

library(torch)
nSubjects <- 75e3
nFeatures <- 1e3
torch::torch_manual_seed(seed=42)

rowIds <- torch::torch_randint(1,nSubjects, c(1e6,1))
columnIds <- torch::torch_randint(1,nFeatures, c(1e6, 1))

tensor <- torch::torch_cat(c(rowIds, columnIds), dim=2)

sortedTensor <- tensor$sort(dim=1)[[1]]

counts <- as.integer(torch::torch_unique_consecutive(sortedTensor[,1], return_counts=TRUE)[[3]])

microbenchmark::microbenchmark(splitted <- torch::torch_split(sortedTensor[,2], counts), times=1)

In python the split function for the same example takes about 0.5 seconds.

import torch
torch.manual_seed(seed=42)
nSubjects = int(75e3)
nFeatures = int(1e3)

rowIds = torch.randint(0, nSubjects, (int(1e6), 1))
columnIds = torch.randint(1, nFeatures, (int(1e6), 1))

tensor = torch.cat((rowIds, columnIds), dim=1)
sortedTensor = tensor.sort(dim=1)[1]

counts = torch.unique_consecutive(sortedTensor[:, 0], return_counts=True)[1].tolist()
splitted = torch.split(sortedTensor[:, 1], counts)

I assume these are calling the same libtorch functions under the hood? Any idea for the difference in performance?

I'm using torch 0.9 and pytorch 1.13.

The text was updated successfully, but these errors were encountered:

dfalbel · 2023-03-17T15:59:46Z

Hi @egillax ,

Thanks very much for reporting. #993 will fix it. Once it's merged, please install the dev version from GitHub.

egillax · 2023-03-17T17:46:25Z

thanks for the fast response! this is indeed much faster now.

dfalbel mentioned this issue Mar 17, 2023

Returning tensors lists much faster. #993

Merged

dfalbel closed this as completed in #993 Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch_split much slower than equivalent function in pytorch #992

torch_split much slower than equivalent function in pytorch #992

egillax commented Mar 17, 2023

dfalbel commented Mar 17, 2023

egillax commented Mar 17, 2023

torch_split much slower than equivalent function in pytorch #992

torch_split much slower than equivalent function in pytorch #992

Comments

egillax commented Mar 17, 2023

dfalbel commented Mar 17, 2023

egillax commented Mar 17, 2023