-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Closed
Labels
module: performanceIssues related to performance, either of kernel code or framework glueIssues related to performance, either of kernel code or framework glue
Description
I've tested a few other norms as well:
x = torch.randn(1024, 256)
y = torch.randn(1024, 256)
In [12]: %timeit torch.norm(x-y, 1, 1)
2.55 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [13]: %timeit (x-y).sum(1)
339 µs ± 699 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [14]: %timeit torch.norm(x-y, 2, 1)
2.42 ms ± 33.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [15]: %timeit torch.sqrt((x-y).pow(2).sum(1))
736 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [16]: %timeit torch.pow((x - y).pow(3).sum(1), 1/3)
700 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [17]: %timeit torch.norm(x-y, 3, 1)
16.1 ms ± 31.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Reported here on the forums
djr7m, munael and mfinzi
Metadata
Metadata
Assignees
Labels
module: performanceIssues related to performance, either of kernel code or framework glueIssues related to performance, either of kernel code or framework glue