Skip to content

torch.norm(x-y, 2, 1) is significantly slower than torch.sqrt((x - y).pow(2).sum(1)) #5671

@zou3519

Description

@zou3519

I've tested a few other norms as well:

x = torch.randn(1024, 256)
y = torch.randn(1024, 256)


In [12]: %timeit torch.norm(x-y, 1, 1)
2.55 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [13]: %timeit (x-y).sum(1)
339 µs ± 699 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [14]: %timeit torch.norm(x-y, 2, 1)
2.42 ms ± 33.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [15]: %timeit torch.sqrt((x-y).pow(2).sum(1))
736 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [16]: %timeit torch.pow((x - y).pow(3).sum(1), 1/3)
700 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [17]: %timeit torch.norm(x-y, 3, 1)
16.1 ms ± 31.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Reported here on the forums

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: performanceIssues related to performance, either of kernel code or framework glue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions