Overdrive cpp extension #1299

bhargavkathivarapu · 2021-02-23T10:34:22Z

New PR for overdrive C++ extension similar to lfilter . Relates to #580 ( old PR )

mthrok

Thanks for the contribution, and sorry for taking so long to get back to you.
Overall it looks good to me.
@cpuhrsch Can you take a second look?

mthrok · 2021-02-23T15:05:19Z

torchaudio/functional/filtering.py

+    # for i in range(waveform.shape[-1]):
+    #     last_out = temp[:, i] - last_in + 0.995 * last_out
+    #     last_in = temp[:, i]
+    #     output_waveform[:, i] = waveform[:, i] * 0.5 + last_out * 0.75


I think you can get rid of these comments now.

@mthrok sure will remove them.
Also I will try to implement the parallel_for as mentioned by @cpuhrsch in the old PR comments ( #580 ).

@bhargavkathivarapu sounds good. Could you try measure the performance improvement?
Something like the following and change the shape of tensor to variety of shapes.
If numactl is missing, you can install it with sudo apt-get install -y numactl on Ubuntu.

OMP_NUM_THREADS=1 numactl --membind 0 --cpubind 0 python -m timeit -n 100 -r 5 -s """ import torch; import torchaudio x = torch.zeros(32, 100, dtype=torch.float) """ """ torchaudio.functional.overdrive(x) """

cpuhrsch

LGTM depending some basic benchmarks that show clear improvement.

bhargavkathivarapu · 2021-02-24T12:12:51Z

@mthrok

Changes made

Removed commented code
Added parallel_for in overdrive cpp.

Performance details
( 60 sec audio clip, VM with 2 vCPUs)

Numctl details ( VM with 2 vCPUs)
Command used :

OMP_NUM_THREADS=1 numactl --membind 0 --cpubind 0 python -m timeit -n 100 -r 5 -s """
import torch
import torchaudio
x = torch.zeros(2, 100, dtype=torch.float)
torchaudio.functional.overdrive(x)
"""

(2 , 100 ) = 100 loops, best of 5: 14.1 nsec per loop
(32 , 100 ) = 100 loops, best of 5: 16.8 nsec per loop
(1000, 100) = 100 loops, best of 5: 28.2 nsec per loop
(2, 100000) = 100 loops, best of 5: 40.4 nsec per loop
(100, 100000) = 100 loops, best of 5: 44.3 nsec per loop
(1000, 100000) = 100 loops, best of 5: 55.7 nsec per loop
(10000, 10000) =100 loops, best of 5: 49.7 nsec per loop

mthrok · 2021-02-24T13:55:49Z

@mthrok

Changes made

Removed commented code

Added parallel_for in overdrive cpp.

Performance details
( 60 sec audio clip, VM with 2 vCPUs)

Numctl details ( VM with 2 vCPUs)
Command used :
OMP_NUM_THREADS=1 numactl --membind 0 --cpubind 0 python -m timeit -n 100 -r 5 -s """
import torch
import torchaudio
x = torch.zeros(2, 100, dtype=torch.float)
torchaudio.functional.overdrive(x)
"""
(2 , 100 ) = 100 loops, best of 5: 14.1 nsec per loop
(32 , 100 ) = 100 loops, best of 5: 16.8 nsec per loop
(1000, 100) = 100 loops, best of 5: 28.2 nsec per loop
(2, 100000) = 100 loops, best of 5: 40.4 nsec per loop
(100, 100000) = 100 loops, best of 5: 44.3 nsec per loop
(1000, 100000) = 100 loops, best of 5: 55.7 nsec per loop
(10000, 10000) =100 loops, best of 5: 49.7 nsec per loop

@bhargavkathivarapu

Thanks for the update. So my understanding is that the speed is comparable to sox implementation.

Can you compare the same torchaudio.functional.overdrive function and Tensor shape before adding the C++ code? That will tell more directly how the C++ implementation improves the performance.

mthrok · 2021-02-24T14:03:37Z

Can you merge the latest master commit to include #1297? I believe I killed the flaky bug. 🐛

Merge master branch into overdrive-new

bhargavkathivarapu · 2021-02-24T15:53:00Z

@mthrok . Now all checks passed after merging master.

Can you compare the same torchaudio.functional.overdrive function and Tensor shape before adding the C++ code? That will tell more directly how the C++ implementation improves the performance.

( Reference 60 second clip ) Tensor shape of w = [1, 1323000]
Existing torchaudio overdrive timing

Comparison for that tensor w

Implementation	Relative run time	Run time
sox overdrive	1X	38.7ms
overdrive new(cpp)	2X	78.2ms
overdrive old (python)	~2000X	77000ms

mthrok · 2021-02-24T17:33:04Z

@mthrok . Now all checks passed after merging master.

Can you compare the same torchaudio.functional.overdrive function and Tensor shape before adding the C++ code? That will tell more directly how the C++ implementation improves the performance.

( Reference 60 second clip ) Tensor shape of w = [1, 1323000]
Existing torchaudio overdrive timing

Comparison for that tensor w

Implementation Relative run time Run time
sox overdrive 1X 38.7ms
overdrive new(cpp) 2X 78.2ms
overdrive old (python) ~2000X 77000ms

Wonderful! Thanks!

overdrive cpp ext

005ced9

facebook-github-bot added the CLA Signed label Feb 23, 2021

bhargavkathivarapu added 2 commits February 23, 2021 10:44

flake8 fix

2287893

JIT issue fix

fb89341

mthrok approved these changes Feb 23, 2021

View reviewed changes

bhargavkathivarapu mentioned this pull request Feb 23, 2021

CPP extension for overdrive effect in functional #580

Closed

cpuhrsch approved these changes Feb 24, 2021

View reviewed changes

bhargavkathivarapu added 3 commits February 24, 2021 10:18

Use parallel_for on CPU

70ebbe5

Minor fix

d05514c

include torch.h for parallel_for

136b266

bhargavkathivarapu marked this pull request as ready for review February 24, 2021 12:13

Merge remote-tracking branch 'upstream/master' into overdrive-new

1e8d3f0

Merge master branch into overdrive-new

mthrok merged commit 23e9ed3 into pytorch:master Feb 24, 2021

mthrok pushed a commit to mthrok/audio that referenced this pull request Feb 26, 2021

Update profiler tutorial to prevent build (pytorch#1299)

5ea0ff6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overdrive cpp extension #1299

Overdrive cpp extension #1299

bhargavkathivarapu commented Feb 23, 2021

mthrok left a comment

mthrok Feb 23, 2021

bhargavkathivarapu Feb 23, 2021

mthrok Feb 23, 2021

cpuhrsch left a comment

bhargavkathivarapu commented Feb 24, 2021 •

edited

mthrok commented Feb 24, 2021 •

edited

mthrok commented Feb 24, 2021

bhargavkathivarapu commented Feb 24, 2021 •

edited

mthrok commented Feb 24, 2021

Overdrive cpp extension #1299

Overdrive cpp extension #1299

Conversation

bhargavkathivarapu commented Feb 23, 2021

mthrok left a comment

Choose a reason for hiding this comment

mthrok Feb 23, 2021

Choose a reason for hiding this comment

bhargavkathivarapu Feb 23, 2021

Choose a reason for hiding this comment

mthrok Feb 23, 2021

Choose a reason for hiding this comment

cpuhrsch left a comment

Choose a reason for hiding this comment

bhargavkathivarapu commented Feb 24, 2021 • edited

mthrok commented Feb 24, 2021 • edited

mthrok commented Feb 24, 2021

bhargavkathivarapu commented Feb 24, 2021 • edited

mthrok commented Feb 24, 2021

bhargavkathivarapu commented Feb 24, 2021 •

edited

mthrok commented Feb 24, 2021 •

edited

bhargavkathivarapu commented Feb 24, 2021 •

edited