Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression and memory leak in SampleBuilder@v4 #2778

Open
at-wat opened this issue May 30, 2024 · 4 comments
Open

Performance regression and memory leak in SampleBuilder@v4 #2778

at-wat opened this issue May 30, 2024 · 4 comments
Milestone

Comments

@at-wat
Copy link
Member

at-wat commented May 30, 2024

Your environment.

  • Version: v4.0.0-beta.14 and later

What did you do?

Updated pion/webrtc from v4.0.0-beta.13 to later version.

What did you expect?

SampleBuilder CPU/memory usage doesn't significantly increased.

What happened?

CPU usage is doubled in the existing benchmark, and input RTP packets aren't unreferenced after the corresponding samples are popped.

CPU

Bench name v4.0.0-beta.13 CPU v4.0.0-beta.14 CPU CPU usage increase
BenchmarkSampleBuilderSequential-32 311.4 ns/op 609.6 ns/op x1.96
BenchmarkSampleBuilderLoss-32 288.7 ns/op 720.5 ns/op x2.50
BenchmarkSampleBuilderReordered-32 320.9 ns/op 621.9 ns/op x1.93
BenchmarkSampleBuilderFragmented-32 272.8 ns/op 540.5 ns/op x1.98
BenchmarkSampleBuilderFragmentedLoss-32 246.8 ns/op 668.2 ns/op x2.71
Full benchmark outputs

v4.0.0-beta.13

$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: github.com/pion/webrtc/v4/pkg/media/samplebuilder
cpu: AMD Ryzen 9 3950X 16-Core Processor            
BenchmarkSampleBuilderSequential-32        	 3831741	       311.4 ns/op	     388 B/op	       5 allocs/op
BenchmarkSampleBuilderLoss-32              	 3969078	       288.7 ns/op	     344 B/op	       4 allocs/op
BenchmarkSampleBuilderReordered-32         	 3697982	       320.9 ns/op	     388 B/op	       4 allocs/op
BenchmarkSampleBuilderFragmented-32        	 4363172	       272.8 ns/op	     362 B/op	       4 allocs/op
BenchmarkSampleBuilderFragmentedLoss-32    	 4807190	       246.8 ns/op	     317 B/op	       3 allocs/op
PASS
ok  	github.com/pion/webrtc/v4/pkg/media/samplebuilder	7.556s

v4.0.0-beta.14

$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: github.com/pion/webrtc/v4/pkg/media/samplebuilder
cpu: AMD Ryzen 9 3950X 16-Core Processor            
BenchmarkSampleBuilderSequential-32        	 1979661	       609.6 ns/op	     420 B/op	       6 allocs/op
BenchmarkSampleBuilderLoss-32              	 1647154	       720.5 ns/op	     373 B/op	       5 allocs/op
BenchmarkSampleBuilderReordered-32         	 1881794	       621.9 ns/op	     420 B/op	       6 allocs/op
BenchmarkSampleBuilderFragmented-32        	 2221912	       540.5 ns/op	     394 B/op	       5 allocs/op
BenchmarkSampleBuilderFragmentedLoss-32    	 1816394	       668.2 ns/op	     346 B/op	       4 allocs/op
PASS
ok  	github.com/pion/webrtc/v4/pkg/media/samplebuilder	9.402s

v4.0.0-beta.19

$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: github.com/pion/webrtc/v4/pkg/media/samplebuilder
cpu: AMD Ryzen 9 3950X 16-Core Processor            
BenchmarkSampleBuilderSequential-32        	 1978306	       602.4 ns/op	     420 B/op	       6 allocs/op
BenchmarkSampleBuilderLoss-32              	 1609996	       740.5 ns/op	     373 B/op	       5 allocs/op
BenchmarkSampleBuilderReordered-32         	 1897911	       628.1 ns/op	     420 B/op	       6 allocs/op
BenchmarkSampleBuilderFragmented-32        	 2179366	       538.9 ns/op	     394 B/op	       5 allocs/op
BenchmarkSampleBuilderFragmentedLoss-32    	 1744249	       684.3 ns/op	     346 B/op	       4 allocs/op
PASS
ok  	github.com/pion/webrtc/v4/pkg/media/samplebuilder	9.426s

Memory

I created a test to check that the input RTP packets are unreferenced after all samples are popped.

On v4.0.0-beta.13,

=== RUN   TestSampleBuilderPacketUnreference
--- PASS: TestSampleBuilderPacketUnreference (0.05s)

all popped packets are unreferenced.

On v4.0.0-beta.14,

=== RUN   TestSampleBuilderPacketUnreference
    samplebuilder_test.go:570: 
        	Error Trace:	/home/at-wat/go/src/github.com/pion/webrtc/pkg/media/samplebuilder/samplebuilder_test.go:570
        	Error:      	Not equal: 
        	            	expected: 1
        	            	actual  : 65536
        	Test:       	TestSampleBuilderPacketUnreference
--- FAIL: TestSampleBuilderPacketUnreference (0.08s)

65536 packets aren't unreferenced.

Note

CPU and memory usage of my product software using SampleBuilder significantly increased after updating v4.0.0-beta.13 to v4.0.0-beta.19.
I'll dig into the memory usage increase which doesn't appeared in the benchmark. Found the problem on input data unreference and updated above sections.

Refs

@at-wat at-wat added this to the v4 milestone May 30, 2024
@at-wat
Copy link
Member Author

at-wat commented May 30, 2024

@thatsnotright @Sean-Der FYI

@at-wat
Copy link
Member Author

at-wat commented May 31, 2024

Also, the new SampleBuilder doesn't unreference the input RTP packets and causes kind of memory leak.
Created a PR to add a test to check that the input RTP packets are unreferenced after all samples are popped.
#2781

@at-wat at-wat changed the title Performance regression in SampleBuilder@v4 Performance regression and memory leak in SampleBuilder@v4 May 31, 2024
@at-wat
Copy link
Member Author

at-wat commented May 31, 2024

@thatsnotright could you take a deeper look?

@at-wat
Copy link
Member Author

at-wat commented Jul 1, 2024

kindly ping @thatsnotright @Sean-Der

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant