Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ToI420 conversion by using sync.Pool #473

Merged
merged 5 commits into from
Mar 6, 2023

Conversation

neversi
Copy link
Contributor

@neversi neversi commented Feb 18, 2023

Description

Added sync.Pool to the i420 conversion to minimize overhead of creating new byte slices

Reference issue

Workaround for #424

  • package scope sync.Pool added
  • internal functions like i444ToI420() and i422ToI420() method signature changed, to i444ToI420(img image.YCbCr, dst []uint8), i422ToI420(img image.YCbCr, dst []uint8) respectively
  • release() function added to eventually put slices into Pool
  • release() function of video.reader deferred in codecs' Read() method
  • tests and benchmarks adjusted according to release() function
TODO
  • Benchmark with Pool and without Pool implementation

@neversi neversi marked this pull request as ready for review February 19, 2023 10:29
@edaniels
Copy link
Member

edaniels commented Mar 3, 2023

Hi @neversi, would you mind providing some benchmark results for this for previous/now? I'm also curious if there's room to expand this to other frame decoders?

@neversi
Copy link
Contributor Author

neversi commented Mar 4, 2023

Hi @edaniels, yes of course, I will provide as soon as possible!

@codecov
Copy link

codecov bot commented Mar 6, 2023

Codecov Report

Patch coverage: 95.34% and project coverage change: +0.37 🎉

Comparison is base (d561715) 58.08% compared to head (51e1510) 58.45%.

❗ Current head 51e1510 differs from pull request most recent head 07e2695. Consider uploading reports for the commit 07e2695 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #473      +/-   ##
==========================================
+ Coverage   58.08%   58.45%   +0.37%     
==========================================
  Files          62       62              
  Lines        3691     3724      +33     
==========================================
+ Hits         2144     2177      +33     
  Misses       1420     1420              
  Partials      127      127              
Impacted Files Coverage Δ
pkg/io/video/convert.go 72.80% <92.59%> (+6.13%) ⬆️
pkg/codec/openh264/openh264.go 81.69% <100.00%> (+0.53%) ⬆️
pkg/codec/vpx/vpx.go 83.59% <100.00%> (+0.08%) ⬆️
pkg/codec/x264/x264.go 65.51% <100.00%> (+0.40%) ⬆️
pkg/io/video/convert_cgo.go 71.23% <100.00%> (+2.57%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@neversi
Copy link
Contributor Author

neversi commented Mar 6, 2023

@edaniels
Here is the result of Benchmarks, I have used benchstat for representation
From the comparison it could be seen great reduction in B/ op -> which leads to less pressure on GC

GOMAXPROCS=8
goos: darwin
goarch: arm64
pkg: github.com/pion/mediadevices/pkg/io/video
                    │   old.txt   │              new.txt               │
                    │   sec/op    │   sec/op     vs base               │
ToI420/480p/I444-8    101.2µ ± 2%   100.7µ ± 3%       ~ (p=0.280 n=10)
ToI420/480p/I422-8    54.22µ ± 2%   51.07µ ± 2%  -5.81% (p=0.000 n=10)
ToI420/480p/I420-8    11.15n ± 2%   11.86n ± 4%  +6.37% (p=0.000 n=10)
ToI420/480p/RGBA-8    267.1µ ± 2%   273.6µ ± 2%  +2.46% (p=0.002 n=10)
ToI420/1080p/I444-8   614.2µ ± 8%   584.6µ ± 1%  -4.82% (p=0.000 n=10)
ToI420/1080p/I422-8   310.3µ ± 1%   296.1µ ± 0%  -4.58% (p=0.000 n=10)
ToI420/1080p/I420-8   11.15n ± 1%   11.81n ± 4%  +5.87% (p=0.000 n=10)
ToI420/1080p/RGBA-8   1.634m ± 7%   1.638m ± 2%       ~ (p=0.631 n=10)
geomean               22.09µ        22.05µ       -0.19%

                    │      old.txt      │                 new.txt                  │
                    │       B/op        │      B/op       vs base                  │
ToI420/480p/I444-8     180224.00 ± 0%       63.50 ±  17%  -99.96% (p=0.000 n=10)
ToI420/480p/I422-8     180224.00 ± 0%       63.00 ±  11%  -99.97% (p=0.000 n=10)
ToI420/480p/I420-8         0.000 ± 0%       0.000 ±   0%        ~ (p=1.000 n=10) ¹
ToI420/480p/RGBA-8      180456.0 ± 0%       327.0 ±  11%  -99.82% (p=0.000 n=10)
ToI420/1080p/I444-8   1048577.00 ± 0%       56.00 ± 914%  -99.99% (p=0.000 n=10)
ToI420/1080p/I422-8    1048577.0 ± 0%       182.0 ±  70%  -99.98% (p=0.000 n=10)
ToI420/1080p/I420-8        0.000 ± 0%       0.000 ±   0%        ~ (p=1.000 n=10) ¹
ToI420/1080p/RGBA-8   1032.272Ki ± 0%     9.606Ki ±  12%  -99.07% (p=0.000 n=10)
geomean                               ²                   -99.66%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                    │   old.txt    │               new.txt               │
                    │  allocs/op   │ allocs/op   vs base                 │
ToI420/480p/I444-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/I422-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/I420-8    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/RGBA-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I444-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I422-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I420-8   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/RGBA-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                          ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Copy link
Member

@edaniels edaniels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! If you want to commit the benchmarks in as well, that would be cool but up to you.

pkg/io/video/convert.go Outdated Show resolved Hide resolved
@neversi
Copy link
Contributor Author

neversi commented Mar 6, 2023

What do you mean by "commit the benchmarks in"? In description of PR?

Co-authored-by: Eric Daniels <eric@erdaniels.com>
@edaniels
Copy link
Member

edaniels commented Mar 6, 2023

Oh never mind hah. I thought you added new benchmarks. I see you resued.

@edaniels edaniels merged commit dbd3768 into pion:master Mar 6, 2023
@neversi
Copy link
Contributor Author

neversi commented Mar 7, 2023

@edaniels, regarding to expanding pool to other frame decoders, I think it is possible, I will research it on weekends.

@edaniels
Copy link
Member

edaniels commented Mar 7, 2023

Sounds great! Thank you for the contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants