-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive CPU usage during playback of VP8/9 videos since commit c64c03478fac7f6b8733e2703035a71ecf0244ff #90
Comments
Of course I forgot that the test VP9 video is YUV444... You still get a nice reduction in CPU usage with VP9 and YUV420 movies. |
I made a sample 4k project using Tears of Steel as that's what Google uses in their examples. Downloaded it and took a short clip near the start: ffmpeg -ss 00:00:09 -to 00:00:25 -i tearsofsteel_4k.mov -c copy tearsofsteel_4k_09s-25s.mov Butchered it to 60 FPS, YUV420 and transcoded to VP8, VP9 then later AV1 both at 4k & 1080p (encodes are not equivalent to each other & may contain unused options): ffmpeg transcodes:ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -tile-columns 3 -g 240 -c:v libvpx -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -tile-columns 3 -c:v libvpx -row-mt 1 -crf 12 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 240 -r 60 -c:a libopus tearsofsteel_4k_09s-25s-60.webm
ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -tile-columns 3 -g 240 -c:v libvpx-vp9 -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -tile-columns 3 -c:v libvpx-vp9 -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 240 -c:a libopus tearsofsteel_4k_09s-25s-vp9-60.webm
ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -g 300 -c:v libaom-av1 -r 60 -row-mt 1 -tiles 4x1 -crf 16 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && time ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=3840x2160 -c:v libaom-av1 -row-mt 1 -tiles 4x1 -crf 16 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 300 -r 60 -c:a libopus tearsofsteel_4k_09s-25s-av1-60.webm
ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -tile-columns 3 -g 240 -c:v libvpx -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -tile-columns 3 -c:v libvpx -row-mt 1 -crf 12 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 240 -r 60 -c:a libopus tearsofsteel_1080_09s-25s-60.webm
ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -tile-columns 3 -g 240 -c:v libvpx-vp9 -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -tile-columns 3 -c:v libvpx-vp9 -r 60 -row-mt 1 -crf 12 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 240 -c:a libopus tearsofsteel_1080_09s-25s-vp9-60.webm
ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -g 300 -c:v libsvtav1 -preset 1 -r 60 -row-mt 1 -tiles 4x1 -crf 16 -b:v 23552K -pass 1 -pix_fmt yuv420p -an -f null /dev/null && time ffmpeg -i tearsofsteel_4k_09s-25s.mov -vf scale=1920x1080 -c:v libsvtav1 -preset 1 -row-mt 1 -tiles 4x1 -crf 16 -b:v 23552K -pass 2 -pix_fmt yuv420p -g 300 -r 60 -c:a libopus tearsofsteel_1080_09s-25s-av1-stv-60.webm Well renpy.movie_cutscene() $ renpy.movie_cutscene("images/tearsofsteel_4k_09s-25s-60.webm", loops=1)
$ renpy.movie_cutscene("images/tearsofsteel_4k_09s-25s-vp9-60.webm", loops=1)
$ renpy.movie_cutscene("images/tearsofsteel_4k_09s-25s-av1-60.webm", loops=1)
$ renpy.movie_cutscene("images/tearsofsteel_1080_09s-25s-60.webm", loops=1)
$ renpy.movie_cutscene("images/tearsofsteel_1080_09s-25s-vp9-60.webm", loops=1)
$ renpy.movie_cutscene("images/tearsofsteel_1080_09s-25s-av1-stv-60.webm", loops=1) New Ryzen system (Debian sid)Default renpy-8.1.1-sdk lib: As you can see both the VP8 & VP9 playback see a significant reduction in CPU with a MMX enabled build. AV1 is largely unaffected with this CPU. Then I actually read the Tears of Steel about page. I did not include the credit scroll, so I'm not entirely certain how kosher it is to share it like this. But it's trivial to reproduce as it applies to all VP8/VP9 videos. |
That's convicing. I've re-enabled MMX for now. |
Despite c64c034's comment it seems MMX is still important to the ffmpeg build. With it disabled it requires significantly more CPU on decode. Effectively reverting that commit on Win/Linux with something like:
Allows for the noticeably reduced CPU usage.
Testing was done with @PastryIRL's vp9_test repo and captured via
pidstat
.pidstat
Here's a comparison with renpy-8.0.3-sdk default no MMX and one built with MMX.
Old A6-3500 Llano system (Debian sid)
In Ren'Py anything above 270% CPU is too much for this system and it starts frame dropping. E.G. 60 FPS VP9 movies turn into a slide show. The CPU drop halfway through was the switch over to the VP8 vid.
CPU flags
cat /proc/cpuinfo
MMX
The Llano with the MMX enabled Ren'Py build can handle 60 FPS VP9 1080p vids just fine as long as the bitrate isn't out of control. Though sometimes devs encode it way higher than needed for 1080p...
Now with a current CPU with renpy-8.1.1-sdk default no MMX and again one built with MMX.
New Ryzen system (Debian sid)
CPU flags
cat /proc/cpuinfo
MMX
This system has no playback issues. It's much faster per core and has plenty of CPU to spare. As you can see the MMX build still offers significant reduction in CPU even with the newer flags available. This naturally holds true for 4K video as well.
It would be nice to get MMX enabled again for at least the Win/Linux builds. This allows projects that use VP9 videos in particular to perform well on "marginal" systems and in the case of 4k expands the pool of systems able to play it at all.
The text was updated successfully, but these errors were encountered: