-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): fully accelerated nvenc #9452
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this just for testing?
98edc66
to
d5ace68
Compare
Deploying immich with Cloudflare Pages
|
eb495c8
to
60aadf3
Compare
60aadf3
to
ea33cd9
Compare
I decided to add the hardware decoding toggle after all. The new command may not work with older kernels or drivers, and it's useful for debugging purposes. Overall, this takes it from being a change I'm slightly nervous about to one I can confidently say is fine. |
Just a FYI, BTW for issue #9252, we also have |
I was going to give this a test but given the latest comment I will wait for @mertalev to give it another pass if he wants to make changes related to that comment. |
Thanks for the tips, @nyanmisaka! I didn't know about |
@mertalev let me know if (and when) I should test this on my RK3588 device |
@nyanmisaka I did some testing and Test video (downloaded with
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -noautorotate -threads 1 -i HDR.mkv \
-tune hq -qmin 0 -rc-lookahead 20 -i_qfactor 0.75 -c:v av1_nvenc -c:a aac -movflags faststart \
-fps_mode passthrough -map 0:0 -map 0:1 -g 256 -temporal-aq 1 -v verbose -preset p1 -cq:v 40 \
-vf scale_cuda=-2:720,tonemap_cuda=matrix=bt709:primaries=bt709:range=pc:tonemap=hable:transfer=bt709:format=nv12 \
SDR.mp4
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -noautorotate -threads 1 -i HDR.mkv \
-tune hq -qmin 0 -rc-lookahead 20 -i_qfactor 0.75 -c:v av1_nvenc -c:a aac -movflags faststart \
-fps_mode passthrough -map 0:0 -map 0:1 -g 256 -temporal-aq 1 -v verbose -preset p1 -cq:v 40 \
-vf scale_cuda=-2:720,hwupload=derive_device=vulkan,libplacebo=color_primaries=bt709:color_trc=bt709:colorspace=bt709:downscaler=none:format=yuv420p:tonemapping=hable:upscaler=none,hwupload=derive_device=cuda \
SDR.mp4 Performance results:
Also wow, I was not expecting that big of a gap between software and hardware decoding / tone-mapping. Edit: I noticed there are some artifacts in the libplacebo image (lower right, middle left, top left). Looks like there's a bug when used in tandem with Interestingly, both filters also have distorted colors before a scene change when |
@fyfrey There shouldn't be any other changes to the RKMPP code so feel free to test it whenever you're able! It basically makes the software decoding variant the default, but the commands are otherwise the same. |
update dockerfile
54dbbf9
to
a6732cf
Compare
@mertalev The image produced by The complete filter options can be queried through the following command.
As for the artifacts caused by the |
@nyanmisaka Thanks, disabling de-saturation did the trick! The colors in the OpenCL image actually look closer to the HDR video than with libplacebo now. With that change, I think this PR is ready for final review. |
In fact, tone-mapping is a lossy process. Therefore there is no completely accurate result. Personal preference also plays an important role here. As for the performance difference, it is not much different on beefy GPUs such as RTX. It is more obvious on weak GPUs such as GTX1050. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, if you've tested nvidia then I am happy for this to be merged, would just be good to get a test of RKMP although it seems that is effectively the same as before with just some code moved around to account for the new hw decode toggle
Description
Edit: An earlier version of this PR elected to make a breaking change here, but after some consideration I decided against it. The PR now makes hardware decoding opt-in and existing setups will continue to work. There is a new toggle for whether to use hardware decoding, applicable to NVENC and RKMPP. Since it defaults to false, the behavior is the same as current for NVENC. RKMPP will be downgraded to software decoding until the admin enables hardware decoding.
This is a smaller version of #9402 that only changes the behavior for NVENC. That PR aimed to streamline the decoding and filtering process to one pipeline by leveraging libplacebo and Vulkan's cross-device capabilities. However, this is premature due to the following reasons:
Vulkan works very well with Nvidia from my testing and has reasonable backwards compatibility (9xx series onward), so it's fine to use it here. Maybe someday it can be used more extensively, but in the meantime it's similar to this XKCD.
Testing
Tested transcoding a video on NVENC and CPU with tone-mapping enabled and disabled, confirming success logs and confirming the video plays (with browser caching disabled to ensure the video is up-to-date).