Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

Merged
merged 5 commits into from Nov 15, 2023

Conversation

MarcA711
Copy link
Contributor

Limitations:

  • So far, I didn't test it very well.
  • This FFmpeg build does not enable libvpx de-/encoding in software. However, you can decode vp8/9 and encode vp8 in hardware.
  • I didn't add the FFmpeg presets so far.
  • I haven't updated the docs so far.

I hope to add the hw accel presets this evening, so others can test this PR.

Copy link

netlify bot commented Nov 13, 2023

Deploy Preview for frigate-docs ready!

Name Link
🔨 Latest commit 375c652
🔍 Latest deploy log https://app.netlify.com/sites/frigate-docs/deploys/65540f513b995f000856015c
😎 Deploy Preview https://deploy-preview-8599--frigate-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@MarcA711
Copy link
Contributor Author

I added ffmpeg presets for hw accel.
Now hw accel can be used this way:

  • build frigate-rk image:
git clone https://github.com/MarcA711/frigate-rockchip
cd frigate-rockchip
sudo make local-rk
  • add hw accel to your config.yaml
ffmpeg:
  hwaccel_args: preset-rk-h265 # or preset-rk-h264

@NickM-27
Copy link
Sponsor Collaborator

looks good to me

@MarcA711
Copy link
Contributor Author

I added the presets and updated the docs. On my system, everything runs very well. I guess this PR is ready to merge.

One note and one question:

  • The FFmpeg preset for hevc/h265 is called h265 in frigate but hevc in FFmpeg. I think this could confuse some users.
  • Rockchip supports more de- and encoders than h264 and h265/hevc. However, all other platforms just implemented h264/h265 presets, perhaps because these are the most common. Should I add the other presets as well or not?

@MarcA711 MarcA711 marked this pull request as ready for review November 14, 2023 21:05
@NickM-27
Copy link
Sponsor Collaborator

on your concerns:

  1. yeah, frigate made this decision because h265 is commonly used in the camera menu itself and what you have done is consistent with other frigate presets
  2. for now probably not

@MarcA711
Copy link
Contributor Author

Yes, h265 is perhaps easier for less technical experienced users. Thank you very much for your explanation and your review.

@MarcA711
Copy link
Contributor Author

Added one small change, since I forgot to update the usage info after copy pasting...

@blakeblackshear blakeblackshear merged commit 8c7f6d4 into blakeblackshear:dev Nov 15, 2023
10 checks passed
@great9
Copy link

great9 commented Nov 15, 2023

@MarcA711 maybe I'm missing something... Am I supposed to add

ffmpeg:
  hwaccel_args: preset-rk-h265

as global and then use hwaccel_args: preset-rk-h264 on the camera that's 264?
What about detect hardware acceleration?

@NickM-27
Copy link
Sponsor Collaborator

you can do it any which way, camera config overrides the global config. It can be cleaner in the config to set it globally and then only override for the cameras that don't match like you described.

the rknn detector is already outlined in the docs

@great9
Copy link

great9 commented Nov 15, 2023

the rknn detector is already outlined in the docs

I was referring to the hardware scaling of the output image that the detector uses, not rknn detector itself. How do I define the rockchip preset there? From the code I see that it uses hevc_rkmpp_encoder which is basically '-c:v hevc_rkmpp_encoder -preset {1}). what is 1? How do I specify the 264 hwaccel for detect?

@NickM-27
Copy link
Sponsor Collaborator

I think you are misunderstanding something, detection is run on decoded stream which is exactly what the hwaccel_args are used for

@NickM-27
Copy link
Sponsor Collaborator

also, the presets used in the config are a frigate specific construct and not something built in to ffmpeg

@great9
Copy link

great9 commented Nov 15, 2023

also, the presets used in the config are a frigate specific construct and not something built in to ffmpeg

aah ok testing the presets now and will compare cpu usage.

any chance to get annke c800 / hikvision clips in the event viewer? snapshots work fine. are there settings for clips that I can fiddle with?

@NickM-27
Copy link
Sponsor Collaborator

I'm not sure what you mean, my hikvision works fine for recordings and snapshots

@great9
Copy link

great9 commented Nov 15, 2023

First benchmarks...
if I use the 4k stream to scale it down to 1080p for detect, it hogs the CPU even at 10 fps. A very specific version of ffmpeg (and a different decoder and scaler) from nyanmisaka and hbiyik worked much better, lower cpu usage tremendeously.

Bugs... h264 decoder has an issue:

2023-11-15 22:24:51.623840716  [h264 @ 0x7f72b438b0] error while decoding MB 42 17, bytestream -16
2023-11-15 22:25:26.641173548  [h264 @ 0x7f713525d0] error while decoding MB 64 35, bytestream -11
2023-11-15 22:25:31.533741061  [h264 @ 0x7f71e51440] error while decoding MB 72 35, bytestream -8
2023-11-15 22:25:46.603678863  [h264 @ 0x7f71a22440] error while decoding MB 89 17, bytestream -22
2023-11-15 22:26:06.587354184  [h264 @ 0x7f71335d40] error while decoding MB 69 17, bytestream -10
2023-11-15 22:26:36.632310777  [h264 @ 0x7f7081df80] error while decoding MB 77 17, bytestream -22
2023-11-15 22:26:46.580402010  [h264 @ 0x7f718242e0] error while decoding MB 106 17, bytestream -12
2023-11-15 22:26:56.593857821  [h264 @ 0x7f72fbe670] error while decoding MB 66 17, bytestream -20

Which results in picture like this:
image

Regarding clips, I can get the event clips for one specific camera played inside the chrome browser (ubiquiti g4 that's due to be replaced) but for the rest of the cameras (hikvision 4k and dahua 4k), they are not available:
image

@NickM-27
Copy link
Sponsor Collaborator

The rockchip preset does not implement hardware scaling, the scaling is done in software, so high CPU usage in that scenario is to be expected. I would definitely suggest running detect on a sub stream

as for the recording issue, that may not be related to this at all, more info would be needed like frigate logs, nginx logs, and browser logs

@MarcA711
Copy link
Contributor Author

One note since hardware scaling was mentioned here:
At least some Rockchip NPUs have raster graphics acceleration (rga). As far as I understand, this can be used when transcoding a video. As an example, consider I have a 4k h265 video and want to transcode it to 2k h264. Then the VPU takes care of the transcoding part and the rga does the image scaling. This is not useful here, since frigate does not transcode our recorded videos. However, if you use go2rtc and transcode your videos, you can use hardware scaling. I plan to update the docs on using go2rtc with rockchip ffmpeg withing the next couple days and I will implement this part.

For some reason the rga is not used when decoding and scaling an image. So, if I decode my 4k h265 video and want to resize the raw frames to 2k resolution, the VPU does the decoding part, but the scaling part has to be done on the cpu not rga. After I finish the basic Rockchip implementation here, I want to find out why the rga is not used to scale decoded frames. Maybe the maintainer of the rockchip ffmpeg fork will implement it.

However, you specifically asked for hardware acceleration of frames passed to the rknpu detector. I guess this is not worth since passing the frame to ffmpeg and then to the rga is probably slower than just doing it on the cpu. But even if it is a bit faster, the detection using the npu takes at least 20ms even using the smallest models. The time it takes to scale the frame on the cpu is probably negligible vs these 20 ms.

But all the information here is vague, as I have not yet familiarized myself with the subject in depth.

@MarcA711
Copy link
Contributor Author

@great9 As I said in my last comment, I will update the docs about transcoding using go2rtc. With that, you can use the following workaround: transcode your 4k stream using go2rtc and use hardware scaling to scale down the stream to 1080p.

But only do this, if you have to use this 4k stream. As Nick said, the much better solution is to provide a substream with lower resolution.

@great9
Copy link

great9 commented Nov 15, 2023

One note since hardware scaling was mentioned here: At least some Rockchip NPUs have raster graphics acceleration (rga). As far as I understand, this can be used when transcoding a video. As an example, consider I have a 4k h265 video and want to transcode it to 2k h264. Then the VPU takes care of the transcoding part and the rga does the image scaling. This is not useful here, since frigate does not transcode our recorded videos. However, if you use go2rtc and transcode your videos, you can use hardware scaling. I plan to update the docs on using go2rtc with rockchip ffmpeg withing the next couple days and I will implement this part.

great, will try that too.

For some reason the rga is not used when decoding and scaling an image. So, if I decode my 4k h265 video and want to resize the raw frames to 2k resolution, the VPU does the decoding part, but the scaling part has to be done on the cpu not rga. After I finish the basic Rockchip implementation here, I want to find out why the rga is not used to scale decoded frames. Maybe the maintainer of the rockchip ffmpeg fork will implement it.

I had some success with https://github.com/nyanmisaka/jellyfin-ffmpeg/commits/next-rockchip-async-afbc (ffmpeg compiled with "--enable-rkrga" and "--enable-rkmpp"

He has two branches he's working on.

Then there's @hbiyik who implemented the rkmpp encoder / decoder to ffmpeg 4.4, 5.0, 5.1 and now ffmpeg latest. He's been sharing some thoughts here/

However, you specifically asked for hardware acceleration of frames passed to the rknpu detector. I guess this is not worth since passing the frame to ffmpeg and then to the rga is probably slower than just doing it on the cpu. But even if it is a bit faster, the detection using the npu takes at least 20ms even using the smallest models. The time it takes to scale the frame on the cpu is probably negligible vs these 20 ms.

I was wrong about "detect". I was under the impression that the "detect" config does scaling (based on the width, height parameters) and that it's being used with the CPU / VPU so that I can tune it.
I have coral and rknn. Detectors aren't a problem. My substreams are 360p and unusable. I really need to downscale from 4k to 1080p.

@NickM-27
Copy link
Sponsor Collaborator

NickM-27 commented Nov 15, 2023

frigate will use ffmpeg to scale the detect input stream to the detect -> width / height if the stream is not already at that size. This is done on the GPU in most ffmpeg presets but as it was said, the preset here uses software

@NickM-27
Copy link
Sponsor Collaborator

@MarcA711 is there something you are implementing or just updating the docs to mention go2rtc config?

@MarcA711
Copy link
Contributor Author

MarcA711 commented Nov 15, 2023

@great9 thank you for the links you provided, especially the jellyfin ffmpeg fork. I didn't know this so far. This can be helpful for implementing hw scaling without encoding in ffmpeg. As I said, with hbiyiks ffmpeg fork it seems that only hw scaling is supported in combination with encoding.

@NickM-27 I would like to implement presets that can be used in the go2rtc way like this: ffmpeg:rtsp://rtsp:12345678@192.168.1.123/av_stream/ch0#input=rkh264dec#video=rkh264enc
Is there some way to add go2rtc presets?

@NickM-27
Copy link
Sponsor Collaborator

yes, that could be done in the create_config.py

if int(os.environ["LIBAVFORMAT_VERSION_MAJOR"]) < 59:
if go2rtc_config.get("ffmpeg") is None:
go2rtc_config["ffmpeg"] = {
"rtsp": "-fflags nobuffer -flags low_delay -stimeout 5000000 -user_agent go2rtc/ffmpeg -rtsp_transport tcp -i {input}"
}
elif go2rtc_config["ffmpeg"].get("rtsp") is None:
go2rtc_config["ffmpeg"][
"rtsp"
] = "-fflags nobuffer -flags low_delay -stimeout 5000000 -user_agent go2rtc/ffmpeg -rtsp_transport tcp -i {input}"

@MarcA711
Copy link
Contributor Author

Awesome, this is exactly what I need.

@faugconti
Copy link

you forked MPP but did not include it in the rockchip makefile? seems to be missing

@MarcA711
Copy link
Contributor Author

I am not sure, what you mean.

I forked mpp and did the following changes:

  • remove all unit tests
  • remove the legacy rockchip_vpu lib
  • build static mpp

I guess you mean that the static rockchip_mpp.a lib contains just a part of the symbols. This is true, I used a workaround to build the lib: I used ar to combine all static libs that the original CMakeLists.txt from rockchip build.

I plan to update my fork in the future, to directly build build a librockchip_mpp with all symbols. However, the mpp repo contains 50+ CMake files and I have to work through all of them. I didn't have time for this so far. This is low priority for me, since the librockchip_mpp.a that I build using my workaround is basically the same as if I wrote CMake files to directly build it.

So if you want to build your own:

git clone https://github.com/MarcA711/mpp
cd mpp 
mkdir bld && cd bld
cmake -DCMAKE_BUILD_TYPE=Release -DHAVE_DRM=ON -DCMAKE_INSTALL_PREFIX=/path/to/your/prefix ..
make -j$(nproc)
# now apply workaround to build mpp.a
../merge_libs.sh
# now you have a working librockchip_mpp.a under mpp/bld/mpp/librockchip_mpp.a
make install

@great9
Copy link

great9 commented Nov 16, 2023

@MarcA711 can you add preset for birdseye a well please?

@NickM-27
Copy link
Sponsor Collaborator

Birdseye already uses the preset

@faugconti
Copy link

I am not sure, what you mean.

I forked mpp and did the following changes:

* remove all unit tests

* remove the legacy rockchip_vpu lib

* build static mpp

I guess you mean that the static rockchip_mpp.a lib contains just a part of the symbols. This is true, I used a workaround to build the lib: I used ar to combine all static libs that the original CMakeLists.txt from rockchip build.

I plan to update my fork in the future, to directly build build a librockchip_mpp with all symbols. However, the mpp repo contains 50+ CMake files and I have to work through all of them. I didn't have time for this so far. This is low priority for me, since the librockchip_mpp.a that I build using my workaround is basically the same as if I wrote CMake files to directly build it.

So if you want to build your own:

git clone https://github.com/MarcA711/mpp
cd mpp 
mkdir bld && cd bld
cmake -DCMAKE_BUILD_TYPE=Release -DHAVE_DRM=ON -DCMAKE_INSTALL_PREFIX=/path/to/your/prefix ..
make -j$(nproc)
# now apply workaround to build mpp.a
../merge_libs.sh
# now you have a working librockchip_mpp.a under mpp/bld/mpp/librockchip_mpp.a
make install

oh i see. But no matter what i do i keep getting these errors

[h264_rkmpp_decoder @ 0xaaab1524bee0] Failed to initialize MPP context (code = -1).
[h264_rkmpp_decoder @ 0xaaaaf20af440] Failed to initialize RKMPP Codec.

@NickM-27
Copy link
Sponsor Collaborator

What is the docker compose?

@MarcA711
Copy link
Contributor Author

@faugconti
You don't need to build the librockchip_mpp.a library to use this image. Everything needed is already included.
What Rockchip SoC and what SBC are you using?
What is your docker_compose.yml and config.yml?

@faugconti
Copy link

Solved this issue switching from mainline kernel back to 5.10.160-rockchip (which already comes with MPP) after checking the Dockerfile.
Wouldn't be possible to include it inside the container alongside the other deps?

@MarcA711
Copy link
Contributor Author

MarcA711 commented Nov 16, 2023

The problem is not a missing library. Everything necessary is implemented in the rk image.

I guess you were using the mainline kernel from collabora. As you can see from their status matrix, only the av1 codec is implemented so far, so this kernel supports no h264 or h265/hevc de- or encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants