Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

MarcA711 · 2023-11-13T11:56:49Z

Limitations:

So far, I didn't test it very well.
This FFmpeg build does not enable libvpx de-/encoding in software. However, you can decode vp8/9 and encode vp8 in hardware.
I didn't add the FFmpeg presets so far.
I haven't updated the docs so far.

I hope to add the hw accel presets this evening, so others can test this PR.

netlify · 2023-11-13T11:57:34Z

✅ Deploy Preview for frigate-docs ready!

Name	Link
🔨 Latest commit	`375c652`
🔍 Latest deploy log	https://app.netlify.com/sites/frigate-docs/deploys/65540f513b995f000856015c
😎 Deploy Preview	https://deploy-preview-8599--frigate-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

MarcA711 · 2023-11-13T20:59:27Z

I added ffmpeg presets for hw accel.
Now hw accel can be used this way:

build frigate-rk image:

git clone https://github.com/MarcA711/frigate-rockchip
cd frigate-rockchip
sudo make local-rk

add hw accel to your config.yaml

ffmpeg:
  hwaccel_args: preset-rk-h265 # or preset-rk-h264

NickM-27 · 2023-11-14T21:03:14Z

looks good to me

MarcA711 · 2023-11-14T21:05:34Z

I added the presets and updated the docs. On my system, everything runs very well. I guess this PR is ready to merge.

One note and one question:

The FFmpeg preset for hevc/h265 is called h265 in frigate but hevc in FFmpeg. I think this could confuse some users.
Rockchip supports more de- and encoders than h264 and h265/hevc. However, all other platforms just implemented h264/h265 presets, perhaps because these are the most common. Should I add the other presets as well or not?

NickM-27 · 2023-11-14T21:07:01Z

on your concerns:

yeah, frigate made this decision because h265 is commonly used in the camera menu itself and what you have done is consistent with other frigate presets
for now probably not

MarcA711 · 2023-11-14T21:55:41Z

Yes, h265 is perhaps easier for less technical experienced users. Thank you very much for your explanation and your review.

MarcA711 · 2023-11-14T23:14:39Z

Added one small change, since I forgot to update the usage info after copy pasting...

great9 · 2023-11-15T20:52:18Z

@MarcA711 maybe I'm missing something... Am I supposed to add

ffmpeg:
  hwaccel_args: preset-rk-h265

as global and then use hwaccel_args: preset-rk-h264 on the camera that's 264?
What about detect hardware acceleration?

NickM-27 · 2023-11-15T20:54:38Z

you can do it any which way, camera config overrides the global config. It can be cleaner in the config to set it globally and then only override for the cameras that don't match like you described.

the rknn detector is already outlined in the docs

great9 · 2023-11-15T21:02:01Z

the rknn detector is already outlined in the docs

I was referring to the hardware scaling of the output image that the detector uses, not rknn detector itself. How do I define the rockchip preset there? From the code I see that it uses hevc_rkmpp_encoder which is basically '-c:v hevc_rkmpp_encoder -preset {1}). what is 1? How do I specify the 264 hwaccel for detect?

NickM-27 · 2023-11-15T21:04:18Z

I think you are misunderstanding something, detection is run on decoded stream which is exactly what the hwaccel_args are used for

NickM-27 · 2023-11-15T21:05:49Z

also, the presets used in the config are a frigate specific construct and not something built in to ffmpeg

great9 · 2023-11-15T21:12:51Z

also, the presets used in the config are a frigate specific construct and not something built in to ffmpeg

aah ok testing the presets now and will compare cpu usage.

any chance to get annke c800 / hikvision clips in the event viewer? snapshots work fine. are there settings for clips that I can fiddle with?

NickM-27 · 2023-11-15T21:19:14Z

I'm not sure what you mean, my hikvision works fine for recordings and snapshots

great9 · 2023-11-15T21:32:36Z

First benchmarks...
if I use the 4k stream to scale it down to 1080p for detect, it hogs the CPU even at 10 fps. A very specific version of ffmpeg (and a different decoder and scaler) from nyanmisaka and hbiyik worked much better, lower cpu usage tremendeously.

Bugs... h264 decoder has an issue:

2023-11-15 22:24:51.623840716  [h264 @ 0x7f72b438b0] error while decoding MB 42 17, bytestream -16
2023-11-15 22:25:26.641173548  [h264 @ 0x7f713525d0] error while decoding MB 64 35, bytestream -11
2023-11-15 22:25:31.533741061  [h264 @ 0x7f71e51440] error while decoding MB 72 35, bytestream -8
2023-11-15 22:25:46.603678863  [h264 @ 0x7f71a22440] error while decoding MB 89 17, bytestream -22
2023-11-15 22:26:06.587354184  [h264 @ 0x7f71335d40] error while decoding MB 69 17, bytestream -10
2023-11-15 22:26:36.632310777  [h264 @ 0x7f7081df80] error while decoding MB 77 17, bytestream -22
2023-11-15 22:26:46.580402010  [h264 @ 0x7f718242e0] error while decoding MB 106 17, bytestream -12
2023-11-15 22:26:56.593857821  [h264 @ 0x7f72fbe670] error while decoding MB 66 17, bytestream -20

Which results in picture like this:

Regarding clips, I can get the event clips for one specific camera played inside the chrome browser (ubiquiti g4 that's due to be replaced) but for the rest of the cameras (hikvision 4k and dahua 4k), they are not available:

NickM-27 · 2023-11-15T21:34:58Z

The rockchip preset does not implement hardware scaling, the scaling is done in software, so high CPU usage in that scenario is to be expected. I would definitely suggest running detect on a sub stream

as for the recording issue, that may not be related to this at all, more info would be needed like frigate logs, nginx logs, and browser logs

MarcA711 · 2023-11-15T21:42:08Z

One note since hardware scaling was mentioned here:
At least some Rockchip NPUs have raster graphics acceleration (rga). As far as I understand, this can be used when transcoding a video. As an example, consider I have a 4k h265 video and want to transcode it to 2k h264. Then the VPU takes care of the transcoding part and the rga does the image scaling. This is not useful here, since frigate does not transcode our recorded videos. However, if you use go2rtc and transcode your videos, you can use hardware scaling. I plan to update the docs on using go2rtc with rockchip ffmpeg withing the next couple days and I will implement this part.

For some reason the rga is not used when decoding and scaling an image. So, if I decode my 4k h265 video and want to resize the raw frames to 2k resolution, the VPU does the decoding part, but the scaling part has to be done on the cpu not rga. After I finish the basic Rockchip implementation here, I want to find out why the rga is not used to scale decoded frames. Maybe the maintainer of the rockchip ffmpeg fork will implement it.

However, you specifically asked for hardware acceleration of frames passed to the rknpu detector. I guess this is not worth since passing the frame to ffmpeg and then to the rga is probably slower than just doing it on the cpu. But even if it is a bit faster, the detection using the npu takes at least 20ms even using the smallest models. The time it takes to scale the frame on the cpu is probably negligible vs these 20 ms.

But all the information here is vague, as I have not yet familiarized myself with the subject in depth.

MarcA711 · 2023-11-15T21:46:54Z

@great9 As I said in my last comment, I will update the docs about transcoding using go2rtc. With that, you can use the following workaround: transcode your 4k stream using go2rtc and use hardware scaling to scale down the stream to 1080p.

But only do this, if you have to use this 4k stream. As Nick said, the much better solution is to provide a substream with lower resolution.

great9 · 2023-11-15T21:47:00Z

One note since hardware scaling was mentioned here: At least some Rockchip NPUs have raster graphics acceleration (rga). As far as I understand, this can be used when transcoding a video. As an example, consider I have a 4k h265 video and want to transcode it to 2k h264. Then the VPU takes care of the transcoding part and the rga does the image scaling. This is not useful here, since frigate does not transcode our recorded videos. However, if you use go2rtc and transcode your videos, you can use hardware scaling. I plan to update the docs on using go2rtc with rockchip ffmpeg withing the next couple days and I will implement this part.

great, will try that too.

For some reason the rga is not used when decoding and scaling an image. So, if I decode my 4k h265 video and want to resize the raw frames to 2k resolution, the VPU does the decoding part, but the scaling part has to be done on the cpu not rga. After I finish the basic Rockchip implementation here, I want to find out why the rga is not used to scale decoded frames. Maybe the maintainer of the rockchip ffmpeg fork will implement it.

I had some success with https://github.com/nyanmisaka/jellyfin-ffmpeg/commits/next-rockchip-async-afbc (ffmpeg compiled with "--enable-rkrga" and "--enable-rkmpp"

He has two branches he's working on.

Then there's @hbiyik who implemented the rkmpp encoder / decoder to ffmpeg 4.4, 5.0, 5.1 and now ffmpeg latest. He's been sharing some thoughts here/

However, you specifically asked for hardware acceleration of frames passed to the rknpu detector. I guess this is not worth since passing the frame to ffmpeg and then to the rga is probably slower than just doing it on the cpu. But even if it is a bit faster, the detection using the npu takes at least 20ms even using the smallest models. The time it takes to scale the frame on the cpu is probably negligible vs these 20 ms.

I was wrong about "detect". I was under the impression that the "detect" config does scaling (based on the width, height parameters) and that it's being used with the CPU / VPU so that I can tune it.
I have coral and rknn. Detectors aren't a problem. My substreams are 360p and unusable. I really need to downscale from 4k to 1080p.

NickM-27 · 2023-11-15T21:51:40Z

frigate will use ffmpeg to scale the detect input stream to the detect -> width / height if the stream is not already at that size. This is done on the GPU in most ffmpeg presets but as it was said, the preset here uses software

NickM-27 · 2023-11-15T21:52:27Z

@MarcA711 is there something you are implementing or just updating the docs to mention go2rtc config?

MarcA711 · 2023-11-15T22:04:47Z

@great9 thank you for the links you provided, especially the jellyfin ffmpeg fork. I didn't know this so far. This can be helpful for implementing hw scaling without encoding in ffmpeg. As I said, with hbiyiks ffmpeg fork it seems that only hw scaling is supported in combination with encoding.

@NickM-27 I would like to implement presets that can be used in the go2rtc way like this: ffmpeg:rtsp://rtsp:12345678@192.168.1.123/av_stream/ch0#input=rkh264dec#video=rkh264enc
Is there some way to add go2rtc presets?

NickM-27 · 2023-11-15T22:11:04Z

yes, that could be done in the create_config.py

frigate/docker/main/rootfs/usr/local/go2rtc/create_config.py

Lines 106 to 114 in 8c7f6d4

    
           if int(os.environ["LIBAVFORMAT_VERSION_MAJOR"]) < 59: 
        
               if go2rtc_config.get("ffmpeg") is None: 
        
                   go2rtc_config["ffmpeg"] = { 
        
                       "rtsp": "-fflags nobuffer -flags low_delay -stimeout 5000000 -user_agent go2rtc/ffmpeg -rtsp_transport tcp -i {input}" 
        
                   } 
        
               elif go2rtc_config["ffmpeg"].get("rtsp") is None: 
        
                   go2rtc_config["ffmpeg"][ 
        
                       "rtsp" 
        
                   ] = "-fflags nobuffer -flags low_delay -stimeout 5000000 -user_agent go2rtc/ffmpeg -rtsp_transport tcp -i {input}"

MarcA711 · 2023-11-15T22:13:05Z

Awesome, this is exactly what I need.

faugconti · 2023-11-16T03:50:32Z

you forked MPP but did not include it in the rockchip makefile? seems to be missing

MarcA711 · 2023-11-16T08:15:27Z

I am not sure, what you mean.

I forked mpp and did the following changes:

remove all unit tests
remove the legacy rockchip_vpu lib
build static mpp

I guess you mean that the static rockchip_mpp.a lib contains just a part of the symbols. This is true, I used a workaround to build the lib: I used ar to combine all static libs that the original CMakeLists.txt from rockchip build.

I plan to update my fork in the future, to directly build build a librockchip_mpp with all symbols. However, the mpp repo contains 50+ CMake files and I have to work through all of them. I didn't have time for this so far. This is low priority for me, since the librockchip_mpp.a that I build using my workaround is basically the same as if I wrote CMake files to directly build it.

So if you want to build your own:

git clone https://github.com/MarcA711/mpp
cd mpp 
mkdir bld && cd bld
cmake -DCMAKE_BUILD_TYPE=Release -DHAVE_DRM=ON -DCMAKE_INSTALL_PREFIX=/path/to/your/prefix ..
make -j$(nproc)
# now apply workaround to build mpp.a
../merge_libs.sh
# now you have a working librockchip_mpp.a under mpp/bld/mpp/librockchip_mpp.a
make install

great9 · 2023-11-16T13:01:18Z

@MarcA711 can you add preset for birdseye a well please?

NickM-27 · 2023-11-16T13:04:32Z

Birdseye already uses the preset

faugconti · 2023-11-16T17:19:22Z

I am not sure, what you mean.

I forked mpp and did the following changes:
* remove all unit tests

* remove the legacy rockchip_vpu lib

* build static mpp
I guess you mean that the static rockchip_mpp.a lib contains just a part of the symbols. This is true, I used a workaround to build the lib: I used ar to combine all static libs that the original CMakeLists.txt from rockchip build.

I plan to update my fork in the future, to directly build build a librockchip_mpp with all symbols. However, the mpp repo contains 50+ CMake files and I have to work through all of them. I didn't have time for this so far. This is low priority for me, since the librockchip_mpp.a that I build using my workaround is basically the same as if I wrote CMake files to directly build it.

So if you want to build your own:
git clone https://github.com/MarcA711/mpp
cd mpp 
mkdir bld && cd bld
cmake -DCMAKE_BUILD_TYPE=Release -DHAVE_DRM=ON -DCMAKE_INSTALL_PREFIX=/path/to/your/prefix ..
make -j$(nproc)
# now apply workaround to build mpp.a
../merge_libs.sh
# now you have a working librockchip_mpp.a under mpp/bld/mpp/librockchip_mpp.a
make install

oh i see. But no matter what i do i keep getting these errors

[h264_rkmpp_decoder @ 0xaaab1524bee0] Failed to initialize MPP context (code = -1).
[h264_rkmpp_decoder @ 0xaaaaf20af440] Failed to initialize RKMPP Codec.

NickM-27 · 2023-11-16T18:13:36Z

What is the docker compose?

MarcA711 · 2023-11-16T19:20:22Z

@faugconti
You don't need to build the librockchip_mpp.a library to use this image. Everything needed is already included.
What Rockchip SoC and what SBC are you using?
What is your docker_compose.yml and config.yml?

faugconti · 2023-11-16T21:03:05Z

Solved this issue switching from mainline kernel back to 5.10.160-rockchip (which already comes with MPP) after checking the Dockerfile.
Wouldn't be possible to include it inside the container alongside the other deps?

MarcA711 · 2023-11-16T21:42:44Z

The problem is not a missing library. Everything necessary is implemented in the rk image.

I guess you were using the mainline kernel from collabora. As you can see from their status matrix, only the av1 codec is implemented so far, so this kernel supports no h264 or h265/hevc de- or encoding.

add static ffmpeg with rockchip hw accel

7a17ed7

add ffmpeg presets

8a3e36d

fix scaling preset and update docs for rk hwaccel

50e3ae7

MarcA711 marked this pull request as ready for review November 14, 2023 21:05

NickM-27 approved these changes Nov 14, 2023

View reviewed changes

update usage info in ffmpeg_presets docs

30568b1

Add note about hardware acceleration support

375c652

blakeblackshear merged commit 8c7f6d4 into blakeblackshear:dev Nov 15, 2023
10 checks passed

great9 mentioned this pull request Nov 15, 2023

feat(WORK IN PROGRESS): Add ArmNN and rknn2 detectors. Add docs about orange pi 5 #5733

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

MarcA711 commented Nov 13, 2023

netlify bot commented Nov 13, 2023 •

edited

MarcA711 commented Nov 13, 2023

NickM-27 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

NickM-27 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

great9 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

great9 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

faugconti commented Nov 16, 2023

MarcA711 commented Nov 16, 2023

great9 commented Nov 16, 2023

NickM-27 commented Nov 16, 2023

faugconti commented Nov 16, 2023

NickM-27 commented Nov 16, 2023

MarcA711 commented Nov 16, 2023

faugconti commented Nov 16, 2023

MarcA711 commented Nov 16, 2023 •

edited

Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

Add static FFmpeg with rockchip hardware acceleration to rk- image #8599

Conversation

MarcA711 commented Nov 13, 2023

netlify bot commented Nov 13, 2023 • edited

✅ Deploy Preview for frigate-docs ready!

MarcA711 commented Nov 13, 2023

NickM-27 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

NickM-27 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

MarcA711 commented Nov 14, 2023

great9 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023 • edited

NickM-27 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023

NickM-27 commented Nov 15, 2023

great9 commented Nov 15, 2023 • edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

great9 commented Nov 15, 2023 • edited

NickM-27 commented Nov 15, 2023 • edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023 • edited

NickM-27 commented Nov 15, 2023

MarcA711 commented Nov 15, 2023

faugconti commented Nov 16, 2023

MarcA711 commented Nov 16, 2023

great9 commented Nov 16, 2023

NickM-27 commented Nov 16, 2023

faugconti commented Nov 16, 2023

NickM-27 commented Nov 16, 2023

MarcA711 commented Nov 16, 2023

faugconti commented Nov 16, 2023

MarcA711 commented Nov 16, 2023 • edited

netlify bot commented Nov 13, 2023 •

edited

great9 commented Nov 15, 2023 •

edited

great9 commented Nov 15, 2023 •

edited

great9 commented Nov 15, 2023 •

edited

NickM-27 commented Nov 15, 2023 •

edited

MarcA711 commented Nov 15, 2023 •

edited

MarcA711 commented Nov 16, 2023 •

edited