Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HW Accel Support]: Random out of memory errors #7934

Closed
jubjubrsx opened this issue Sep 24, 2023 · 12 comments
Closed

[HW Accel Support]: Random out of memory errors #7934

jubjubrsx opened this issue Sep 24, 2023 · 12 comments

Comments

@jubjubrsx
Copy link

jubjubrsx commented Sep 24, 2023

Describe the problem you are having

I'm getting some strange random out of memory errors, and machine hard locks. I think might be related to the vaapi drivers memory leak? But those threads I read on this were a few years old, is this still a problem?

Version

0.12.1-367D724

Frigate config file

# ~/docker/frigate/config.yml
# yaml-language-server: $schema=http://192.168.150.3:5000/api/config/schema.json

ui:
  use_experimental: false
  live_mode: mse

mqtt:
  host: 192.168.10.10
  port: 1883
  user: mqtt
  password: passwordsareawesome
  topic_prefix: frigate
  client_id: frigate
  stats_interval: 60
  enabled: true

detectors:
  coral1:
    type: edgetpu
    device: pci

ffmpeg:
  hwaccel_args: preset-vaapi

record:
  expire_interval: 10

logger:
  default: info

rtmp:
  enabled: false

live:
  height: 720
  quality: 1

birdseye:
  enabled: true
  restream: False
  width: 640
  height: 360
  quality: 1
  mode: continuous

#############################################################
go2rtc:
  streams:
    
    Front_Porch:
      - rtsp://admin:passwordsareawesome@192.168.150.10:554/Streaming/Channels/101?transportmode=unicast&profile=Profile_1
    Front_Porch_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.10:554/Streaming/Channels/102?transportmode=unicast&profile=Profile_1


    Back_Porch:
      - rtsp://admin:passwordsareawesome@192.168.150.13:554/Streaming/Channels/101?transportmode=unicast&profile=Profile_1
    Back_Porch_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.13:554/Streaming/Channels/102?transportmode=unicast&profile=Profile_1


    Air_Conditioner:
      - rtsp://admin:passwordsareawesome@192.168.150.14:554/Streaming/Channels/101?transportmode=unicast&profile=Profile_1
    Air_Conditioner_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.14:554/Streaming/Channels/102?transportmode=unicast&profile=Profile_1


    Carport:
      - rtsp://admin:passwordsareawesome@192.168.150.12:554/Streaming/Channels/101?transportmode=unicast&profile=Profile_1
    Carport_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.12:554/Streaming/Channels/102?transportmode=unicast&profile=Profile_1


    Driveway:
      - rtsp://admin:passwordsareawesome@192.168.150.11:554/Streaming/Channels/101?transportmode=unicast&profile=Profile_1
    Driveway_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.11:554/Streaming/Channels/102?transportmode=unicast&profile=Profile_1


    Backyard:
      - rtsp://admin:passwordsareawesome@192.168.150.15:554/Streaming/Channels/101
    Backyard_sub:
      - rtsp://admin:passwordsareawesome@192.168.150.15:554/Streaming/Channels/102


    webrtc:
      candidates:
        - 192.168.150.3:8555
        #- stun:8555
##############################################################

cameras:

 Driveway: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Driveway
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Driveway_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - cat
        - bicycle
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        retain:
          default: 14
          mode: motion
        pre_capture: 15
        post_capture: 30
    motion:
       mask:
        - 0,0,843,0,960,0,1049,0,1093,152,718,74,421,64,333,61,203,73,114,94,64,108,0,257

 Back_Porch: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Back_Porch
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Back_Porch_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - cat
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        retain:
          default: 5
          mode: motion
        pre_capture: 15
        post_capture: 30
    motion:
      mask:
       - 1012,0,1047,158,890,130,753,106,643,0


 Carport: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Carport
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Carport_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - cat
        - bicycle
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        required_zones:
         - Carzone_1
        retain:
          default: 5
          mode: motion
        pre_capture: 15
        post_capture: 30
    zones:
        Carzone_0:
          coordinates: 265,322,271,174,305,90,328,47,450,22,684,24,753,133,719,527,421,586,265,561
        Carzone_1:
          coordinates: 116,576,0,587,0,96,0,96,21,45,147,33


 Air_Conditioner: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Air_Conditioner
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Air_Conditioner_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - cat
        - bicycle
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
          mask:
           - 927,54,933,104,881,87,887,49
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        retain:
          default: 5
          mode: motion
        pre_capture: 15
        post_capture: 30


 Front_Porch: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Front_Porch
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Front_Porch_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - bicycle
        - cat
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        required_zones:
        - FPzone_0
        retain:
          default: 5
          mode: motion
        pre_capture: 15
        post_capture: 30
    motion:
      mask:
        - 872,0,1146,0,1189,58,1219,0,1032,0,1054,59,912,46,809,30,681,32,550,44,500,60,495,0
    zones:
     FPzone_0:
      coordinates: 1280,720,0,720,0,0,612,0,1018,0,1085,175,1201,192,1280,210,1280,433
     FPzone_1:
      coordinates: 1046,0,1280,0,1280,189,1055,163    


 Backyard: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Backyard
          input_args: preset-rtsp-restream
          roles:
            - record
            - rtmp
       -  path: rtsp://127.0.0.1:8554/Backyard_sub
          input_args: preset-rtsp-restream
          roles:
             - detect
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    detect:
      width: 640 #<---- update for your camera's resolution
      height: 360 #<---- update for your camera's resolution
    objects:
      track:
        - person
        - car
        - cat
        - bicycle
        - dog
      filters:
        person:
          min_score: 0.6
          threshold: 0.7
          min_area: 700
          mask:
           - 927,54,933,104,881,87,887,49
    snapshots:
      enabled: true
      timestamp: true
      bounding_box: true
      crop: True
      height: 500
      retain:
        default: 30
    record:
      enabled: True
      retain:
        days: 45
        mode: motion
      events:
        retain:
          default: 5
          mode: motion
        pre_capture: 15
        post_capture: 30
    motion:
      mask:
         - 798,394,664,473,625,488,412,425,426,298,757,300



 247Front_Porch: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Front_Porch_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
      enabled: false
    birdseye:
      enabled: false


 247Driveway: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Driveway_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
        enabled: false
    birdseye:
      enabled: false

 247Carport: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Carport_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
      enabled: false
    birdseye:
      enabled: false

 247Back_Porch: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Back_Porch_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
      enabled: false
    birdseye:
      enabled: false

 247Air_Conditioner: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Air_Conditioner_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
      enabled: false
    birdseye:
      enabled: false

 247Backyard: # <------ Name the camera
    ffmpeg:
      output_args:
        record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -tag:v hvc1 -bsf:v hevc_mp4toannexb -c:a aac
        rtmp: -c:v copy -c:a aac -f flv

      inputs:
       -  path: rtsp://127.0.0.1:8554/Backyard_sub
          input_args: preset-rtsp-restream
          roles:
             - record
    rtmp:
      enabled: False # <-- RTMP should be disabled if your stream is not H264
    record:
      enabled: True
      retain:
        days: 30
        mode: all
    detect:
      enabled: false
    birdseye:
      enabled: false

docker-compose file or Docker CLI command

# ~/docker/frigate/frigate-compose.yml

version: "3.9"
services:
  frigate:
    container_name: frigate
    privileged: true
    restart: unless-stopped
    image: ghcr.io/blakeblackshear/frigate:stable
    shm_size: "2048mb" # update for your cameras based on calculation
    devices:
      - /dev/apex_0:/dev/apex_0 # passes a PCIe Coral
      - /dev/dri/renderD128
      - /dev/dri/card0
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /home/docker/frigate:/config
      - /media/frigate:/media/frigate
      - type: tmpfs # Optional: 1GB of memory, reduces SSD/SD Card wear
        target: /tmp/cache
        tmpfs:
          size: 1500000000
    ports:
      - "5000:5000"
      - "8554:8554" # RTSP feeds
      - "8555:8555/tcp" # WebRTC over tcp
      - "8555:8555/udp" # WebRTC over udp
    environment:
      FRIGATE_RTSP_PASSWORD: "_"

Relevant log output

none

FFprobe output from your camera

[{"return_code":0,"stderr":"","stdout":{"programs":[],"streams":[{"avg_frame_rate":"0/0","codec_long_name":"H.265/HEVC(HighEfficiencyVideoCoding)","display_aspect_ratio":"16:9","height":2160,"width":3840}]}},{"return_code":0,"stderr":"","stdout":{"programs":[],"streams":[{"avg_frame_rate":"0/0","codec_long_name":"H.265/HEVC(HighEfficiencyVideoCoding)","display_aspect_ratio":"16:9","height":360,"width":640}]}}]

Operating system

Debian

Install method

Docker Compose

Network connection

Wired

Camera make and model

ankee c800

Any other information that may be helpful

proxmox 8
ubuntu server 23.04
passing raw gpu device to VM
passing raw coral m.2 bkey(pci-e adapter)
6gb memory
6 cpu (host=intel 11th gen i5 - rocket lake)

tried to use the 965 driver as stated in the help also the i915 for kicks.
Also tried to pass QSV it fails. (hwaccel_args: preset-intel-qsv-h264)
vaapi usage shows roughly 3-5% on average.

@NickM-27
Copy link
Sponsor Collaborator

you have not included the errors so difficult to know what is going on. You should run vainfo on your host to make sure your host drivers are up to date

@NickM-27
Copy link
Sponsor Collaborator

I'm assuming you mean vaapi memory leak. And it depends on the GPU you have and the host driver version. Looks like your host driver is a bit out of date, but in any case I'd suggest watching your memory usage with something like top or htop to confirm it is actually ffmpeg and not something else using the memory.

@jubjubrsx
Copy link
Author

jubjubrsx commented Sep 24, 2023

Thanks Nick, I deleted that post but looks like you replied to that :D anywho the host had an older driver then the docker, I'm working on getting the driver updated on the host hopefully that fixes it.

I have been trying to watch it with htop to keep an eye on it and uptime kuma emails me if it goes down.

@jubjubrsx
Copy link
Author

hopefully this fixes it
host machine:
vainfo: VA-API version: 1.19 (libva 2.19.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.3.1 ()
vainfo: Supported profile and entrypoints

frigate docker vainfo
vainfo: VA-API version: 1.17 (libva 2.10.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.1.1 ()

@louispires
Copy link

Will probably open my own issue when it happens again, but my entire server crashed and looking at the Memory Usage just before the crash, Frigate container jumped up to 115GB of RAM usage:

In a span of 4 minutes, the RAM usage went from 3.6GB to 115GB:
image

I am using the stable-tensorrt with ffmpeg hardware acceleration which I only enable a couple of days ago...

@NickM-27
Copy link
Sponsor Collaborator

For one I would generally suggest setting memory limits on the container, but we'd need info on what specifically was using the memory.

@louispires
Copy link

For one I would generally suggest setting memory limits on the container, but we'd need info on what specifically was using the memory.

Absolutely, added a memory limit now.

It is just tough to know what happened exactly within Frigate container, as the entire server was frozen and had to be cold booted to get back...

@NickM-27
Copy link
Sponsor Collaborator

right, with the memory limit it should be easier to catch it in the state (if it happens again) and use top or htop to see what is using memory in the container

@audiophonicz
Copy link

I have been trying to chase down a machine lock up issue for a while now myself, and my setup is somewhat similar to OPs. besides CPU gen, proxmox and docker, my versions are the same.

I have 6 identical Skylake NUCs with 16GB RAM running K3s on Debian. Started with frigate v11 on deb v11 and upgraded both to v12 since and the lockups persist.

Im using iHD driver with the preset-intel-qsv-h264 preset, but have also tried i915 and i965, as well as the preset-vaapi with no differences.

I also started without any mem limits on frigate, but noticed some OOM Killed conditions and set it down to 4GB. I would expect the container to die and restart if it were just memory, but still the node will lock up hard.

Far as I can tell, hw acceleration is working just fine, as I also have plex transcoding with the iHD driver.

The problem seems to be around object detection (im using openvino). With obj detection disabled, the container will run for months without issue. When I enable obj detection, with just -person defaults, on a single camera; the node will lock up within a week requiring a hard reset. If enabled for all 3 cameras with multiple objects, it can lock up as fast as 48hrs. Doesnt matter which node its running on, that node will lock up. I have not found any discernable logs on the node beyond some GPU messages on the console after lockup (that led me nowhere), nor any frigate logs beyond the OOM Killed container status.

@NickM-27
Copy link
Sponsor Collaborator

NickM-27 commented Oct 8, 2023

the main difference is OP is using a coral, which sort of highlights the fact that these issues are usually not related and have their own unique cause. Other users have reported the opposite that you are and have said openvino works fine but hwaccel causes high memory usage.

a lot of these types of issues are for proxmox users, though not all. some users have reported updating host kernel, driver, etc. fixed it for them

@jubjubrsx
Copy link
Author

thought I'd chime in after updating the drivers I've had ZERO issues. well other then the fact that intel are poopoo heads and not letting rocket lake cpu's gpu split.

@NickM-27
Copy link
Sponsor Collaborator

NickM-27 commented Oct 8, 2023

Thanks for confirming. Will close this as the original issue is not occurring. Anyone else can feel free to create their own issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants