Bug: norfair distance() "Received nan values from distance function" kills the camera process during PTZ autotracking (healthy detector); process never restarts #23471
-
SummaryOn an autotracking PTZ camera with a healthy Coral EdgeTPU detector, the per-camera tracking process dies intermittently (~2-3×/month) with: This is the same error string as #9742, but a different root cause. In #9742 the NaN came from a broken TensorRT model on Maxwell/compute-5.0 Quadro cards returning bad coordinates, and was treated as a GPU/model problem (users moved to newer GPUs; closed as stale). Here the detector is healthy and detections are normal — the degenerate box originates in the tracker/autotracking path (a collapsed Kalman estimate or a box clamped to zero width/height during camera motion). The common, generalizable defect is that Root cause
estimate_dim = np.diff(estimate, axis=0).flatten()
detection_dim = np.diff(detection, axis=0).flatten()
...
distance[0] /= estimate_dim[0] # ZeroDivision -> nan when estimate width == 0
distance[1] /= estimate_dim[1] # ZeroDivision -> nan when estimate height == 0
widths = np.sort([estimate_dim[0], detection_dim[0]])
heights = np.sort([estimate_dim[1], detection_dim[1]])
width_ratio = widths[1] / widths[0] - 1.0 # ZeroDivision -> nan when smaller width == 0
height_ratio = heights[1] / heights[0] - 1.0 # ZeroDivision -> nan when smaller height == 0When a dimension is 0, the division yields Two problems
Version
Config (redacted)detectors:
coral:
type: edgetpu
model:
model_type: yolo-generic
width: 320
height: 320
cameras:
lake_cam_ptz:
detect: { width: 1280, height: 720, fps: 5 }
objects: { track: [bird, boat, person] }
onvif:
autotracking:
enabled: true
track: [bird, boat, person]
return_preset: BoatsNorthEastZoomLog output (the crash)After this, there is no camera-process restart in the logs; NotesReproduction is intermittent and tied to autotracking motion, so it's hard to force on demand, but the divide-by-zero is deterministic given a zero-dim estimate/detection. We run a live autotracking PTZ and are happy to test a patched |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
|
This is a well-documented bug report identifying a divide-by-zero vulnerability in the Norfair tracker's distance function during PTZ autotracking. The issue occurs when box dimensions collapse to zero (from Kalman estimate drift or clamping during camera motion), causing NaN values that crash the camera process. Your proposed fix is sound—clamping dimensions to ≥1px would protect all four division sites: estimate_dim = np.maximum(np.abs(np.diff(estimate, axis=0).flatten()), 1.0)
detection_dim = np.maximum(np.abs(np.diff(detection, axis=0).flatten()), 1.0)Regarding the autotracking configuration, the documentation confirms that calibration can help with tracking accuracy during PTZ movements. (1) If you haven't already, you might try running calibration by setting The documentation also notes that the motion estimator needs sufficient scene details to track camera motion—if For the process recovery issue (problem #2), this appears to be outside what's documented in the available sources. Opening a PR with the guard fix and filing the process restart issue separately would be the recommended path forward. 📚 Sources: Have another question? Just tag @inkeep. |
Beta Was this translation helpful? Give feedback.
-
|
This rare crash is likely due to autotracking motion estimation in difficult circumstances. The PR your AI put up only clamps the divisors, but the crash likely comes from a non-finite estimate, not a zero-area box. I've pushed some changes to improve this in #23475 Let me know if you still have issues after this fix. |
Beta Was this translation helpful? Give feedback.
This rare crash is likely due to autotracking motion estimation in difficult circumstances. The PR your AI put up only clamps the divisors, but the crash likely comes from a non-finite estimate, not a zero-area box. I've pushed some changes to improve this in #23475
Let me know if you still have issues after this fix.