Skip to content

Conversation

@adamshiervani
Copy link
Contributor

@adamshiervani adamshiervani commented Nov 20, 2025

Fix video thread infinite restart loop on signal loss

Problem: When display sleeps, video thread can loop indefinitely trying to restart capture, causing:

  • Resource exhaustion (20+ device open/close cycles)
  • Log spam
  • Thread becomes unresponsive, preventing UI restart

Root cause: Race between fast polling video thread and slow event-driven format detection thread. Video thread sees stale detected_signal=true and keeps retrying before format thread updates it.

Solution:

  1. Failsafe counter: Exit after 20 failed attempts to trigger supervisor restart
  2. Stale flag detection: After 3 consecutive zero-frame timeouts (6s), clear detected_signal to stop the loop
  3. Thread cleanup: Clean up stopped threads in video_start_streaming() to allow UI restart
  4. Add more logs: Add detailed logging for timeout analysis

@adamshiervani adamshiervani changed the title fix: limit video device open attempts to prevent infinite loops in st… fix: limit video device open attempts to prevent infinite loops Nov 20, 2025
Copy link
Contributor

@IDisposable IDisposable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the idea of this change is to give up completely if we can't safely open things up, a review of this code shows a lot of issues with resources that might be partially setup not being torn down correctly.

attempts++;
}
if (!streaming_stopped) {
log_error("video streaming thread did not exit after 30s");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worthwhile doing an exit(1) here too?

if (open_attempts > MAX_OPEN_ATTEMPTS) {
log_error("Tried to open video device %d times without success. "
"Exiting to trigger supervisor restart.", MAX_OPEN_ATTEMPTS);
exit(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better as return (void *)errno (as that contains the error number from the last failed operation)

struct timeval tv;
int r;
uint32_t num = 0;
VIDEO_FRAME_INFO_S stFrame;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this point, all the setup work succeeded, but we still haven't gotten a valid frame, but we HAVE allocated memory that must be released.


if (ioctl(video_dev_fd, VIDIOC_QBUF, &buf) < 0)
log_error("failure VIDIOC_QBUF: %s", strerror(errno));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else open_attempts = 0; because we've safely handled a frame? Otherwise we could slowly accumulate failing counts, while actually still running mostly right.

stFrame.stVFrame.enCompressMode = COMPRESS_MODE_NONE;
bool retried = false;
retry_send_frame:
if (RK_MPI_VENC_SendFrame(VENC_CHANNEL, &stFrame, 2000) != RK_SUCCESS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a cap on the attempts to send the frame too? Seems like we could get hung up here otherwise.

{
log_error("Set format fail: %s", strerror(errno));
usleep(100000); // Sleep for 100 milliseconds
close(video_dev_fd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goto cleanup;

close(video_dev_fd);
return (void *)errno;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stream_on = true;

log_info("running video stream");

streaming_stopped = false;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool stream_on == false;

{
if (buffers[i].mb_blk != NULL)
{
RK_MPI_MB_ReleaseMB((buffers + i)->mb_blk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this never happens if we return out as some paths currently do, we would leak memory if the memory was allocated and we return out...

req_free.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;
req_free.memory = V4L2_MEMORY_DMABUF;

if (ioctl(video_dev_fd, VIDIOC_REQBUFS, &req_free) < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self This actually allocates a zero-count of buffers, dropping what we had above

if (video_dev_fd > 0 && ioctl(video_dev_fd, VIDIOC_REQBUFS, &req_free) < 0)

@adamshiervani
Copy link
Contributor Author

When we enter sleep mode, we get a signal but no frames. When we request a frame, it times out after a second and then retries forever.

The proper fix is to stop the video before we enter sleep mode, so we don't need to handle this special case. See #999

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants