Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vo: Fix race condition causing redraw requests to be ignored #9735

Closed
wants to merge 1 commit into from

Conversation

chengsun
Copy link
Contributor

@chengsun chengsun commented Jan 21, 2022

This manifested most frequently as a bug where the OSD fails to update
after pausing or single-stepping forwards (issues #8172, #8350).

In render_frame, the vo thread takes the lock once before rendering
the frame, and once after. The lock is relinquished whilst the frame is
actually rendering.

The redraw_request flag was being cleared whilst holding the lock
after the video frame has been rendered; however this frame was
rendered according to the request fetched whilst holding the lock the
first time. If the OSD was updated during the rendering of this frame,
the redraw request is cleared once the frame finishes rendering even
though the updated OSD is not included in the rendered frame.

This patch fixes the race condition by clearing the redraw_request
flag before the video frame rendering starts. That way, a redraw
request that comes in during frame rendering will not be cleared without
another call to render_frame or do_redraw.

@chengsun
Copy link
Contributor Author

NB: I have only tested this on Linux with vo-gpu.

@avih
Copy link
Member

avih commented Jan 21, 2022

A-priori it seems to fix the issue on windows.

Test case: add this to input.conf: SPACE osd-msg cycle pause, play a video, press space few times.

Without the patch: Only pause: no is displayed.
With the patch: both pause: no and pause: yes are displayed.

I can't pretend to follow the logic of the fix exactly, and while the commit message describes what was wrong before, it doesn't describe how the patch addresses the issue - and it should (i.e. do edit the commit message with explanation).

@chengsun
Copy link
Contributor Author

chengsun commented Jan 21, 2022

Appended a paragraph to the commit message. Thank you for testing!

@@ -944,6 +944,7 @@ static bool render_frame(struct vo *vo)
} else {
in->rendering = true;
in->hasframe_rendered = true;
in->request_redraw = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clears in->request_redraw unconditionally, but originally it was only cleared if the frame was not dropped. I think it's a (new) bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true, see line 941.

Copy link
Member

@avih avih Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes true. In (new) line 988 (after in->request_redraw is cleared with the patch, and before it originally checked in->dropped_frame to decide whether to clear) it can change:

    in->dropped_frame = prev_drop_count < vo->in->drop_count;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're totally right, I didn't spot that. Thanks a lot. I force-pushed a change that I believe addresses this. The logic feels very subtle so I tried to explain what's going on by adding a comment to the code.

@chengsun chengsun force-pushed the fix_request_redraw_race branch 2 times, most recently from 1aef0cd to 533ae7c Compare January 21, 2022 02:33
This manifested most frequently as a bug where the OSD fails to update
after pausing or single-stepping forwards (issues mpv-player#8172, mpv-player#8350).

In `render_frame`, the vo thread takes the lock once before rendering
the frame, and once after. The lock is relinquished whilst the frame is
actually rendering.

The `redraw_request` flag was being cleared whilst holding the lock
*after* the video frame has been rendered; however this frame was
rendered according to the request fetched whilst holding the lock the
first time. If the OSD was updated during the rendering of this frame,
the redraw request is cleared once the frame finishes rendering even
though the updated OSD is not included in the rendered frame.

This patch fixes the race condition by clearing the `redraw_request`
flag *before* the video frame rendering starts. That way, a redraw
request that comes in during frame rendering will not be cleared without
another call to `render_frame` or `do_redraw`. However, we need to
ensure that if the frame is dropped, we set `redraw_request` again if it
was previously set. See the newly added comment in the code.
@Obegg
Copy link

Obegg commented Jan 21, 2022

Does it also fix this issue #9718 ?

@avih
Copy link
Member

avih commented Jan 21, 2022

Does it also fix this issue #9718 ?

I don't see how. Why would you consider that?

I force-pushed a change that I believe addresses this ...

Thanks, I think that part is now covered. Few nits though:

  • prev_request_redraw is only needed at the big else scope, but you defined it more globally. Please move it to the inner scope - right before you clear in->request_redraw.
  • While in->request_redraw |= prev_request_redraw; is logically correct, it does more work than it should (always assigns a value, checks two variables). I think it could be (in two lines): if (prev_request_redraw) in->request_redraw = true;. Maybe also add a small comment here to complement the bigger comment you added.
  • I think prev should be changed to initial.

Now, as for the patch itself, while it indeed addresses the race where a request is added while rendering, I'm not 100% convinced it fixes all race conditions related to the request (that's not euphomism for "it doesn't fix it 100%"). For instance, what happens if a frame is requested before the first lock is taken? Is the request applied in some way inside the first lock (I don't think I can see that)? if not, then there could still be a race.

I.e. I'm not familiar with the request logic, and what constitutes "the request is applied" (e.g. if an OSD bitmap needs to be re-rendered), so I don't know if the patch is complete in this regard.

Granted, it's definitely better than before as far as I can tell, but if it's not 100% race-free, then I think it should be mentioned too.

@sfan5
Copy link
Member

sfan5 commented Jan 21, 2022

  • While in->request_redraw |= prev_request_redraw; is logically correct, it does more work than it should (always assigns a value, checks two variables). I think it could be (in two lines): if (prev_request_redraw) in->request_redraw = true;

This is the wrong nit to pick. The compiler will transform whatever you write into the more efficient form and that will surely not include a branch. Your abstract idea of "work" is inaccurate here.

@avih
Copy link
Member

avih commented Jan 22, 2022

This is the wrong nit to pick.

I'm not going to argue this. Both forms perform the sme thing. It's a matter of personal taste, and it doesn't really matter which form it ends up as. Feel free to ignore this nit if you wish.

@avih
Copy link
Member

avih commented Feb 2, 2022

@chengsun ping?

Dudemanguy added a commit to Dudemanguy/mpv that referenced this pull request Mar 2, 2023
There is a very subtle race in vo that can manifest itself on pause
events. In the renderloop, render_frame, unsurprisingly, does the heavy
lifting of actually queuing up and flipping the frames. This is called
during normal playback. Sometimes various parts of the player can make a
redraw request which will latter trigger another render of the frame
later down in the loop (do_redraw). Because these requests can happen at
essentially anytime, sometimes the redraw request will happen *before*
do_redraw and it'll be caught in render_frame. When this happens,
the existing render_frame run works perfectly fine as a redraw so it
clears out the request which is sensible. Normally this is all locked of
course, but there's one catch. render_frame has to unlock itself when
propagating down into specific VOs/backends. That's what causes this
bug.

While render_frame is unlocked, other parts of the player can send
redraw requests which will cause in->request_redraw to become true. The
logic in the code always clears out this parameter after a successful
render, but this isn't correct. When in->request_become becomes true in
the middle of render_frame, there needs to be one more draw afterwards
to reflect whatever actually changed (usually the OSD). Instead, this
gets simply discarded. If you rapidly spam pause while rendering things
to the OSD at the same time, it's possible to for the last render to
behind a frame and appear as if your osd event was ignored.

Once you realize what is happening, the fix is quite simple. Just store
the initial value of in->request_redraw before the unlock step. After we
do the render step and unlock again, only set in->request_redraw to
false if there was an initial redraw request. We just finished doing a
redraw, so it is safe to clear. Otherwise, leave in->request_redraw
alone. If it is initially false, then it will still be false and nothing
changes. However if it updated to true in the middle of the rendering,
this value is now preserved so we can go and call do_redraw later and
show what that last frame was meant to be when you pause. One
unfortunate thing about this design is that it is technically possible
for other internal things in vo to update during that unlocked period.
Hopefully, that doesn't actually happen and only redraw requests work
like this.

Fixes mpv-player#8172 and mpv-player#9735.
@Dudemanguy
Copy link
Member

Superceded by #11395.

@Dudemanguy Dudemanguy closed this Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants