Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for the cursor corruption with KMS (again) #4895

Merged

Conversation

mripard
Copy link
Contributor

@mripard mripard commented Feb 17, 2022

Hi,

Here's a fix for the remaining cursor corruption bug when we were moving the cursor along the edge shared by two monitors with overscan enabled, mentioned in the PR #4776 .

The initial solution implemented by that PR doesn't work, since the root cause of the corruption seems to be that we copy the shadow dlist outside of the vblank window, probably due to scheduling delay.

This PR first reverts the patches implementing the shadow dlist, and implement another solution.

That corruption is due to the driver overwriting a dlist entry being used by the HVS to generate the current frame. Since we can't reliably copy something during vblank, the alternative is to defer the deallocation of the dlist entries to after the next vblank, when we know for sure they won't be used by the hardware anymore. Commit 5f5bb18 has more details on the care that needs to be taken in the implementation.

I can't see any corruption now, even after fairly long testing, so it seems that it's working as expected and it should be fixed for good (famous last words).

Maxime

While the HVS has the same context memory size in the BCM2711 than in
the previous SoCs, the range allocated to the registers doubled and it
now takes 16k + 16k, compared to 8k + 16k before.

The KMS driver will use the whole context RAM though, eventually
resulting in a pointer dereference error when we access the higher half
of the context memory since it hasn't been mapped.

Fixes: 4564363 ("ARM: dts: bcm2711: Enable the display pipeline")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
In order to get the field currently being output, the driver has been
using the display FIFO frame count in the HVS, reading a 6-bit field at
the offset 12 in the DISPSTATx register.

While that field is indeed at that location for the FIFO 1 and 2, the
one for the FIFO0 is actually in the DISPSTAT1 register, at the offset
18.

Fixes: e538092 ("drm/vc4: Enable precise vblank timestamping for interlaced modes.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Those macros are really about the HVS itself, and thus its associated
structure vc4_hvs, rather than the entire (virtual) vc4 device.

Let's change those macros to use the hvs pointer directly, and change
the calling sites accordingly.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
During normal operations, the cursor position update is done through an
asynchronous plane update, which on the vc4 driver basically just
modifies the right dlist word to move the plane to the new coordinates.

However, when we have the overscan margins setup, we fall back to a
regular commit when we are next to the edges. And since that commit
happens to be on a cursor plane, it's considered a legacy cursor update
by KMS.

The main difference it makes is that it won't wait for its completion
(ie, next vblank) before returning. This means if we have multiple
commits happening in rapid succession, we can have several of them
happening before the next vblank.

In parallel, our dlist allocation is tied to a CRTC state, and each time
we do a commit we end up with a new CRTC state, with the previous one
being freed. This means that we free our previous dlist entry (but don't
clear it though) every time a new one is being committed.

Now, if we were to have two commits happening before the next vblank, we
could end up freeing reusing the same dlist entries before the next
vblank.

Indeed, we would start from an initial state taking, for example, the
dlist entries 10 to 20, then start a commit taking the entries 20 to 30
and setting the dlist pointer to 20, and freeing the dlist entries 10 to
20. However, since we haven't reach vblank yet, the HVS is still using
the entries 10 to 20.

If we were to make a new commit now, chances are the allocator are going
to give the 10 to 20 entries back, and we would change their content to
match the new state. If vblank hasn't happened yet, we just corrupted
the active dlist entries.

A first attempt to solve this was made by creating an intermediate dlist
buffer to store the current (ie, as of the last commit) dlist content,
that we would update each time the HVS is done with a frame. However, if
the interrupt handler missed the vblank window, we would end up copying
our intermediate dlist to the hardware one during the composition,
essentially creating the same issue.

Since making sure that our interrupt handler runs within a fixed,
constrained, time window would require to make Linux a real-time kernel,
this seems a bit out of scope.

Instead, we can work around our original issue by keeping the dlist
slots allocation longer. That way, we won't reuse a dlist slot while
it's still in flight. In order to achieve this, instead of freeing the
dlist slot when its associated CRTC state is destroyed, we'll queue it
in a list.

A naive implementation would free the buffers in that queue when we get
our end of frame interrupt. However, there's still a race since, just
like in the shadow dlist case, we don't control when the handler for
that interrupt is going to run. Thus, we can end up with a commit adding
an old dlist allocation to our queue during the window between our
actual interrupt and when our handler will run. And since that buffer is
still being used for the composition of the current frame, we can't free
it right away, exposing us to the original bug.

Fortunately for us, the hardware provides a frame counter that is
increased each time the first line of a frame is being generated.
Associating the frame counter the image is supposed to go away to the
allocation, and then only deallocate buffers that have a counter below
or equal to the one we see when the deallocation code should prevent the
above race from occuring.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
@popcornmix
Copy link
Collaborator

I've tested this and it seems improved. I've not triggered the blatant glitches I'd seen without this patch (one screen blanking or showing noise pixels near bottom of screen).

I am seeing something funny. Two monitors at 1080p60.

When cursor is close to right hand edge of left screen and moving, I see the cursor flicker on right screen near left hand edge.
I believe mouse pointer is actually 64x64 pixels, but only the top left 16x16 are not fully transparent.
I believe the issue occurs when this transparent part of plane straddles the two screens.

It seems easier to provoke this towards bottom of screen.
Put mouse pointer at bottom right of left screen and waggle it as close to right hand edge as you can without crossing into next screen. Mouse pointer is occasionally visible on left edge of right screen.

@popcornmix
Copy link
Collaborator

Just in case this is a higher level issue (e.g. a bug in mutter) I've tested with fkms driver, and I don't see the issue, so it seems to be in kms driver.

@popcornmix
Copy link
Collaborator

A bit more testing. Mouse glitch affects buster and bullseye.
I can still see the cursor glitch without this PR (and without PR I get the additional glitches).

I tested this PR with kodi watching a number of videos without regressions.
Pi3+ still boots with working desktop using kms.

Can't see any issue in commits. The HVS register range change and frame number change are correct.
The logic for deferring deallocation is sane.

So I'm happy with this PR, it fixes an issue. There still exists an unrelated issue related to mouse pointer spanning two screens, but that can be dealt with in a later PR.

@pelwell pelwell merged commit 7bfcc30 into raspberrypi:rpi-5.15.y Feb 18, 2022
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this pull request Feb 18, 2022
kernel: usb: xhci: add a quirk for Superspeed bulk OUT transfers on VL805
See: raspberrypi/linux#4893

kernel: Change vc4 DSI to being a bridge
See: raspberrypi/linux#4878

kernel: Fix for the cursor corruption with KMS (again)
See: raspberrypi/linux#4895

kernel: OV7251 updates for libcamera compatibility
See: raspberrypi/linux#4897
popcornmix added a commit to raspberrypi/firmware that referenced this pull request Feb 18, 2022
kernel: usb: xhci: add a quirk for Superspeed bulk OUT transfers on VL805
See: raspberrypi/linux#4893

kernel: Change vc4 DSI to being a bridge
See: raspberrypi/linux#4878

kernel: Fix for the cursor corruption with KMS (again)
See: raspberrypi/linux#4895

kernel: OV7251 updates for libcamera compatibility
See: raspberrypi/linux#4897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants