Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drm/v3d: clean caches at the end of render jobs on request from userspace #3203

Merged
merged 1 commit into from
Sep 20, 2019

Conversation

itoral
Copy link
Contributor

@itoral itoral commented Sep 3, 2019

The TMU has a write combiner before the L2T cache that needs to be
explicitly flushed at the end of a render job when it includes non-atomic
writes.

This patch changes the kernel interface for CL submissions, so it goes together with a user-space patch. I'll update this with a link to the Mesa merge request for that when I submit it.

@anholt: this doesn't seem to fix all the issues that we have been discussing concerning write violations, but it certainly helps things significantly. Specifically, it fixes all the same CTS test failures as the user-space workaround I had, however, it doesn't remove all the write violation messages from piglit runs. I'll see if I can track down the remaining issues.

@itoral
Copy link
Contributor Author

itoral commented Sep 3, 2019

@anholt: this doesn't seem to fix all the issues that we have been discussing concerning write violations, but it certainly helps things significantly. Specifically, it fixes all the same CTS test failures as the user-space workaround I had, however, it doesn't remove all the write violation messages from piglit runs. I'll see if I can track down the remaining issues.

Scratch that, I merged the user-space side of the fix for this with the latest version from Alejandro and now the Piglit run doesn't show any write violations any more so the errors I was seeing in piglit were probably caused because the version of the userspace driver I was using was not up ot date and was not flagging dirty tmu state in all the right places.

@itoral
Copy link
Contributor Author

itoral commented Sep 4, 2019

For reference, link to the merge request in Mesa for the user-space side of the change:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1663

@popcornmix
Copy link
Collaborator

In general for changes to upstream files we'd prefer the patch is first submitted upstream.
We can backport merged commits that are useful to the stable kernel tree.

That way if there are some modifications required when getting it merged upstream we don't end up with conflicts form a stale commit.

If the fix is urgent (affecting many users) then we might accept it here first to avoid any delay.

Is the plan for this to be submitted upstream?

@naushir
Copy link
Contributor

naushir commented Sep 4, 2019

As discussed with @itoral, this will be submitted upstream and we will merge back once it is available in the kernel tree.

@itoral
Copy link
Contributor Author

itoral commented Sep 5, 2019

@naushir, @popcornmix: it seems that the v3d drm source from upstream doesn't exactly match what is in the rapberry kernel source. Looks like I only require a trivial change, but I'd like to double-check that it works on the actual RPI4 hardware before I submit the patch to upstream.

Is there a recipe for building and installing a functional kernel for the Rpi4 from the upstream kernel source? I tried copying bcm2711_defconfig from the raspberry pi kernel and then follow the instructions from https://www.raspberrypi.org/documentation/linux/kernel/building.md but with that I don't get any files under arch/arm/boot/dts/overlays/ and the Rpi4 won't boot with the built kernel.

@popcornmix
Copy link
Collaborator

I don't typically run with an upstream kernel so others may be better placed to comment.
I believe you want upstream_kernel=1 in config.txt

@pelwell
Copy link
Contributor

pelwell commented Sep 5, 2019

Overlays are a downstream thing, so you'll need to keep the overlays from a downstream build (or grab them from our firmware repo).

@itoral
Copy link
Contributor Author

itoral commented Sep 6, 2019

@popcornmix: tried with upstream_kernel=1 but still won't boot.
@pelwell: I had the overlays from a functional downstream build of the rpi-4.19.y branch.

I see that the rpi-5.3.y branch does have and up-to-date drm/v3d driver matching upstream where the upstream version of this patch applies clean. I was planning to use that kernel to verify the upstream patch, however that kernel doesn't seem to boot properly either, it gets to the point where I can see the desktop's background image, but it completely freezes there... is that supposed to happen?

@pelwell
Copy link
Contributor

pelwell commented Sep 6, 2019

The upstream_kernel flag tells the firmware which .dtb to load. On a Pi4 it will normally look for bcm2711-rpi-4-b.dtb; with upstream_kernel=1 it will try to load bcm2838-rpi-4-b.dtb, but if it doesn't find it it will load bcm2711-rpi-4-b.dtb and automatically apply dtoverlay=upstream which is equivalent to dtoverlay=vc4-kms-v3d-overlay.dts,cma-96 and dtoverlay=dwc2-overlay.dts,dr_mode=otg. Which highlights the problem - upstream probably don't have enough code to boot a Pi4 except in simple fb mode, because neigher the fkms nor full kms drivers are available.

@itoral
Copy link
Contributor Author

itoral commented Sep 6, 2019

@pelwell: thanks for analysis, I guess we can discard testing with upstream kernels then.

In that case the only other option is to hope that we can get RPI kernels that are up-to-date enough with regards to the upstream kernel that we can use to test patches for drm/v3d on the RPI before submitting them upstream. Right now, only the rpi-5.3.y branch seems to meet that criteria, but as I said, it doesn't look like a kernel built from that branch is functional enough yet.

The changes that I need to implement to the patch so to it applies on the upstream kernel are small enough that I am tempted to just go ahead and submit it without testing, but this is definitely not a good workflow and certainly not one that we can use in general, so I am not sure how to move forward here. Do you have any suggestions? Is there any chance that we can get a functional 5.3 kernel on the Rpi any time soon?

@lategoodbye
Copy link
Contributor

@itoral We are at the very beginning of upstreaming BCM2711 / Raspberry Pi 4. Some preparations (pinctrl, sdhci, i2c) has been submitted for Linux 5.4, but a reduced bootable DTS is still WIP and will arrive Linux 5.5. The problem with Linux 5.3 is the introduction of DMA mapping changes, so i needed to revert commits to get a bootable system.

In case you are interested in current status of upstreaming:
lategoodbye/rpi-zero#43

Sorry, V3D has currently low prio for us. Maybe this working upstream 5.3 branch (only debug UART) is still helpful:
https://github.com/lategoodbye/rpi-zero/commits/bcm2711-initial-v3

Be aware, the upstream version will be very different from rpi-4.19.

@itoral
Copy link
Contributor Author

itoral commented Sep 9, 2019

@lategoodbye: thanks for the input. I think the UART-only kernel won't be useful to us since we need to run visuals.

I'll discuss with @anholt and @naushir about the way to go. For this particular patch we might just go ahead with starting upstream review without testing the upstream patch specifically, however, it looks to me that we might need to backport the patch to the downstream kernel anyway since we have user-space functionality that depends on this and it looks like we won't have a new kernel in time for it that can work with the upstream patch.

@itoral
Copy link
Contributor Author

itoral commented Sep 10, 2019

Upodated the PR with a couple of trivial changes to the comments and commit log.
@anholt: this is ready for review.

@lategoodbye
Copy link
Contributor

lategoodbye commented Sep 10, 2019

@pelwell Just a note, during mainline discussion for initial Raspberry Pi 4 we decided to use bcm2711-rpi-4-b.dtb for upstream, too.

@pelwell
Copy link
Contributor

pelwell commented Sep 10, 2019

That's not going to be at all confusing.

@anholt
Copy link
Contributor

anholt commented Sep 11, 2019

DRM changes have to be submitted upstream to the kernel first. Please do it there so we can do this discussion correctly.

However, ABI-wise, you should turn the pad field into flags and define a flag for the new submit behavior (and make sure no other flags are set). The pad is there for 64-bit struct alignment, which adding a new field would break. Also, for future DRM ABI changes, new fields must always go at the end of the struct. Check out Documentation/ioctl/botching-up-ioctls.rst for more ABI rules

@itoral
Copy link
Contributor Author

itoral commented Sep 12, 2019

Thanks for the feedback @anholt, I've fixed that and sent the upstream patch for review here:
https://lists.freedesktop.org/archives/dri-devel/2019-September/235160.html

I'll update this PR with and up-to-date version of the downstream patch that includes upstream feedback when the review process is finished.

@lategoodbye
Copy link
Contributor

In order to improve the current situation, it would be nice to know which dependencies the V3D driver has on the Raspberry Pi 4. So i can adjust the prio accordingly.

@itoral Does Igalia only care about V3D driver or are there any plans to help upstream changes on these dependencies?

@itoral
Copy link
Contributor Author

itoral commented Sep 12, 2019

@lategoodbye: the big dependency is the DRM subsystem naturally (and its depedencies I guess), I am not sure if there are other relevant dependencies for V3D though, I hope @anholt can give a better answer to your question.

Regarding the upstreaming of dependencies, I think the plan was for us to focus on V3D only, but I guess that's something we can discuss with @naushir if needed.

@lategoodbye
Copy link
Contributor

Okay, let me try to explain the situation. The Raspberry Pi 4 requires several patches for a lot of drivers to get working with a mainline kernel. The actual dependencies for V3D i'm currently aware of are changes to bcm2835-clk, bcm2835-pm and the necessary devicetree nodes. But i'm not sure that there are more. Focus on V3D only works after the initial V3D support for Raspberry Pi 4 has been established.

@anholt
Copy link
Contributor

anholt commented Sep 12, 2019

For V3D upstream, I think PM and clk should be all we need to start doing testing with surfaceless.

…space

Extends the user space ioctl for CL submissions so it can include a request
to flush the cache once the CL execution has completed. Fixes memory
write violation messages reported by the kernel in workloads involving
shader memory writes (SSBOs, shader images, scratch, etc) which sometimes
also lead to GPU resets during Piglit and CTS workloads.

v2: if v3d_job_init() fails we need to kfree() the job instead of
    v3d_job_put() it (Eric Anholt).

v3 (Eric Anholt):
  - Drop _FLAG suffix from the new flag name.
  - Add a new param so userspace can tell whether cache flushing is
    implemented in the kernel.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
@itoral
Copy link
Contributor Author

itoral commented Sep 20, 2019

The upstream version of the patch has now landed here:
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=455d56ce809fcc540dc029a05db074855269dc33

I have updated this pull request with a version of that patch that can be applied to the rpi-4.19.y of the Rpi kernel.

For anyone interested in the user-space side of the change, they can find it in this branch with the GLES 3.1 enablement for v3d (which requires this kernel change):
https://gitlab.freedesktop.org/chema/mesa/tree/v3d-gles31

There should not be any change of behavior for an old user-space when interfacing with the new kernel. For a new user-space interfacing with the old kernel, they should just don't get GLES 3.1 and related functionality (shader storage and shader images in particular). For those with new kernel and userspace, they will get GLES 3.1.

@popcornmix popcornmix merged commit db69008 into raspberrypi:rpi-4.19.y Sep 20, 2019
@popcornmix
Copy link
Collaborator

Thanks!

popcornmix added a commit to raspberrypi/firmware that referenced this pull request Sep 20, 2019
…om user space

See: raspberrypi/linux#3203

kernel: dts: Add DTS for Pi 2B rev 1.2 with BCM2837
See: raspberrypi/linux#3235

firmware: ldconfig: Support [edid=*] to mean any HDMI monitor
See: #1136

firmware: image_fx: Remove restriction that output stride must match input stride
firmware: image_fx: Remove spamming log line

userland: bcm_host: Add support for querying processor type and fkms status
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this pull request Sep 20, 2019
…om user space

See: raspberrypi/linux#3203

kernel: dts: Add DTS for Pi 2B rev 1.2 with BCM2837
See: raspberrypi/linux#3235

firmware: ldconfig: Support [edid=*] to mean any HDMI monitor
See: raspberrypi/firmware#1136

firmware: image_fx: Remove restriction that output stride must match input stride
firmware: image_fx: Remove spamming log line

userland: bcm_host: Add support for querying processor type and fkms status
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants