Skip to content

Conversation

@vianpl
Copy link
Contributor

@vianpl vianpl commented Apr 2, 2020

This is still a WIP, as I haven't properly tested it. Yet, the overall idea is pretty clear at this point and I'd appreciate your input.

Nicolas Saenz Julienne added 2 commits April 2, 2020 17:55
With a fixed device tree we can now go back to the proper behavior.

This reverts commit fad1545.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Now that the kernel command line has precedence over the device tree,
we can use the upstream CMA setup without breaking backward
compatibility.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
@pelwell
Copy link
Contributor

pelwell commented Apr 3, 2020

I like it - it looks like the kind of thing I would have ended up with (sorting out CMA in DT was next-but-one on the list of things to do).

There are a few things I would tweak:

  1. cma has to come after balena-fin in the README.
  2. If cma is going to exist as a standalone overlay then it needs an entry in the Makefile; if not then it should be a dtsi.
  3. When I regenerate the upstream overlay with ovmerge it includes the cma fragment, even though it is hidden. Since the checker script regenerates it every time to test for changes that fragment is going to get put back, so its better to include it now (at least until I change ovmerge to drop hidden segments).

There is a related item that will affect the end result. The Pi 4 KMS driver we merged into rpi-5.4.y this week needs a different version of the vc4-kms-v3d overlay - vc4-kms-v3d-pi4. This means that there will have to be two versions of the upstream overlay as well. #3520 describes a way of loading a platform-specific implementation of an overlay, and #3527 is a candidate for what it would look like in the kernel repo (but lacking the Pi 4 version of the upstream overlay).

@vianpl
Copy link
Contributor Author

vianpl commented Apr 3, 2020

Hi @pelwell, thanks for reviewing.

I don't fully get point 3:

I changed the command to: ovmerge -c vc4-kms-v3d-overlay.dts,cma-default dwc2-overlay.dts,dr_mode=otg

Then, by pulling in vc4-kms-v3d-overlay.dts, I'm getting cma-overlay.dts as a side effect. And by adding the cma-default option the CMA fragment is left dormant. That leaves the &cma node unaltered, which I presume it's what we want.

@pelwell
Copy link
Contributor

pelwell commented Apr 3, 2020

Hi @vianpl,

What I was trying to say in point 3 is that the version of upstream-overlay.dts as contained in the PR does not match the one that I get when I run the same ovmerge command on the other files in this PR. A diff looks like this:

diff --git a/arch/arm/boot/dts/overlays/upstream-overlay.dts b/arch/arm/boot/dts/overlays/upstream-overlay.dts
index e634387d64fb..4d46790d81c9 100644
--- a/arch/arm/boot/dts/overlays/upstream-overlay.dts
+++ b/arch/arm/boot/dts/overlays/upstream-overlay.dts
@@ -8,84 +8,90 @@
 / {
        compatible = "brcm,bcm2835";
        fragment@0 {
+               target = <&cma>;
+               __dormant__ {
+                       size = <0x10000000>;
+               };
+       };
+       fragment@1 {
                target = <&i2c2>;
                __overlay__ {
                        status = "okay";
                };
        };
-       fragment@1 {
+       fragment@2 {
                target = <&fb>;
                __overlay__ {
                        status = "disabled";
                };
        };
-       fragment@2 {
+       fragment@3 {
                target = <&pixelvalve0>;
                __overlay__ {
                        status = "okay";
                };
        };
-       fragment@3 {
+       fragment@4 {
                target = <&pixelvalve1>;
                __overlay__ {
                        status = "okay";
                };
        };
...

The version that I generated include the first fragment, albeit a dormant/hidden one, whereas the one uploaded doesn't. The two overlays have the same end result, but it means that next time I run the command to see if it has changed it will find a difference, even though nothing else as changed.

@vianpl
Copy link
Contributor Author

vianpl commented Apr 3, 2020

understood

@vianpl
Copy link
Contributor Author

vianpl commented Apr 3, 2020

I've performed my share of testing on both 64bit and 32bit builds and multiple devices (RPi3b+ and RPi4b). I double-checked the kernel command line still behaves as expected and the new CMA overlay property modifiers.

Note that, in order to configure CMA trough the kernel command line on RPi4 you also have to provide correct placement constraints. For example cma=256M@0x0-0x40000000. Otherwise it'll be placed in the higher parts of memory, which we can't really use.

I can't do much for vc4-kms-v3d-pi4 as it's only available on v5.4.y and back-porting this series is not trivial. On the other hand forward-porting it will be fairly easy and can take care of it if needed. As for #3520, I welcome it, and think it'll play alright with this series.

Also, I addressed all @pelwell's comments. So overall I'd say it's ready for proper review.

@vianpl vianpl changed the title RFC: CMA overlay centalization CMA overlay centalization Apr 3, 2020
@pelwell
Copy link
Contributor

pelwell commented Apr 3, 2020

There's just one hard TAB on line 553 of the README, otherwise I'm happy to merge.

I'll sort out the "upstream" handling on 5.4, then apply the necessary changes to 5.6.

Now that we don't have to abuse the kernel command line to change CMA's
size we can clean-up and centralize CMA usage in overlays.

A new file, cma-overlay.dts is created to be used as a standalone
overlay or included on other overlays. All CMA users are converted to
this scheme. Ultimately upstream-overlay.dts is also updated to use the
default CMA size provided by upstream.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
@vianpl
Copy link
Contributor Author

vianpl commented Apr 4, 2020

Fixed!

@pelwell pelwell merged commit b08ce72 into raspberrypi:rpi-5.6.y Apr 4, 2020
@pelwell
Copy link
Contributor

pelwell commented Apr 4, 2020

Thanks!

@HiassofT
Copy link
Contributor

HiassofT commented Apr 8, 2020

@pelwell do you plan to backport this change to 5.4?

It's a nice cleanup and would help us in LibreELEC, too - I'm currently testing with cma=512M on cmdline.txt (some 4k HEVC files seem to need more memory to decode properly) and being able to select this via dt instead of cmdline would make things a lot easier for us

@pelwell
Copy link
Contributor

pelwell commented Apr 8, 2020

I was planning to, but I can add it to the list - or take a PR.

@HiassofT
Copy link
Contributor

HiassofT commented Apr 8, 2020

Thanks!

And no need to hurry, take your time, we don't have a pressing need to have this immediately in 5.4.

@pelwell
Copy link
Contributor

pelwell commented Apr 8, 2020

If we're going to switch then it would be nice if it was there from launch.

@pelwell
Copy link
Contributor

pelwell commented Apr 8, 2020

@vianpl - the DT work is easy enough, but how many infrastructure changes need to be back-ported to 5.4 for this to work? Do you think it's sensible?

@vianpl
Copy link
Contributor Author

vianpl commented Apr 8, 2020

@pelwell It's totally doable, I've actually done it for our (SUSE) v5.3 LTS kernel. I can have a go at it if you're patient enough.

@pelwell
Copy link
Contributor

pelwell commented Apr 8, 2020

That would be great - or just give me any helpful tips and I'll do the donkey work.

@pelwell pelwell mentioned this pull request Apr 8, 2020
@pelwell
Copy link
Contributor

pelwell commented Apr 8, 2020

This is as far as I've got: #3536
It's working on 32-bit, but 64-bit is broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants