Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpi-4.19 doesn't boot on RPI 4 with multi_v7_defconfig #3057

Closed
lategoodbye opened this issue Jul 7, 2019 · 22 comments
Closed

rpi-4.19 doesn't boot on RPI 4 with multi_v7_defconfig #3057

lategoodbye opened this issue Jul 7, 2019 · 22 comments

Comments

@lategoodbye
Copy link
Contributor

lategoodbye commented Jul 7, 2019

I compiled the rpi-4.19 branch of this repo with multi_v7_defconfig. After replacing the original kernel image on my Raspbian Buster, the kernel stuck on RPI 4 (4 GB RAM) at mounting the rootfs. But there are more critical issues before:

[    0.054949] bcm2835-dma: probe of fe007000.dma failed with error -5
...
[    0.061832] raspberrypi-firmware soc:firmware: Coherent DMA mask 0xffffffff (pfn 0xfff40000-0x40000) covers a smaller range of system memory than the DMA zone pfn 0x0-0xfc001
...
[    0.267555] raspberrypi-exp-gpio soc:firmware:expgpio: Failed to get GPIO 4 config (-12 84)
[    0.267586] raspberrypi-firmware soc:firmware: Coherent DMA mask 0xffffffff (pfn 0xfff40000-0x40000) covers a smaller range of system memory than the DMA zone pfn 0x0-0xfc001
[    0.267623] raspberrypi-exp-gpio soc:firmware:expgpio: Failed to get GPIO 4 config (-12 84)
[    0.267651] gpio-regulator sd_io_1v8_reg: Could not obtain regulator setting GPIOs: -12
[    0.267686] gpio-regulator: probe of sd_io_1v8_reg failed with error -12

I'll try to investigate further.

dmesg.log

@pelwell
Copy link
Contributor

pelwell commented Jul 7, 2019

Good luck - I'm very curious to see what you find.

@lategoodbye
Copy link
Contributor Author

Didn't have much time for investigation. Current state is that the initial call of dma_set_mask_and_coherent() fails. This function seems to fail because dma_supported returns with 0.

@vianpl
Copy link
Contributor

vianpl commented Jul 9, 2019

I followed the error into __dma_supported() and got this message:

[ 0.056757] bcm2835-dma fe007000.dma: Coherent DMA mask 0xffffffff (pfn 0xfff40000-0x40000) covers a smaller range of system memory than the DMA zone pfn 0x0-0xfc001

I'm looking into why, any ideas?

@vianpl
Copy link
Contributor

vianpl commented Jul 9, 2019

Enabling ARM_LPAE fixes the issue for multi_v7_defconfig. The option is enabled in bcm2711_defconfig.

With ARM_LPAE enabled, phys_addr_t is defined as a 64bit value, as opposed to the default, which is 32bit.

I guess someone is making a wrong type/size assumption somewhere.

@lategoodbye
Copy link
Contributor Author

Okay, i was assuming that Raspbian kernel7.img is the non-LPAE variant and kernel7l.img is the LPAE variant of the same kernel. But it seems that both are the LPAE one.

At least i've found a workaround for the DMA issue by removing

#if defined(CONFIG_ZONE_DMA) && defined(CONFIG_ARM_LPAE)

in this commit 278f37a

@vianpl
Copy link
Contributor

vianpl commented Jul 10, 2019

@pelwell Could you provide some info on the dma addressing limitations? I've seen this:

		/* Emulate a contiguous 30-bit address range for DMA */
		dma-ranges = <0xc0000000  0x0 0x00000000  0x3c000000>;

In other words, dma addresses start at 0xc0000000 with a maximum ~1GB size aliased to 0x00000000 which is the beginning of RAM.

This seems to be working fine as long as the kernel doesn't map it's buffers outside the first GB of RAM. Sadly depending on the 32bit config and on 64bit in general, dma coherent allocations are located on the top of the memory. Which generates nasty 64bit dma addresses that break the mailbox and other devices.

For example, the last dma_alloc_coherent I logged (on a 64bit kernel/4GB device) has this physical address: 0xf8000000 which is translated to this dma address: 0x1b8000000. Which the mailbox isn't prepare to handle (AFAIK).

I think this also affects dma_supported() internals, which failed without @lategoodbye's fix.

@pelwell
Copy link
Contributor

pelwell commented Jul 10, 2019

It sounds like you understand the limitations, just not how to overcome them. Do you remember that dma_zone_size patch that Stefan reverted? That will limit the coherent pool to the first 1GB.

@lategoodbye
Copy link
Contributor Author

Just for the report, my intention wasn't to revert the dma_zone_size patch. The idea was to make it unconditional.

@agherzan
Copy link
Contributor

@vianpl @lategoodbye Even making dma_zone_size set to 1G unconditionally doesn't seem to have any impact on 64bit build. Was this working for you?

@lategoodbye
Copy link
Contributor Author

lategoodbye commented Jul 15, 2019

@pelwell
Copy link
Contributor

pelwell commented Jul 15, 2019

Please follow this discussion:

If only it was that easy - score one for GiHub, The discussion seems to have dried up.

@vianpl
Copy link
Contributor

vianpl commented Jul 15, 2019

Please follow this discussion:

If only it was that easy - score one for GiHub, The discussion seems to have dried up.

I'm still looking into the issue, but getting confident with the DMA code takes time... I was planning on doing a small write-up on the issues to solve. Even maybe a small RFC where I generate the max_zone_dma_phys out of the soc's dma-range. But I'm not sure it's good enough, specially taking into account this limitation doesn't apply to all the peripherals.

If anyone's curious, there are more semi related DMA addressing issues with PCI https://patchwork.kernel.org/patch/10605957/. BTW do we know if Broadcom is working on a follow up of this?

@agherzan In the meantime if you want to play around I managed to get a working setup with this:

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3c795278def..cad580bda548 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -170,8 +170,9 @@ static void __init reserve_elfcorehdr(void)
  */
 static phys_addr_t __init max_zone_dma_phys(void)
 {
-       phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, 32);
-       return min(offset + (1ULL << 32), memblock_end_of_DRAM());
+       /*phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, 32);*/
+       /*return min(offset + (1ULL << 32), memblock_end_of_DRAM());*/
+       return 0x40000000;
 }
 
 #ifdef CONFIG_NUMA
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index b90e1aede743..31aa085597fe 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -392,6 +392,8 @@ int dma_direct_supported(struct device *dev, u64 mask)
 
        if (IS_ENABLED(CONFIG_ZONE_DMA))
                min_mask = DMA_BIT_MASK(ARCH_ZONE_DMA_BITS);
+       if (IS_ENABLED(CONFIG_ZONE_DMA32))
+               min_mask = DMA_BIT_MASK(30);
        else
                min_mask = DMA_BIT_MASK(32);

@agherzan
Copy link
Contributor


diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3c795278def..ec3cb7b76a76 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -407,7 +407,8 @@ void __init arm64_memblock_init(void)
 
        /* 4GB maximum for 32-bit only capable devices */
        if (IS_ENABLED(CONFIG_ZONE_DMA32))
-               arm64_dma_phys_limit = max_zone_dma_phys();
+               arm64_dma_phys_limit = 0x40000000;
        else
                arm64_dma_phys_limit = PHYS_MASK + 1;

This works great. Anyone sees any downsides of it? Shall we include it in the rpi fork?

@agherzan
Copy link
Contributor

#3080

@agherzan
Copy link
Contributor

@pelwell do you have any idea if the new firmware switches to 64bit mode automatically based on kernel filename? Do we still need to force it in the config?

@pelwell
Copy link
Contributor

pelwell commented Jul 16, 2019

The current firmware will load a kernel8.img if found and place the Pi in 64-bit mode. The old arm_control flag should still work, but the preferred mechanism to force 64-bit is with "arm_64bit=1". The very latest rpi-update firmware lets you disable 64-bit mode with "arm_64bit=0".

Using an explicitly named kernel, provided the name includes "8.img" and there is an external stub file named "armstub8-gic.bin" or "armstub8.bin" (or an explicitly named stub including "8.bin") then it will also enter 64-bit mode.

@lategoodbye
Copy link
Contributor Author

@vianpl Regarding to pcie-brcmstb, i tried to contact the original author but didn't get any feedback. For my initial RPi 4 series, i will try to concentrate on the known parts to get at least a booting system which is accessible via serial console.

@agherzan
Copy link
Contributor

agherzan commented Jul 16, 2019

The current firmware will load a kernel8.img if found and place the Pi in 64-bit mode

@pelwell This doesn't seem to happen with the latest release: https://github.com/raspberrypi/firmware/releases/tag/1.20190709 . I had to force it using arm_64bit. This is on a RPi 4.

@pelwell
Copy link
Contributor

pelwell commented Jul 16, 2019

I've had the opposite experience, when I wanted it to ignore the V8 kernel and it wouldn't, so this bit of the loader clearly needs some work.

@agherzan
Copy link
Contributor

@pelwell should I create an issue for this? I've just retested this just to double check.

@pelwell
Copy link
Contributor

pelwell commented Jul 17, 2019

It's going to be a low priority as long as forcing the 64bit-ness works, but an issue is a good place for discussion and tracking.

macmpi referenced this issue in volumio/Build Oct 1, 2019
HDMI sound fix
Hifiberry DAC+DSP soundcard driver 
...and many other stuff
@lategoodbye
Copy link
Contributor Author

I think we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants