Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipq40xx: meraki-mr33, meraki-mr74: disable image generation #12953

Merged
merged 1 commit into from Jun 25, 2023

Conversation

Leo-PL
Copy link
Contributor

@Leo-PL Leo-PL commented Jun 21, 2023

After migrating to kernel 5.15, upgrading causes the units to become soft-bricked, hanging forever at the kernel startup. Kernel size limitation of 4000000 bytes is suspected here, but this is not fully confirmed.

Disable the images to protect users from inadvertent bricking of units, because recovery of those is painful with Cisco's U-boot, until the root cause is found and fixed.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jun 21, 2023

cc @chunkeey
This will likely affect 23.05 too - It caught me by surprise when I flashed one image built before 23.05 was branched off. Not sure about 22.03 - I need all my devices operational until at least next week.

@github-actions github-actions bot added the target/ipq40xx pull request/issue for ipq40xx target label Jun 21, 2023
@robimarko
Copy link
Contributor

Sadly more and more devices are being hit.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jun 21, 2023

I wonder if using lzma-loader, or any other intermediate is feasible on ipq40xx at all.

@robimarko
Copy link
Contributor

Well, somebody would have to port LZMA loader, its not plug and play.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jun 21, 2023

And while at that, MR33 isn't the most graceful target to test it on. Thankfully I have a spare MF286D, so I may take a plunge at doing so in future.

@john-tho
Copy link
Contributor

I saw it mentioned (in the LEDE-MR33 mega issue) that you are not seeing any console output after the bootloader hands off to kernel. Also saw from OpenWrt device wiki page an OEM kernel booting from an uncompressed FIT image. It may be worth trying the self-decompressing kernel, to rule out bootloader decompression issues. This may be available through FitzImage (I have not used FIT images, sorry):

define Device/FitzImage

There are also some additional debug strings avail in the kernel self-extraction phase, before kernel starts, but these need configuring in kernel: CONFIG_DEBUG_LL. I have these extras I used some time ago, but cannot remember if they worked:

--- a/target/linux/ipq40xx/config-5.4
+++ b/target/linux/ipq40xx/config-5.4
@@ -161,8 +161,12 @@ CONFIG_CRYPTO_SIMD=y
 CONFIG_CRYPTO_XTS=y
 CONFIG_CRYPTO_ZSTD=y
 CONFIG_DCACHE_WORD_ACCESS=y
-CONFIG_DEBUG_LL_INCLUDE="mach/debug-macro.S"
+CONFIG_DEBUG_LL=y
+CONFIG_DEBUG_LL_INCLUDE="debug/msm.S"
 CONFIG_DEBUG_MISC=y
+CONFIG_DEBUG_QCOM_UARTDM=y
+CONFIG_DEBUG_UART_PHYS=0x078af000
+CONFIG_DEBUG_UART_VIRT=0xf78af000
 CONFIG_DMADEVICES=y
 CONFIG_DMA_ENGINE=y
 CONFIG_DMA_OF=y

Cheers

@Djfe
Copy link
Contributor

Djfe commented Jun 23, 2023

Also affected: Zyxel WRE6606. I soft-bricked while trying to port DSA. I got it recovered by now.
I'll add an image-size header for that device on readding support. Right now it's disabled because DSA wasn't ported.

Here's the only commit for ipq40xx so far that bumped compat-version to make people adjust their bootcmd variable
45eb57f

After migrating to kernel 5.15, upgrading causes the units to become
soft-bricked, hanging forever at the kernel startup.
Kernel size limitation of 4000000 bytes is suspected here, but this is
not fully confirmed.

Disable the images to protect users from inadvertent bricking of units,
because recovery of those is painful with Cisco's U-boot, until the root
cause is found and fixed.

Signed-off-by: Lech Perczak <lech.perczak@gmail.com>
@openwrt-bot openwrt-bot merged commit 9d64cc0 into openwrt:main Jun 25, 2023
3 checks passed
@Djfe
Copy link
Contributor

Djfe commented Jul 9, 2023

@Leo-PL have you tested
define Device/FitzImage, yet?
It works fine on my Zyxel WRE6606 and seems to be the greatest solution for now.
The reduction in Kernelsize:

FitImage: 4310288
FitzImage: 3064600

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 9, 2023

Will try soon-ish, I'm quite busy right now. Seems simple enough, thanks!

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 15, 2023

@hauke I just checked, and 23.05 is affected too. This most likely needs a backport. 22.03 works. Checking the FitzImage option now.
Edit: Nevermind, I see it's disabled too. Tested on own build.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 15, 2023

@Leo-PL have you tested define Device/FitzImage, yet? It works fine on my Zyxel WRE6606 and seems to be the greatest solution for now. The reduction in Kernelsize:

FitImage: 4310288
FitzImage: 3064600

@Djfe Just tried, MR33 is able to boot FitzImage, however this does not help. Might just be useful to make images more compact. Reduction is substantial, more than 1MB for 22.03.

I'm having issues with 5.15.79 kernel - have to rebase to the newest. No problem booting a 6MB initramfs image from flash either - this is my current recovery mechanism.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 15, 2023

I saw it mentioned (in the LEDE-MR33 mega issue) that you are not seeing any console output after the bootloader hands off to kernel. Also saw from OpenWrt device wiki page an OEM kernel booting from an uncompressed FIT image. It may be worth trying the self-decompressing kernel, to rule out bootloader decompression issues. This may be available through FitzImage (I have not used FIT images, sorry):

define Device/FitzImage

There are also some additional debug strings avail in the kernel self-extraction phase, before kernel starts, but these need configuring in kernel: CONFIG_DEBUG_LL. I have these extras I used some time ago, but cannot remember if they worked:

--- a/target/linux/ipq40xx/config-5.4
+++ b/target/linux/ipq40xx/config-5.4
@@ -161,8 +161,12 @@ CONFIG_CRYPTO_SIMD=y
 CONFIG_CRYPTO_XTS=y
 CONFIG_CRYPTO_ZSTD=y
 CONFIG_DCACHE_WORD_ACCESS=y
-CONFIG_DEBUG_LL_INCLUDE="mach/debug-macro.S"
+CONFIG_DEBUG_LL=y
+CONFIG_DEBUG_LL_INCLUDE="debug/msm.S"
 CONFIG_DEBUG_MISC=y
+CONFIG_DEBUG_QCOM_UARTDM=y
+CONFIG_DEBUG_UART_PHYS=0x078af000
+CONFIG_DEBUG_UART_VIRT=0xf78af000
 CONFIG_DMADEVICES=y
 CONFIG_DMA_ENGINE=y
 CONFIG_DMA_OF=y

Cheers

This also needs CONFIG_EARLY_PRINTK in OpenWrt configuration and defining CONFIG_DEBUG_UNCOMPRESS in kernel defconfig. Trying now.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 15, 2023

Okay, I finally got some results:

U-Boot 2012.07-g97ab7f1 [local,local] (Oct 06 2016 - 13:07:25)

DRAM:  242 MiB
machid : 0x8010001
Product: meraki_Stinkbug
NAND:  ONFI device found
128 MiB
Using default environment

In:    serial
Out:   serial
Err:   serial
machid: 8010001
Creating 1 MTD partitions on "nand0":
0x000000c00000-0x000007f00000 : "mtd=0"
UBI: attaching mtd1 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    126976 bytes
UBI: smallest flash I/O unit:    2048
UBI: VID header offset:          2048 (aligned 2048)
UBI: data offset:                4096
UBI: attached mtd1 to ubi0
UBI: MTD device name:            "mtd=0"
UBI: MTD device size:            115 MiB
UBI: number of good PEBs:        919
UBI: number of bad PEBs:         1
UBI: max. allowed volumes:       128
UBI: wear-leveling threshold:    4096
UBI: number of internal volumes: 1
UBI: number of user volumes:     5
UBI: available PEBs:             34
UBI: total number of reserved PEBs: 885
UBI: number of PEBs reserved for bad PEB handling: 9
UBI: max/mean erase counter: 828/426
Read 0 bytes from volume part.safe to 84000000
No size specified -> Using max size (2920448)
## Booting kernel from FIT Image at 84000000 ...
   Using 'config@1' configuration
   Trying 'kernel-1' kernel subimage
     Description:  ARM OpenWrt Linux-5.15.79
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x840000e4
     Data Size:    2859856 Bytes = 2.7 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x80208000
     Entry Point:  0x80208000
     Hash algo:    crc32
     Hash value:   482c0350
     Hash algo:    sha1
     Hash value:   28df2c5e5ab7b3486ccbfa8cf08b7c2702c6d670
   Verifying Hash Integrity ... crc32+ sha1+ OK
## Flattened Device Tree from FIT Image at 84000000
   Using 'config@1' configuration
   Trying 'fdt-1' FDT blob subimage
     Description:  ARM OpenWrt meraki_mr33 device tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x842ba56c
     Data Size:    18938 Bytes = 18.5 KiB
     Architecture: ARM
     Hash algo:    crc32
     Hash value:   20322fc1
     Hash algo:    sha1
     Hash value:   a473ecf55cf48e730d4115e5a4ebfaa8d2cef060
   Verifying Hash Integrity ... crc32+ sha1+ OK
   Booting using the fdt blob at 0x842ba56c
   Loading Kernel Image ... OK
OK
   Using Device Tree in place at 842ba56c, end 842c1f65
Using machid 0x8010001 from environment

Starting kernel ...

C:0x802080E0-0x804C2360->0x80C92400-0x80F4C680
Uncompressing Linux... done, booting the kernel.
no ATAGS support: can't continue

Why kernel complains about ATAGs on a DT-based system - no idea :/
I just compared .config for both 5.10 (22.03) and 5.15 kernel, and neither enable it, so I'm puzzled even more. Seems that setup_machine_fdt returns NULL, causing the kernel to fall back to ATAGs.

@Djfe
Copy link
Contributor

Djfe commented Jul 16, 2023

So if I understand you correctly, then something else got broken in Kernel 5.15 and that's why the FitzImage didn't help, yet. It still doesn't boot :/

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 16, 2023

Or in FIT image generation - the kernel either doesn't see the device tree passed to it, or deems it invalid.

herbetom added a commit to herbetom/gluon that referenced this pull request Jul 16, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Jul 16, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
@Leo-PL
Copy link
Contributor Author

Leo-PL commented Jul 16, 2023

Today I noticed, that fresh build with kernel 5.15.120 works.
I did a bisect, to find a commit which potentially repairs the issue, which shown this:

git bisect start
# status: waiting for both good and bad commits
# bad: [1c56801dd2e696cac7da45912b5e3ea1165549d5] ath11k-firmware: update to stable WLAN.HK.2.9.0.1-01837
git bisect bad 1c56801dd2e696cac7da45912b5e3ea1165549d5
# status: waiting for good commit(s), bad commit known
# good: [b0a94fc60de18e55b44fdb656d7e9812c1668af7] kernel: bump 5.15 to 5.15.74
git bisect good b0a94fc60de18e55b44fdb656d7e9812c1668af7
# bad: [7b7edd25a571568438c886529d3443054e02f55f] imagebuilder: allow to specific ROOTFS_PARTSIZE
git bisect bad 7b7edd25a571568438c886529d3443054e02f55f
# bad: [ab8a5f2ea0e15e69a1f5e14642034eb75b2317fd] realtek: fix default image generation
git bisect bad ab8a5f2ea0e15e69a1f5e14642034eb75b2317fd
# good: [efaad5e901508621a47044dd729c4b776633b9b7] cypress-nvram: use symlink to provide NVRAM for some RPis
git bisect good efaad5e901508621a47044dd729c4b776633b9b7
# bad: [05dbdcbd32644b57d317ec16688c91b4b413c975] generic: move pending 870 ca8210 fix crash patch to backport
git bisect bad 05dbdcbd32644b57d317ec16688c91b4b413c975
# bad: [ed9bd9824a477b2cca0887867155a73b38775d80] realtek: refactor keep vlan tag setup, fix tagged forwarding
git bisect bad ed9bd9824a477b2cca0887867155a73b38775d80
# bad: [eaba63cddee44e0761cee1b0f82e3dd359e2dac1] kernel: fix regression on mt7986
git bisect bad eaba63cddee44e0761cee1b0f82e3dd359e2dac1
# good: [8db2db9890e240097caabef56860df3835320f7d] libtracefs: update to 1.6.1
git bisect good 8db2db9890e240097caabef56860df3835320f7d
# good: [3d343ca713aaa1e59962cdb4f09df99b3c7ad9a2] ath79: calibrate nand netgear wndrxxxx with nvmem
git bisect good 3d343ca713aaa1e59962cdb4f09df99b3c7ad9a2
# bad: [21762e46535d154a2e349ae2d646e2ea6926f365] ramips: add support for Keenetic KN-3010
git bisect bad 21762e46535d154a2e349ae2d646e2ea6926f365
# good: [288b0004bfa981e3dbb8678ee3289509c3930217] ath79: fix MAC address assigment for TP-Link TL-WR740N/TL-WR741ND v4
git bisect good 288b0004bfa981e3dbb8678ee3289509c3930217
# bad: [a28297d395d4253c03f362bc1ac1974b0b4f825e] x86: enable PINCTRL for all Intel platform
git bisect bad a28297d395d4253c03f362bc1ac1974b0b4f825e
# bad: [34615250a9c38b131a853166184ae094238333e2] x86/64: enable Intel PINCTRL in 64bit target
git bisect bad 34615250a9c38b131a853166184ae094238333e2
# first bad commit: [34615250a9c38b131a853166184ae094238333e2] x86/64: enable Intel PINCTRL in 64bit target

Which looks like a pure nonsense, but it looks like the culprit somehow is near.
I managed to score two builds differing by 1 commit (288b000 vs 3461525) and indeed the 2nd one is booting, when the kernel config and version is theoretically the same. Device tree for sure is, as shown by dumpimage:

dumpimage -T flat_dt -l good/sysupgrade-meraki_mr33/kernel
Image contains unit addresses @, this will break signing
FIT description: ARM OpenWrt FIT (Flattened Image Tree)
Created:         Sun Nov 20 16:31:10 2022
 Image 0 (kernel-1)
  Description:  ARM OpenWrt Linux-5.15.79
  Created:      Sun Nov 20 16:31:10 2022
  Type:         Kernel Image
  Compression:  gzip compressed
  Data Size:    4293826 Bytes = 4193.19 KiB = 4.09 MiB
  Architecture: ARM
  OS:           Linux
  Load Address: 0x80208000
  Entry Point:  0x80208000
  Hash algo:    crc32
  Hash value:   fb8ddbf0
  Hash algo:    sha1
  Hash value:   c02142715afd397dd7bd27836796d8d4488db253
 Image 1 (fdt-1)
  Description:  ARM OpenWrt meraki_mr33 device tree blob
  Created:      Sun Nov 20 16:31:10 2022
  Type:         Flat Device Tree
  Compression:  uncompressed
  Data Size:    18942 Bytes = 18.50 KiB = 0.02 MiB
  Architecture: ARM
  Hash algo:    crc32
  Hash value:   0af90942
  Hash algo:    sha1
  Hash value:   e4e44e60376e3b2888f6797af25c592488461cce
 Default Configuration: 'config@1'
 Configuration 0 (config@1)
  Description:  OpenWrt meraki_mr33
  Kernel:       kernel-1
  FDT:          fdt-1
dumpimage -T flat_dt -l bad/sysupgrade-meraki_mr33/kernel
Image contains unit addresses @, this will break signing
FIT description: ARM OpenWrt FIT (Flattened Image Tree)
Created:         Sun Nov 20 16:30:27 2022
 Image 0 (kernel-1)
  Description:  ARM OpenWrt Linux-5.15.79
  Created:      Sun Nov 20 16:30:27 2022
  Type:         Kernel Image
  Compression:  gzip compressed
  Data Size:    4293823 Bytes = 4193.19 KiB = 4.09 MiB
  Architecture: ARM
  OS:           Linux
  Load Address: 0x80208000
  Entry Point:  0x80208000
  Hash algo:    crc32
  Hash value:   e52544bc
  Hash algo:    sha1
  Hash value:   f10c6c88126a210f6ed74da5de753e78c6eb83ab
 Image 1 (fdt-1)
  Description:  ARM OpenWrt meraki_mr33 device tree blob
  Created:      Sun Nov 20 16:30:27 2022
  Type:         Flat Device Tree
  Compression:  uncompressed
  Data Size:    18942 Bytes = 18.50 KiB = 0.02 MiB
  Architecture: ARM
  Hash algo:    crc32
  Hash value:   0af90942
  Hash algo:    sha1
  Hash value:   e4e44e60376e3b2888f6797af25c592488461cce
 Default Configuration: 'config@1'
 Configuration 0 (config@1)
  Description:  OpenWrt meraki_mr33
  Kernel:       kernel-1
  FDT:          fdt-1

Maybe the image generation is at fault. The further I dig into this, the more puzzled I am.
Edit: tried rebuilding from scratch, at 288b000, and this yielded bootable image again.

@Flole998
Copy link
Contributor

I suspected some random issue here aswell as I tried various kernel options which sometimes caused bootable images and sometimes didn't. Disabling something completely unused caused issues, and disabling something else fixed them again.

herbetom added a commit to herbetom/gluon that referenced this pull request Jul 26, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 2, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 3, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 9, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 9, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
@Flole998
Copy link
Contributor

Edit: tried rebuilding from scratch, at 288b000, and this yielded bootable image again.

Does that mean you have 2 images with the same versions, same build config and so on and one is bootable and one isn't? Did you run something like hexdiff to see where they differ?

@Flole998
Copy link
Contributor

So the image I built from latest master with the recipe-change from above (and I also modified the qca8k driver and included that FDB learning disable workaround, that's entirely unrelated though) did lead to a bootable image. I am not sure if I was "just lucky" or if images are now reliable and working properly again though. Only time will tell I guess.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 13, 2023

By any chance. do you have a boot log from your attempt? Just until the kernel starts is enough.
Could you please locate the source .its file for building the image? I'm thinking if patching the known broken FIT image to contain FDT load address would indeed fix the issue. This would confirm the solution.

@Flole998
Copy link
Contributor

No bootlog, I didn't disassemble the unit. Where is that its file located?

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 13, 2023

Look for build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-ipq40xx_generic/tmp/openwrt-*-ipq40xx-generic-meraki_mr33-initramfs-fit-zImage.itb.its

When you find it, I think you can force the load address to be unaligned too, then regenerate the image and see if such image boots. Or do that from the image makefile level, this should be easier.

@Flole998
Copy link
Contributor

/dts-v1/;

/ {
        description = "ARM OpenWrt FIT (Flattened Image Tree)";
        #address-cells = <1>;

        images {
                kernel-1 {
                        description = "ARM OpenWrt Linux-5.15.126";
                        data = /incbin/("/openwrt/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-ipq40xx_generic/tmp/openwrt-ipq40xx-generic-meraki_mr33-initramfs-uImage.itb");
                        type = "kernel";
                        arch = "arm";
                        os = "linux";
                        compression = "gzip";
                        load = <0x80208000>;
                        entry = <0x80208000>;
                        hash-1 {
                                algo = "crc32";
                        };
                        hash-2 {
                                algo = "sha1";
                        };
                };


                fdt-1 {
                        description = "ARM OpenWrt meraki_mr33 device tree blob";

                        data = /incbin/("/openwrt/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-ipq40xx_generic/image-qcom-ipq4029-mr33.dtb");
                        type = "flat_dt";
                        load = <0x89000000>;
                        arch = "arm";
                        compression = "none";
                        hash-1 {
                                algo = "crc32";
                        };
                        hash-2 {
                                algo = "sha1";
                        };
                };



        };

        configurations {
                default = "config@1";
                config@1 {
                        description = "OpenWrt meraki_mr33";
                        kernel = "kernel-1";
                        fdt = "fdt-1";



                };

        };
};

I haven't opened the unit so I prefer to not try to brick it. Maybe you can try to assemble the broken image and turn it into a good one?

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 14, 2023

If everything goes well, maybe I can find some time in the evening. For this exact reason, since last experiments I kept one of my units without replacing the rubber feet and their sticky tape.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 14, 2023

Okay, I managed to squeeze out some time for the test - we have it. Setting the DEVICE_DTS_LOADADDR to 0x89000000 causes the board to boot, setting it to 0x89000004 causes it to fail booting.
@Flole998 or @robimarko would you like to submit a PR with the fix?
I can do that as well if you don't.

While at that, do we have any influence through mkits or mkimage to create images in a way, that makes the load addresses aligned on the devices automatically, i.e. to apply padding there if necessary?

@robimarko
Copy link
Contributor

I can make the PR, its great that we found the issue.

@Flole998
Copy link
Contributor

That's for both, MR33 and MR74, right?

@robimarko Thank you for your help! If you want, go ahead (and don't forget to enable image generation and backport to 23 RC).

@robimarko
Copy link
Contributor

robimarko commented Aug 14, 2023

They should both the pretty much the same board?

@Leo-PL Are you sure the address was 0x79000000 and not 0x89000000, DRAM is mapped at 0x80000000 so I am sure it should just crash if you try using 0x79000000.

There is no way to force alignment during image generation, the same was discussed when UniFi 6 devices had the same issue, there is probably a PR somewhere with the discussion.
AFAIK if no load address is set then it's up to the bootloader to relocate the DTB, so this is a bootloader bug.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 14, 2023

@robimarko, Correct, it is 0x89000000 indeed.

While at that, we might enable FitzImage, this would save on kernel size substantially, and extract some common defines within image makefile. At current main I get reduction from 4.1MiB down to 2.9MiB, and boot is faster.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 14, 2023

AFAIK if no load address is set then it's up to the bootloader to relocate the DTB, so this is a bootloader bug.

Yes, but if the bootloader uses device tree in place (as MR33's U-boot does) then we have some influence, to at least make the problem rarer.

@robimarko
Copy link
Contributor

We can just completely avoid the issue by hardcoding the adress.

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 14, 2023

And this is the proper solution too, because of reliability. At the same time it is device-specific, sadly.

@robimarko
Copy link
Contributor

robimarko commented Aug 14, 2023

Well, the thing is that its not supposed to be happening at all, and I have no idea why its happening on Meraki bootloaders as they are just QCA reference ones.

Made a PR, I did not move to Fitz, that should be a separate PR.
#13290

@Flole998
Copy link
Contributor

I am not too sure, the code at https://github.com/riptidewave93/meraki-uboot/blob/e89f7c54758e513b8fda3f63af171ebe23f0adb2/common/image.c#L1526 doesn't seem to enfore the alignment, it seems to just try to find whereever that FIT image is currently located and then it tries to use it. And somehow https://github.com/riptidewave93/meraki-uboot/blob/e89f7c54758e513b8fda3f63af171ebe23f0adb2/common/image.c#L1278 won't help here either as disable_relocation seems to get set because fdt_high is all ones (someone would have to confirm that though).

@robimarko
Copy link
Contributor

fdt_high is most likely the issue, UniFi 6 had the same thing

@Flole998
Copy link
Contributor

Very likely, having a look at u-boot:

# cat /dev/mtd8 | grep fdt_high
fdt_high=0xffffffff
fdt_high
Failed using fdt_high value for Device Tree

herbetom added a commit to herbetom/gluon that referenced this pull request Aug 15, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 15, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 15, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
@Flole998
Copy link
Contributor

While at that, we might enable FitzImage, this would save on kernel size substantially

I thought about this but I don't see the advantage: This unit has more than enough flash and kernel size doesn't seem to be an issue, so what this would do is increase booting time due to the extraction process and I think when doing such a change that should be a factor aswell, something like 1 second would be okay in my opinion, something like 5-10 would make me question the change.

Isn't it possible to make it configurable somehow in menuconfig? Whenever a unit with a FIT-image is selected add an option somewhere to enable a fitz-image?

@robimarko
Copy link
Contributor

It cannot be selectable, its just a recipe thing.

@Flole998
Copy link
Contributor

Just to put this into perspective a little: On my unit I have left the original system installed on separate partitions and I still have about 32 MiB of free space on pretty much stock openWRT with Luci. The smaller kernel could maybe bump this to 34 MiB. Does it really make sense for this device? I don't think so....

@Leo-PL
Copy link
Contributor Author

Leo-PL commented Aug 15, 2023

On the other hand: are there any real downsides of using FitzImage? The board boots faster and kernel is smaller, if only but a little bit.

@Djfe
Copy link
Contributor

Djfe commented Aug 15, 2023

flole probably thought it would boot slower but better compression increases overall read speed on most of these flashes
Screenshot_20230815-234411~2
https://events.static.linuxfound.org/sites/events/files/lcjpcojp13_klee.pdf

ok there is no comparison for lzma in those slides but unless your device is single core, lzma isn't as slow as you think.

@Flole998
Copy link
Contributor

Exactly, thanks for providing the comparison. In that case we should absolutely switch.

herbetom added a commit to herbetom/gluon that referenced this pull request Aug 17, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 17, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
herbetom added a commit to herbetom/gluon that referenced this pull request Aug 21, 2023
The device was disabled in OpenWrt due to unresolved issues with branches
above openwrt-22.03. Even though we only support it as broken it's better
to wait and see what happens upstream.

ref: openwrt/openwrt#12953
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
target/ipq40xx pull request/issue for ipq40xx target
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants