Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32-C3 boot-loops with CP 8.0.0-beta.2 #7060

Closed
dhalbert opened this issue Oct 15, 2022 · 24 comments · Fixed by #7094
Closed

ESP32-C3 boot-loops with CP 8.0.0-beta.2 #7060

dhalbert opened this issue Oct 15, 2022 · 24 comments · Fixed by #7094

Comments

@dhalbert
Copy link
Collaborator

dhalbert commented Oct 15, 2022

8.0.0-beta.1 works on QT Py ESP32-C3; beta.2 does not. Bisected to #7023, which updates the ESP-IDF

@dhalbert dhalbert added this to the 8.0.0 milestone Oct 15, 2022
@dhalbert
Copy link
Collaborator Author

dhalbert commented Oct 15, 2022

I did a bisect on esp-idf to track this down, and the breaking commit is
espressif/esp-idf@0007754
which is a toolchain update from *gcc8_4_0-esp-2021r2-patch3 to *gcc8_4_0-esp-2021r2-patch4 ☹️

@mwalimu
Copy link

mwalimu commented Oct 16, 2022

I'm seeing the same issue with ESP32 boards. Not with ESP32-S3.

@microdev1
Copy link
Collaborator

This issue is also present in idf@v5.0. I suspect something is wrong with our build configuration (probably flash configuration).

@Neradoc
Copy link

Neradoc commented Oct 16, 2022

I'm seeing the same issue with ESP32 boards. Not with ESP32-S3.

What board is it ? I'm not seeing it with the Feather ESP32 V2.

Adafruit CircuitPython 8.0.0-beta.2-9-g5192082e6 on 2022-10-16; Adafruit Feather ESP32 V2 with ESP32
>>> 

@mwalimu
Copy link

mwalimu commented Oct 16, 2022

First off let me say that I'm a complete amatuer here.

I was porting circuitpython to non-adafruit boards. I able to get it to work on M5-ATOM by rewriting Huzzah32. I also got it to work on M5-STAMP-c3u by rewriting QTYPY-esp32-c3. They worked rather well with REPL and Workflow. However, since the last update none of them work. I get a boot up message that ends with "Serial console setup" and nothing more. The ESP32 chip that goes with those are very basic, while the chip that goes with the feather esp32v2 has more memory and doesn't port to those more basic chips.

@microdev1
Copy link
Collaborator

I believe we can narrow this issue down to only being present on esp32 and esp32c3 boards with 4MB flash.

@microdev1 microdev1 changed the title ESP32-C3 boot-loops after ESP-IDF update ESP32 & ESP32-C3 boards having 4MB flash boot-loop with CP 8.0.0-beta.2 Oct 16, 2022
@dhalbert
Copy link
Collaborator Author

dhalbert commented Oct 16, 2022

After some more testing, I think these are two different problems. The ESP32 problem is a simple hang; the ESP32-C3 problem is a boot-loop. More on ESP32-C3 problem in the next comment. I will re-title this and open another issue for ESP32.

@dhalbert dhalbert changed the title ESP32 & ESP32-C3 boards having 4MB flash boot-loop with CP 8.0.0-beta.2 ESP32-C3 boards having 4MB flash boot-loop with CP 8.0.0-beta.2 Oct 16, 2022
@dhalbert
Copy link
Collaborator Author

dhalbert commented Oct 16, 2022

From the current tip of main, I rolled back only the toolchain from gcc8_4_0-esp-2021r2-patch5 to gcc8_4_0-esp-2021r2-patch3, and the QT Py ESP32-C3 came alive again. (rolling back to patch4 was not enough, which what is actually in the bisect commit above).

Now we need to do some more C3/RISC-V-specific searching in the ESP-IDF issues, etc. to see if anyone else is seeing this.

@dhalbert
Copy link
Collaborator Author

dhalbert commented Oct 17, 2022

Recording some notes:

I did a debug build and added some logging to catch where it's hanging:

V (639) esp_image: loading segment header 4 at offset 0x1838cc_cache_and_start_app
D (778) boot: configure drom and irom and start
V (783) boot: d mmu set paddr=00010000 vaddr=3c130000 size=301864 n=5
V (789) boot: rc=0
V (791) boot: i mmu set paddr=00060000 vaddr=42000000 size=1194156 n=19
V (798) boot: rc=0
D (800) boot: start: 0x40382394
I (814) cpu_start: Pro cpu up.
D (815) efuse: In EFUSE_BLK2__DATA4_REG is used 3 bits starting with 0 bit
D (815) efuse: In EFUSE_BLK2__DATA4_REG is used 8 bits starting with 12 bit
D (821) efuse: In EFUSE_BLK1__DATA3_REG is used 3 bits starting with 18 bit
D (828) efuse: In EFUSE_BLK1__DATA5_REG is used 5 bits starting with 5 bit
D (835) efuse: In EFUSE_BLK1__DATA4_REG is used 7 bits starting with 7 bit
D (842) efuse: In EFUSE_BLK1__DATA4_REG is used 7 bits starting with 14 bit
D (849) efuse: In EFUSE_BLK1__DATA4_REG is used 8 bits starting with 21 bit
D (856) efuse: In EFUSE_BLK1__DATA4_REG is used 3 bits starting with 29 bit
D (863) efuse: In EFUSE_BLK1__DATA5_REG is used 5 bits starting with 0 bit
D (878) clk: RTC_SLOW_CLK calibration value: 3474099
I (886) cpu_start: Pro cpu start user code
I (886) cpu_start: cpu freq: 160000000
I (887) cpu_start: Application information:
I (889) cpu_start: Project name:     circuitpython
I (895) cpu_start: App version:      8.0.0-beta.2-32-g029e57dd5-dirt
I (902) cpu_start: Compile time:     Oct 17 2022 19:03:04
I (908) cpu_start: ELF file SHA256:  4536afb5b0794c9c...
I (914) cpu_start: ESP-IDF:          v4.4.2-381-g716d8531d7-dirty
V (921) memory_layout: reserved range is 0x3c1787f8 - 0x3c178810
D (927) memory_layout: Checking 4 reserved memory ranges:
D (932) memory_layout: Reserved memory range 0x3fc80000 - 0x3fc94000
D (938) memory_layout: Reserved memory range 0x3fc94000 - 0x3fca0ca0
D (945) memory_layout: Reserved memory range 0x3fcdf060 - 0x3fce0000
D (951) memory_layout: Reserved memory range 0x50000000 - 0x50000020
D (958) memory_layout: Building list of available memory regions:
V (964) memory_layout: Examining memory region 0x3fc80000 - 0x3fca0000
V (971) memory_layout: Start of region 0x3fc80000 - 0x3fca0000 overlaps reserved 0x3fc80000 - 0x3fc94000
V (980) memory_layout: Region 0x3fc94000 - 0x3fca0000 inside of reserved 0x3fc94000 - 0x3fca0ca0
V (989) memory_layout: Examining memory region 0x3fca0000 - 0x3fcc0000
V (996) memory_layout: Start of region 0x3fca0000 - 0x3fcc0000 overlaps reserved 0x3fc94000 - 0x3fca0ca0
D (1005) memory_layout: Available memory region 0x3fca0ca0 - 0x3fcc0000
V (1012) memory_layout: Examining memory region 0x3fcc0000 - 0x3fcdc710
D (1018) memory_layout: Available memory region 0x3fcc0000 - 0x3fcdc710
V (1025) memory_layout: Examining memory region 0x3fcdc710 - 0x3fce0000
V (1032) memory_layout: End of region 0x3fcdc710 - 0x3fce0000 overlaps reserved 0x3fcdf060 - 0x3fce0000
D (1041) memory_layout: Available memory region 0x3fcdc710 - 0x3fcdf060
V (1048) memory_layout: Examining memory region 0x50000000 - 0x50002000
V (1055) memory_layout: Start of region 0x50000000 - 0x50002000 overlaps reserved 0x50000000 - 0x50000020
D (1064) memory_layout: Available memory region 0x50000020 - 0x50002000
I (1071) heap_init: Initializing. RAM available for dynamic allocation:
D (1078) heap_init: New heap initialised at 0x3fca0ca0
I (1083) heap_init: At 3FCA0CA0 len 0003BA70 (238 KiB): DRAM
I (1090) heap_init: At 3FCDC710 len 00002950 (10 KiB): STACK/DRAM
D (1097) heap_init: New heap initialised at 0x50000020
I (1102) heap_init: At 50000020 len 00001FE0 (7 KiB): RTCRAM
V (1112) memspi: raw_chip_id: 164020

V (1115) memspi: chip_id: 204016

V (1119) memspi: raw_chip_id: 164020

V (1123) memspi: chip_id: 204016

D (1161) cpu_start: calling init function: 0x4211f192
D (1166) cpu_start: calling init function: 0x4211e982
D (1171) cpu_start: calling init function: 0x4211355e
D (1177) efuse: In EFUSE_BLK2__DATA4_REG is used 3 bits starting with 0 bit
D (1184) efuse: In EFUSE_BLK2__DATA5_REG is used 10 bits starting with 18 bit
D (1195) cpu_start: calling init function: 0x42113530
D (1200) cpu_start: calling init function: 0x42106026
D (1205) cpu_start: calling init function: 0x420800a4
D (1210) cpu_start: calling init function: 0x4207d6be
V (1216) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (1222) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0xC02
D (1230) intr_alloc: Connected src 39 to int 2 (cpu 0)
I (1235) sleep: Configure to isolate all GPIO pins in sleep state`
I (1242) sleep: Enable automatic switching of GPIO sleep configuration
I (1249) ESP_SYSTEM_INIT_FN: after esp_sleep_enable_gpio_switch
I (1256) ESP_SYSTEM_INIT_FN: after esp_apb_backup_dma_lock_init
I (1262) ESP_SYSTEM_INIT_FN: before esp_coex_adapter_register
I (1269) ESP_SYSTEM_INIT_FN: before coex_pre_init

The last four lines are logging I added, in esp-idf/components/esp_system/startup.c, in the ESP_SYSTEM_INIT_FN (that's a macro that expands to a specific name). As indicated, it gets to the coex_pre_init() call, which appears to never return.

#if CONFIG_SW_COEXIST_ENABLE || CONFIG_EXTERNAL_COEX_ENABLE
    ESP_EARLY_LOGI("ESP_SYSTEM_INIT_FN", "before esp_coex_adapter_register");   /// ADDED
    esp_coex_adapter_register(&g_coex_adapter_funcs);
    ESP_EARLY_LOGI("ESP_SYSTEM_INIT_FN", "before coex_pre_init");   /// ADDED
    coex_pre_init();   /// <-------------------------
    ESP_EARLY_LOGI("ESP_SYSTEM_INIT_FN", "after coex_pre_init");   /// ADDED
#endif

(EDIT: compile-time testing shows that both CONFIG_SW_COEXIST_ENABLE and CONFIG_EXTERNAL_COEX_ENABLE are set.)

Interestingly, there is a substantial size difference between the .bins compiled by patch5 and ones compiled by patch3: the patch5 ones are noticeably smaller. Also the build files show that the arch chosen by patch4 is rv32gc, and patch3 is rv32imc. But I don't know what that means.

@jepler
Copy link
Member

jepler commented Oct 17, 2022

RV32IMC: the base 32-bit instruction set, with "M" (integer multiplication & division) and "C" (compressed instruction) support. Notably, without "F", it would be software floating point.

RV32GC: "G" is "Shorthand for the IMAFDZicsr_Zifencei base and extensions" and "C" is compressed again. That alphabet soup says there's support for things like atomic instructions, single and double precision floating point arithmetic, and a few others.

So RV32GC is strictly a superset of IMC. https://en.wikipedia.org/wiki/RISC-V#ISA_base_and_extensions

@dhalbert
Copy link
Collaborator Author

@microdev1 Since the failure is happening very early, in startup.c, it's not CircuitPython code that's failing. So I would suspect something about our sdkconfig or build process. I'm not sure what I will do next, but I may try to build a simple ESP-IDF example for the C3 and look at its sdkconfig and build log. If you have some ideas or have something to try yourself, let me know. Thanks.

@a-a-crabtree
Copy link

Not sure if it helps any, but I figured out the specific build where this started happening, at least for the Seeed XIAO ESP32C3. The error does not occur on 8d34445 and it does occur on de95463. That would point toward the issue being from #7023 , which had a big update to esp-idf included. I'm way too much of a newb on this to track down the error from there, but hopefully this can help narrow it down some.

@dhalbert
Copy link
Collaborator Author

Not sure if it helps any, but I figured out the specific build where this started happening, at least for the Seeed XIAO ESP32C3. The error does not occur on 8d34445 and it does occur on de95463. That would point toward the issue being from #7023 , which had a big update to esp-idf included. I'm way too much of a newb on this to track down the error from there, but hopefully this can help narrow it down some.

I did a bisect inside esp-idf to narrow it down further inside the esp-idf update. See #7060 (comment). The change that causes the problem is not a code change in esp-idf, but an update to the compiler toolchain.

@WULFFJ
Copy link

WULFFJ commented Oct 19, 2022

Not sure if it helps any, but I figured out the specific build where this started happening, at least for the Seeed XIAO ESP32C3. The error does not occur on 8d34445 and it does occur on de95463. That would point toward the issue being from #7023 , which had a big update to esp-idf included. I'm way too much of a newb on this to track down the error from there, but hopefully this can help narrow it down some.

I did a bisect inside esp-idf to narrow it down further inside the esp-idf update. See #7060 (comment). The change that causes the problem is not a code change in esp-idf, but an update to the compiler toolchain.

I am having a problem too with the same model XIAO. The S1 version you provided actually got it to where I can see the REPL in Thonny. However, it will not fully uploaded to the library. The flash problems were happening before this new issue. I will be interested to see how this is resolved.

@a-a-crabtree
Copy link

I am having a problem too with the same model XIAO. The S1 version you provided actually got it to where I can see the REPL in Thonny. However, it will not fully uploaded to the library. The flash problems were happening before this new issue. I will be interested to see how this is resolved.

@WULFFJ I was running into the same error & was able to get it working through the CircuitPython Web Workflow after following the steps here. You may need to try a few times before it works as I kept having Thonny's backend crash. Web Workflow occasionally crashes as well, but generally speaking it's worked much better for me than Thonny over USB.

@WULFFJ
Copy link

WULFFJ commented Oct 19, 2022

I am having a problem too with the same model XIAO. The S1 version you provided actually got it to where I can see the REPL in Thonny. However, it will not fully uploaded to the library. The flash problems were happening before this new issue. I will be interested to see how this is resolved.

@WULFFJ I was running into the same error & was able to get it working through the CircuitPython Web Workflow after following the steps here. You may need to try a few times before it works as I kept having Thonny's backend crash. Web Workflow occasionally crashes as well, but generally speaking it's worked much better for me than Thonny over USB.

I will let you know shortly if that works. If I can get it to upload what I need, I dont mind a little disconnect occasionally. I have been able to get this to show up on my homes wife before, so maybe this will work for me.

@microdev1 microdev1 added crash and removed esp32 labels Oct 20, 2022
@WULFFJ
Copy link

WULFFJ commented Oct 20, 2022

I am having a problem too with the same model XIAO. The S1 version you provided actually got it to where I can see the REPL in Thonny. However, it will not fully uploaded to the library. The flash problems were happening before this new issue. I will be interested to see how this is resolved.

@WULFFJ I was running into the same error & was able to get it working through the CircuitPython Web Workflow after following the steps here. You may need to try a few times before it works as I kept having Thonny's backend crash. Web Workflow occasionally crashes as well, but generally speaking it's worked much better for me than Thonny over USB.

So, made it a bit further with your advice.
1.) You cannot use Brave it seems, even though it has the same web development feature in it. I had to use Chrome.
2.) Turn any USB using software off in the background. (Thonny, MU Editor, any other browser being opened and even Cura seemed to stop me from connecting)
3.) Use the old ESP web serial link https://nabucasa.github.io/esp-web-flasher/

Thonny did not work very good but was better. I was able to upload an env file. Though, my Wifi shows off now, despite the board being listed as connected in my Wifi software.

Thonny wont upload more than one really small file at a time. It starts the restart error message again.

Through Putty, it provides no Wifi address for the board.

@dhalbert
Copy link
Collaborator Author

I intended to roll back the toolchain for ESP32-C3 today for 8.0.0-beta.3. However, when I tested the latest main against this rollback, it did not work on QT Py ESP32-C3 (my main C3 testing board). So I did not include it and released beta.3 with C3 boards still not working.

I was confused because I thought I had demonstrated to myself that the toolchain rollback worked, and yet I could not reproduce. I then did a bisect again, with the toolchain rollback in all my builds, and found that the very recent commit f86377e also causes a boot loop. That commit is part of #7073, and is a change in the partition table for the QT Py ESP32-C3.

This is interesting. So one reason for a boot-loop is the toolchain advance, and another reason is this partition change, without the toolchain advance. These may be unrelated issues, or one may be a clue for the other.. Certainly it is worth tracking down what it is about the partition change that is causing a problem.

@microdev1 ping for intereset.

@dhalbert
Copy link
Collaborator Author

dhalbert commented Oct 21, 2022

This is ridiculously confusing. I went back to the tip of main with the patch3 toolchain and now C3 is working again.

@microdev1 microdev1 changed the title ESP32-C3 boards having 4MB flash boot-loop with CP 8.0.0-beta.2 ESP32-C3 boot-loops with CP 8.0.0-beta.2 Oct 21, 2022
@PaulskPt
Copy link

I think I have a similar issue with a boot-loop on an esp32-s3. See issue #7093

@dhalbert
Copy link
Collaborator Author

Fixed by #7094. Thanks @microdev1!

@a-a-crabtree
Copy link

Can confirm this is working for me as well on both Seeed Xiao ESP32C3 and Adafruit QT PY ESP32-C3. Thanks @dhalbert and @microdev1!

@WULFFJ
Copy link

WULFFJ commented Oct 27, 2022

Fixed by #7094. Thanks @microdev1!

How do you get this fix to work? Im trying using Ubuntu WSL. I can get the firmware.bin file to make. However, its the same issue as before on the Xiao ESP32c3. I must be doing something wrong??? I am making with the 8.1 beta branch. Is that correct? I am flashing with the online webtool and it uploads. However, it will not work with Thonny, MU or Putty at all.

@microdev1
Copy link
Collaborator

I am making with the 8.1 beta branch. Is that correct?

The fix for this should be present in the main branch. Check if your local commit history has 2285dd1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants