Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 220 byte message causes Heltec to reboot #3573

Closed
clwgh opened this issue Apr 8, 2024 · 8 comments
Closed

[Bug]: 220 byte message causes Heltec to reboot #3573

clwgh opened this issue Apr 8, 2024 · 8 comments
Labels
bug Something isn't working cant-reproduce Can't reproduce issues

Comments

@clwgh
Copy link

clwgh commented Apr 8, 2024

Category

Other

Hardware

Heltec V3

Firmware Version

2.3.2.63df972 Beta (Stable)

Description

Heltec v3.1
Meshtastic firmware 2.3.2.63df972 Beta (Stable)
iOS app 2.3.3 (893)

Experimenting with a friend, sending a tiny base64-encoded jpg file over direct message. The file was base64 encoded and the text split into 220 byte sections. Each of these sections, one at a time, were copied and pasted into a message and sent within the iOS app. Each part was received and relayed without any problems, except for one part.

This one part, when pressing Send on the message, causes the Heltec to reboot. The contents of this part are as follows:

eKvEfi2x8DeGtp8dWu27RsXhh+Ad73TZfrFj3zdES//aAAgBAxEBPwHu/9oACAECEQE/Ae7/2gAIAQEABj8CurOygjvrEIt4BaGZKOVMR5egPT8NfLz3LdfEEgQRf8uK2ixxhASNAeJ+fn+pp3WPfV2t/aLxkit+UvJNB7aZEHjw+zyLX4f8M7Ym0jhIVLGjROSh/co7CTa7sxKuCmGYgDqQeI1+

Edit - trying other 220 byte strings to see if special characters or ending in a + was relevant, and this randomly generated mixed-case alpha numeric string also causes the same reboot

xBH0sYoGXX464UQF2BS8bFWn1OlnLitp3AgmHqogWpEMjfKzGMAq0HsXLvoimNmUEr5ubXFo4EFeuIk48qaSkQldN6orGhzgt81uF3pVCqkLPPuzT3YBIDtASMT7EQ9CRUe9J0VE2iWjjPbgfK2aElwe4sSoySzzd1AAMyHEV0YAHaFWAdLyBYnXYWYmS4NncRJJYRAbPea8axJpgJQcyAwST5cc

On the surface it looks like a buffer problem, however both messages are 220 bytes and that is within the the 228 byte limit presented by the app.

@clwgh clwgh added the bug Something isn't working label Apr 8, 2024
@netscylla
Copy link

Tried it, works on Heltec and Rak, heres a serial dump from the heltec

DEBUG | 20:43:04 201 localSend to channel 0
DEBUG | 20:43:04 201 (bw=250, sf=11, cr=4/5) packet symLen=8 ms, payloadSize=241, time 2033 ms
DEBUG | 20:43:04 201 Setting next retransmission in 10918 msecs: (id=0xdc2b0f71 fr=0x00 to=0x27, WantAck=1, HopLim=4 Ch=0x0 Portnum=1 rxtime=1712695384)
DEBUG | 20:43:04 201 Add packet record (id=0xdc2b0f71 fr=0x00 to=0x27, WantAck=1, HopLim=4 Ch=0x0 Portnum=1 rxtime=1712695384)
DEBUG | 20:43:04 201 Original length - 220
DEBUG | 20:43:04 201 Compressed length - 245
DEBUG | 20:43:04 201 Original message - xBH0sYoGXX464UQF2BS8bFWn1OlnLitp3AgmHqogWpEMjfKzGMAq0HsXLvoimNmUEr5ubXFo4EFeuIk48qaSkQldN6orGhzgt81uF3pVCqkLPPuzT3YBIDtASMT7EQ9CRUe9J0VE2iW
DEBUG | 20:43:04 201 Not using compressing message.

Stack smashing protect failure!

abort() was called at PC 0x420726c7 on core 0
E (44742) esp_core_dump_flash: Core dump flash config is corrupted! CRC=0x7bd5c66f instead of 0x0

Rebooting...

��ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0xc (RTC_SW_CPU_RST),boot:0x29 (SPI_FAST_FLASH_BOOT)
Saved PC:0x4037806c
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3808,len:0x44c
load:0x403c9700,len:0xbe4
load:0x403cc700,len:0x2a38
entry 0x403c98d4
E (354) esp_core_dump_flash: No core dump partition found!
E (354) esp_core_dump_flash: No core dump partition found!
���@�INFO | ??:??:?? 0

//\ E S H T /\ S T / C
INFO | ??:??:?? 0 Booted, wake cause 0 (boot count 1), reset_reason=reset

@thebentern thebentern added the cant-reproduce Can't reproduce issues label Apr 20, 2024
@thebentern
Copy link
Contributor

I believe this has been resolved I followed the steps and could not reproduce using the same steps on the iOS and my Heltec V3. I wonder if removing some of the unishox2 logging statements resolved this one inadvertently

@netscylla
Copy link

netscylla commented Apr 20, 2024

Can confirm resolved within the 2.3.6 alpha

@clwgh
Copy link
Author

clwgh commented Apr 20, 2024

@thebentern are you referring to firmware 2.3.6 alpha? I reported this at 2.3.2 stable and tested now on 2.3.4 stable following both a non-destructive upgrade and a full erase upgrade, and it continued to happen (traces below if of interest).

I then tested again after a full erase upgrade to 2.3.6 alpha and it no longer happened and the messages were being sent.

Sending the base64 fragment following non-destructive upgrade from 2.3.2 stable to 2.3.4 stable

DEBUG | 06:27:05 47 Original message - eKvEfi2x8DeGtp8dWu27RsXhh+Ad73TZfrFj3zdES//aAAgBAxEBPwHu/9oACAECEQE/Ae7/2gAIAQEABj8CurOygjvrEIt4BaGZKOVMR5egPT8NfLz3LdfEEgQRf8uK2ixxhASNAeJ
DEBUG | 06:27:05 47 Not using compressing message.
DEBUG | 06:27:05 47 Expanding short PSK #1
DEBUG | 06:27:05 47 Using AES128 key!
DEBUG | 06:27:05 47 ESP32 crypt fr=30327da8, num=c158f9e7, numBytes=225!

Stack smashing protect failure!

abort() was called at PC 0x420728fb on core 0

Backtrace: 0x4037845e:0x3fcc1e90 0x40380ce5:0x3fcc1eb0 0x40387881:0x3fcc1ed0 0x420728fb:0x3fcc1f50 0x420195ba:0x3fcc1f70 0x420196fb:0x3fcc2180 0x4201309f:0x3fcc21b0 0x42018d8a:0x3fcc21d0 0x420198a5:0x3fcc21f0 0x42014c60:0x3fcc2210 0x42014cfb:0x3fcc2240 0x42017072:0x3fcc2260 0x4201716c:0x3fcc2280 0x42032013:0x3fcc24a0 0x4204970f:0x3fcc24d0 0x42050995:0x3fcc2750 0x420509cd:0x3fcc2780 0x4204c8ed:0x3fcc27c0 0x4204cacd:0x3fcc27f0 0x4204d65d:0x3fcc2810 0x4204bfdd:0x3fcc2840 0x42052c37:0x3fcc2870 0x42051720:0x3fcc28a0 0x4205172b:0x3fcc28c0 0x40375706:0x3fcc28e0 0x42049d8e:0x3fcc2900

ELF file SHA256: c96d9d6e075bb841

E (51184) esp_core_dump_flash: Core dump flash config is corrupted! CRC=0x7bd5c66f instead of 0x0

Rebooting...

��ES

Sending the base64 fragment following full erase reflash of 2.3.4 stable

DEBUG | 07:55:30 18 Original message - eKvEfi2x8DeGtp8dWu27RsXhh+Ad73TZfrFj3zdES//aAAgBAxEBPwHu/9oACAECEQE/Ae7/2gAIAQEABj8CurOygjvrEIt4BaGZKOVMR5egPT8NfLz3LdfEEgQRf8uK2ixxhASNAeJ
DEBUG | 07:55:30 18 Not using compressing message.
DEBUG | 07:55:30 18 Expanding short PSK #1
DEBUG | 07:55:30 18 Using AES128 key!
DEBUG | 07:55:30 18 ESP32 crypt fr=30327da8, num=633506b9, numBytes=225!

Stack smashing protect failure!

abort() was called at PC 0x420728fb on core 0

Backtrace: 0x4037845e:0x3fcc0f60 0x40380ce5:0x3fcc0f80 0x40387881:0x3fcc0fa0 0x420728fb:0x3fcc1020 0x420195ba:0x3fcc1040 0x420196fb:0x3fcc1250 0x4201309f:0x3fcc1280 0x42018d8a:0x3fcc12a0 0x420198a5:0x3fcc12c0 0x42014c60:0x3fcc12e0 0x42014cfb:0x3fcc1310 0x42017072:0x3fcc1330 0x4201716c:0x3fcc1350 0x42032013:0x3fcc1570 0x4204970f:0x3fcc15a0 0x42050995:0x3fcc1820 0x420509cd:0x3fcc1850 0x4204c8ed:0x3fcc1890 0x4204cacd:0x3fcc18c0 0x4204d65d:0x3fcc18e0 0x4204bfdd:0x3fcc1910 0x42052c37:0x3fcc1940 0x42051720:0x3fcc1970 0x4205172b:0x3fcc1990 0x40375706:0x3fcc19b0 0x42049d8e:0x3fcc19d0

ELF file SHA256: c96d9d6e075bb841

E (21713) esp_core_dump_flash: Core dump flash config is corrupted! CRC=0x7bd5c66f instead of 0x0

Rebooting...

��E

Sending the base64 fragment following full erase upgrade from 2.3.4 stable to 2.3.6 alpha

Message was sent without issue.

Notes

The first two consoles have the entry Original message followed by part of the base64 fragment, up to that second + sign. Perhaps this reveals how those firmwares were parsing the string. In the test which worked, this entry does not appear in the console (presumably it's just present during a crash).

@thebentern
Copy link
Contributor

Yes, the change I was referring to actually would have been present in 2.3.5 as well. #3606

@clwgh
Copy link
Author

clwgh commented Apr 20, 2024

Nice one, thanks for confirming. I'm finding 2.3.6 seems to be much better at acquiring new nodes (100 in 40 mins vs many hours or days previously). Not sure if it's relevant to that change you mentioned or even partly imagined! Regarding the earlier crashes, hopefully those traces are of interest as a future reference. Looks like the strings were causing corruption, something overflowing perhaps.

@thebentern
Copy link
Contributor

Regarding the earlier crashes, hopefully those traces are of interest as a future reference. Looks like the strings were causing corruption, something overflowing perhaps.

Indeed. When we re-introduce unishox2 for text compression in 3.0, I would like to re-run these as test cases.

@clwgh
Copy link
Author

clwgh commented Apr 20, 2024

By all means give me a shout if you'd like me to help test anything, happy to assist. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cant-reproduce Can't reproduce issues
Projects
None yet
Development

No branches or pull requests

3 participants