Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TW#13567] mbedtls sample fails when allocating mbedts_ssl_context on heap #711

Closed
SteveOfTheStow opened this issue Jun 19, 2017 · 20 comments

Comments

@SteveOfTheStow
Copy link

(Also posted on esp32.com but not approved yet so cross-posted here)

I'm trying to use mbedtls in the ESP-IDF and have grabbed the mbedtls server sample program from here. It largely maps fine into an ESP32 model. The one major change I tried to do was to allocate mbedtls_ssl_context on the heap (I need this to use mbedtls in a larger program I'm building), and this seems to blow up the program when it tries to parse certs.
I've tried this change on the sample and run on macOS; works fine.

Change:

mbedtls_ssl_context ssl;
becomes
mbedtls_ssl_context *ssl = malloc(sizeof(mbedtls_ssl_context));

(and all the uses of &ssl become ssl)

Logs from ESP32:

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0x00
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0008,len:8
load:0x3fff0010,len:3444
load:0x40078000,len:10356
load:0x40080000,len:252
entry 0x40080034
I (1135) wifi: wifi firmware version: 41eede3
I (1135) wifi: config NVS flash: enabled
I (1135) wifi: config nano formating: disabled
I (1140) wifi: Init dynamic tx buffer num: 32
I (1141) wifi: Init dynamic rx buffer num: 32
I (1145) wifi: wifi driver task: 3ffcde94, prio:23, stack:4096
I (1150) wifi: Init static rx buffer num: 10
I (1154) wifi: Init dynamic rx buffer num: 32
I (1158) wifi: Init rx ampdu len mblock:7
I (1162) wifi: Init lldesc rx ampdu entry mblock:4
I (1166) wifi: wifi power manager task: 0x3ffd326c prio: 21 stack: 2560
I (1173) wifi: wifi timer task: 3ffd42fc, prio:22, stack:3584
I (1199) wifi: mode : sta (24:0a:c4:04:52:64)
I (2527) wifi: n:11 0, o:1 0, ap:255 255, sta:11 0, prof:1
I (3184) wifi: state: init -> auth (b0)
I (3186) wifi: state: auth -> assoc (0)
I (3191) wifi: state: assoc -> run (10)
I (3213) wifi: connected with LXIII-D, channel 11

. Loading the server cert. and key…�[0;32mI (4791) PEM: Alpha�[0m
�[0;32mI (4791) PEM: Free RAM: 174432�[0m
Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled.
Register dump:
PC : 0x40083283 PS : 0x00060633 A0 : 0x800833e0 A1 : 0x3ffcb7c0
A2 : 0x3ffd52b8 A3 : 0x00000000 A4 : 0xff000000 A5 : 0x80ffffff
A6 : 0x00000005 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x00000000
A10 : 0x3ffc0b68 A11 : 0x00000000 A12 : 0x3ffcc6c8 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x3ffcb8b0 SAR : 0x00000004 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffffe

Backtrace: 0x40083283:0x3ffcb7c0 0x400833e0:0x3ffcb7e0 0x400862b3:0x3ffcb800 0x400862f4:0x3ffcb820 0x40081a9c:0x3ffcb840 0x4000bef8:0x3ffcb860 0x40123710:0x3ffcb880 0x40115e8e:0x3ffcb8d0 0x400def59:0x3ffcb920 0x400df2ea:0x3ffcc310 0x400e0000:0x3ffcc330 0x400e003c:0x3ffcc350

The call that causes things to go bang is:

ret = mbedtls_x509_crt_parse( &srvcert, (const unsigned char *) cacert_pem_start, cacert_pem_bytes );

which doesn't even use the allocated struct.

After a stop in x509_crt.c: 1015, we get to:

0x4017096c is in mbedtls_pem_read_buffer (~/bin/espressif/esp32/esp-idf/components/mbedtls/library/pem.c:332).
327 if( ret == MBEDTLS_ERR_BASE64_INVALID_CHARACTER )
328 return( MBEDTLS_ERR_PEM_INVALID_DATA + ret );
329
330 ESP_LOGI("PEM", "Alpha");
331 ESP_LOGI("PEM", "Free RAM: %d", esp_get_free_heap_size());
332 if( ( buf = mbedtls_calloc( 1, len ) ) == NULL ) {
333 ESP_LOGI("PEM", "Bravo");
334 return( MBEDTLS_ERR_PEM_ALLOC_FAILED );
335 }
336

then:

0x40081b18 is in _calloc_r (~/bin/espressif/esp32/esp-idf/components/newlib/./syscalls.c:56).
51 return new_chunk;
52 }
53
54 void* IRAM_ATTR _calloc_r(struct _reent r, size_t count, size_t size)
55 {
56 void
result = pvPortMalloc(count * size);
57 if (result)
58 {
59 memset(result, 0, count * size);
60 }

and further into the system.

@HubbyGitter
Copy link

HubbyGitter commented Jun 19, 2017

You're certainly checking the return value of malloc() to make sure it's not NULL, right? The chance for it to fail on a Mac as compared to an ESP32 is somewhat different. The structure can be relatively large depending on many #ifdefs (300 bytes or so)?.

And since you didn't indicate that it ever worked with your config and without your modification on the ESP32, we cannot be sure that you're not simply facing a stack overflow, either. (Even if it works without your modification, a potential stack overflow can go unnoticed in one case and mess up everything in the other.)

@SteveOfTheStow
Copy link
Author

Yep, definitely getting a valid pointer back. And it does work if I put the mbedtls_ssl_context struct on the stack.

Here's a gist: https://gist.github.com/SteveOfTheStow/d8109991384f032edcb2de406a78e7a4

@HubbyGitter
Copy link

Again, the stack could be corrupted, and it could have a different effect in both cases due to the different stack layout, so is your build configured for stack overflow, or not?

@SteveOfTheStow
Copy link
Author

SteveOfTheStow commented Jun 20, 2017

I've got "Check by stack pointer value".

Note that the same occurs if I turn off Checking.

@FHFS
Copy link

FHFS commented Jun 20, 2017

Did you include the .pem file?
Also can you do the backtrace with xtensa-esp32-elf-addr2line?
There are mbedtls examples.
Before you made ssl a pointer, the program worked?
OpenSSL component in esp-idf uses malloc'ed structures of mbedtls.

@SteveOfTheStow
Copy link
Author

SteveOfTheStow commented Jun 21, 2017

I was using the built-in mbedtls test cert, but I've just tried a separate pem using COMPONENT_EMBED_FILES and it's the same backtrace.

Here's the output of IDF Monitor:

Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled.
Register dump:
PC : 0x4008327b PS : 0x00060333 A0 : 0x800833d8 A1 : 0x3ffd68b0
0x4008327b: prvInsertBlockIntoFreeList at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

A2 : 0x3ffc868c A3 : 0x00000000 A4 : 0xff000000 A5 : 0x80ffffff
A6 : 0x00000005 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x00000000
A10 : 0x3ffc0b68 A11 : 0x00000000 A12 : 0x3ffca398 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x3ffd69a0 SAR : 0x00000004 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffffe

Backtrace: 0x4008327b:0x3ffd68b0 0x400833d8:0x3ffd68d0 0x4008624b:0x3ffd68f0 0x4008628c:0x3ffd6910 0x40081a94:0x3ffd6930 0x4000bef8:0x3ffd6950 0x401234ec:0x3ffd6970 0x40115c6a:0x3ffd69c0 0x400def86:0x3ffd6a10
0x4008327b: prvInsertBlockIntoFreeList at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

0x400833d8: pvPortMallocTagged at ~/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

0x4008624b: pvPortMallocCaps at~/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:414

0x4008628c: pvPortMalloc at ~/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:414

0x40081a94: _calloc_r at ~/bin/espressif/esp32/esp-idf/components/newlib/./syscalls.c:56

0x401234ec: mbedtls_pem_read_buffer at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/pem.c:332 (discriminator 1)

0x40115c6a: mbedtls_x509_crt_parse at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/x509_crt.c:1253

0x400def86: ssl_task at ~/Dev/src/Practice/Platform_Specific/ESP32/mbedtls_test/main/./main.c:112

I'm sure there's some way to get this to work using mbedtls_ssl_context as a pointer. It does work before I made ssl a pointer, yep.

@SteveOfTheStow
Copy link
Author

Searching for prvInsertBlockIntoFreeList, I found this, which provides a good theory about what the malloc might be up to, though I don't understand the 'why' or how to work around it yet.

@FHFS
Copy link

FHFS commented Jun 21, 2017

@SteveOfTheStow again, Before you made ssl a pointer, the program worked?

As that post you reference suggests, it might be a heap corruption issue. You should check the rest of your code for malloc'ed structs you manipulate that might go out of bounds.

@SteveOfTheStow
Copy link
Author

Yes, it works before I made ssl a pointer.

@SteveOfTheStow
Copy link
Author

SteveOfTheStow commented Jun 22, 2017

I compared the sdkconfig for both the referenced mbedtls client sample in ESP-IDF and, and the project I created using mbedtls's own c file sample.
The consequence is I've been able to get mbedtls's c file running with mbedtls_ssl_context on the heap by changing the following settings from the IDF defaults:

CONFIG_FREERTOS_THREAD_LOCAL_STORAGE_POINTERS=1 (was 3)

// Enable FreeRTOS to use multiple cores
CONFIG_INT_WDT_CHECK_CPU1=y
CONFIG_TASK_WDT_CHECK_IDLE_TASK_CPU1=y
# CONFIG_FREERTOS_UNICORE is not set

I don't know why this is.

Unfortunately the program I'm writing that uses mbedtls still suffers from the issue of failing on parsing certs, as described above (a couple of lines of backtrace are copied below) after these modifications are made.

0x401234ec: mbedtls_pem_read_buffer at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/pem.c:332 (discriminator 1)

0x40115c6a: mbedtls_x509_crt_parse at ~/bin/espressif/esp32/esp-idf/components/mbedtls/library/x509_crt.c:1253

@FHFS
Copy link

FHFS commented Jun 23, 2017

Could you share the source code? Ill try to reproduce.

@SteveOfTheStow
Copy link
Author

99% of the code is just using ESP-IDF right now so I'd just need to change a bunch of entity names and I could probably share it sometime this week.

@HubbyGitter
Copy link

HubbyGitter commented Jun 27, 2017

I'm still on the memory corruption track.

When exactly does your sever start as compared to WiFi?

WiFi has a mysterious issue regarding connect calls failing for no (justified) reason (giving reason "201").

I'm not using any malloc()ed stuff in my own code so I that could be a reason why I do not encounter any runtime problems except the failing connect. Actually, since all my memory gets allocated before WiFi starts, there's no chance for the WiFi lib to access any memory which was freed and now belongs to me (rather, it will mess up its own former memory, which I never used).

In your case, depending on the startup sequence, it's possible that the root cause is actually WiFi going wild, accessing memory via a pointer to formerly own memory which is now yours.

If you allocate that memory before everything else and pass it to your server's main(), does that change anything?

@FayeY FayeY changed the title mbedtls sample fails when allocating mbedts_ssl_context on heap [TW#13567] mbedtls sample fails when allocating mbedts_ssl_context on heap Jun 28, 2017
@SteveOfTheStow
Copy link
Author

SteveOfTheStow commented Jun 29, 2017

Here's a sample version of the app I'm building: https://github.com/SteveOfTheStow/esp_mbedtls_test

  • Needs wifi settings in alpha_app_wifi.c
  • It now breaks for me when trying to malloc inside uart_connection when trying a malloc(). Just need to start the program and it falls over, no input required.

Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled.
Register dump:
PC : 0x400832e3 PS : 0x00060033 A0 : 0x80083440 A1 : 0x3ffda3f0
0x400832e3: prvInsertBlockIntoFreeList at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

A2 : 0x3ffdaa6c A3 : 0x00000000 A4 : 0xff000000 A5 : 0x80ffffff
A6 : 0x00000022 A7 : 0x00000000 A8 : 0x00000000 A9 : 0x00000000
A10 : 0x3ffc0b08 A11 : 0x00000001 A12 : 0x3ffca2b8 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x00000000 SAR : 0x00000004 EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xfffffffe

Backtrace: 0x400832e3:0x3ffda3f0 0x4008343d:0x3ffda410 0x40086af0:0x3ffda430 0x40086b31:0x3ffda450 0x40081a99:0x3ffda470 0x4000beaf:0x3ffda490 0x400df883:0x3ffda4b0
0x400832e3: prvInsertBlockIntoFreeList at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

0x4008343d: pvPortMallocTagged at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/freertos/./heap_regions.c:410

0x40086af0: pvPortMallocCaps at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:372

0x40086b31: pvPortMalloc at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/esp32/./heap_alloc_caps.c:306

0x40081a99: _malloc_r at /Users/steveofthestow/bin/espressif/esp32/esp-idf/components/newlib/./syscalls.c:27

0x400df883: uart_event_task at /Users/steveofthestow/Dev/src/External/steve-of-the-stow-esp-mbedtls-sample/main/./alpha_app_uart_connection.c:39

@SteveOfTheStow
Copy link
Author

@HubbyGitter Thanks for the tip. Not sure I could allocate everything before starting WiFi, especially bits that depend on it.

@HubbyGitter
Copy link

@SteveOfTheStow To clarify, my suggestion was meant for helping to find the root cause, not as a solution for a working system.

@SteveOfTheStow
Copy link
Author

Fair. Was actually fine to swap around starting the SSL/UART connectivity and the WiFI, and it still broke in the same place alas.

@projectgus
Copy link
Contributor

projectgus commented Aug 21, 2017

To work around this bug, remove the call to #include "mbedtls/config.h" from your project's source files. You will need to move #include "mbedtls/ssl.h to the top of the list of mbedTLS headers to avoid errors in some of the other headers (like mbedtls/certs.h).

Why is this a bug? IDF ships its own mbedTLS config header, esp_config.h. This header is correctly included recursively from inside headers like mbedtls/ssl.h, but we still ship the default config header in "mbedtls/config.h". This means any source file which includes mbedtls/config.h directly gets the default configuration, but mbedTLS libraries are compiled with the esp_config.h configuration.

The size of various mbedTLS structures depends on the config items enabled. sizeof(mbedtls_ssl_context) is less under mbedtls/config.h than under esp_config.h. So the allocated buffer was being overrun when mbedtls_ssl_init(ssl) was called.

This manifests as a crash due to heap corruption. The same memory corruption still happens when mbedtls_ssl_context is allocated on the stack, it just happens to not corrupt any stack memory in a way that causes a crash.

I'm going to leave this Issue open for now because we should either remove the default config header, or provide a way to detect if it's accidentally included directly into a source file.

PS While looking for this I noticed another memory corruption bug here:
https://github.com/SteveOfTheStow/esp_mbedtls_test/blob/master/main/alpha_app_ssl_server.c#L340
taskName needs to one be one byte longer to account for the terminating NULL byte. As written, strcat() overflows the buffer. This pattern seems to be repeated in at least one other place.

@projectgus
Copy link
Contributor

Fix for the underlying issue is coming (including "mbedtls/config.h" directly will include the correct configuration.)

@SteveOfTheStow
Copy link
Author

Many thanks!

@igrr igrr closed this as completed in 2c0ff0c Aug 28, 2017
turmary pushed a commit to Seeed-Studio/Seeed_Arduino_mbedtls that referenced this issue Jan 22, 2020
…" directly in program

Previously this resulted in a config mismatch between default config and esp_config.h

Closes espressif/esp-idf#711
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants