New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

significantly higher memory consumption and binary size with mbed 5.1 #2635

Closed
mfiore02 opened this Issue Sep 6, 2016 · 20 comments

Comments

Projects
None yet
10 participants
@mfiore02
Contributor

mfiore02 commented Sep 6, 2016

We are currently developing a platform using the STM32L151CC processor - 32kB of RAM and 256kB of flash. We added platform support to mbed 2.0 during the transition to 5.x and are now adding support to 5.x so we can make a pull request and have our platform enabled.

We've noticed significantly higher memory consumption in mbed 5.x vs mbed 2.0. This is a problem for us because our application (an AT parser and a LoRa stack) currently can't run due to the system running out of memory during initialization.

We've done some testing with a similar platform to ours, NUCLEO-L152RE. We took the same main.cpp and built it with mbed-os revision 2244, mbed 125 & rtos 121, and mbed 121 & rtos 117. All of these builds were using the mbed online compiler. The main from the app is below:

  #include "mbed.h"
  #include "rtos.h" // only in the mbed 2.0 version

  void func(void const*) {
      while (true) {
          osDelay(100);
      }
  }

  int main() {
      uint8_t* mem_buf = NULL;
      uint32_t mem_size = 0; 
      Thread t(func);

      while (true) {
          // malloc until we die
          mem_buf = new uint8_t[256];
          if (! mem_buf) {
              printf("malloc failed: %lu\r\n", mem_size);
          } else {
              mem_size += 256;
              printf("%lu\r\n", mem_size);
          }
      }
  }

Our results are as follows:

with mbed revision 121 mbed-rtos revision 117
binary size: 20kB
free memory: 74752

with mbed revision 125 mbed-rtos revision 121
binary size: 29K
free memory: 66816

with mbed-os revision 2244
binary size: 37kB
free memory: 65792

When I compile our LoRa stack (offline) for our new platform with mbed 2.0, I'm able to allocate over 13kB before the system runs out of memory. When I compile (offline) with mbed 5.1, I'm only able to allocate 2kB.

I'm hoping there are some switches in mbed 5.x that I can flip to reduce memory consumption. Reducing the consumption of our application and stack will be time-consuming and is considered a last resort.

Any ideas or insight appreciated.

@sg-

This comment has been minimized.

Show comment
Hide comment
@sg-

sg- Sep 7, 2016

Member

Some information about the different ROM sizes:

  • NUCLEO-L152RE with mbed revision 121 mbed-rtos revision 117 used uARM as the default toolchain. Baseline
  • NUCLEO-L152RE with mbed revision 125 mbed-rtos revision 121 uses ARM as the default toolchain. ~9k increase
  • with mbed-os revision 2244 has some test harness code lingering that is getting linked in. This will be fixed. ~8k increase that will be reverted along with a few more optimizations

I'm able to allocate over 13kB before the system runs out of memory. When I compile (offline) with mbed 5.1, I'm only able to allocate 2kB.

The default stack size has increased from 128x4 to 512x4 so this will take a chunk of that as well as a fixed size main stack. Also, GCC has some allocation for buffering in the stdlib. Lots of work ongoing right now around memory optimizations. Thanks for the detailed report. Any more finding, please report them here!

Member

sg- commented Sep 7, 2016

Some information about the different ROM sizes:

  • NUCLEO-L152RE with mbed revision 121 mbed-rtos revision 117 used uARM as the default toolchain. Baseline
  • NUCLEO-L152RE with mbed revision 125 mbed-rtos revision 121 uses ARM as the default toolchain. ~9k increase
  • with mbed-os revision 2244 has some test harness code lingering that is getting linked in. This will be fixed. ~8k increase that will be reverted along with a few more optimizations

I'm able to allocate over 13kB before the system runs out of memory. When I compile (offline) with mbed 5.1, I'm only able to allocate 2kB.

The default stack size has increased from 128x4 to 512x4 so this will take a chunk of that as well as a fixed size main stack. Also, GCC has some allocation for buffering in the stdlib. Lots of work ongoing right now around memory optimizations. Thanks for the detailed report. Any more finding, please report them here!

@ciarmcom

This comment has been minimized.

Show comment
Hide comment
@ciarmcom

ciarmcom Sep 7, 2016

Member

ARM Internal Ref: IOTMORF-452

Member

ciarmcom commented Sep 7, 2016

ARM Internal Ref: IOTMORF-452

@janjongboom

This comment has been minimized.

Show comment
Hide comment
@janjongboom

janjongboom Sep 8, 2016

Contributor

@mfiore02 There are three patches in the work which will bring memory consumption back to mbed 2.0-like levels (4.8K decrease in RAM for blinky), plus we're making the stack size configurable in #2646 so you can reclaim some memory there as well.

We'll be publishing a blog post with some background and hard numbers when merged, but I'll send you a draft.

Contributor

janjongboom commented Sep 8, 2016

@mfiore02 There are three patches in the work which will bring memory consumption back to mbed 2.0-like levels (4.8K decrease in RAM for blinky), plus we're making the stack size configurable in #2646 so you can reclaim some memory there as well.

We'll be publishing a blog post with some background and hard numbers when merged, but I'll send you a draft.

@mfiore02

This comment has been minimized.

Show comment
Hide comment
@mfiore02

mfiore02 Sep 8, 2016

Contributor

@janjongboom that's great news! We've manually tweaked stack sizes for our platform for now, but it's good to know there's a global resolution coming!

Contributor

mfiore02 commented Sep 8, 2016

@janjongboom that's great news! We've manually tweaked stack sizes for our platform for now, but it's good to know there's a global resolution coming!

@nuket

This comment has been minimized.

Show comment
Hide comment
@nuket

nuket Sep 8, 2016

Contributor

I've got similar out of memory issues using mbed 5.1 with an nRF51-DK (256KB flash / 32KB RAM).

Checking one app with a similar "malloc(64) until it dies" strategy, I get the following mbed_stats_heap_t output:

  • Start: Heap stats - alloc_cnt: 6, alloc_fail_cnt: 0, current_size: 5284, max_size: 5284, total_size: 5284
  • Stop: Heap stats - alloc_cnt: 26, alloc_fail_cnt: 0, current_size: 6564, max_size: 6564, total_size: 6564

Meaning that I only have about 1K of RAM left to use in my program.

On another app I'm working on, the initial allocations succeed at boot time and tell me:

  • Heap stats - alloc_cnt: 11, alloc_fail_cnt: 0, current_size: 6872, max_size: 6872, total_size: 6872

But the minute I try one more new of any size, the system jumps to HardFault and prints Operator new out of memory on the serial debug port.

Hope these memory reduction patches make it to the mainline soon!

Contributor

nuket commented Sep 8, 2016

I've got similar out of memory issues using mbed 5.1 with an nRF51-DK (256KB flash / 32KB RAM).

Checking one app with a similar "malloc(64) until it dies" strategy, I get the following mbed_stats_heap_t output:

  • Start: Heap stats - alloc_cnt: 6, alloc_fail_cnt: 0, current_size: 5284, max_size: 5284, total_size: 5284
  • Stop: Heap stats - alloc_cnt: 26, alloc_fail_cnt: 0, current_size: 6564, max_size: 6564, total_size: 6564

Meaning that I only have about 1K of RAM left to use in my program.

On another app I'm working on, the initial allocations succeed at boot time and tell me:

  • Heap stats - alloc_cnt: 11, alloc_fail_cnt: 0, current_size: 6872, max_size: 6872, total_size: 6872

But the minute I try one more new of any size, the system jumps to HardFault and prints Operator new out of memory on the serial debug port.

Hope these memory reduction patches make it to the mainline soon!

@janjongboom

This comment has been minimized.

Show comment
Hide comment
@janjongboom

janjongboom Sep 8, 2016

Contributor

@nuket Yeah, I've seen the same issues on the nRF51, but with these patches we managed to fit BLE + mbed TLS + RTOS + application into 32K again, similar profile as on mbed 2.

Contributor

janjongboom commented Sep 8, 2016

@nuket Yeah, I've seen the same issues on the nRF51, but with these patches we managed to fit BLE + mbed TLS + RTOS + application into 32K again, similar profile as on mbed 2.

@nuket

This comment has been minimized.

Show comment
Hide comment
@nuket

nuket Sep 9, 2016

Contributor

@janjongboom This sounds pretty excellent.

Contributor

nuket commented Sep 9, 2016

@janjongboom This sounds pretty excellent.

@nuket

This comment has been minimized.

Show comment
Hide comment
@nuket

nuket Sep 20, 2016

Contributor

@janjongboom Are the patches associated with this stable / getting closer to being merged?

Contributor

nuket commented Sep 20, 2016

@janjongboom Are the patches associated with this stable / getting closer to being merged?

@janjongboom

This comment has been minimized.

Show comment
Hide comment
@janjongboom

janjongboom Sep 20, 2016

Contributor

ping @pan-

Contributor

janjongboom commented Sep 20, 2016

ping @pan-

@pan-

This comment has been minimized.

Show comment
Hide comment
@pan-

pan- Sep 20, 2016

Member

I've made three different PR to solve some issues we currently have with the memory consumption: #2715 #2741 and #2745.
Hopefully they will be merged soon but it will not fix everything,

Here is some good practices to optimize the application for size:

  • Avoid the standard IO subsystem (apply #2715, #2741 and compile with NDEBUG), it saves 2K of RAM and 14K of flash with ARMCC (online compiler).
  • If you have a lot of static global objects, apply #2745 to reduce the number of memory allocations needed for destruction.
  • Allocate statically as much as you can.
  • Allocate small objects using a pool allocator to avoid fragmentation of the main heap.
  • adjust the parameters of the os to your application: All the parameters of the OS are defined in this file. You can override then by putting a file named mbed_app.json at the root of your application. This file contains the application configuration (more details here). Every define can be overridden in the macros list:
{ 
  "macros": [ 
    "NDEBUG=1",
    "OS_TASKCNT=XXX",
    "..."
  ]
}

the main macros to override are:

  • OS_TASKCNT: The number of task running in parallel in the application.
  • OS_IDLESTKSIZE: The size of the stack for the idle task, in words of 4 bytes. It can be shrinked by default to 32.
  • OS_STKSIZE Can be shrunk to 1 if you drive provide manually the buffer which will be used by each threads stack.
  • OS_MAINSTKSIZE: The stack of the main thread in words of 4 bytes.
  • OS_TIMERS: 1 or 0, enable or disable the timer thread.
Member

pan- commented Sep 20, 2016

I've made three different PR to solve some issues we currently have with the memory consumption: #2715 #2741 and #2745.
Hopefully they will be merged soon but it will not fix everything,

Here is some good practices to optimize the application for size:

  • Avoid the standard IO subsystem (apply #2715, #2741 and compile with NDEBUG), it saves 2K of RAM and 14K of flash with ARMCC (online compiler).
  • If you have a lot of static global objects, apply #2745 to reduce the number of memory allocations needed for destruction.
  • Allocate statically as much as you can.
  • Allocate small objects using a pool allocator to avoid fragmentation of the main heap.
  • adjust the parameters of the os to your application: All the parameters of the OS are defined in this file. You can override then by putting a file named mbed_app.json at the root of your application. This file contains the application configuration (more details here). Every define can be overridden in the macros list:
{ 
  "macros": [ 
    "NDEBUG=1",
    "OS_TASKCNT=XXX",
    "..."
  ]
}

the main macros to override are:

  • OS_TASKCNT: The number of task running in parallel in the application.
  • OS_IDLESTKSIZE: The size of the stack for the idle task, in words of 4 bytes. It can be shrinked by default to 32.
  • OS_STKSIZE Can be shrunk to 1 if you drive provide manually the buffer which will be used by each threads stack.
  • OS_MAINSTKSIZE: The stack of the main thread in words of 4 bytes.
  • OS_TIMERS: 1 or 0, enable or disable the timer thread.
@mfiore02

This comment has been minimized.

Show comment
Hide comment
@mfiore02

mfiore02 Oct 3, 2016

Contributor

@pan- I see these fixes have been merged into master branch. Looks like we should expect to see them in mbed-os-5.2?

Is there a release date for mbed-os-5.2?

Contributor

mfiore02 commented Oct 3, 2016

@pan- I see these fixes have been merged into master branch. Looks like we should expect to see them in mbed-os-5.2?

Is there a release date for mbed-os-5.2?

@sg-

This comment has been minimized.

Show comment
Hide comment
@sg-

sg- Oct 3, 2016

Member

Is there a release date for mbed-os-5.2?

estimated mid October.

Member

sg- commented Oct 3, 2016

Is there a release date for mbed-os-5.2?

estimated mid October.

@bogdanm

This comment has been minimized.

Show comment
Hide comment
@bogdanm

bogdanm Oct 27, 2016

Contributor

5.2 was released. @mfiore02, can you please check if this issue still exists with 5.2? For reference, the release note for 5.2 is here

Contributor

bogdanm commented Oct 27, 2016

5.2 was released. @mfiore02, can you please check if this issue still exists with 5.2? For reference, the release note for 5.2 is here

@EduardPon

This comment has been minimized.

Show comment
Hide comment
@EduardPon

EduardPon Nov 7, 2016

Hello,

mbed5.2.1 uses even more RAM in comparison with the 5.1. I have to reduce the heap size to get in the static part.

See below build output:

mbed5.1
        "summary": {
            "total_flash": 750339, 
            "static_ram": 75720, 
            "stack": 512, 
            "heap": 120308, 
            "total_ram": 196540
mbed5.2.1
      "summary": {
            "total_flash": 643074, 
            "static_ram": 77944, 
            "stack": 0, 
            "heap": 118340, 
            "total_ram": 196284
        }

EduardPon commented Nov 7, 2016

Hello,

mbed5.2.1 uses even more RAM in comparison with the 5.1. I have to reduce the heap size to get in the static part.

See below build output:

mbed5.1
        "summary": {
            "total_flash": 750339, 
            "static_ram": 75720, 
            "stack": 512, 
            "heap": 120308, 
            "total_ram": 196540
mbed5.2.1
      "summary": {
            "total_flash": 643074, 
            "static_ram": 77944, 
            "stack": 0, 
            "heap": 118340, 
            "total_ram": 196284
        }
@janjongboom

This comment has been minimized.

Show comment
Hide comment
@janjongboom
Contributor

janjongboom commented Nov 7, 2016

@pan-

This comment has been minimized.

Show comment
Hide comment
@pan-

pan- Nov 7, 2016

Member

@EduardPon

Could you list the mbed modules used by your application and post your configuration file ?

Member

pan- commented Nov 7, 2016

@EduardPon

Could you list the mbed modules used by your application and post your configuration file ?

@EduardPon

This comment has been minimized.

Show comment
Hide comment
@EduardPon

EduardPon Nov 7, 2016

I assume the feature in the target.json file:

{
     "Target": {
        "core": null,
        "default_toolchain": "ARM",
        "supported_toolchains": null,
        "extra_labels": [],
        "is_disk_virtual": false,
        "macros": [],
        "device_has": [],
        "features": [],
        "detect_code": [],
        "public": false,
        "default_lib": "std"
    },
   "K64F": {
        "supported_form_factors": ["ARDUINO"],
        "core": "Cortex-M4F",
        "supported_toolchains": ["ARM", "GCC_ARM", "IAR"],
        "extra_labels": ["Freescale", "KSDK2_MCUS", "FRDM", "KPSDK_MCUS", "KPSDK_CODE", "MCU_K64F"],
        "is_disk_virtual": true,
        "macros": ["CPU_MK64FN1M0VMD12", "FSL_RTOS_MBED", "MBEDTLS_ENTROPY_HARDWARE_ALT"],
        "inherits": ["Target"],
        "progen": {"target": "frdm-k64f"},
        "detect_code": ["0240"],
        "device_has": ["ANALOGIN", "ANALOGOUT", "ERROR_RED", "I2C", "I2CSLAVE", "INTERRUPTIN", "LOWPOWERTIMER", "PORTIN", "PORTINOUT", "PORTOUT", "PWMOUT", "RTC", "SERIAL", "SERIAL_FC", "SLEEP", "SPI", "SPISLAVE", "STDIO_MESSAGES", "STORAGE", "TRNG"],
        "features": ["IPV4", "STORAGE"],
        "release_versions": ["2", "5"]
    },
 "OUR_TARGET_K64F_IPV6": {
        "supported_form_factors": ["ARDUINO"],
        "core": "Cortex-M4F",
        "default_toolchain": "GCC_ARM",
        "extra_labels": ["Freescale", "KSDK2_MCUS", "FRDM", "KPSDK_MCUS", "KPSDK_CODE", "MCU_K64F", "K64F", "OUR_TARGET", "WIRED_IPV6" ],
        "is_disk_virtual": true,
        "inherits": ["K64F"],
        "progen": {"target": "frdm-k64f"},
        "detect_code": ["0240"],
        "features": ["NANOSTACK", "NANOSTACK_FULL"],
        "release_versions": ["2", "5"],
        "device_name": "MK64FN1M0xxx12"
}

EduardPon commented Nov 7, 2016

I assume the feature in the target.json file:

{
     "Target": {
        "core": null,
        "default_toolchain": "ARM",
        "supported_toolchains": null,
        "extra_labels": [],
        "is_disk_virtual": false,
        "macros": [],
        "device_has": [],
        "features": [],
        "detect_code": [],
        "public": false,
        "default_lib": "std"
    },
   "K64F": {
        "supported_form_factors": ["ARDUINO"],
        "core": "Cortex-M4F",
        "supported_toolchains": ["ARM", "GCC_ARM", "IAR"],
        "extra_labels": ["Freescale", "KSDK2_MCUS", "FRDM", "KPSDK_MCUS", "KPSDK_CODE", "MCU_K64F"],
        "is_disk_virtual": true,
        "macros": ["CPU_MK64FN1M0VMD12", "FSL_RTOS_MBED", "MBEDTLS_ENTROPY_HARDWARE_ALT"],
        "inherits": ["Target"],
        "progen": {"target": "frdm-k64f"},
        "detect_code": ["0240"],
        "device_has": ["ANALOGIN", "ANALOGOUT", "ERROR_RED", "I2C", "I2CSLAVE", "INTERRUPTIN", "LOWPOWERTIMER", "PORTIN", "PORTINOUT", "PORTOUT", "PWMOUT", "RTC", "SERIAL", "SERIAL_FC", "SLEEP", "SPI", "SPISLAVE", "STDIO_MESSAGES", "STORAGE", "TRNG"],
        "features": ["IPV4", "STORAGE"],
        "release_versions": ["2", "5"]
    },
 "OUR_TARGET_K64F_IPV6": {
        "supported_form_factors": ["ARDUINO"],
        "core": "Cortex-M4F",
        "default_toolchain": "GCC_ARM",
        "extra_labels": ["Freescale", "KSDK2_MCUS", "FRDM", "KPSDK_MCUS", "KPSDK_CODE", "MCU_K64F", "K64F", "OUR_TARGET", "WIRED_IPV6" ],
        "is_disk_virtual": true,
        "inherits": ["K64F"],
        "progen": {"target": "frdm-k64f"},
        "detect_code": ["0240"],
        "features": ["NANOSTACK", "NANOSTACK_FULL"],
        "release_versions": ["2", "5"],
        "device_name": "MK64FN1M0xxx12"
}
@JojoS62

This comment has been minimized.

Show comment
Hide comment
@JojoS62

JojoS62 Nov 7, 2016

Contributor

the tools/memap.py with option -d gives a good readable output where you can see which module consume how much memory. See also https://github.com/ARMmbed/mbed-os/blob/master/docs/memap.md
I've also trouble with memory space, the DEFAULT_STACK_SIZE is causing a problem for me, resulting in allocating a 4 kB block for the thread_stack_main. But I will open a new issue for this.

Contributor

JojoS62 commented Nov 7, 2016

the tools/memap.py with option -d gives a good readable output where you can see which module consume how much memory. See also https://github.com/ARMmbed/mbed-os/blob/master/docs/memap.md
I've also trouble with memory space, the DEFAULT_STACK_SIZE is causing a problem for me, resulting in allocating a 4 kB block for the thread_stack_main. But I will open a new issue for this.

@mfiore02

This comment has been minimized.

Show comment
Hide comment
@mfiore02

mfiore02 Feb 9, 2017

Contributor

@bogdanm This has been on my list and I finally got to it. I ran the same test as the original post using latest mbed-os: 5.3.4/rev 2740. Compiled online for NUCLEO-L152RE.
binary size: 30.3kB
free memory: 66816

So flash consumption has gone down significantly, but RAM only about 1kB.

Contributor

mfiore02 commented Feb 9, 2017

@bogdanm This has been on my list and I finally got to it. I ran the same test as the original post using latest mbed-os: 5.3.4/rev 2740. Compiled online for NUCLEO-L152RE.
binary size: 30.3kB
free memory: 66816

So flash consumption has gone down significantly, but RAM only about 1kB.

@ChiefBureaucraticOfficer

This comment has been minimized.

Show comment
Hide comment
@ChiefBureaucraticOfficer

ChiefBureaucraticOfficer Oct 27, 2017

GitHib issue review: Closed due to inactivity. Please re-file if critical issues found.

GitHib issue review: Closed due to inactivity. Please re-file if critical issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment