Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua freezing the main thread #1643

Closed
kilrah opened this issue Aug 24, 2014 · 42 comments
Closed

Lua freezing the main thread #1643

kilrah opened this issue Aug 24, 2014 · 42 comments

Comments

@kilrah
Copy link
Member

kilrah commented Aug 24, 2014

I received a message from a user on the French forum who has an issue with this script:
http://andrebernet.ch/divers/predimrc.lua

He says the script runs fine, until you exit it whether it is with the EXIT short press as he implemented it, or with the system's EXIT long. No problem on the simu, but on the radio it locks up the main thread and requires removing the battery to turn the radio off.

@kilrah kilrah added this to the OpenTX 2.0.9 milestone Aug 24, 2014
projectkk2glider added a commit that referenced this issue Aug 24, 2014
* killed a bunch of simu memory leaks
* added a bunch of TRACE() in lua
* fixed some potential bugs in lua.cpp:
  * missing free or sid.background
  * call luaL_unref() only if argument > 0
  * free of init was before its call
  * luaGetMemUsed() returned wrong value
* added Lua shutdown (main reason is to enable checking of simu traces)
* added python script trace-parse.py that checks (de)allocations using simu output traces
@projectkk2glider
Copy link
Member

This sound very similar to the problem I had in #1597. It was related to linking the firmware with newlib-nano.

I have tested this script yesterday with my own build of opentx (with normal newlib) and the worst case is that I get a warning that script can't be loaded because of the lack of RAM. System never locks-up. Strange thing is that problem with this script does not happen always.

So the problem looks two fold:

  • opentx firmware has some bug with the RAM freeing
  • newlib-nano causes even more havoc

I certainly see no problem withing the attached script. But in any case a bad script should not crash the whole system in any case! You should really open an issue for this.

@projectkk2glider
Copy link
Member

I have made some changes and wrote an allocator/deallocator parser that checks the validity of realloc/free calls.

Using commit b15ce6f and standalone simu I get the following results:

Only simulator run, without any scripts, just start simu, go to main screen and then close it:

/opentx/radio/src $ ./simu | tee simu_minimum.log | ../util/trace-parse.py
Mem usage report:
    parsed 722 lines
    number of allocations: 350
    number of free: 350
    allocated RAM peak size: 16355
Remaining allocations: 0
Detected problems: 0

In addition to above, a simple stand-alone script was run:

/opentx/radio/src $ ./simu | tee simu_one_standalone_script.log | ../util/trace-parse.py
Mem usage report:
    parsed 2625 lines
    number of allocations: 1304
    number of free: 1304
    allocated RAM peak size: 16640
Remaining allocations: 0
Detected problems: 0

Like first case, but offending script from this issue was run, all pages were displayed, no values changed:

/opentx/radio/src $ ./simu | tee simu_issue_1643.log | ../util/trace-parse.py
Mem usage report:
    parsed 17408 lines
    number of allocations: 9213
    number of free: 9213
    allocated RAM peak size: 71152
Remaining allocations: 0
Detected problems: 0

My findings:

  • Lua is malloc/free hog, just initializing our Lua environment and registering our functions and constants requires 350 memory allocations!
  • things get worse when actual script is run, every run cycle adds some more (de)allocations
  • Lua is clean - its realloc/free usage is correct, no problems here
  • I wonder what the newlib(nano) allocator does with all that allocations. Maybe we are seeing some bug in it or we just get a lot for heap fragmentation.
  • the peak RAM usage for script from this issue on PC is 71kB! It is probably less on Taranis but still it is high. I any case if script uses too much RAM it should be disabled and not have OpenTX crash.

@projectkk2glider
Copy link
Member

One idea, this post (last reply) http://www.freertos.org/FreeRTOS_Support_Forum_Archive/November_2009/freertos_sbrk_and_newlib_-_is_any_malloc_or_free_used_3447091.html says that heap growing must be aligned to 8 byte boundary. We had this problem before, where menusStack stack was not 8 byte aligned and that caused printf() problems for double type #1379.

@bsongis
Copy link
Member

bsongis commented Aug 24, 2014

The RAM on simu will be always bigger, as it uses 64bits pointers instead of 32bits

@kilrah
Copy link
Member Author

kilrah commented Aug 25, 2014

The OP found that he removed all the concatenation operators the script didn't freeze on exit anymore:
http://andrebernet.ch/divers/predimrc_nocat.lua

But he now has occasional "not enough memory" messages when exiting/reexecuting it, trying again works.

Now Bertrand is back the thread on the French forum is here:
http://frskytaranis.forumactif.org/t1141-presque-resolu-probleme-avec-un-script-la-radio-freeze

@projectkk2glider
Copy link
Member

I will run the new version of script trough simu and parser to compare with old version. But my hunch is that the new script is using a little less memory. I guess that opentx somehow allows too much memory to be allocated and that then causes the crash. We should double check the validity of _sbrk().

But he now has occasional "not enough memory" messages when exiting/reexecuting it, trying again works.

This to me looks like the Lua garbage collector did not finish its job OR newlib allocator did not really release RAM.

@projectkk2glider
Copy link
Member

Did some additional runs of original script with self build commit 69aef74.

Strange thing is that sometimes I can run this script many times without any problem. When I then check script memory usage (when it is running with MENU long press) I get around 39kB. Free memory after script is closed (in statistics page) is reported at about 12kB.

Then after power cycle I get script to run one time and then the next execution returns not enough memory. Then I tried at least 10 times and in all cases got the same error. I have even run another short script in between. That one run without any problem. Free mem in this case was reported only 8kB.

So newlib allocator is definitely not (always) releasing RAM (with _sbrk() function) when Lua releases it. Sometimes it does, but with this script it mostly does not do a release. That is why we see a large ram usage (only 12kB free), but this large script is able to run. This is likely not a problem since it hopefully reuses this ram for new allocations.

It is looking more and more that some particular accumulated allocation size (that is still allowed by _sbrk()) causes problems. On the other hand if allocation is bigger, it gets rejected by _sbrk() and we just get not enough memory.

The _sbrk() declares 4kB from the top of the RAM and does not allow heap to grow in there. That looks ok, but that is the stack for main thread and for interrupts. Do we have enough? I am not very familiar with startup files for this CPU. Where is initial stack pointer set? To which value? Where is stack for interrupt handlers? Do we have a main thread (using this stack) after all other threads are run?

@kilrah
Copy link
Member Author

kilrah commented Aug 26, 2014

Next report from the script author, he moved variable declarations from within individual functions to script level to avoid allocation / deallocation at every function call, and the script no longer crashes.

Seems to support that the problem comes with improper deallocation.

@bsongis
Copy link
Member

bsongis commented Aug 26, 2014

This function needs to be checked, I am not sure that RAM_END-4096 is good.

extern caddr_t _sbrk(int nbytes)
{
if (heap + nbytes < RAM_END-4096) {
unsigned char prev_heap = heap;
heap += nbytes;
return (caddr_t) prev_heap;
}
else {
errno = ENOMEM;
return ((void
)-1);
}
}

@bsongis
Copy link
Member

bsongis commented Aug 27, 2014

This is an awesome script, I didn't think it would be possible !!!

@projectkk2glider
Copy link
Member

Bertrand, ram end is where main stack (MSP register) lives. This stack is used by main function and all interrupt (trap) handlers. I sifted trough source and couldn't find where such ammount of stack could be used.

Could we paint this stack also and then measure its usage?

@bsongis
Copy link
Member

bsongis commented Aug 28, 2014

Right, but the main function shouldn't use too much stack, as soon as the RTOS is started, each task has his own stack.

That's said I also tried to increase to 3000 words the menus stack and the script also worked. I don't get how this is possible. I thought it would be worst.

@projectkk2glider
Copy link
Member

That's said I also tried to increase to 3000 words the menus stack and the script also worked. I don't get how this is possible. I thought it would be worst.

I think I can explain that:

  • lets say we have 100kB RAM for heap
  • allocator takes heap in chunks
  • if 90kB of heap is already allocated and then another chunk is needed two things can happen:
    • if requested chunk is less than 10k the allocation succeeds, but this last allocation could spill into main stack for example.
    • if requested chunk is more than 10k it fails - we get "no memory error"

Now when you changed heap size with additional stack in menu thread, you changed the size of the last free heap chunk. It just happens that this new heap size causes different behavior: it fails to grow before it allocates the last part of the heap - the problematic part.

So there is always a chance to allocate the last part of the heap and get a stack overwrite or to not allocate it and fail with "not enough ram" error.

The question is what is in the last part for the RAM and how bit it is. Do you have an open Taranis with SWD connection? You could inspect the contents of the RAM up there. Mine is closed and used for flying so I don't want to do that.

@bsongis
Copy link
Member

bsongis commented Aug 28, 2014

I would prefer having a "not enough memory" message after my change, it would be obvious we were writing on the main stack. But this is not the case. Perhaps the malloc tries to get a big chunk and when it fails it mallocs a smaller chunk, but I am in doubt it does this.

I don't have the SWD connector on my Plus radio, I have to solder it.

@projectkk2glider
Copy link
Member

We were contemplating this on wrong assumptions!

I have looked over the source code of newlib and newlib-nano allocators from https://launchpad.net/gcc-arm-embedded/4.7/4.7-2013-q3-update/+download/gcc-arm-none-eabi-4_7-2013q3-20130916-src.tar.bz2

The facts for newlib-nano which (I strongly believe) is used on official build server:

  • once heap memory is taken with _sbrk() with positive parameter value, it is never released back to the system via _sbrk() with negative parameter value.
  • freed memory is stored inside allocator in a linked list of free chunks.
  • subsequent malloc() reuse these free chunks if possible, if none is big enough, more memory is requested from the system with _sbrk().

The facts for newlib (which I am using most of the time):

  • this one will release heap back to the sytem with _sbrk() with negative parameter value, but only if either is true:
    • the size of internal topmost free chunk area is bigger than 128kB - ie never in our case since we don't have such big heap (this could be tuned with mallopt())
    • the malloc_trim() is called - which is again never not in OpenTX

Therefore our statistics that shows free memory is misleading for both allocators! Currently it just shows minimum free memory.

@projectkk2glider
Copy link
Member

In view of my previous comment incident in issue #1597 (comment) is now looking very similar to this one.

@projectkk2glider
Copy link
Member

Also the measurement of bit32 library RAM usage in #1626 (comment) is questionable.

@bsongis
Copy link
Member

bsongis commented Aug 28, 2014

This is right, inside OpenTX _sbrk(..) is never called with a negative argument, anyhow it would be completely wrong, only the last block could be freed this way. But that doesn't change anything to our asumptions? We have to understand why the stack goes bigger than 4k, unless we have something really completely wrong with the allocator, I hope no!

@projectkk2glider
Copy link
Member

Bad news. I have implemented main stack free check in commit a78c2b8 and got:

  • stack size 0x2000 = 8192 bytes
  • minimum free seen: 7908 bytes

This means that the main stack (interrupts stack) is only using 284 bytes, which on one hand is good - we have a normal stack usage, but on the other hand we still have the original problem.

@bsongis Reducing max heap size (with main stack reserve from 4k to 8k) just hid your problems because heap size changed. We still have the original problem and it is not the main stack corruption!

@projectkk2glider
Copy link
Member

Since this issue never causes watchdog reboot it is clear that main stack does not get corrupted. That would cause immediate reset.

I am also 99% sure that the libc allocator is not the main culprit.

That leaves us with Lua code and our code in particular the interface to Lua.

The Lua code is presumed clean, but to prove it i will run a bunch more tests in standalone simu and check malloc/free output with checker script.

So in my view our Lua interface code is the main suspect in this case...

@bsongis
Copy link
Member

bsongis commented Sep 1, 2014

I checked, valgrind doesn't see anything wrong

@projectkk2glider
Copy link
Member

I also found another issue in the way newlib-nano allocator works. It is bad in our case, but can be easily fixed. More tomorrow....

@bsongis
Copy link
Member

bsongis commented Sep 1, 2014

Nice catch, thanks!

projectkk2glider added a commit that referenced this issue Sep 2, 2014
* re-enabled usage of newlib-nano with cmdline option NANO=YES
* added dump of newlib-nano allocator free chunks list
* added newlib-nano allocator fix if cmdline option FIX_NEWLIB_NANO_ALLOCATOR=YES
@projectkk2glider
Copy link
Member

Commit 0f9a02a adds debug traces and can be used to see the free memory fragmentation.

Unfortunately I can't reproduce newlib-nano allocator failure. I was getting it on next branch when I had this last commit applied there. Looks like my changes to Lua code on this branch so far have significantly reduced free memory fragmentation. I think Lua stack reduction limitation is most significant.

Newlib-nano allocator fragmentation report

Data presented here was got from the next branch with commit 0f9a02a applied. First part has FIX_NEWLIB_NANO_ALLOCATOR=NO. Data listed here is from Taranis serial port in Debug mode.

When Taranis shows main screen and no model scripts are defined we get this (Lua environment initialized for the first time):

GC Use: 11020bytes
eeprom write general
mallinfo:
           864 0x2000cc18[12]
          1128 0x2000d08c[20]
           524 0x2000d2ac[20]
           648 0x2000d548[16]
           400 0x2000d6e8[20]
           416 0x2000d89c[16]
           248 0x2000d9a4[12]
           432 0x2000db60[12]
            80 0x2000dbbc[28]
          1720 0x2000e290[28]
           256 0x2000e3ac[28]
          3040 0x2000efa8[20]
           624 0x2000f22c[20]
           556 0x2000f46c[680]
           640 0x2000f994[1264]
            36 0x2000fea8[1160]
          1988 0x20010af4[3912]
        Total size: 20868
        Free size:  7268
        Used size:  13600

Now we start some stand-alone script from SD manager, first Lua state gets closed and we can se that all memory is freed, but remains in possesion of allocator:

after lua_close()
mallinfo:
             0 0x2000c8b8[20868]
        Total size: 20868
        Free size:  20868
        Used size:  0

Now new Lua state is created and script is loaded, we get much more memory fragmentation:

mallinfo:
           352 0x2000ca18[16]
           784 0x2000cd38[16]
           388 0x2000cecc[16]
           360 0x2000d044[28]
           656 0x2000d2f0[20]
           284 0x2000d420[12]
           280 0x2000d544[28]
           240 0x2000d650[28]
           520 0x2000d874[84]
            36 0x2000d8ec[140]
            92 0x2000d9d4[64]
            64 0x2000da54[72]
            72 0x2000dae4[40]
           184 0x2000dbc4[36]
           112 0x2000dc58[12]
           176 0x2000dd14[208]
           128 0x2000de64[96]
           844 0x2000e210[64]
          1412 0x2000e7d4[20]
           152 0x2000e880[124]
           196 0x2000e9c0[40]
           144 0x2000ea78[228]
          1108 0x2000efb0[64]
            72 0x2000f038[416]
           156 0x2000f274[520]
            32 0x2000f49c[516]
            32 0x2000f6c0[492]
            32 0x2000f8cc[472]
            36 0x2000fac8[448]
            36 0x2000fcac[428]
            40 0x2000fe80[404]
            32 0x20010034[384]
            32 0x200101d4[360]
            32 0x2001035c[340]
            32 0x200104d0[316]
            36 0x20010630[296]
            32 0x20010778[272]
            32 0x200108a8[252]
            32 0x200109c4[228]
            32 0x20010ac8[208]
            32 0x20010bb8[184]
            36 0x20010c94[164]
            32 0x20010d58[140]
            40 0x20010e0c[120]
            32 0x20010ea4[96]
            32 0x20010f24[76]
            40 0x20010f98[52]
            32 0x20010fec[80]
           292 0x20011160[264]
          1004 0x20011654[136]
            40 0x20011704[28]
            40 0x20011748[368]
          1420 0x20011e44[2000]
          2608 0x20013044[1968]
        Total size: 28476
        Free size:  13484
        Used size:  14992

Still, nothing particular alarming. Now we close and open the same script several times. Each time all allocated memory is freed by Lua, but each time newlib-nano allocator finishes with bigger heap size. Each script run uses a little more heap memory. For example outputs after script exit, when lua state is closed:

mallinfo:
             0 0x2000c8b8[28476]
        Total size: 28476
        Free size:  28476
        Used size:  0
GC Use: 11020bytes
after lua_close()
mallinfo:
             0 0x2000c8b8[35204]
        Total size: 35204
        Free size:  35204
        Used size:  0

mallinfo:
             0 0x2000c8b8[42900]
        Total size: 42900
        Free size:  42900
        Used size:  0

Eventually all available memory gets used by the allocator (around 70k in my case) then error happens when initializing Lua state. Here we try to push constant MIXSRC_SF, but at that point malloc fails, Lua calls panic() since its function was not called in protected environment. MainTask ends up in _exit() which is endless loop:

realloc FAILURE 1430
mallinfo:
             0 0x2000c8c0[980]
          1488 0x2000d264[1396]
            36 0x2000d7fc[1372]
            36 0x2000dd7c[1352]
            36 0x2000e2e8[1328]

 <about 100 lines deleted!!!>

             40 0x2001d0e8[52]
            32 0x2001d13c[80]
           292 0x2001d2b0[264]
          1004 0x2001d7a4[136]
        Total size: 70348
        Free size:  58292
        Used size:  12056
PANIC: unprotected error in call to Lua API (MIXSRC_SF)
                                                       Exiting with status 1.

The list of free chunks above is truncated, but there really is no chunk large enough (1430 bytes) to satisfy malloc request. Memory is so badly fragmented.

Why this happens in next post.

@projectkk2glider
Copy link
Member

Newlib-nano allocator problem

Why is our Lua state initialization causing free memory fragmentation with newlib-nano allocator? It is caused by the newlib-nano allocator algorithm in combination with Lua memory usage pattern.

Case 1: first run

Heap size is zero and when allocator gets malloc request it requests more heap using _sbrk() and returns this new region. So memory is allocated from start of heap towards the end.

Case 2: second run

First all allocated memory by Lua is freed and allocator gets left with one big chunk of free memory corresponding to max memory usage from previous run.

When new malloc request comes and if it is smaller than the size of the free chunk, then:

  • free chunk is split: malloc uses end part, the rest is put back in free chunk list.

So this means that memory is allocated from end towards the beginning if it can be satisfied from free list. This is opposite from the case 1! This behavior in combination with Lua malloc/fee pattern causes massive fragmentation of allocator free space and in the end causes allocator to request more heap, because it can not satisfy all malloc request from the free list (even if end malloc usage is the same).

This pattern then repeats each time Lua state is initialized and each time more heap is needed, but the allocated memory size stay the same! When max heap size is reached, we get Lua panic().

Possible solution

Just to confirm I introduced an ugly hack that wen enabled with FIX_NEWLIB_NANO_ALLOCATOR=YES resets newlib-nano allocator and heap size to zero each time all memory is freed. This completely fixes above problem since each run now begins from the same state of unused heap.

But this fix is not for production since it is highly dependent on allocator implementation and that could change in future versions of newlib.

My current idea is to further analyze Lua malloc/fee pattern and then see what can be improved. Ideas:

  • I already optimized Lua stack reduction which looks like it reduced fragmentation massively
  • small malloc/free could be satisfied from our own allocator that would have a bin of fixed size slots. The rest of allocations would still go to newlib allocator.
  • dump the usage of newlib allocator and use our own for Lua, we could just take current newlib-nano allocator and change it with my FIX_NEWLIB_NANO_ALLOCATOR=YES fix.

I need a litle guidance here. @bsongis what are your thoughts?

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

Thanks for your analysis. I was aware of those fragmentations problems and that's why the RAM is completely freed each time we are allowed to do so (model load, change in model scripts, etc.) in luaInit() - at least I hope the RAM is completely freed, it seems so, there shouldn't remain any malloc used.

I also checked that a script produced a 'not enough memory available' error when a malloc failed, but obviously I missed a case and according to what you are writing above, this case produces an abort() which is really wrong.

So we have many problems to solve and I think they have to be solved in this order:

  1. the prioritary one: avoid this abort() freezing the 'menus' thread
    => we will release the 2.0.10 once this one is fixed
  2. optimize the malloc fragmentation
  3. port the eLua patch to move as much Lua allocated objects from RAM to flash

Now I will focus on 1), perhaps we could open a new issue for 2) - and 3) has already one opened, targeted to 2.1.0 if I am right.

Soooo for 1) did you already find a potential fix? Did you have a look at eLua?

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

I am trying to enable exceptions and throw an exception in a custom panic handler, it would be an easy fix.

@projectkk2glider
Copy link
Member

I was aware of those fragmentations problems and that's why the RAM is completely freed each time we are allowed to do so (model load, change in model scripts, etc.) in luaInit() - at least I hope the RAM is completely freed, it seems so, there shouldn't remain any malloc used.

Completely free RAM still leaves a problem with newlib-nano allocator as is explained above.

So we have many problems to solve and I think they have to be solved in this order:

  1. the prioritary one: avoid this abort() freezing the 'menus' thread
    => we will release the 2.0.10 once this one is fixed
  2. optimize the malloc fragmentation
  3. port the eLua patch to move as much Lua allocated objects from RAM to flash

I agree with your plan. Lets create three separate issues and a corresponding branch for each one. Since this branch is quite messy I suggest we start from the master. I can create this branch and add changes from this branch that are helpful and then the protection can be added later (by you).

I have been toying with 2) so 1) is not done yet by me.

I am trying to enable exceptions and throw an exception in a custom panic handler, it would be an easy fix.

C++ exceptions might not work properly because we are mixing C code (Lua) with C++. Be careful. Proven solution is setjmp/longjmp as explained in http://noahdesu.github.io/2013/01/24/custom-lua-panic-handler.html

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

Completely free RAM still leaves a problem with newlib-nano allocator as is explained above.
True we would need also to reset the heap and reset all the malloc/free internal tables but I do prefer not going there ...

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

I will take care of 1) then. Thanks for your help on the other points! I am really hoping miracles from 3) in fact, so I will give it a try just after 1) is fixed

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

Enabling the exceptions costs a lot on the flash side :(

@projectkk2glider
Copy link
Member

I've told you, do not use C++ exceptions, use setjmp/longjmp!

@bsongis
Copy link
Member

bsongis commented Sep 2, 2014

I like exploring all solutions :)
The setjmp/longjmp solution works, but as soon as we reach the panic, there is no more Lua interpreter available :(

bsongis added a commit that referenced this issue Sep 2, 2014
@projectkk2glider
Copy link
Member

I have number 2) custom allocator written and somewhat tested. Its effect on memory fragmentation is significant. I will post it in coming days.

@projectkk2glider
Copy link
Member

I pushed my changes in commit 3375641 to let you see what am I doing. Code is not cleaned yet. But the new allocator and panic protection seem to be working. But it still needs to be heavily tested!

The idea behind panic protection is that functions that are protected have Protected at the end of their name. And only this functions can be called unprotected from other code.

@rdeanchurch
Copy link

Has it been determined that the string concatenate is OK to use or not?
I'm using it on one script and t works fine in the simulator, but haven't tried it yet on the Tx.

Thanks Dean.

@projectkk2glider
Copy link
Member

String concatenate itself is not a problem, it was just a trigger in this script for this bug. Use it.

@rdeanchurch
Copy link

Thank you.Dean

Date: Wed, 3 Sep 2014 13:09:29 -0700
From: notifications@github.com
To: opentx@noreply.github.com
CC: rdeanchurch@hotmail.com
Subject: Re: [opentx] Lua freezing the main thread (#1643)

String concatenate itself is not a problem, it was just a trigger in this script for this bug. Use it.


Reply to this email directly or view it on GitHub.

projectkk2glider added a commit that referenced this issue Sep 4, 2014
* luaState is handled like bitfeld now
* usage of BinAllocator now optional with make option USE_BIN_ALLOCATOR
* removed clearing of newlib-nano free memory, it was dangerous
* commented out various TRACEs
* code cleanup
@projectkk2glider
Copy link
Member

Using latest commit d4eca36 and make:

$  make  PCB=TARANIS  PCBREV=REV4 HELI=NO GVARS=YES AUTOSWITCH=YES PPM_LIMITS_SYMETRICAL=YES PPM_CENTER_ADJUSTABLE=YES TRANSLATIONS=EN TEMPLATES=NO DBLKEYS=YES AUTOSOURCE=YES  DSM2=PPM LUA=YES   DBLKEYS=YES NANO=YES DEBUG=YES USE_BIN_ALLOCATOR=YES

I get following results on debug port:

Lua initalized, no scripts:

luaL_newstate
luaL_openlibs
registering MODEL
registering LCD
registering functions
registering constants
luaInit done
GC Use: 10459bytes
stack in use: 1 of 40
eeprom write general
mallinfo:
        Total size: 12204
        Free size:  6960
        Used size:  5244
        slots1: 168/200
        slots2: 40/50
        heap: 0x200125d4

In this state Lua is using 12k of newlib-nano allocated RAM and almost all our BinAllocator slots.

Then before runing standalone script, lua is closed:

lua_close
lua_close end
mallinfo:
        Total size: 12204
        Free size:  12204
        Used size:  0
        slots1: 0/200
        slots2: 0/50
        heap: 0x200125d4

So remaining allocated memory is zero 👍 Sucess

Then I started a script that calls non-existing function:

luaL_newstate
luaL_openlibs
registering MODEL
registering LCD
registering functions
registering constants
luaInit done
loading script: 0:/ABC/playtone.lua
Error in script 0:/ABC/playtone.lua init: 0:/ABC/playtone.lua:2: attempt to call global 'playTone' (a nil value)
lua_close
lua_close end
mallinfo:
        Total size: 12204
        Free size:  11968
        Used size:  236
        slots1: 0/200
        slots2: 0/50
        heap: 0x200125d4

Now not all memory is freed! But I think this is the result of above error printing (ie printf()). This 236 will remain allocated from now on regardless of how many times some other script will be run.

Now I run original predimrc.lua, first time and close it:

luaL_newstate
luaL_openlibs
registering MODEL
registering LCD
registering functions
registering constants
luaInit done
loading script: 0:/SCRIPTS/predimrc.lua
realloc FAILURE 45
mallinfo:
        Total size: 59832
        Free size:  392
        Used size:  59440
        slots1: 199/200
        slots2: 50/50
        heap: 0x2001dfe0
GC Use: 39916bytes
stack in use: 2 of 70
Script finished with status 2
lua_close
lua_close end
mallinfo:
        Total size: 59832
        Free size:  59596
        Used size:  236
        slots1: 0/200
        slots2: 0/50
        heap: 0x2001dfe0

It can be seen that memory limit was reached (realloc FAILURE 45) and heap now is at 0x2001dfe0 (0x2001e000 is theoretical limit, 4k reserve). Lua handled the lack of RAM gracefully, it called GC and got enugh RAM that way.

When script was closed, the cleanup was good, but 236 bytes remain (from printf() allegedly).

I have then run predimrc.lua an number of times and the result is basically the same. Here is memory report while script predimrc.lua is running (on page1):

GC Use: 39916bytes
stack in use: 2 of 70
mallinfo:
        Total size: 59832
        Free size:  15932
        Used size:  43900
        slots1: 198/200
        slots2: 50/50
        heap: 0x2001dfe0
GC Use: 39764bytes
stack in use: 2 of 70

projectkk2glider added a commit that referenced this issue Sep 4, 2014
* corrected comments in BinAllocator
* malloc failure simulaton only in DEBUG mode
projectkk2glider added a commit that referenced this issue Sep 4, 2014
@bsongis
Copy link
Member

bsongis commented Sep 4, 2014

I will take as less as possible for 'master' from your patch. This will be ok tomorrow. The release should happen during the week-end, unless we have something else meanwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants