Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC - Emergency GC setting for Lua 5.3 #2783

Closed
TerryE opened this issue Jun 4, 2019 · 5 comments
Closed

RFC - Emergency GC setting for Lua 5.3 #2783

TerryE opened this issue Jun 4, 2019 · 5 comments
Assignees
Labels

Comments

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2019

I am currently doing my Lua 5.3 port and there are a number of issues on which I would like feedback from the contributors. The first relates to garbage collection. Our current Lua 5.1 VM includes the eLua Emergency GC, plus the various EGC tuning parameters that seem to be rarely used and don't seem to be stable.

The default setting EGC (which most users use) is node.egc.ALWAYS and this triggers a full GC before every memory allocation. I find that VM spends maybe 90% of its time doing full GC sweeps with this default setting!

My 'production' apps were initially developed pre-LFS, so I rarely bump against memory exhaustion. The setting that I now use in node.egc.setmode() is ON_MEM_LIMIT=-8096 (note the negative value) which defaults to the standard Lua incremental GC unless there is less than 8Kb heap left; under this threshold it kicks into a full GC before every memory allocation. This setting drops the time in the GC by maybe 10× so overall the Lua code runs maybe 5× faster.

The issue is that many of the modules including net and LwIP don't use the Lua allocator so you need headroom in the heap to allow these to operate as these mallocs will not trigger a GC if needed. As a consequence using ON_ALLOC_FAILURE will invariably cause your application to crater. Likewise there are bugs in how the GC counts memory, so relying on ON_MEM_LIMIT with a positive limit will also tend to crash your application.

(As an aside, the ECG uses multiples of 1Kb internally for its parameters but the node methods work in exact bytes which are then rounded to multiple Kb internally.)

So now what do we do for Lua 5.3? This now already includes an EGC mode, but this defaults to ON_ALLOC_FAILURE without the ALWAYS and ON_MEM_LIMIT options.

  • I see little point of re-implementing the options which crater the app, but a headroom-based trigger does seem sensible.

  • I propose we have a backwards compatibility break on the node.egc options (I don't see this as a material issue as most developers don't go here and the ones that do are power users) and have node.egc.setheap(n) where n is the Kb of headroom required for other mallocs, above which the EGC kicks in. This would be synonymous to the current node.egc.setmode( node.egc.ON_MEM_LIMIT, -1024*n ) with a default of n=12, and setting the limit to nil would cause it to behave like the current default ALWAYS option.

It would be nice to have amber and red thresholds to soften the slowdown as the VM starts to kick in the VM, for example triggering a GC sweep step above the amber and a full GC above the red. However, we can add this sort of refinement later if there is the demand for it.

Note that the Lua interactive mode after completing a command always does a full GC before returning the > prompt to the user. I think that there is a good case for always doing a full GC or at least a GC sweep step on any callback before handing control back to the SDK. However this depends on first resolving another parallel issue.

@jmattsson
Copy link
Member

This sounds reasonable to me. Like you, I've not had any success with ON_ALLOC_FAILURE, for precisely the reasons you lay out. We do use ON_MEM_LIMIT at $work in a performance critical loop, so it'd be sad to see something like that go. Having a node.egc.setminheapfree(k) as you suggest would seem like a good bridge between the old and the new.

Of course, I'm also curious as to whether it'd be possible to use some linker magic to swap out the alloc routines the non-Lua-aware bits (i.e. SDK, some C modules) use and try to hook it all in through the Lua allocator. Definitely a topic for another day though! :D

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 5, 2019

In some of my tests, the standard Lua incremental GC seems to do a pretty good job without hammering the CPU the way that the EGC fullGC seems to - but it needs maybe 5+ Kb headroom.

The issue with triggering a full GC ON_ALLOC_FAILURE is that you really don't want the full GC to kick off in the middle of some time-critical operation such as working the network stack. More thought and play needed, I think, but this is really a separate issue to the Lua53 port

@nwf
Copy link
Member

nwf commented Jun 5, 2019

@TerryE IMHO we shouldn't ever be heap allocating in critical paths... but yes, another issue for another day. If, in your rampage through the C, you identify particularly pernicious allocations, feel free to check in a comment.

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 7, 2019

we shouldn't ever be heap allocating in critical paths...

Yup but any string assignment or creating a new array element does this so it is hard to avoid totally in Lua applications.

I will go with Johny's node.egc.setminheapfree(k) and if we don't get any other comments in the next few days, then I will close this as "agreed".

@TerryE
Copy link
Collaborator Author

TerryE commented Sep 17, 2019

As per this discussion, I've added a SETMEMLIMIT option to lua_gc() to do as discussed. We can track any residue through the many other Lua5.3 issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants