Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua Plugin memory leak on remap configuration reloads. #8728

Closed
pbchou opened this issue Mar 15, 2022 · 12 comments · Fixed by #8764
Closed

Lua Plugin memory leak on remap configuration reloads. #8728

pbchou opened this issue Mar 15, 2022 · 12 comments · Fixed by #8764
Assignees

Comments

@pbchou
Copy link
Contributor

pbchou commented Mar 15, 2022

Hi. We are experiencing a memory leak on remap configuration reloads while using the Lua plugin. This seems to be a known issue based on comments in the source code. We will need a fix as our Lua scripts leak close 1GB of memory on every reload in production.

Mention @shukitchan from the original mailing list discussion.

@ezelkow1
Copy link
Member

From the mailing list, I had a few questions since we've also been tracking a leak on 9.1. So for the instances you are seeing this is it only from global/file-global-local var usages? Or just lua reloads in general?

@pbchou
Copy link
Contributor Author

pbchou commented Mar 15, 2022

I can't really tell except with the global and file-local variables since I am testing with 100MB chunks. There might be leakages of other things, but they are too small to register with only the small test script. You would probably have to run with tcmalloc or jemalloc with leak checking turned on to find any other small leaks.

Couple of notes --

  1. For 9.1.1 we also cherry-picked the following SNIConfig: tunnel_route - Change the way we extract matched subgroups from the servename #8589 (memory leak) and Fix length bug in validate_unmapped_url_path #8080 (core dump) at the recommendations of the community.

  2. It seems that I had to reverse the following commits 36a8cd (which removed the calling the LuaJIT garbage collector when deleting remap instances) and 1f645a (which removed the freeing of the remap instance handle and the clearing of the global registry in the Lua states when deleting remap instances) to get the global and file-local variables to be cleared on reloads. I don't think reversing either one on its own will do it. I think the clearing of the registry and calling the garbage collector are probably more important than the leaking of the instance handle structure.

@shukitchan
Copy link
Contributor

So you are saying that you want to reverse those two changes : i.e. calling garbage collector and clearing of the global registry in the lua states when deleting remap instances.

@pbchou
Copy link
Contributor Author

pbchou commented Mar 16, 2022

Yes, both steps seem to be required based on my testing. If I am correct in that remap create new instances all run in one thread (and the same is true for deletes), then it should be straight-forward to reference count the instance handle to control clearing the global registry and freeing the instance handle. However, what was the reason for removing the garbage collector calls? The commit reference mentioned fixing core dumps?

@shukitchan
Copy link
Contributor

explicitly calling garbage collector is expensive and takes a long time. In a situation where there are many lines in remap.config using the lua script, it can take forever and the ATS can eventually crash.

Why do you need to call the garbage collector? Would it be better to leave it for the lua to decide even to collect the garbage?

@pbchou
Copy link
Contributor Author

pbchou commented Mar 16, 2022

It sounds like your use case is different than ours. This might be something that could be included as a plugin parameter to decide whether to run the garbage collector on remap instance deletes.

Let me explain why we added the garbage collector calls. Our current production config generates close to 1GB of Lua memory usage. With the current LuaJIT v2.1.0-beta3 that is standard, the memory usage is limited to 2GB on Linux. This means that we could not reload remap config more than once before running the garbage collector. If one is willing to take the latest non-tagged LuaJIT snapshot, the 2GB limit is gone. However, then you would eventually run into system memory limitations. We are very concerned about memory usage as we would prefer to use the RAM for caching. Is there a way to set and guarantee a maximum amount of time or memory usage threshold between garbage collector cycles? Otherwise, how would we engineer our network controller in terms of pushing config files and executing config reloads? Out of hundreds of caches maybe some have run GC and some have not.

FYI, in looking through the LuaJIT source, there is a garbage collector option to stop garbage collection. It sounds like this might be useful to you since Lua deciding on its own to run the GC in your use case might be a time bomb.

lua_gc(L, LUA_GCSTOP, 0);

@pbchou
Copy link
Contributor Author

pbchou commented Mar 16, 2022

Some interesting testing results --

I am running with a remap configuration consisting of eight lines. Each line uses the same single state for the Lua plugin. Each allocates 100MB of memory when it is initialized. So the total allocation is 800MB per remap configuration loading.

I am varying three variables --

  1. free(ih) on or off
  2. clear global registry on or off
  3. running garbage collector on demand on or off

Tests --

  • ATS 9.1.1 -free -global -gc --> after x8 reloads allocation is 7.2GB used and never decreases
  • ATS 9.1.1 +free -global -gc --> allocation cycles between 2.4/3.2GB down to 1.6GB [ So it turns out the GC is run on its own, but it appears to be based on some kind of threshold. some times goes up 2.4GB before cycles and other times 3.2GB before cycles. Note that it only frees down to 1.6GB not 800MB. ]
  • ATS 9.1.1 +free +global -gc --> allocation cycles between 1.8-2.5GB down to 0.8-1.3GB [ So it will free more memory than the above case. ]
  • ATS 9.1.1 +free +global +gc --> allocation cycles between 1.6GB down to 0.8GB [ This is the same behavior as with 7.1.4. ]

pbchou added a commit to pbchou/trafficserver that referenced this issue Mar 17, 2022
…on reloads

This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.
@pbchou
Copy link
Contributor Author

pbchou commented Mar 17, 2022

@shukitchan -- Please see the commit that I added off the master branch in my personal fork. It should be referenced in this issue above this comment. Not sure what direction you wanted to take with this. If you think this is alright to go forward, then I can open a PR.

pbchou added a commit to pbchou/trafficserver that referenced this issue Mar 17, 2022
…on reloads

Fixes TSDebug() calls that were converted from TSError().
@shukitchan
Copy link
Contributor

  1. I am thinking whether we can do the gc in clean function instead in your own lua script ? Will that make a difference ?

  2. I am checking Removes refcounting from compress and s3_auth plugins #6737 and am thinking whether we still need ref count or not in our case.

@pbchou
Copy link
Contributor Author

pbchou commented Mar 21, 2022

  1. We don't use any clean() function currently. I suppose we can modify our scripts to add one. What happens is the last remap instance will not free since the file-global is not out of scope at the time. So in my testing of 8 x 100MB instances you will free 700MB and not the last instance.
  2. It seems like you would still need to reference count because the remap system assumes that each plugin instance ih is unique. If you have multiple plugin instances using the same ih then that would result in a double free as soon as you hit the same ih twice right?

@bryancall bryancall added the Leak label Mar 21, 2022
@shukitchan
Copy link
Contributor

Ok. I think I understand the change you want. It looks ok.
I can try to test out these changes before next week.

pbchou added a commit to pbchou/trafficserver that referenced this issue Mar 28, 2022
…on reloads

This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.
@pbchou
Copy link
Contributor Author

pbchou commented Mar 28, 2022

@shukitchan -- I rebased and squashed my branch into the new branch above. There are no other changes. If I am understanding you correctly, then I will go ahead and create a PR based on this new branch. Thanks.

@randall randall linked a pull request Mar 29, 2022 that will close this issue
randall pushed a commit that referenced this issue Mar 30, 2022
This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes #8728
zwoop pushed a commit that referenced this issue Apr 5, 2022
This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes #8728

(cherry picked from commit b6f83f1)
SolidWallOfCode added a commit to SolidWallOfCode/trafficserver that referenced this issue Apr 11, 2022
* Rebase checkpoint for HostDB restructure.

* Minor fixes, clean up some printf formatting.

* Tweak TLS test to not mark server down for TLS errors.

* FreeBSD integer format fix.

* Fix autopep8 issue.

* Fix strategies test, lack of update for down server configuration.

* Remove zombie tracking - not really needed.

* autopep8 fix.

* Documentation updates.

* Revert API change - maintain 9.x compatibility.

* Doc: update HostDB

* Fix: avoid reusing null hostent if the DNS query failed.

* Doc update.

* Fix issue with plugin supplied address - suppress DNS lookup in that case.

* Fix brokenness in Lua do to API change and reversion.

* Revert CARP fix - not a good idea.

* Minor BWF tweak for better argument handling.

* Doc udpate.

* HostDB: Fix plugin API port error.

* Fix missing mutex on TSHostLookup bounce continuation.

* Fix.

* Fix nullptr dereference for SRV records.

* Fix HostDB sync time variable.

* Add a new --enable-event-tracker configure option (apache#8179)

* Send diags output to stderr when running regression tests. (apache#8662)

* Fixed issues when compiling with -Og (apache#8665)

* Fix plugin parent_select missing hostname len (apache#8649)

* Add parent_select plugin strategy caching (apache#8651)

Without this, large production strategies.yaml files with thousands
of remaps take hours to load, because each remap independently
loads and parses the strategies.yaml file.

This makes the plugin reuse the parsed object if multiple remaps
use the same file. On reload, files are loaded again.

* TLS Session Resumption: fix timed out session (apache#8667)

Reverts the behavior of getOriginSession() for a timed out session to the previous behavior, where it returns a null session pointer.

* Fix a libswoc build error on macOS (apache#8669)

* Making 9.2.x backwards compatible with 9.1.x (apache#8661)

In order to make 9.2.x backwards compatible against plugins compiled
with 9.1.x, this PR does the following:

1. Adds back exposed enum values removed in 9.2.x
2. Any added enum values are put at the end of their enumeration lists.
3. Adds back the TSHttpTxnCntl function

* Cleanup strategy debug logs (apache#8656)

- fixes parent_select plugin to only have 1 debug tag
- clarifies a misleading debug log about parent selection
- changes parent_select plugin to not unnecessarily call nextHopExists

* Make SSL writes more efficient when using dynamic record sizing (apache#8670)

Avoid calling clock time during subsequent SSL writes since the clock
does get updated in the event loop already.

* Fix to allow running  from outside top_srcdir (apache#8673)

* Stop ATS when a global lua script fails to load (apache#8671)

* Stop ATS when a global lua script fails to load

* fix compile problem

* Make sure each remap rule gets its own continuation (apache#8675)

* Fix test_QUIC unit test builds. (apache#8678)

The test_QUIC unit tests were failing to build because they didn't link
against a file with the TLSKeyLogger definition. This fixes the
undefined references by breaking out TLSKeyLogger into a separate object
that the unit tests can link with.

* Update asf.yaml to require Rocky over CentOS build (apache#8679)

We have changed the CI to use rocky now that centos is defunct, need to update the asf to match

* Fix missing unique_lock definition. (apache#8680)

* Fix clang build issue for libswoc (apache#8658)

* Upgrade Catch to v2.13.8 (apache#8683)

This upgrades our Catch unit test framework to version v2.13.8 from
v2.13.7.

* Fix Clang 13.0.1 compiler warnings (apache#8685)

This fixes a couple compiler warnings raised by Clang 13.0.1.

* Fix warnings from GCC 12.0.1 (apache#8684)

This patch fixes warnings generated by GCC 12.0.1. The warnings involved the
use of the deprecated std::binary_function, std::unary_function, and
std::iterator interfaces. The use of these was considered more confusing than
explicitly declaring the associated types. Therefore this patch replaces the
use of these deprecated interfaces with the declaration of the associated
types.

* Fix plugin parent_select failover (apache#8676)

* slice and cache_range_requests: allow header override (apache#8666)

* Add back "DNS: Fix lack of nameserver failover in low use circumstances." (apache#8705)

This adds back the fix for nameserver failover in low use circumstances
and addresses the regression test instability caused when the test box
has connectivity issues with the DNS server.

* Update iocore/cache/test to fix the build (apache#8677)

* Provide libquic.a definition to VIO::VIO() (apache#8706)

Building with Clang 13 resulted in an undefined sybmol for VIO::VIO().
That is provided by libinkevent.a. Providing that after libquic.a so it
can find the symbol.

This closes apache#8686

* Add docs for parent.config defaults (apache#8692)

* cache-request-method autest: Extending IO delay (apache#8707)

This tweaks a delay that waits upon IO to finish in the
cache-request-method autest which has frequently failed because the item
is not yet cached.

* change api TSSslTicketKeyUpdate to return TSReturnCode (apache#8700)

* libswoc - Update to 1.3.7 (apache#8709)

* Move hwloc-based functions into it's own header/compilation unit (apache#8711)

Reduces everything needing to include hwloc.h

* update FREELIST macros for AArch64 (apache#8688)

remove redundant right shift operations when extract pointer
on AArch64, use 52~62 bits to save version info

* Remove unused functions/definitions from ink_defs (apache#8714)

* Fixes build when hwloc is not installed (apache#8715)

Follow up to 08d3506

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* Fixed memory leaks with CryptoContext (apache#8719)

* Change DNS retries to be a static (requires restart) config value (apache#8724)

* Check bounds before accessing Vol::evacuate array (apache#8716)

* Adds user-agent to OCSP requests (apache#8722)

Uses the proxy.config.http.request_via_str override as the User-Agent

Closes apache#8721

* libswoc: Make libswoc implementation overridable, header export optional. (apache#8733)

* Address issues with python 3.10 (apache#8729)

* Require use of 'override' keyword when valid. (apache#8737)

Co-authored-by: Walt Karas <wkaras@yahooinc.com>

* ts_lua: change type in stats struct from 'int' to 'TSMgmtInt' to avoid overflow (apache#8738)

* Update THREAD_FREE macro to only evaluate the _t parameter once. (apache#8725)

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* Support transforming range requests when origin returns full resource. (apache#8657)

* Support transforming range requests when origin returns full resource.

Resolves apache#8161:

This PR adds support for handling range requests after a response is received and the response source is the origin server.

 - Allow caching responses when a range is requested (requires proxy.config.http.cache.range.write is set, but slightly changes behavior)
 - Handle range setup even if the response source is the origin server.
 - Call do_range_setup_if_necessary on forward server response.
 - Enable API flag to cache original response when a partial response is tranformed (api_info.cache_untransformed = true)

* Add autests

* autopep

* update documentation for updated setting

* Do not setup range transform if the server response is encoded

* use f string instead of .format to format string

* all perform range transform and cache write when replacing cache due to revalidation.

* add autest to cover case of replacing cache with a requested range

* verify cache replace using via string

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* Enable all iocore/cache tests (apache#8718)

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* Remove null check before ats_free calls (apache#8744)

ats_free already does null checks

* money_trace: allow custom header, change span-id gen, opt to create if none (apache#8655)

* money_trace: add global support, custom header name, create if none.

This enhances the money_trace to allow it to run globally, with the
ability to override the global with a remap plugin.
Adds in better logging support for the current transaction's trace header.
Allows passthru mode which passes along the client trace header
without modification making ATS transparent.

* money_trace: change logic for how an invalid incoming header is handled

* money_trace autest: remove commented code

* Fixes Issue apache#7824 - The strategies.yaml parser can incorrectly interpret (apache#8742)

the YAML elements in the host protocol in both core and the parent_select
plugin.

* add log format for whether origin TLS connection resumed an existing TLS session (apache#8745)

* Move Sterror class from TsSharedMutex.h to its own include file. (apache#8710)

Co-authored-by: Walt Karas <wkaras@yahooinc.com>

* Use Sphinx 4.x (apache#8750)

Sphinx has fixed the following issue that was blocking us from moving to
Sphinx 4.x:
sphinx-doc/sphinx#9322

With that fixed, we can unpin from 3.x, which this patch does.

Fixes: apache#7941

* docs: add some obvious units to some http2 overridables (apache#8752)

also remove link to TS timeouts in ocsp settings; unrelated to TS timeouts

* Revert "Use Sphinx 4.x (apache#8750)" (apache#8754)

This reverts commit 38bf663.

* Update to proxy-verifier-v2.3.1 (apache#8753)

Proxy Verifier version 2.3.1 actively closes the connection when the
proxy (ATS) returns a response with a "Connection: close" header. This
improves the stability of the cache-request-method AuTest, wherein ATS,
dependent upon packet timing, would sometimes return a "Connection:
close" and follow it with a TCP FIN. The FIN would then be processed by
the Verifier client after request headers were sent, which would
interrupt the traffic flow and cause the test to fail. With Verifier
version 2.3.1, this is avoided by the client actively closing the
connection when ATS returns the intermittent "Connection: close".

* Use Sphinx 4.x (second attempt) (apache#8757)

This is a second attempt to upgrade to Sphinx 4.x. The first was apache#8750.
It was noticed, however, that the new Sphinx didn't automatically
deliver the theme.css file. This updates the conf.py file to make the
addition of theme.css explicit.

* [ink_base64] Fix buffer size computation (apache#8736)

These buffer size computation macros are meant to be used for callers of
ats_base64_encode/decode.  They were not taking into account a null
terminator, which is always written by those functions.  This causes a
buffer overflow when encode or decode are called on buffers of certain
sizes, e.g. decode on an input buffer of size 4.

This change makes these buffer size calculations match the functions.
It also updates unit tests for the access_control plugin, which uses
these functions.

Since this is a macro change, plugins which use encode/decode will need to
be recompiled.

* Fixing the httpbin AuTests by pinning Werkzeug (apache#8765)

The AuTests using httpbin all started failing today when new AuTest
pipenv environments started using the 2.1.0 version of Werkzeug. This
new Werkzeug release removes its BaseResponse API because it has been
deprecated for a few releases. httpbin has a dependency upon
BaseResponse, however, and thus it currently fails with this latest
Werkzeug release. This patch temporarily pins Werkzeug until httpbin is
updated so that our httpbin AuTests can run.

* Fix error handling in SSL cert/key load failures (apache#8763)

Fix a use-after-free error when loading SSL cert or key fails.

* Avoid allocation when matching hosts for vol lookup (apache#8762)

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* Pin Jinja2 for doc builds (apache#8773)

The latest jinja2 breaks Sphinx 3.x builds. For now, pinning jinja2 to
an earlier release for doc builds.

* Revert "Use Sphinx 4.x (second attempt) (apache#8757)" (apache#8771)

This reverts commit 4dbf914.

In the previous attempt to use Sphinx 4.x, I tried to fix our CSS 4.x
rendering by explicitly adding theme.css to the css_files html_context.
That did indeed add the needed theme.css rendering to the various docs,
but there are still some artifacts that don't render correctly (such as
ts:cv parameters). Reverting 4.x for now since 3.x works fine and moving
to 4.x is turning out to be a bigger project than I expected.

* Lua plugin memory leak on remap configuration reloads (apache#8764)

This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes apache#8728

* Add metrics for loop detection. (apache#8772)

Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>

* check return values of openssl api calls (apache#8758)

* STEK share plugin using Raft (apache#8751)

Co-authored-by: Fei Deng <feid@yahooinc.com>

* Adding prefetch feature to slice plugin (apache#8723)

Co-authored-by: Serris Lew <lserris@apple.com>

* Adds an IP reputation system to the SNI rate limiter (apache#8459)

* Reuse TSMutex for ts_lua_http_intercept_handler (apache#8687)

* Propagate proxy.config.net.sock_option_flag_in to newly accepted connections (apache#8784)

* Propagate proxy.config.net.sock_option_flag_in to newly accepted connections

* For client connections, disable setting SO_LINGER by default

This was originally approved under apache#8463 and was reverted in apache#8622

Co-authored-by: Alan M. Carroll <amc@apache.org>
Co-authored-by: Masaori Koshiba <masaori@apache.org>
Co-authored-by: Walt Karas <wkaras@verizonmedia.com>
Co-authored-by: Bryan Call <bcall@apache.org>
Co-authored-by: Robert O Butts <rob05c@users.noreply.github.com>
Co-authored-by: Mo Chen <uncorrupt@gmail.com>
Co-authored-by: Sudheer Vinukonda <sudheerv@apache.org>
Co-authored-by: Chris McFarlen <chris@mcfarlen.us>
Co-authored-by: Kit Chan <kichan@apache.org>
Co-authored-by: Leif Hedstrom <zwoop@apache.org>
Co-authored-by: Evan Zelkowitz <eze@apache.org>
Co-authored-by: Randy DuCharme <5157999+AD5GB@users.noreply.github.com>
Co-authored-by: Brian Olsen <brian_olsen2@comcast.com>
Co-authored-by: Fei Deng <duke8253@gmail.com>
Co-authored-by: Randall Meyer <rrm@apache.org>
Co-authored-by: wangrong <93629441+wangrong1069@users.noreply.github.com>
Co-authored-by: Chris McFarlen <cmcfarlen@apple.com>
Co-authored-by: dragon512 <dragon512@live.com>
Co-authored-by: Walt Karas <wkaras@yahooinc.com>
Co-authored-by: John J. Rushford <jrushford@apache.org>
Co-authored-by: pbchou <pbchou@labs.att.com>
Co-authored-by: Fei Deng <feid@yahooinc.com>
Co-authored-by: Serris Lew <serrisnlew@gmail.com>
Co-authored-by: Serris Lew <lserris@apple.com>
Co-authored-by: Oknet <xuchao@skyguard.com.cn>
Co-authored-by: bneradt <bneradt@yahooinc.com>
bryancall pushed a commit that referenced this issue Jun 14, 2022
This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes #8728

(cherry picked from commit b6f83f1)
moonchen pushed a commit to moonchen/trafficserver that referenced this issue Jul 26, 2022
* change MemArena::make test to remove memory leak (apache#8352)

(cherry picked from commit 2a6156f)

* Fix leaks in ConfigManager::configName (apache#8269)

This fixes an ASan reported leak of ConfigManager::configName. It used
to be strdup'd but not freed in the destructor. This simply changes it
to a std::string.

ASan also reported a leak in AddConfigFilesHere which is fixed with an
ats_free as well.

(cherry picked from commit ee820c7)

* Lua plugin memory leak on remap configuration reloads (apache#8764)

This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes apache#8728

(cherry picked from commit b6f83f1)

* Fixes leak of SNI config filename on load

(cherry picked from commit e99f33c)

* Fixes leak of ssl_ocsp_response_path_only on reload

(cherry picked from commit 18c5404)

* SNIConfig (tunnel_route): Change the way we extract matched subgroups from the server name. (apache#8589)

This now uses the provided offsets from pcre_exec to read each matched group, this avoids
allocating memory for the subgroups. There was a memory leak here as well which is now eliminated.
This also changes the ActionItem::Context vector of strings to a vector of views to keep each matched group.

(cherry picked from commit 4f0c4f2)

Conflicts:
    iocore/net/P_SSLSNI.h
    iocore/net/TLSSNISupport.cc

* Fixes leak in SNIAction name globbing (apache#8827)

pcre_compile allocated object is never pcre_free-ed

(cherry picked from commit efaf441)

Conflicts:
    iocore/net/P_SSLSNI.h

Co-authored-by: Brian Olsen <bnolsen@gmail.com>
Co-authored-by: Brian Neradt <brian.neradt@gmail.com>
Co-authored-by: pbchou <pbchou@labs.att.com>
Co-authored-by: Damian Meden <damian.meden@gmail.com>
JosiahWI pushed a commit to JosiahWI/trafficserver that referenced this issue Jul 19, 2023
This fix adds reference counting for the Lua plugin remap instance handles. The reference counting
allows us to eliminate an existing memory leak of the instance handles. In addition, this means
that the old Lua memory allocated by LuaJIT may also be freed via LuaJIT garbage collection.

This fix also adds the '--ljgc' remap instance plugin parameter to the Lua plugin. This paramter
enables on-demand LuaJIT garbage collection while the remap instances are created and deleted.
This is useful when operating close to the LuaJIT memory limit, which is currently 2GB on Linux
using LuaJIT v2.1.0-beta3 from 2017.

Fixes apache#8728

(cherry picked from commit b6f83f1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants