-
-
Notifications
You must be signed in to change notification settings - Fork 655
Description
While investigating high memory usage in our application, I noticed in the heap snapshot that all PoolConnection objects, which represent open database connections, keep references to Query objects, including their query results (_rows). In one case, the result of a single query took up about 500 MiB of memory.
I managed to reproduce this problem with a minimal example. You can find the repository here, but I’ll briefly describe the steps and part of the code here. Basically, if you create a connection pool to a database, run a SELECT query on a large table, don’t store the result anywhere, then manually trigger garbage collection, you’ll see that the data still remains in memory.
async function main()
{
global.handle = await mysql.createPool(...);
console.log("> Heap usage before query");
printHeapUsage();
await handle.query("SELECT * FROM big_data")
}
main().then(() => {
global.gc();
console.log("\n> Heap usage after query, GC cycle is forced");
printHeapUsage();
global.handle.end();
});In my example, I fill a table with 128 MiB of data beforehand, so it’s easy to see from memory statistics that even after losing all references to the query result in my code and calling global.gc(), the data still stays in the heap:
> Heap usage before query
Heap Used: 9.88 MiB
Heap Total: 16.99 MiB
> Heap usage after query, GC cycle is forced
Heap Used: 147.80 MiB
Heap Total: 280.03 MiB
It seems the issue comes from a reference to cmdQuery being kept inside a callback closure used when creating PoolConnection. I’m not entirely sure how this happens, but if I add this.removeAllListeners("error") in callbackOnce() to remove the "error" listener after the connection is successfully established, the reference (and the memory leak) disappears.
if (!connectCalled) {
if (isErrorHandler) {
cb(param);
+ this.removeAllListeners('connect');
} else {
cb(null, param);
+ this.removeAllListeners('error');
}
}> Heap usage before query
Heap Used: 9.95 MiB
Heap Total: 16.74 MiB
> Heap usage after query, GC cycle is forced
Heap Used: 8.50 MiB
Heap Total: 143.21 MiB
If I understand correctly, the function created by calling callbackOnce(true) keeps a reference to the cb passed to connect(). That cb, in turn, holds a reference in its closure to another cb passed to getConnection(), and that one keeps a reference in its closure to cmdQuery. And because the function created by calling callbackOnce(true) is registered as a listener for "error" event that would never happen after successful connection, the reference to cmdQuery inside this "eternal" listener will prevent GC from cleaning Query object until the connection is dead.
You can find a screenshot from VS Code below, it should explain better what I mean more clearly.
