High memory usage #3
Comments
Thanks for the report. First of all, I suggest decreasing the As for the memory leak I've tried this, which helps but I believe it doesn't fix the problem entirely. Could you try it? It's on the I'm a C++ newbie so any help in that department is welcome. |
I think any takeaway would greatly benefit the community, a war story that would help other developers. I'll take a look tomorrow and post back with the results. |
branch mem cuts down peak memory (RSS) from 9.8 GB to 3.4 GB. After wrk finishes and GC happens, memory is down to 671 MB. Before running wrk, the server is only using 15MB, so it looks like there's still is a leak. Will do more digging. |
So I found this with asan's leak checker:
Which is https://github.com/MayhemYDG/iltorb/blob/master/src/enc/stream_encode_worker.cc#L14 I had to change the following lines in my binding.gyp:
then build with if (encodings.has('br')) {
res.setHeader('Content-Encoding', 'br');
fs.createReadStream(filename).pipe(brotli()).pipe(res).on('finish', function () {
if (++counter === 10) {
process.exit(1);
}
});
} otherwise asan's leak checker does not run since the process does not exit. |
I don't believe so. I call According to the NAN docs:
|
oops, yep, just tried this, got a double free... |
@nickdesaulniers Btw, by passing |
ok, added the patch from the mem branch, added
|
That's less than 1 MB total, this doesn't look like it. |
Try reading from a larger file perhaps? |
From my own testing, the culprit does not reside in JS, as the heapTotal/heapUsed remained low throughout the test. |
@MayhemYDG but you did see a high RSS? |
Yes definitely. Here's the test I wrote quickly: 'use strict';
var fs = require('fs');
var brotli = require('.');
var i = 100;
function compress() {
fs.createReadStream('./test/fixtures/large.txt')
.pipe(brotli.compressStream({lgwin: 1}));
if (i-- > 0) {
compress();
}
}
compress();
function p(n) {
return Math.round( n / (1024 * 1024) * 100 ) / 100;
}
function report() {
var mem = process.memoryUsage();
console.log(p(mem.rss), p(mem.heapTotal), p(mem.heapUsed));
setTimeout(report, 500);
}
report(); You'll have to quit the process manually but it lets you observe the memory over time. |
in the meantime, can you merge the mem branch? |
Done. |
yes please 🎉 |
And done. |
Here's a heap profile from running |
I've pushed two new commits to master, but they don't seem to make much of a difference. 202f156 1dd8e7e @nickdesaulniers @kkoopa |
That really has nothing with NAN to do. Persistent handles are a V8 thing, NAN only wraps them to provide a unified API and semantics. Read the V8 embedder's guide, especially the section on Handles. Essentially, what you should do is define destructors for |
@kkoopa |
Yes, that is what you should do. On Wednesday 04 November 2015 08:07:59 Mayhem wrote:
|
I think I figured out where the memory leak comes from. |
Actually the destructor is being called, I'm confused. |
If I don't explicitly use |
@nickdesaulniers |
will do, another idea is we can use things like https://github.com/andreasgal/finalize.js or https://www.npmjs.com/package/weak to verify if things aren't getting GC'd. Tracking down why is more fun, but we could verify that that's the case. |
The JS side of iltorb was never the problem, the v8 heap always remained low, like less than 10MB, while the RSS can reach infinity and beyond. The memory remains stable because the GC is lazy. I still think there's a leak in the C++ somewhere, maybe even in brotli itself. |
I'd guess that v8 only counts memory it allocates by itself, and therefore doesn't know that it ought to GC. Have you looked into |
interesting looking at how libxmljs does it. |
Yes, the possibility to provide custom memory allocation functions to a library is great and often very helpful, esp. for bindings and/or memory-intensive applications (like compression 😃). zlib-based interfaces (zlib, libbzip2, liblzma, probably many more) provide these, but unfortunately, the brotli codebase was not written with this feature in mind. I contemplated writing a PR for their code that adds this in order to help with this issue, but it looks like a rather large job: They freely mix C and C++ allocation primitives and many changes to function signatures would be necessary, and I don’t have the time on my hands to code this anytime soon. Maybe it’s a good idea to create an issue there to make them aware of this? |
I think @addaleax is uniquely qualified to eloquently state requirements of what would need to be added to brotli to the developers. |
@simonlindholm @addaleax |
Mh, having an overview of the memory used by brotli is not really an I/O thing. The above links to other bindings (libxmljs and my lzma-native) lead to code that works in the following way:
Note that |
Btw, the paragraph from the Python docs which I copied to google/brotli#263 explains pretty well how this works from the garbage collector’s point of view, maybe that’s a bit clearer than what I wrote :) |
So if I understand this correctly, I cannot right now use |
I’m afraid so, yes; You can use it for everything allocated directly by this library, but I’m afraid that’s not much in comparison to brotli’s internal structures. I don’t exactly know what the brotli maintainers mean by “soon”, but it’s probably easiest to just wait until they have implemented custom memory allocation over there… |
Well, I mean, you could also approximate the memory usage by the size of the compressed data plus/times some empirically gathered constants. There's no need for an exact value since it's only used for GC heuristics. But the added reliability gained by hooking allocators would certainly be nice, of course. |
@simonlindholm You’re absolutely right, that’s a possibility, and one can probably find some way to estimate the memory usage. |
I'm trying to update iltorb to use non-deprecated functions in brotli version 0.3.0, and make use of custom allocators somehow. I started looking into I then tried looking into custom allocators but this is beyond my knowledge of C. Any pointers are welcome. Pull requests are welcome as well. |
This parallels the changes previously done in the decoder. Fixes: nstepien#3
This parallels the changes previously done in the decoder. Fixes: nstepien#3
This parallels the changes previously done in the decoder. Fixes: nstepien#3
This parallels the changes previously done in the decoder. Fixes: nstepien#3
This parallels the changes previously done in the decoder. Fixes: nstepien#3
This parallels the changes previously done in the decoder. Fixes: nstepien#3
🍻 🍻 🍻 🎆 |
@nickdesaulniers Have you tried benchmarking again? I wonder how much of an improvement the changes made. |
I no longer poses the machine I originally reported. I can follow along from my post, but I didn't list the version of Node I used, so the old numbers are worthless. My current machine is relatively underpowered. iltorb iltorb v 1.0.12: If I downgrade iltorb to v1.0.7 as in this comment: So way less memory usage, 10x req/s, half the latency, 1/10th the stddev for latency, and 1/3rd worst case latency. Nice! |
Absolutely terrific! 🎉 |
Huh, didn’t expect the impact to be this big. But yep, awesome to hear it helps! |
I assume part of it is due to improvements in brotli itself. |
Hi there,
I'm currently benchmarking brotli to write a blog post about support for brotli in Firefox 44+. While running
wrk -c 100 -t 6 -d 30s -H 'Accept-Encoding: br' https://localhost:3000
against a custom Node.js server:and measuring the peak resident set size in memory I saw about 9.8 GB of usage. Compare that to 330 MB for LZMA, 204 MB for GZIP, and 125 MB for uncompressed.
I assumed that there were objects being created that were using a lot of memory. I would expect them to get GC'ed once finished.
Using
top -pid <node process id>
, afterwrk
finished, I observed the memory usage of the process stabalize around 8.2 GB of memory. It looks to me like this library has a memory leak. ❔The text was updated successfully, but these errors were encountered: