Build leaking strings with push/join instead of string concatenation. #48

ChALkeR · 2015-03-15T10:24:10Z

This solves #46.

I haven't updated lz-string.min.js, there is no build script.

pieroxy · 2015-03-15T17:07:35Z

Thanks for the code. As usual with performance improvements, I've setup a small JSPERF: http://jsperf.com/lzstring-1-4-strings-vs-arrays

I'll wait a couple of days so that I can test it with all platforms at hand both at home and from work. Can you do the same on your end?

Also, please have a look at the JSPERF just to make sure I didn't screw something up.

pieroxy · 2015-03-15T17:09:06Z

Ah, don't worry about the minified version, I update it before any release.

ChALkeR · 2015-03-15T17:43:18Z

The Arrays-based implementation could be a bit slower in this kind of micro-benchmarks (5% slower in Chromium 41), but that doesn't really matter.

ChALkeR · 2015-03-15T17:56:20Z

Try compressing a block of length 10 000 (10k) 10 000 times, saving each result. Or a block of length 100 000 (100k) 1000 times. Or a block of length 1000 (1k) 100 000 times.

pieroxy · 2015-03-15T18:01:21Z

Just to let you know, your implementation is consistently slower on FF and Chrome on MacOS and Windows, consistently faster on FF & Chrome on Linux, and about the same on Safari (MacOS & iOS).

A mixed bag, really. And while I'm working almost exclusively on Linux, I know that Windows + Mac is more than 85% of my desktop traffic.

On Android, things look globally equal.

As a side note, I've opened an issue on jsperf because the graph is essentially meaningless since they don't mention the OS: mathiasbynens/jsperf.com#222

As usual, there is not really a clear cut decision to take here...

Can you tell me how you measured the consumed memory so we can try to benchmark the change from that point of view?

ChALkeR · 2015-03-15T18:02:09Z

The problem with strings concatenation is that the resulting string, while being equal to the expected result, consumes much more memory. If you discard the result immediately — you most probably won't notice anything, but if you store the result somewhere — your memory will exhaust pretty soon, brining hell to your setup: gc will start consuming 100% CPU, if your physical memory is exhausted (not just the soft-limit) — side apps and native code will start failing, etc.

ChALkeR · 2015-03-15T18:06:43Z

push/join:

var len = 1000,
    iter = 50000,
    m = [];

function foo0() {
    var s = [];
    for (var j = 0; j < len; j++) {
        s.push('x');
    }
    return s.join('');
}
for (var i = 0; i < iter; i++) {
    m.push(foo0());
}
console.log((process.memoryUsage().rss / 1024 / 1024) + ' MiB');

String concatenation:

var len = 1000,
    iter = 50000,
    m = [];

function foo1() {
    var s = "";
    for (var j = 0; j < len; j++) {
        s += 'x';
    }
    return s;
}
for (var i = 0; i < iter; i++) {
    m.push(foo1());
}
console.log((process.memoryUsage().rss / 1024 / 1024) + ' MiB');

Run with iojs/node.

If you wish to test browsers, make two separate pages, run them, and measure memory usage, for example, using the developer tools. Make sure that you perform the measurement before gc cleans up things.

ChALkeR · 2015-03-15T18:12:50Z

The difference between these tests and your jsperf test is that your test immediately throws the resulting strings out. That doesn't happen in real applications, those strings always go somewhere, for example, into the message queue that is about to be sent.

pieroxy · 2015-03-15T18:55:12Z

True, but bigger strings should consume more memory and then trigger more GCs, leading to a slower compression. That's a lots of theory right there so it's almost certainly false. But I need to get things straight and understand this - at least in a theoretical standpoint - before going further.

LZ-String was made for browsers. I don't even understand what people do with it on the server side, save decompressing stuff from the client.

Can you tell me your use case of compressing huge strings on the server side?

ChALkeR · 2015-03-15T19:09:30Z

Can you tell me your use case of compressing huge strings on the server side?

The strings were not huge, generally below 1000 chars each. Compressing WebSocket messages before transfering them (not everything was fine with native compression). I disabled that already

ChALkeR · 2015-03-15T19:20:47Z

http://jsperf.com/lzstring-1-4-strings-vs-arrays/3 — here, I modified your testcase by storing the results. Now the Array-based implementation is faster in Chromium.
It's the same microbenchmark in the browser as yours, it just doesn't throw out the results immediately.

pieroxy · 2015-03-15T20:27:29Z

First, about your use case. Of course, I generally advise people to turn on gzip compression when sending stuff from the server to the client. But even then, you were in the exact same use case as my initial jsperf: Compress a string, send it to the network and throw it away. I don't see the problem of having string bigger than what they represent here.

About your jsperf. Well, strings are consistently faster with += in your jsperf ;-) But don't worry I get the issue. I'm just think

In particular, this jsperf is making me doubt about the pertinence of this patch for in-browsers lz-string usages.

But now that I understand your issue, I'll try to think of something to mitigate

ChALkeR · 2015-03-15T20:46:08Z

First, about your use case. Of course, I generally advise people to turn on gzip compression when sending stuff from the server to the client. But even then, you were in the exact same use case as my initial jsperf: Compress a string, send it to the network and throw it away. I don't see the problem of having string bigger than what they represent here.

There is a message queue in my case that contains messages that are about to be sent. I already disabled lz-string compression because the native permessage-deflate is working fine nowdays, so don't worry about me =).

About your jsperf. Well, strings are consistently faster with += in your jsperf ;-)

Looks I didn't use jsperf correctly. It runs the setup/teardown inside the loop, not outside of it. Wait a moment, I will fix that.

ChALkeR · 2015-03-15T21:02:25Z

http://jsperf.com/lzstring-1-4-strings-vs-arrays/4 — take a look at this one.

strings-normalize wins in Chromium and Firefox, but that is a hack.

And I changed the string that is compressed a bit. It's now the same during one run for all the tests, a bit shorter and has letters.

ChALkeR · 2015-03-15T21:36:19Z

About the memory consumption: In Chromium developer tools, under «Timeline», choose «Memory» and press the record button. Same in Firefox.

http://oserv.org/bugs/lz-string/memory — use this to measure memory with in-browser developer tools.
My results in Chromium: array — 53M, normalize — 49M, strings — 409M.
My results if Firefox e10s: array — 68M, normalize — 110M, strings — 270M.

That's just 1000 strings with 1000 random numbers each.

ChALkeR · 2015-03-15T22:35:54Z

Even if strings concatenation is/were 5% faster, you shouldn't go for that 5% performance win when it increases memory usage 10-100 times in a major browser.

In JS strings are immutable, and engines are not optimized for «build a string with char-by-char concat» use-cases, no one expects that. «a += b» does not modify «a» value, it constructs a new string object and overrides the reference.

About your «server-side» and «large messages» arguments:
The same engine is used in Chromium, on the client-side.
And if you say that this lib shouldn't be used for large messages or large amounts of messages, what's the point in 5% performance optimization?

The performance should be optimized, for example, by native Map/Set classes, not by misusing strings in a harmful way.

pieroxy · 2015-03-15T22:47:29Z

Sorry, my link was broken, I meant to link to this JSPERF. Of course, it doesn't really apply to me since I'm concatenating characters one by one.

In any case, thanks for all the informations. I'm going to look into it closely this week and see what the best solution is.

ChALkeR · 2015-03-15T23:01:11Z

http://jsperf.com/javascript-concat-vs-join/4 — try this one.
It builds 1000 strings with length 5000, one by one.

And I still don't understand why are you referring to jsperf micro-benchmarks.

pieroxy · 2015-03-20T16:41:20Z

On your jsperf, the concat version is consistently faster (way faster actually) for me except on Chrome/Linux and IE9... Again a mixed bag.

Unfortunately I have extremely little free time these days. I'll try to look into this this weekend to find a way that consumes less memory and doesn't hit performance in the process. If I don't I'll merge this pull request.

Thanks for your time and patience.

Build leaking strings with push/join instead of string concatenation. Slightly slower on some browsers but consumes far less memory.

ChALkeR · 2015-03-23T10:22:49Z

Thanks.

Build leaking strings with push/join instead of string concatenation.…

eb6dc2e

… Solves pieroxy#46.

pieroxy added a commit that referenced this pull request Mar 23, 2015

Merge pull request #48 from ChALkeR/master

d15462a

Build leaking strings with push/join instead of string concatenation. Slightly slower on some browsers but consumes far less memory.

pieroxy merged commit d15462a into pieroxy:master Mar 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build leaking strings with push/join instead of string concatenation. #48

Build leaking strings with push/join instead of string concatenation. #48

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 20, 2015

ChALkeR commented Mar 23, 2015

Build leaking strings with push/join instead of string concatenation. #48

Build leaking strings with push/join instead of string concatenation. #48

Conversation

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 15, 2015

ChALkeR commented Mar 15, 2015

pieroxy commented Mar 20, 2015

ChALkeR commented Mar 23, 2015