String.substring - very low performance #27810

Closed
DisDis opened this Issue Nov 12, 2016 · 13 comments

Comments

Projects
None yet
5 participants
@DisDis

DisDis commented Nov 12, 2016

I created a mini performance test for String.substring
https://github.com/DisDis/dart_vs_nodejs_substring

Dart VM version: 1.20.1 (Wed Oct 12 22:00:54 2016) on "linux_x64"

NodeJs slice string test

Str length:99900
Time: 4ms
Str length:1

real 0m0.059s
user 0m0.056s
sys 0m0.000s

Dart slice (substring) string test

Str length: 99900
Time: 1865ms
Str length: 1

real 0m1.930s
user 0m1.920s
sys 0m0.032s

dart ~450 times slower nodejs.
Observatory says

@lrhn

This comment has been minimized.

Show comment
Hide comment
@lrhn

lrhn Nov 12, 2016

Member

V8, used by node.js, has a cheap substring operation for short-lived substrings. It creates a small object that points into the original string. I believe the substring "slice" is converted to a real string when it's garbage collected, so keeping a lot of substrings alive for a long time will eventually cost anyway, but if it's thrown away immediately, then it's cheap.

The cost of this optimization is that V8 needs handle one more type of string in all places where it handles strings. This complicates both the runtime system and garbage collection.

The Dart VM doesn't have an optimization like this. If you make a 99900 character substring of another string, you copy 99900 characters.

Member

lrhn commented Nov 12, 2016

V8, used by node.js, has a cheap substring operation for short-lived substrings. It creates a small object that points into the original string. I believe the substring "slice" is converted to a real string when it's garbage collected, so keeping a lot of substrings alive for a long time will eventually cost anyway, but if it's thrown away immediately, then it's cheap.

The cost of this optimization is that V8 needs handle one more type of string in all places where it handles strings. This complicates both the runtime system and garbage collection.

The Dart VM doesn't have an optimization like this. If you make a 99900 character substring of another string, you copy 99900 characters.

@DisDis

This comment has been minimized.

Show comment
Hide comment
@DisDis

DisDis Nov 12, 2016

@lrhn, I understand it. People porting libraries from nodejs negative talk about performance dart vm. We need a class that supports this behavior or some kind of solution to the problem.

DisDis commented Nov 12, 2016

@lrhn, I understand it. People porting libraries from nodejs negative talk about performance dart vm. We need a class that supports this behavior or some kind of solution to the problem.

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 12, 2016

Contributor

We need to see something more complete than microbenchmark - then we can give some solution.

Which library are people trying to port and where is the bottleneck?

@lrhn I don't think slice is ever flattened - so you just have essentially a memory leak.

Contributor

mraleph commented Nov 12, 2016

We need to see something more complete than microbenchmark - then we can give some solution.

Which library are people trying to port and where is the bottleneck?

@lrhn I don't think slice is ever flattened - so you just have essentially a memory leak.

@Aetet

This comment has been minimized.

Show comment
Hide comment
@Aetet

Aetet Nov 14, 2016

+1. I have the same type of issue as @DisDis
When porting node.js less to dart instead of using node.js slice - substring function was used.

This function executes too often and has huge impact to performance.

So less files compiles at X10 times slower than at node.js.

Aetet commented Nov 14, 2016

+1. I have the same type of issue as @DisDis
When porting node.js less to dart instead of using node.js slice - substring function was used.

This function executes too often and has huge impact to performance.

So less files compiles at X10 times slower than at node.js.

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 14, 2016

Contributor

@Aetet Is there some canonical benchmark I can run against your code to see huge impact on performance? e.g. if you give us some less file that compiles 10x slower then we can nail it down.

Contributor

mraleph commented Nov 14, 2016

@Aetet Is there some canonical benchmark I can run against your code to see huge impact on performance? e.g. if you give us some less file that compiles 10x slower then we can nail it down.

@DisDis

This comment has been minimized.

Show comment
Hide comment
@DisDis

DisDis Nov 14, 2016

@mraleph.
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/run.sh
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/benchmark.less
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/big1.less

Output:
Benchmark.less

**NodeJs**

real   0m1.232s
user   0m1.564s
sys    0m0.076s
 ----------------- 
**Dart**

real    0m2.923s
user    0m4.044s
sys     0m0.272s

big1.less

**NodeJs**

real    0m1.787s
user    0m1.928s
sys     0m0.128s

 ----------------- 
**Dart**

real    0m10.160s
user    0m11.308s
sys     0m0.472s

DisDis commented Nov 14, 2016

@mraleph.
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/run.sh
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/benchmark.less
https://github.com/DisDis/less_dart/blob/opt1/test/benchmark/big1.less

Output:
Benchmark.less

**NodeJs**

real   0m1.232s
user   0m1.564s
sys    0m0.076s
 ----------------- 
**Dart**

real    0m2.923s
user    0m4.044s
sys     0m0.272s

big1.less

**NodeJs**

real    0m1.787s
user    0m1.928s
sys     0m0.128s

 ----------------- 
**Dart**

real    0m10.160s
user    0m11.308s
sys     0m0.472s
@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 15, 2016

Contributor

I looked into this. It seems that less_dart is using String.substring to workaround RegExp API deficiency: there is no API that allows you to efficiently match regexp anchored at the given position in the input string - you can only anchor at the start of the string. A common (though arguably not the most efficient way) to write lexer in JavaScript is to match tokens using anchored regexps like ^[_A-Za-z-][_A-Za-z0-9-]* and slice parsed tokens away using .slice/.substring. In V8 string slices and raw performance of irregexp hide away overheads of such approach. In Dart VM overheads become acutely visible.

To be honest I think that the best way to fix those overheads is to rewrite the less_dart parser. However there are things we could do on our side as well:

A: Implement string slices in the VM I don't think this is likely to happen. It's quite a bit of complexity that helps only few cases, while introducing subtle memory leaks. Developers should write their code under expectation that String.substring has O(n) complexity. Some runtimes that used to have string slices removed them for this very reason (e.g. see JVM experience).

B: Add APIs that would make calling substring unnecessary For example ES2015 adds sticky flag that allows you to create regexps that stick to a particular index in the string - i.e. they match if and only if there is a match starting at that index. In reality we already have similar API on the RegExp class RegExp.matchAsPrefix. So dart_less code can be rewritten to use that instead of substringing. One small note is that RegExp.matchAsPrefix is implemented somewhat in-efficiently - it uses normal regexp match routine and checks if match occured at the start - but we can fix that.

I made some prototype changes to the VM and less_dart - and I can see factor of 3-4 improvement on performance. See VM and less_dart patches in https://gist.github.com/mraleph/2e57e391d6d32136e5cb71e402c3aa7a. Substringing completely disappears from the profile - but VM is still slower than V8 on parsing by a factor of ~3x. This is probably regexp performance, though I did not dig into that.

So to summarize my suggestions: @Aetet if you don't want to rewrite the less_dart parser completely you can at least rewrite it to use RegExp.matchAsPrefix, which we can fix to make it faster. (My recommendation though would be to rewrite the thing entirely).

@DisDis: are you also using substring together with RegExp, like the code in less_dart or you have another use case we should look at?

/cc @ErikCorryGoogle

Contributor

mraleph commented Nov 15, 2016

I looked into this. It seems that less_dart is using String.substring to workaround RegExp API deficiency: there is no API that allows you to efficiently match regexp anchored at the given position in the input string - you can only anchor at the start of the string. A common (though arguably not the most efficient way) to write lexer in JavaScript is to match tokens using anchored regexps like ^[_A-Za-z-][_A-Za-z0-9-]* and slice parsed tokens away using .slice/.substring. In V8 string slices and raw performance of irregexp hide away overheads of such approach. In Dart VM overheads become acutely visible.

To be honest I think that the best way to fix those overheads is to rewrite the less_dart parser. However there are things we could do on our side as well:

A: Implement string slices in the VM I don't think this is likely to happen. It's quite a bit of complexity that helps only few cases, while introducing subtle memory leaks. Developers should write their code under expectation that String.substring has O(n) complexity. Some runtimes that used to have string slices removed them for this very reason (e.g. see JVM experience).

B: Add APIs that would make calling substring unnecessary For example ES2015 adds sticky flag that allows you to create regexps that stick to a particular index in the string - i.e. they match if and only if there is a match starting at that index. In reality we already have similar API on the RegExp class RegExp.matchAsPrefix. So dart_less code can be rewritten to use that instead of substringing. One small note is that RegExp.matchAsPrefix is implemented somewhat in-efficiently - it uses normal regexp match routine and checks if match occured at the start - but we can fix that.

I made some prototype changes to the VM and less_dart - and I can see factor of 3-4 improvement on performance. See VM and less_dart patches in https://gist.github.com/mraleph/2e57e391d6d32136e5cb71e402c3aa7a. Substringing completely disappears from the profile - but VM is still slower than V8 on parsing by a factor of ~3x. This is probably regexp performance, though I did not dig into that.

So to summarize my suggestions: @Aetet if you don't want to rewrite the less_dart parser completely you can at least rewrite it to use RegExp.matchAsPrefix, which we can fix to make it faster. (My recommendation though would be to rewrite the thing entirely).

@DisDis: are you also using substring together with RegExp, like the code in less_dart or you have another use case we should look at?

/cc @ErikCorryGoogle

@lrhn lrhn closed this Nov 15, 2016

@lrhn lrhn reopened this Nov 15, 2016

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 16, 2016

Contributor

Review for faster RegExp.matchAsPrefix is in flight. After that I will gradually upstream my patches for less_dart.

@DisDis / @Aetet I suggest we rename this issue to "less_dart is 10x slower than less.js" if this is the case you actually care about.

Contributor

mraleph commented Nov 16, 2016

Review for faster RegExp.matchAsPrefix is in flight. After that I will gradually upstream my patches for less_dart.

@DisDis / @Aetet I suggest we rename this issue to "less_dart is 10x slower than less.js" if this is the case you actually care about.

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 17, 2016

Contributor

RegExp.matchAsPrefix optimization has landed.

Contributor

mraleph commented Nov 17, 2016

RegExp.matchAsPrefix optimization has landed.

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 18, 2016

Contributor

@DisDis I started pushing out optimization commits into my fork of your fork of less_dart

# before (with a bleeding edge Dart VM)
$ time dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css  1.97s user 0.15s system 139% cpu 1.519 total
$ time dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css  5.46s user 0.30s system 112% cpu 5.123 total

# after (with a bleeding edge Dart VM)
$ time dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css  1.28s user 0.15s system 144% cpu 0.991 total
$ time dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css  1.25s user 0.10s system 157% cpu 0.862 total

I think this issue can be closed. Though I do have few other cleanups to push out into less_dart.

Contributor

mraleph commented Nov 18, 2016

@DisDis I started pushing out optimization commits into my fork of your fork of less_dart

# before (with a bleeding edge Dart VM)
$ time dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css  1.97s user 0.15s system 139% cpu 1.519 total
$ time dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css  5.46s user 0.30s system 112% cpu 5.123 total

# after (with a bleeding edge Dart VM)
$ time dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/big1.less /tmp/benchmark.css  1.28s user 0.15s system 144% cpu 0.991 total
$ time dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css
dart bin/lessc.dart test/benchmark/benchmark.less /tmp/benchmark.css  1.25s user 0.10s system 157% cpu 0.862 total

I think this issue can be closed. Though I do have few other cleanups to push out into less_dart.

@DisDis

This comment has been minimized.

Show comment
Hide comment
@DisDis

DisDis Nov 18, 2016

@mraleph. It is very cool.
Many thanks

DisDis commented Nov 18, 2016

@mraleph. It is very cool.
Many thanks

@DisDis DisDis closed this Nov 18, 2016

@DisDis

This comment has been minimized.

Show comment
Hide comment
@DisDis

DisDis Nov 18, 2016

@mraleph 'RegExp.matchAsPrefix' will be in Dart 1.21?

DisDis commented Nov 18, 2016

@mraleph 'RegExp.matchAsPrefix' will be in Dart 1.21?

@mraleph

This comment has been minimized.

Show comment
Hide comment
@mraleph

mraleph Nov 18, 2016

Contributor

@DisDis RegExp.matchAsPrefix is already in Dart, but optimization which makes it usable I think will only be in 1.21 indeed.

Running opt1 branch with older Dart actually shows that RegExp.matchAsPrefix had very poor performance before (benchmark is 17x slower), which is unsurprising - it was searching for a match through the whole string and then discarding that match if it did not occur at the expected position.

Contributor

mraleph commented Nov 18, 2016

@DisDis RegExp.matchAsPrefix is already in Dart, but optimization which makes it usable I think will only be in 1.21 indeed.

Running opt1 branch with older Dart actually shows that RegExp.matchAsPrefix had very poor performance before (benchmark is 17x slower), which is unsurprising - it was searching for a match through the whole string and then discarding that match if it did not occur at the expected position.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment