Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize string concatenation #1685

Closed
vitaut opened this issue May 16, 2020 · 5 comments
Closed

Optimize string concatenation #1685

vitaut opened this issue May 16, 2020 · 5 comments

Comments

@vitaut
Copy link
Contributor

vitaut commented May 16, 2020

Look if we can beat string append with or without reserve on this benchmark: https://godbolt.org/z/EZE7Dn

% ./naive-benchmark
2020-05-15 17:44:56
Running ./naive-benchmark
Run on (8 X 2800 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 262K (x4)
  L3 Unified 8388K (x1)
Load Average: 1.50, 1.52, 1.63
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    137 ns          137 ns      5184417
append                   109 ns          109 ns      6200012
appendWithReserve        117 ns          116 ns      5923469
fmtlib                   185 ns          185 ns      3792887
nullop                 0.220 ns        0.220 ns   1000000000
@vitaut
Copy link
Contributor Author

vitaut commented May 16, 2020

format_to+memory_buffer results are similar to appendWithReserve:

% ./naive-benchmark
2020-05-15 17:53:33
Running ./naive-benchmark
Run on (8 X 2800 MHz CPU s)
CPU Caches:
  L1 Data 32K (x4)
  L1 Instruction 32K (x4)
  L2 Unified 262K (x4)
  L3 Unified 8388K (x1)
Load Average: 2.01, 4.79, 3.79
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    136 ns          136 ns      5073493
append                   110 ns          110 ns      6052327
appendWithReserve        118 ns          118 ns      5823676
fmtlib                   118 ns          118 ns      5977542
nullop                 0.221 ns        0.221 ns   1000000000

@vitaut
Copy link
Contributor Author

vitaut commented May 19, 2020

With a small parsing optimization:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    133 ns          133 ns      5160377
append                   118 ns          118 ns      6391760
appendWithReserve        121 ns          121 ns      5549698
format                   165 ns          165 ns      4162108
format_to                104 ns          103 ns      6547073
nullop                 0.221 ns        0.221 ns   1000000000

@vitaut
Copy link
Contributor Author

vitaut commented May 19, 2020

Second iteration:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    135 ns          135 ns      5096209
append                   107 ns          107 ns      6414956
appendWithReserve        111 ns          111 ns      6131745
format                   153 ns          153 ns      4523980
format_to               94.6 ns         94.6 ns      7208840
nullop                 0.218 ns        0.218 ns   1000000000

@vitaut
Copy link
Contributor Author

vitaut commented Jun 12, 2020

fmt::format with format string compilation

auto output = fmt::format(FMT_COMPILE("Result: {}: ({},{},{},{})"), str1, str2, str3, str4, str5);

beats concat even with preallocation, so I guess not much to do here:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    137 ns          137 ns      4801855
append                   129 ns          128 ns      6219734
appendWithReserve        117 ns          117 ns      5995461
format_compile           101 ns          101 ns      6953205
format_runtime           161 ns          161 ns      4368884
format_to                103 ns          103 ns      6706715
nullop                 0.230 ns        0.230 ns   1000000000

Might reintroduce small format string optimization to improve format_runtime perf.

@vitaut
Copy link
Contributor Author

vitaut commented Jun 13, 2020

With small optimization:

------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
naive                    139 ns          139 ns      4840941
append                   113 ns          113 ns      6207325
appendWithReserve        116 ns          116 ns      6029597
format_compile           108 ns          107 ns      6392344
format_runtime           129 ns          129 ns      5341350
format_to               67.7 ns         67.6 ns     10185967
nullop                 0.233 ns        0.232 ns   1000000000

@vitaut vitaut closed this as completed Jun 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant