Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Optimize String::replace methods #92546

Merged
merged 1 commit into from
Aug 19, 2024

Conversation

AThousandShips
Copy link
Member

@AThousandShips AThousandShips commented May 30, 2024

Performs a single allocation, only when any instances are found, and avoids concatenations and other unnecessary conversions.

Haven't done any benchmarking yet but as far as I know there's no optimization that can handle the concatenations in a way that avoids repeated allocations.

Will run the code through QuickBench soon with some of the data in the PR below but have to translate some of the code first.

Related:

@@ -3986,54 +3986,161 @@ String String::format(const Variant &values, const String &placeholder) const {
return new_string;
}

String String::replace(const String &p_key, const String &p_with) const {
String new_string;
static String _replace_common(const String &p_this, const String &p_key, const String &p_with, bool p_case_insensitive) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the code only differs on which find method to use I created a helper method to avoid code duplication

@AThousandShips
Copy link
Member Author

Gonna write up a benchmark test in the next few days when I find the time to convert the code to something benchmarkable (won't necessarily catch some of the bottlenecks but should at least catch the benefits of pre-processing and memcpy)

@AThousandShips
Copy link
Member Author

There managed to write some benchmarks now in fact, using this:
https://quick-bench.com/q/HgsGkbFKTnXifEkW1BJzDe4lXP8 (adapted from #92433 (comment))

Will do some other case tests as well to see exactly how it might improve, but this shows this specific case

HgsGkbFKTnXifEkW1BJzDe4lXP8

The improvements will likely be a bit better thanks to other behind the scenes differences this can't test for (COW stuff primarily)

@AThousandShips
Copy link
Member Author

The results in some cases are hard to evaluate, might be allocation based so hard to benchmark against them and consider the differences with the engine code, will try further benchmarks and see

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected. This provides a slight speedup in a real world scenario (using print_rich() lots of times, which currently relies on string replacement a lot).

Testing project: test_print_rich.zip

Benchmark

Using optimize=speed lto=full Linux editor binaries.

Command:

$ hyperfine 'bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null' 'bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null'
Benchmark 1: bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null
  Time (mean ± σ):     15.030 s ±  0.211 s    [User: 10.756 s, System: 3.243 s]
  Range (min … max):   14.651 s … 15.292 s    10 runs
 
Benchmark 2: bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null  <--- This PR
  Time (mean ± σ):     14.046 s ±  0.504 s    [User: 9.959 s, System: 3.257 s]
  Range (min … max):   13.138 s … 14.656 s    10 runs
 
Summary
  bin/godot.linuxbsd.editor.x86_64 --path /tmp/4 &> /dev/null ran
    1.07 ± 0.04 times faster than bin/godot.linuxbsd.editor.x86_64.master --path /tmp/4 &> /dev/null
PC specifications
  • CPU: Intel Core i9-13900K
  • GPU: NVIDIA GeForce RTX 4090
  • RAM: 64 GB (2×32 GB DDR5-5800 C30)
  • SSD: Solidigm P44 Pro 2 TB
  • OS: Linux (Fedora 39)

Performs a single allocation, only when any instances are found, and
avoids concatenations and other unnecessary conversions.
@akien-mga akien-mga modified the milestones: 4.x, 4.4 Aug 17, 2024
@akien-mga
Copy link
Member

For the record, the unit tests for String::replace seem a bit lightweight, might be worth expanding in a follow-up (maybe for more of the String API given the number of PRs that aim to optimize it, we need to make sure it's well tested to prevent regressions).

@akien-mga akien-mga merged commit 4afcbb1 into godotengine:master Aug 19, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

@AThousandShips AThousandShips deleted the faster_replace branch August 19, 2024 10:24
@AThousandShips
Copy link
Member Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants