Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in String::sprintf #94558

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

stuartcarnie
Copy link
Contributor

@stuartcarnie stuartcarnie commented Jul 20, 2024

This prevents an allocation of a new string for every += char32_t. The example string was rendered as a label every frame in GDScript:

text = "%d FPS (%.2f mspf)" % [Engine.get_frames_per_second(), 1000.0 / Engine.get_frames_per_second()]

To validate the improvement, allocations were captured for a 5-second window. Note the number of allocations in the sprintf function of the following images.

Before: 43,830 allocations

CleanShot 2024-07-20 at 14 49 03@2x

After operator += (char32_t): 17,474 allocations

Note

60% reduction in allocations.

CleanShot 2024-07-20 at 14 49 28@2x

After static const String: 12,934

Note

A further improvement was to use some static const String for common padding and sign characters, which reduced the allocations a further 26%

CleanShot 2024-07-20 at 15 23 59@2x

Note

Total reduction in allocations for the 5-second window was 70%

@stuartcarnie stuartcarnie requested a review from a team as a code owner July 20, 2024 04:58
@Calinou Calinou added this to the 4.x milestone Jul 20, 2024
@stuartcarnie stuartcarnie force-pushed the sgc/sprintf_allocations branch from dd3698b to 82dec21 Compare July 20, 2024 05:25
@stuartcarnie stuartcarnie changed the title use operator += (char32_t) Reduce allocations in String::sprintf Jul 20, 2024
@stuartcarnie stuartcarnie changed the title Reduce allocations in String::sprintf Reduce allocations in String::sprintf Jul 20, 2024
@kus04e4ek
Copy link
Contributor

kus04e4ek commented Jul 20, 2024

Related: #93697, this PR supersedes that one

@kus04e4ek
Copy link
Contributor

I feel like intead of static const String, methods like insert_char should be introduced, similar to #92475

@stuartcarnie
Copy link
Contributor Author

stuartcarnie commented Jul 20, 2024

I feel like intead of static const String, methods like insert_char should be introduced, similar to #92475

Totally agree. Adding appending APIs would be helpful. Also, being able to keep the memory around for a String that could be used as a temporary would be beneficial too, as there are a number of temporaries created in sprintf

@kus04e4ek
Copy link
Contributor

kus04e4ek commented Jul 20, 2024

Also, being able to keep the memory around for a String that could be used as a temporary would be beneficial too, as there are a number of temporaries created in sprintf

There's quite a lot of examples of thread_local LocalVector used for similar purposes, so I wonder about introducing a LocalString class, which would use LocalVector inside of it. Not sure if it would be worth it if there's not many uses for it though

@stuartcarnie stuartcarnie force-pushed the sgc/sprintf_allocations branch from 82dec21 to 0d604d8 Compare July 21, 2024 08:18
@RandomShaper
Copy link
Member

Just passing by. A few thoughts:

  • T const & is inconsistent with the rest of codebase. const T & is preferred.
  • There is string appening functionality, StringBuilder. I'm not sure how to use it for the late sign insertion, but there's likely a clever way to solve that.
  • String str = "(" deserves the optimization, too.

There's quite a lot of examples of thread_local LocalVector used for similar purposes, so I wonder about introducing a LocalString class, which would use LocalVector inside of it. Not sure if it would be worth it if there's not many uses for it though

The thread_local thing is being done with LocalVector because it features a capacity feature, which Vector doesn't. In addition, LocalVector doesn't have a thread-safe refcount, which makes it more performant and the right option when you know you don't need such a thing. My point is that thread_local Vector and thread_local String would be possible if Vector featured the aforementioned capacity-reserve feature. Adding it would avoid having to emulate it in certain parts of the engine where the Vector is sized to what it would be the capacity and the size is managed externally.

In other words, at some point my view is that it would be ideal that Vector and LocalVector were just two specializations of the same template, with better names. So, Vector would be LocalVector plus copy-on-write
I opened a proposal time ago about this: godotengine/godot-proposals#5144.

We could even have a copy-on-write specialization which is not thread-safe, which may be a better fit for strings. I'd say that all of this would need to be discussed carefully because there are things yelling to be improved, such as these vector types getting more telling names and finding out the most precise set of features each needs across the codebase.

@clayjohn clayjohn modified the milestones: 4.x, 4.4 Jul 24, 2024
@clayjohn
Copy link
Member

Discussion of future improvements aside, I think we can go ahead and merge this once it is updated to use const T & formatting as requested by RandomShaper.

@stuartcarnie
Copy link
Contributor Author

  • T const & is inconsistent with the rest of codebase. const T & is preferred.

I'll fix that.

As an aside, I wrote a blog about const 13 years ago, indicating a more intuitive placement, with a simple rule for reading it:

http://blog.stuartcarnie.com/post/constant-confusion/

This prevents an allocation of a new string for every character.

Additionally, use some static constants for padding and sign characters
@stuartcarnie stuartcarnie force-pushed the sgc/sprintf_allocations branch from 0d604d8 to 67eb6be Compare July 24, 2024 23:18
@stuartcarnie
Copy link
Contributor Author

@RandomShaper / @clayjohn changed to const T &

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected.

Testing project: test_pr_94558.zip

Benchmark

PC specifications
  • CPU: Intel Core i9-13900K
  • GPU: NVIDIA GeForce RTX 4090
  • RAM: 64 GB (2×32 GB DDR5-5800 C30)
  • SSD: Solidigm P44 Pro 2 TB
  • OS: Linux (Fedora 40)

Using an optimize=speed lto=full editor build.

Time taken to format the string 1 million times from GDScript (as a standalone expression):

"%d FPS (%.2f mspf)" % [Engine.get_frames_per_second(), 1000.0 / Engine.get_frames_per_second()]
Before After
1.009s 0.679s

@clayjohn
Copy link
Member

As an aside, I wrote a blog about const 13 years ago, indicating a more intuitive placement, with a simple rule for reading it:

Lol, if I had to debug an issue arising from const int const * const * i = &j; // ok I would cry

@akien-mga akien-mga merged commit 70096c0 into godotengine:master Aug 16, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

@stuartcarnie stuartcarnie deleted the sgc/sprintf_allocations branch August 22, 2024 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants