Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-5709: Drastically improve to_string() performance. #2806

Merged
merged 2 commits into from Jun 10, 2023

Conversation

tinloaf
Copy link
Contributor

@tinloaf tinloaf commented May 12, 2023

Currently, the to_string() overloads in TToString.h call the std::locale constructor for every single call. Creating locales is surprisingly expensive. We have an application where we - especially during tests - write large amounts of Thrift dumps to disk, and is this application we currently spend around 17% of total CPU time in std::locale's constructor (building with MSVC, in a Release build - other compilers might optimize this away?). With this change, it's basically down to zero.

Since we always create the exact same locale anyways, using a const static one does not hurt.

  • Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

Creating locales is surprisingly expensive. We have an application where we - especially during tests - write large amounts of Thrift dumps to disk, and is this application we currently spend around 17% of total CPU time in std::locale's constructor. With this change, it's basically down to zero.
@Jens-G
Copy link
Member

Jens-G commented May 15, 2023

Why do we need 4 instances?

@tinloaf
Copy link
Contributor Author

tinloaf commented May 16, 2023

Why do we need 4 instances?

I don't think we do, I just wanted to make the change as small as possible. Would you prefer a global constant (in an anonymous namspace probably)?

@Jens-G
Copy link
Member

Jens-G commented May 16, 2023

Well, I just thought if we are going to optimize, why don't we do it right ... so yes, would make sense to me, if that's not causing other overhead, too much confusion or work, then I'm for it.

@tinloaf
Copy link
Contributor Author

tinloaf commented May 17, 2023

@Jens-G I have updated the PR to use a single, TU-local locale.

@@ -32,10 +32,15 @@
namespace apache {
namespace thrift {

// unnamed namespace to enforce internal linkage - could be done with 'inline' when once have C++17
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: when/once is duplicate here.

@Jens-G Jens-G merged commit 1d6a326 into apache:master Jun 10, 2023
17 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants