-
-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace UUID validation RegEx with manual validation #651
Conversation
d8a97cf
to
65d9ccb
Compare
My |
Nice catch! But the fix could be better and simpler 😉 Creating static objects is often dangerous, especially for complex types. Since the initialization order of static objects is not well defined, very weird behavior could be introduced if there are dependencies between static objects. In addition, these objects are initialized very early (before Thus I'd avoid static objects whenever possible. Usually that's easy by wrapping such objects with a static method like this: class Foo {
static const Bar& createBar() {
static Bar bar;
return bar;
}
}; In that case, the object will be created once when it is accessed the first time, i.e. much later than static objects. But in the Btw, there are several other classes which use |
And one more issue since |
Why not just write the matching code manually instead of using a regex library? It should be quite simple in this case. Something like:
I believe this would be faster than using a regular expression compiled at runtime. |
Thanks for the explanation! I'm used to lazy_static and already expected static initialization of complex objects to be potentially fishy 🙂
I thought about that too, but a regex basically does that as well, but in a very optimized way that might make optimal use of CPU caches and things like SIMD instructions. If the RegEx is not unnecessarily created multiple times, that cost should be amortized over time. In any case, we would need to benchmark it. Any performance optimization without a benchmark is just guessing. |
Btw, a compile-time regular expression library like ctre would be very nice, but it requires C++17 (Clang 5+, GCC 7+). |
I made a simple benchmark (a standalone CLI application that validates 30'000 UUIDs): #include <iostream>
#include <librepcb/common/uuid.h>
using namespace librepcb;
const static QString uuids[3000] = {
"61a0caad-812b-4291-908d-b63aa9f7213e",
// ...
"baa57b4e-3e68-4e3b-8a07-041575b42fce"
};
void validate_uuids() {
unsigned volatile int valid = 0;
unsigned volatile int invalid = 0;
for (int i = 0; i < 10; i++) {
for (QString uuid : uuids) {
if (Uuid::isValid(uuid)) {
valid += 1;
} else {
invalid += 1;
}
}
}
std::cout << valid << " valid, " << invalid << " invalid." << std::endl;
}
int main() {
std::cout << "Starting benchmark..." << std::endl;
validate_uuids();
return 0;
} I compared the current inline bool isLowerHex(const QChar chr) noexcept {
return (chr >= QChar('0') && chr <= QChar('9')) || (chr >= QChar('a') && chr <= QChar('f'));
}
// Loops
bool Uuid::isValid(const QString& str) noexcept {
// check format of string
if (str.length() != 36) return false;
for (uint i = 0; i < 8; i++) {
if (!isLowerHex(str[i])) return false;
}
if (str[8] != QChar('-')) return false;
for (uint i = 9; i < 13; i++) {
if (!isLowerHex(str[i])) return false;
}
if (str[13] != QChar('-')) return false;
for (uint i = 14; i < 18; i++) {
if (!isLowerHex(str[i])) return false;
}
if (str[18] != QChar('-')) return false;
for (uint i = 19; i < 23; i++) {
if (!isLowerHex(str[i])) return false;
}
if (str[23] != QChar('-')) return false;
for (uint i = 24; i < 36; i++) {
if (!isLowerHex(str[i])) return false;
}
// ...
}
// Manual loop unrolling
bool Uuid::isValid(const QString& str) noexcept {
// check format of string
if (str.length() != 36) return false;
if (!isLowerHex(str[0])) return false;
if (!isLowerHex(str[1])) return false;
if (!isLowerHex(str[2])) return false;
if (!isLowerHex(str[3])) return false;
if (!isLowerHex(str[4])) return false;
if (!isLowerHex(str[5])) return false;
if (!isLowerHex(str[6])) return false;
if (!isLowerHex(str[7])) return false;
if (str[8] != QChar('-')) return false;
if (!isLowerHex(str[9])) return false;
if (!isLowerHex(str[10])) return false;
if (!isLowerHex(str[11])) return false;
if (!isLowerHex(str[12])) return false;
if (str[13] != QChar('-')) return false;
if (!isLowerHex(str[14])) return false;
if (!isLowerHex(str[15])) return false;
if (!isLowerHex(str[16])) return false;
if (!isLowerHex(str[17])) return false;
if (str[18] != QChar('-')) return false;
if (!isLowerHex(str[19])) return false;
if (!isLowerHex(str[20])) return false;
if (!isLowerHex(str[21])) return false;
if (!isLowerHex(str[22])) return false;
if (str[23] != QChar('-')) return false;
if (!isLowerHex(str[24])) return false;
if (!isLowerHex(str[25])) return false;
if (!isLowerHex(str[26])) return false;
if (!isLowerHex(str[27])) return false;
if (!isLowerHex(str[28])) return false;
if (!isLowerHex(str[29])) return false;
if (!isLowerHex(str[30])) return false;
if (!isLowerHex(str[31])) return false;
if (!isLowerHex(str[32])) return false;
if (!isLowerHex(str[33])) return false;
if (!isLowerHex(str[34])) return false;
if (!isLowerHex(str[35])) return false;
// ...
} I then timed the different variants 5 times with
@ubruhin It seems that char comparison beats regular expressions and it's thread safe, so I'd go with the unrolled version unless you prefer to spend a few cycles for better readability. (It seems quite strange to me that these loops aren't unrolled by the compiler 😕) In any case, a 3083% speedup is quite nice 😁 |
Why shouldn't it be thread safe? You don't modify the regex, right? The static initialization is guaranteed to be thread safe by C++11: https://stackoverflow.com/questions/1661529/is-meyers-implementation-of-the-singleton-pattern-thread-safe. Edit: Oh wow. After reading https://stackoverflow.com/questions/25581696/qregularexpression-matching-thread-safety I'm again horrified about C++.
How can one not guarantee that const access is thread safe? |
The RegEx engine probably has an internal state machine 😉 |
Thanks for the detailed investigation!
Ok I'm fine with that, but would by nice to add a comment to the code why (the hell) UUID checking is done that way (link to this PR or so). Otherwise one might want to replace it by a RegEx some time 😁 |
2e9eece
to
1eb7dce
Compare
When analyzing a callgrind profile of a full library rescan, I noticed that there are a lot of CPU instructions coming from libpcre2, with the call originating in the UUID validation code. Previously, the regular expression used for UUID validation was parsed from a string for every invocation of the validation function. This is unnecessary. A library rescan will validate thousands of UUIDs (roughly 250k calls to `Uuid::isValid` with 200'000 instructions per call on my laptop for a library rescan). By replacing the RegEx with a manually loop-unrolled validation function, I achieved a 20-40% speedup for a full library scan on my computer (which has a fast SSD). The CPU instructions per call to `isValid` were reduced from 187'341 to 767. For more details, please refer to the pull request discussion: #651
1eb7dce
to
ae1dd70
Compare
@ubruhin updated and rebased! I added an inline comment, as well as a detailed commit message. |
When analyzing a callgrind profile of a full library rescan, I noticed that there are a lot of CPU instructions coming from libpcre2, with the call originating in the UUID validation code. Previously, the regular expression used for UUID validation was parsed from a string for every invocation of the validation function. This is unnecessary. A library rescan will validate thousands of UUIDs (roughly 250k calls to `Uuid::isValid` with 200'000 instructions per call on my laptop for a library rescan). By replacing the RegEx with a manually loop-unrolled validation function, I achieved a 20-40% speedup for a full library scan on my computer (which has a fast SSD). The CPU instructions per call to `isValid` were reduced from 187'341 to 767. For more details, please refer to the pull request discussion: #651
Make isLowerHex a lambda function
ae1dd70
to
6b0cfd9
Compare
When analyzing a callgrind profile of a full library rescan, I noticed that there are a lot of CPU instructions coming from libpcre2, with the call originating in the UUID validation code. Previously, the regular expression used for UUID validation was parsed from a string for every invocation of the validation function. This is unnecessary. A library rescan will validate thousands of UUIDs (roughly 250k calls to `Uuid::isValid` with 200'000 instructions per call on my laptop for a library rescan). By replacing the RegEx with a manually loop-unrolled validation function, I achieved a 20-40% speedup for a full library scan on my computer (which has a fast SSD). The CPU instructions per call to `isValid` were reduced from 187'341 to 767. For more details, please refer to the pull request discussion: #651 (cherry picked from commit ca83f88)
When analyzing a callgrind profile of a full library rescan, I noticed
that there are a lot of CPU instructions coming from libpcre2, with the
call originating in the UUID validation code.
Previously, the regular expression used for UUID validation was parsed
from a string for every invocation of the validation function. This is
unnecessary. A library rescan will validate thousands of UUIDs (roughly
250k calls to
Uuid::isValid
with 200'000 instructions per call on mylaptop for a library rescan). By moving the code to a static class
member, I achieved a 15% speedup for a full library scan.
Library scan duration for five consecutive LibrePCB starts using the release build:
Before:
Avg: 12'159 ms
After:
Avg: 10'325 ms (-15%)