-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CQL UUID parsing marshal error #1635
Comments
It's in fact illegal according to https://en.wikipedia.org/wiki/Universally_unique_identifier. |
@avikivity It doesn't matter, it's what Cassandra supports. |
Of course. |
@avikivity What's invalid about it? It passes some random UUID validator (http://www.uuid-validator.com/), which claims it's "version 4" UUID. |
Right. |
@avikivity So the UUID doesn't pass the regexp validation phase but I don't really understand why. The regular expression itself looks OK to me. |
Maybe need to enable some enhanced mode for {}. Or maybe the - is some unicode char. |
That doesn't explain the error though because the backtrace suggests that the string_view is constructed from |
@tgrabiec, nice catch about This patch makes the problem go away: diff --git a/types.cc b/types.cc
index 12104d8..3c29fec 100644
--- a/types.cc
+++ b/types.cc
@@ -799,7 +799,7 @@ struct uuid_type_impl : concrete_type<utils::UUID> {
return bytes();
}
static const std::regex re("^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$");
- if (!std::regex_match(s.data(), re)) {
+ if (!std::regex_match(std::string{s.data(), s.length()}, re)) {
throw marshal_exception();
}
utils::UUID v(s); What is it exactly that you want me to check? When we get to Thread 1 "scylla" hit Breakpoint 2, uuid_type_impl::from_string (this=0x60000012b860, s="80761BE4-C2E8-4329-A3BA-B6E0688CC775") at types.cc:797
797 virtual bytes from_string(sstring_view s) const override {
(gdb) print s
$1 = "80761BE4-C2E8-4329-A3BA-B6E0688CC775"
(gdb) print s._M_len
$2 = 36
(gdb) print s._M_str
$3 = 0x6000012fd410 "80761BE4-C2E8-4329-A3BA-B6E0688CC775" |
@penberg It would be more efficient to avoid the copy, like this: The sstring_view contents itself will not include the null terminator, but it should follow it immediately if it is created from an sstring instance. So |
|
@tgrabiec Interestingly enough, |
_text is constructed by moving/copying sstrings that were initially constructed from an std::string. The entire path looks correct. |
...and the failure disappears with debug build (no ASan warnings either). |
@penberg then maybe we have use-after-free or concurrent modification of the storage underlying s. Copying could help by narrowing down the access time window. |
@tgrabiec Like I said, no ASan errors, so unlikely to be use-after-free. |
Also, if you look at the lifetime of |
@penberg in release build, does it reproduce for you every time, using cqlsh and the statements you gave in the description? I can't reproduce the failure here. Maybe it's related to gcc/libc version. |
@tgrabiec It's always reproducible with the release build. |
I'm on Fedora 24:
|
Maybe we're violating -fstrict-aliasing again. |
OK, so the hypothesis about the string not being zero-terminated is false:
Also, just to make sure [penberg@nero tmp]$ cat regex.cc
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::string s = "80761BE4-C2E8-4329-A3BA-B6E0688CC775";
static const std::regex re("^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$");
if (!std::regex_match(s.data(), re)) {
std::cout << "FAIL: no match" << std::endl;
} else {
std::cout << "PASS" << std::endl;
}
}
[penberg@nero tmp]$ g++ -O3 -Wall -std=c++17 regex.cc && ./a.out
PASS |
What if you use sstring and sstring_view instead of std::string? |
Can you check zero termination in Scylla? The test program has different code paths. |
@avikivity That |
No warnings from diff --git a/configure.py b/configure.py
index b54f72d..eb17b76 100755
--- a/configure.py
+++ b/configure.py
@@ -614,6 +614,7 @@ deps['tests/anchorless_list_test'] = ['tests/anchorless_list_test.cc']
warnings = [
'-Wno-mismatched-tags', # clang-only
'-Wno-maybe-uninitialized', # false positives on gcc 5
+ '-fstrict-aliasing',
]
warnings = [w |
Oh, |
-fstrict-laliasing is the default, and it makes problems appear, not disappear. |
Uhm, yes, I do understand that's the way it should be. But it's not. Problem disappears with:
I suppose I need to rebuild yet again to see if I am able to reproduce with a clean build... |
Blah. The issue is no longer reproducible. I suspect what happened was that I had not cleaned up |
Well, different gcc versions ought to generate interoperable code. |
Yeah? I would have assumed all of them have to follow some ABI... Anyway, I really wanted a clean build but simply forgot to nuke Seastar build artifacts. |
Installation details
Scylla version (or git commit hash): 1.3
Scylla throws a marshal exception for the following UUID although it works fine in Cassandra:
Full backtrace:
Spotted by https://github.com/IBM-Swift/Kassandra tests.
The text was updated successfully, but these errors were encountered: