-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemini failure due to float point expected to include more digits #11103
Comments
Reproduced in 5.0.0 :
Installation detailsKernel Version: 5.15.0-1015-aws Scylla Nodes used in this run:
OS / Image: Test: Issue description>>>>>>>
Logs:
|
@yarongilor at this state this issue doesn't tell anyone anything. |
Rerunning with gemini BYO job using the same seed (56) in: |
Not sure if it is a scylla issue or a Gemini issue. according to the above log, both lines are identical although reported to be different:
|
Issue is reproduced with a live cluster. As in the SCT Gemini error report, live queries of node-1 and the oracle are identical as well.
|
@slivne please assign @yarongilor what's identical? I see only one result. |
Please extend gemini to report the columns that don't match, so we don't have to guess. One possibility is that this is related to floating-point conversions. |
@avikivity
Node-4 output (same for node-1):
Linux |
The live cluser was terminated. It can be reproduced by rerunning job: https://jenkins.scylladb.com/job/scylla-master/job/longevity/job/byo-longevity-test/181/ |
Does gemini itself report both data sets and reports them as different even though visually they are identical? cqlsh output could be different due to accessing a different node or later read repair. |
Please adjust gemini to print which column it thinks is different (and if it's a collection, to say which elements it thinks are different). |
yes. as can be seen in the original (human-unreadable) message of SCT above.
After Gemini failed and stopped the cluster basically left in an idle state. |
ok, please make the adjustments I asked for above and we'll have more hints. |
At the moment there is no maintainer for Gemini and it may take us a while to reach this one. |
Reproduced while running 5.0.5: Installation detailsKernel Version: 5.15.0-1021-aws Cluster size: 3 nodes (i3.large) Scylla Nodes used in this run:
OS / Image: Test: Issue description>>>>>>>
<<<<<<<
Logs:
|
@KnifeyMoloko please extract the failure to be in a more readable way and please check if it's the exact same issue as this original bug that was opened - you can also consult @aleksbykov if it's known issue or not. |
@roydahan I extracted it to be a bit more readable now. I do believe it's the same as the one reported here. Same error encountered at the same time (end of warmup in Gemini, row values look to be the same, i.e. the above "+map" and "-map" output is the same) |
@avikivity from looking on the output that @KnifeyMoloko added above, I think that Gemini expects the "float" value to be 16 digits after the decimal point, but in fact scylla stores only 8 digits. Gemini expects: Scylla has (in both cluster under test and test oracle): |
So it seems like the bug could be in gemini in the way it interprets the values or in the way it calculates the expected value, not in scylla, right? |
So, in both our and Cassandra's documentation, float is a 32-bit according to IEEE-754. So, in fact we should expect 7 digits (IIUC), so I guess 8 digits is ok. |
Sorry guys for false-positive, Which is exactly root cause of this issue. |
Installation details
Kernel Version: 5.15.0-1015-aws
Scylla version (or git commit hash):
5.0.1-20220719.b177dacd3
with build-id217f31634f8c8722cadcfe57ade8da58af05d415
Cluster size: 3 nodes (i3.large)
Scylla Nodes used in this run:
OS / Image:
ami-0dd355c9f58ff2b64
(aws: us-east-1)Test:
gemini-3h-with-nemesis-test
Test id:
b11be702-5343-41e2-8ab7-f34a1cb3328c
Test name:
scylla-5.0/gemini-/gemini-3h-with-nemesis-test
Test config file(s):
Issue description
>>>>>>>
Scenario:
'Validation failed: rows differ (-map[ck0 ...
Gemini command:
Gemini error details:
The failed query is:
<<<<<<<
$ hydra investigate show-monitor b11be702-5343-41e2-8ab7-f34a1cb3328c
$ hydra investigate show-logs b11be702-5343-41e2-8ab7-f34a1cb3328c
Logs:
Jenkins job URL
The text was updated successfully, but these errors were encountered: