-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end tests are flaky in CI #1558
Comments
After investigation of the second flakiness, it appears that there is an epoch gap that is occurring which is responsible for the error with 2 different origins
After investigation it appears that the Cardano nodes don't all have the same version of the immutable files (which should not appear in real conditions as the immutable files must be strictly the same for all Cardano nodes on a network). $ ls -al node-pool1/db/immutable | grep chunk | head
-rw-r--r-- 1 jp jp 85519 Feb 27 14:03 00000.chunk
-rw-r--r-- 1 jp jp 86272 Feb 27 14:03 00001.chunk
-rw-r--r-- 1 jp jp 86396 Feb 27 14:03 00002.chunk
-rw-r--r-- 1 jp jp 87216 Feb 27 14:03 00003.chunk
-rw-r--r-- 1 jp jp 86266 Feb 27 14:03 00004.chunk
-rw-r--r-- 1 jp jp 85800 Feb 27 14:03 00005.chunk
-rw-r--r-- 1 jp jp 46332 Feb 27 14:03 00006.chunk $ ls -al node-pool2/db/immutable | grep chunk | head
-rw-r--r-- 1 jp jp 85519 Feb 27 14:03 00000.chunk
-rw-r--r-- 1 jp jp 86272 Feb 27 14:03 00001.chunk
-rw-r--r-- 1 jp jp 85688 Feb 27 14:03 00002.chunk
-rw-r--r-- 1 jp jp 85800 Feb 27 14:03 00003.chunk
-rw-r--r-- 1 jp jp 86732 Feb 27 14:03 00004.chunk
-rw-r--r-- 1 jp jp 85800 Feb 27 14:03 00005.chunk
-rw-r--r-- 1 jp jp 46332 Feb 27 14:03 00006.chunk $ ls -al node-pool3/db/immutable | grep chunk | head
-rw-r--r-- 1 jp jp 85519 Feb 27 14:03 00000.chunk
-rw-r--r-- 1 jp jp 86272 Feb 27 14:03 00001.chunk
-rw-r--r-- 1 jp jp 85688 Feb 27 14:03 00002.chunk
-rw-r--r-- 1 jp jp 85800 Feb 27 14:03 00003.chunk
-rw-r--r-- 1 jp jp 86732 Feb 27 14:03 00004.chunk
-rw-r--r-- 1 jp jp 85800 Feb 27 14:03 00005.chunk
-rw-r--r-- 1 jp jp 46332 Feb 27 14:03 00006.chunk We can clearly see that from
This situation is most likely due to the Cardano parameters that we use to make the test run fast. In that case, there are no chance that the Nonetheless, the final outcome of the test would be a failure as the end to end test would detect that some immutable files have not been signed. Given the low probability of occurrence of this flakiness, the fact that it is very unlikely to correspond to a real life situation and the overall high difficulty to properly handle that situation, there is no possible fix to implement.
After investigation, it appears that a signature that was sent by one of the signers was not properly received by the aggregator relay (whereas it was successfully received by other peers on the pubsub P2P topic). The aggregator was unable to create a multi-signature with individual signatures of only one signer as the quorum was not reached. As there is not timeout on the Mithril stake distribution signed entity, this lead to not creating any certificate for the corresponding epoch (which translates to an epoch gap at following epochs) As this feature is still experimental, occurring with low probability, and currently being modified with the issue #1587, we'll not implement a fix directly but we will keep monitoring if it keeps occurring when developing #1587. |
Why
Following the fix of the end to end flakiness in #1147, we have noticed that there still exist some rare occurrences where flakiness still occur.
Flakiness 1
Flakiness 2
What
Investigate the origins of the flakiness and provide fixes.
How
Fix flakiness 2The text was updated successfully, but these errors were encountered: