Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix end to end test flakiness in CI #1486

Merged
merged 28 commits into from Feb 23, 2024
Merged

Conversation

jpraynaud
Copy link
Member

@jpraynaud jpraynaud commented Feb 2, 2024

Content

This PR includes fixes to remove observed flakiness in the CI lately:

  • Error: 'mithril-aggregator genesis bootstrap' exited with code: 1
  • Timeout exhausted for target epoch to be reached
  • Minimum expected mithril stake distribution epoch not reached : XX < YY
  • Run devnet exited with status code: 1
  • Write era marker on chain exited with status code: 1
  • Timeout exhausted assert_node_producing_mithril_stake_distribution, no response from http://localhost:8080/aggregator/artifact/mithril-stake-distributions

Pre-submit checklist

  • Branch
    • Tests are provided (if possible)
    • Crates versions are updated (if relevant)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
  • PR
    • No clippy warnings in the CI
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

Issue(s)

Closes #1147

@jpraynaud jpraynaud self-assigned this Feb 2, 2024
Copy link

github-actions bot commented Feb 2, 2024

Test Results

  3 files  ±0   42 suites  ±0   8m 31s ⏱️ +13s
864 tests  - 1  864 ✅  - 1  0 💤 ±0  0 ❌ ±0 
953 runs   - 1  953 ✅  - 1  0 💤 ±0  0 ❌ ±0 

Results for commit fa5e910. ± Comparison against base commit 52fa7d3.

♻️ This comment has been updated with latest results.

@jpraynaud jpraynaud force-pushed the jpraynaud/1147-fix-e2e-flakiness branch 12 times, most recently from 22a54fd to 474c8fa Compare February 5, 2024 14:52
@jpraynaud jpraynaud force-pushed the jpraynaud/1147-fix-e2e-flakiness branch 10 times, most recently from 1f329d5 to b9f63a1 Compare February 8, 2024 17:45
@jpraynaud jpraynaud force-pushed the jpraynaud/1147-fix-e2e-flakiness branch 5 times, most recently from e5ba96a to 5a75559 Compare February 15, 2024 15:20
jpraynaud and others added 25 commits February 23, 2024 17:02
Bootstrap protocol parameters on first launch only.

Attempt to fix temporal effects at startup (fast epochs) resulting in
error: '`mithril-aggregator genesis bootstrap` exited with code: 1'
The discrepancy can occur not only at startup and not only on the 'serve' command.
This was responsible a source of flakiness in the e2e test.
This lead to Cardano nodes running on different divergent forks,
and created panic on era reader which did not retrieve any marker.
This was responsible for flakiness in the network with low values
of slot length and epoch length (such as those used in the E2E test).
It was responsible for era markers readers retrieving no values and
panicking.
This avoids flakiness due to 'InsufficientPeers' errors from the P2P network.
Use  instead of the default value, to avoid 'InsufficientPeers' errors in the P2P network.
Attempt to avoid not reaching the quorum in some rare cases.
Use the security parameter value ('10') to make sure that transaction is synchornized on all Cardano nodes.
This avoids getting some nodes panic at startup beacuse of empty era markers retrieval.
This will avoid panic on Mithril nodes in case of rollback, which results in empty markers on chain.
We currently sign the penultimate immutable chunk, which means that transactions
could either be in the ultimate immutable chunk or in the volatile db, thus not signed
by Mithril. This has lead to having only a subset of the submitted transactions
verified by the client.
This will avoid to have an unnecessary BFT node running in the devnet when started from the e2e test.
- 'mithril-aggregator' from '0.4.39' to '0.4.40'
- 'mithril-common' from '0.3.7' to 0.3.8'
- 'mithril-relay' from '0.1.11' to '0.1.12'
- 'mithril-enn-to-end' from '0.3.3' to '0.4.0'
- 'mithril-devnet' from '0.3.0' to '0.3.1'
@jpraynaud jpraynaud force-pushed the jpraynaud/1147-fix-e2e-flakiness branch from 45629d1 to fa5e910 Compare February 23, 2024 16:03
@jpraynaud jpraynaud merged commit 6e1b2b4 into main Feb 23, 2024
37 of 39 checks passed
@jpraynaud jpraynaud deleted the jpraynaud/1147-fix-e2e-flakiness branch February 23, 2024 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some end to end tests are flaky in the CI
4 participants