Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittant CI failure tracking issue II #2825

Open
elsirion opened this issue Jul 23, 2023 · 29 comments
Open

Intermittant CI failure tracking issue II #2825

elsirion opened this issue Jul 23, 2023 · 29 comments
Labels
CI Continuous integration

Comments

@elsirion
Copy link
Contributor

Successor of #222.

https://github.com/fedimint/fedimint/actions/runs/5637590680/job/15270864510?pr=2824

@elsirion elsirion added the CI Continuous integration label Jul 23, 2023
@justinmoon
Copy link
Contributor

@justinmoon
Copy link
Contributor

@elsirion
Copy link
Contributor Author

@elsirion
Copy link
Contributor Author

cannot_replay_transactions gets tuck

@justinmoon
Copy link
Contributor

@justinmoon
Copy link
Contributor

@justinmoon
Copy link
Contributor

@justinmoon
Copy link
Contributor

cli_test_reconnect

@justinmoon
Copy link
Contributor

@justinmoon
Copy link
Contributor

@okjodom
Copy link
Contributor

okjodom commented Aug 1, 2023

@justinmoon
Copy link
Contributor

justinmoon commented Aug 3, 2023

@justinmoon
Copy link
Contributor

Some debugging ideas from call:

  • Compile another project in parallel to introduce CPU contention
  • Put a loop around where it fails
  • Write bash scripts to re-run one test many times to try to trigger failure

@justinmoon
Copy link
Contributor

justinmoon commented Aug 7, 2023

I notice that all of the lightning-related test failures seem to happen in cli_test_rust_tests_bitcoind and never cli_test_rust_tests_electrs or cli_test_rust_tests_esplora ...

@justinmoon
Copy link
Contributor

test_gateway_cannot_pay_expired_invoice & test_gateway_client_pay_unpayable_invoice https://github.com/fedimint/fedimint/actions/runs/5789072035/job/15689333467?pr=2911#step:9:7236

@elsirion
Copy link
Contributor Author

elsirion commented Aug 17, 2023

cli_test_latency, cli_test_cli and cli_test_reconnect: https://github.com/fedimint/fedimint/actions/runs/5888250155/job/15969079807

@elsirion
Copy link
Contributor Author

elsirion commented Aug 18, 2023

@dpc
Copy link
Contributor

dpc commented Aug 18, 2023

Note: #2980 should make it much easier to debug failed runs.

@elsirion
Copy link
Contributor Author

elsirion commented Aug 26, 2023

Something got a lot of the code coverage tests stuck here (search for has been).

@elsirion
Copy link
Contributor Author

elsirion commented Aug 29, 2023

All the backed_tests failed maybe not a flake

@elsirion
Copy link
Contributor Author

elsirion commented Sep 5, 2023

Flakiness intensified, a lot of the code coverage tests just get stuck apparently.

@bradleystachurski
Copy link
Member

@justinmoon
Copy link
Contributor

@bradleystachurski
Copy link
Member

bradleystachurski commented Nov 28, 2023

FAIL [  36.905s] fedimint-ln-gateway::gatewayd-integration-tests test_gateway_executes_swaps_between_connected_federations

https://github.com/fedimint/fedimint/actions/runs/7013415232/job/19079522681?pr=3746

Same failure in separate PR

https://github.com/fedimint/fedimint/actions/runs/7015530652/job/19085010433?pr=3756

@okjodom
Copy link
Contributor

okjodom commented Jan 23, 2024

test_config_api possibly flaky

fedimint-workspace-lcov-ci> SLOW [>420.000s] fedimint-server config::api::tests::test_config_api[1m0 skipped
fedimint-workspace-lcov-ci> TERMINATING [>480.000s] fedimint-server config::api::tests::test_config_api[1m0 skipped
fedimint-workspace-lcov-ci> TIMEOUT [ 480.026s] fedimint-server config::api::tests::test_config_api skipped
fedimint-workspace-lcov-ci> --- STDOUT: fedimint-server config::api::tests::test_config_api ---
fedimint-workspace-lcov-ci> running 1 test
fedimint-workspace-lcov-ci> test config::api::tests::test_config_api has been running for over 60 seconds
fedimint-workspace-lcov-ci>
fedimint-workspace-lcov-ci> --- STDERR: fedimint-server config::api::tests::test_config_api ---
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.341974Z INFO consensus: Starting config gen
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.342039Z INFO net::peer::dkg: Created new config gen Api
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343480Z INFO net::api: Starting api on ws://127.0.0.1:25031
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343510Z INFO consensus: Starting config gen
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343601Z INFO net::peer::dkg: Created new config gen Api
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343872Z INFO net::api: Starting api on ws://127.0.0.1:25033
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343892Z INFO consensus: Starting config gen
fedimint-workspace-lcov-ci> 2024-01-22T19:41:15.343925Z INFO net::peer::dkg: Created new config gen Api
fedimint-workspace-lcov-ci> 2024-01-22T19:41:16.342903Z INFO test: Test retrying server status
fedimint-workspace-lcov-ci> 2024-01-22T19:41:16.345235Z INFO net::peer::dkg: Set password for config gen
fedimint-workspace-lcov-ci> 2024-01-22T19:41:16.345542Z WARN net::api: api request error path="set_password" error=ApiError { code: 400, message: "Expected to be in AwaitingPassword state" }
fedimint-workspace-lcov-ci> thread 'tokio-runtime-worker' panicked at fedimint-server/src/lib.rs:206:14:
fedimint-workspace-lcov-ci> Could not build API server: API name: config-gen
2335
fedimint-workspace-lcov-ci> Caused by:
fedimint-workspace-lcov-ci> 0: Bind address: 127.0.0.1:25035
fedimint-workspace-lcov-ci> 1: Networking or low-level protocol error: Address already in use (os error 98)
fedimint-workspace-lcov-ci> 2: Address already in use (os error 98)

https://github.com/fedimint/fedimint/actions/runs/7616433363/job/20743174870#step:6:2315

@okjodom
Copy link
Contributor

okjodom commented Jan 24, 2024

ci reconnect_test fails on merge queue

@bradleystachurski
Copy link
Member

FAIL [   5.062s] fedimint-ln-gateway::gatewayd-integration-tests test_gateway_register_with_federation

https://github.com/fedimint/fedimint/actions/runs/7703671558/job/20994518252?pr=4166#step:6:2124

Timeouts in separate PR (merge queue)

Summary [ 505.141s] 200 tests run: 198 passed (2 slow), 2 timed out, 0 skipped
TIMEOUT [ 480.039s] fedimint-mint-tests::fedimint_mint_tests sends_ecash_out_of_band_cancel
TIMEOUT [ 480.027s] fedimint-server config::api::tests::test_restart_setup

https://github.com/fedimint/fedimint/actions/runs/7702842445/job/20991977876#step:6:38066

@bradleystachurski
Copy link
Member

bradleystachurski commented Jan 30, 2024

FAIL [   9.208s] fedimint-ln-gateway::gatewayd-integration-tests test_gateway_configuration

https://github.com/fedimint/fedimint/actions/runs/7704054812/job/20995614440?pr=4169#step:6:1835

FAIL [  10.728s] fedimint-ln-gateway::gatewayd-integration-tests test_gateway_register_with_federation

https://github.com/fedimint/fedimint/actions/runs/7835068918/job/21382888265?pr=4267#step:6:2189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration
Projects
None yet
Development

No branches or pull requests

5 participants