Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: poll in reconnect test #4220

Merged
merged 1 commit into from
Feb 4, 2024

Conversation

bradleystachurski
Copy link
Member

reconnect_test is flakey, which can likely be resolved by polling.

https://github.com/fedimint/fedimint/actions/runs/7747755393/job/21128923550#step:5:1433

2024-02-01T21:34:35.3311143Z 00:00:30 Error: command: /nix/store/fdbrlr3n4n94gs5a1lgnzskhgmv8pimq-fedimint-cli/bin/fedimint-cli --data-dir=/tmp/nix-shell.qkvd7G/devimint-73588/cl
ients/default dev api module_2_block_count
2024-02-01T21:34:35.3312020Z 00:00:30
2024-02-01T21:34:35.3312197Z 00:00:30 Caused by:
2024-02-01T21:34:35.3312415Z 00:00:30     exit status: 1
2024-02-01T21:34:35.3312648Z 00:00:30     stdout:
2024-02-01T21:34:35.3312845Z 00:00:30
2024-02-01T21:34:35.3313047Z 00:00:30     stderr:
2024-02-01T21:34:35.3313661Z 00:00:30     2024-02-01T21:34:34.877575Z  INFO fedimint_cli::db_locked: Acquiring database lock
2024-02-01T21:34:35.3314777Z 00:00:30     2024-02-01T21:34:35.074923Z  INFO fedimint_client: Last client reference dropped, shutting down client task group
2024-02-01T21:34:35.3315862Z 00:00:30     2024-02-01T21:34:35.077403Z  INFO fedimint_client::sm::executor: Starting state machine executor task
2024-02-01T21:34:35.3317015Z 00:00:30     2024-02-01T21:34:35.081059Z  INFO fedimint_client::sm::executor: Shutting down state machine executor runner due to shutdown signal
2024-02-01T21:34:35.3317705Z 00:00:30     {
2024-02-01T21:34:35.3317921Z 00:00:30       "error": "CliError",
2024-02-01T21:34:35.3318208Z 00:00:30       "kind": "general_failure",
2024-02-01T21:34:35.3319757Z 00:00:30       "message": "Federation rpc error {general => Received errors from 1 peers: peer-0: Rpc error: Networking or low-level protocol error:
Error when opening the TCP socket: Connection refused (os error 111)), 0 => Rpc error: Networking or low-level protocol error: Error when opening the TCP socket: Connection refus
ed (os error 111)), }",
2024-02-01T21:34:35.3322451Z 00:00:30       "raw_error": "Federation rpc error {general => Received errors from 1 peers: peer-0: Rpc error: Networking or low-level protocol error
: Error when opening the TCP socket: Connection refused (os error 111)), 0 => Rpc error: Networking or low-level protocol error: Error when opening the TCP socket: Connection ref
used (os error 111)), }"
2024-02-01T21:34:35.3323774Z 00:00:30     }
2024-02-01T21:34:35.3323966Z 00:00:30
2024-02-01T21:34:35.3324168Z ## FAILED: reconnect_test

@bradleystachurski bradleystachurski requested a review from a team as a code owner February 2, 2024 03:07
Copy link

codecov bot commented Feb 2, 2024

Codecov Report

Attention: 14 lines in your changes are missing coverage. Please review.

Comparison is base (df9b41a) 57.94% compared to head (a3d82e9) 57.99%.

Files Patch % Lines
devimint/src/federation.rs 0.00% 13 Missing ⚠️
devimint/src/tests.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4220      +/-   ##
==========================================
+ Coverage   57.94%   57.99%   +0.05%     
==========================================
  Files         197      197              
  Lines       43413    43414       +1     
==========================================
+ Hits        25154    25177      +23     
+ Misses      18259    18237      -22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -1558,7 +1558,11 @@ pub async fn reconnect_test(dev_fed: DevFed, process_mgr: &ProcessManager) -> Re

fed.start_server(process_mgr, 0).await?;
fed.mine_then_wait_blocks_sync(100).await?;
fed.await_all_peers().await?;
poll("federation back online", None, || async {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking aloud: should this be just part of await_all_peers? Otherwise won't we hit it soon someplace else using it?

I guess if someone is using await_all_peers ... they can kind of expect some peers to be down, and don't want this function to fail because of it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent observation 🙂

auto-merge was automatically disabled February 3, 2024 23:06

Head branch was pushed to by a user without write access

@dpc dpc added this pull request to the merge queue Feb 4, 2024
Merged via the queue into fedimint:master with commit 5a941b7 Feb 4, 2024
22 checks passed
@bradleystachurski bradleystachurski deleted the fix-reconnect-test branch February 4, 2024 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants