Skip to content

Conversation

@jamillambert
Copy link
Collaborator

The intermittent integration test failures are caused by race conditions in the creation of the node, wallet or the cookie file. Change Node::with_conf to check if there is a problem and retry the section where it occurred, or if that still fails retry the whole process of starting the node.

First patch:

  • Split out some sections into helper functions.

Second patch:

  • Remove the loop in with_conf() that creates client, instead loop over the individual parts.
  • Create helper functions for creating the client, and creating or loading the client wallet, both with their own retry logic.
  • Reduce the sleep time between retries to reduce the delay, since it looks like the errors happen over shorter time periods than the existing 1 second wait.
  • Add functions to wait until the client and the cookie file are available.
  • Add a loop to retry the whole process if it fails, since the integration tests pass when repeated, just repeat this section where the failures occur instead.
  • Remove all the debugging outputs since it all works now they are not needed.
  • Update the rustdocs for the function.

Third patch:

Closes #205

with_conf() is large, split out some of the variable creation sections
into helper functions.
@jamillambert jamillambert force-pushed the 0529-node-refactor branch 2 times, most recently from 19069bf to 1571aec Compare May 29, 2025 20:05
@jamillambert jamillambert changed the title Fix integration test failures by fixing Node::with_conf Fix integration test failures by making Node::with_conf more robust May 29, 2025
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mad, good work man.

}

if Self::wait_for_cookie_file(cookie_file.as_path(), Duration::from_secs(5)).is_err() {
let _ = process.kill();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a code comment here saying why we kill the process if waiting for the cookie file fails?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comments to all the checks where it kills the process and retries.

node/src/lib.rs Outdated
}
thread::sleep(Duration::from_millis(200));
}
unreachable!()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is cleaner?

        for _ in 0..9 {
            if let Ok(client) = Client::new_with_auth(rpc_url, auth.clone()) {
                return Ok(client);
            }
            thread::sleep(Duration::from_millis(200));
        }
        Client::new_with_auth(rpc_url, auth.clone()).map_err(|e| Error::NoBitcoindInstance(e.to_string()).into())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it is. I changed it and also rewrote create_client_wallet to use the same format.

@jamillambert
Copy link
Collaborator Author

NB. I think there is a CI issue, this passed CI but I got an error locally when I accidentally ran cargo test in the root instead of integration_test. The failure was a doctest on node/src/lib.rs:205 because I had change the default values but not updated the docs.

@jamillambert
Copy link
Collaborator Author

Made the suggested changes and added a few more comments.

Spilt out client and wallet sections into helper functions with their
own retry loops.

Add wait functions for client and cookie file.

Retry all of with_conf if it fails up to conf.attempts times.

Improve the rustdocs.
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 8c9a4a4

@tcharding tcharding merged commit 4e35595 into rust-bitcoin:master May 30, 2025
29 checks passed
blaze-smith470pm added a commit to blaze-smith470pm/corepc that referenced this pull request Sep 26, 2025
…g Node::with_conf more robust

8c9a4a49e971782da87fbabdd0e7be318dded8b8 Revert "integration_test: Pause after node comes up" (Jamil Lambert, PhD)
66e7e7c3c188f09d27a7edede88c393cdf9318a5 Refactor Node::with_conf (Jamil Lambert, PhD)
2b8fad91fb885642f3d69268198374d92cd85ca0 Split out sections into helper functions (Jamil Lambert, PhD)

Pull request description:

  The intermittent integration test failures are caused by race conditions in the creation of the node, wallet or the cookie file. Change `Node::with_conf` to check if there is a problem and retry the section where it occurred, or if that still fails retry the whole process of starting the node.

  First patch:
  - Split out some sections into helper functions.

  Second patch:
  - Remove the loop in `with_conf()` that creates `client`, instead loop over the individual parts.
  - Create helper functions for creating the `client`, and creating or loading the client `wallet`, both with their own retry logic.
  - Reduce the sleep time between retries to reduce the delay, since it looks like the errors happen over shorter time periods than the existing 1 second wait.
  - Add functions to wait until the client and the cookie file are available.
  - Add a loop to retry the whole process if it fails, since the integration tests pass when repeated, just repeat this section where the failures occur instead.
  - Remove all the debugging outputs since it all works now they are not needed.
  - Update the rustdocs for the function.

  Third patch:
  - Revert #206, since the issue is now solved at it’s source the pause is no longer needed.

  Closes #205

ACKs for top commit:
  tcharding:
    ACK 8c9a4a49e971782da87fbabdd0e7be318dded8b8

Tree-SHA512: d80ef5e4def47025e830dd84aaa99e8ed0b71302f95bfc3d0ec210226444ce45a0679c569c5cbcf152cccc3222f812e4a8e5ab4ed5383a89ae2539f1f49baed4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random integration test failures because: it appears that bitcoind is not reachable

2 participants