Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for errors during large message send/receipt + retry on send fail #289

Merged
merged 9 commits into from
Jul 15, 2021

Conversation

joshuef
Copy link
Contributor

@joshuef joshuef commented Jul 12, 2021

Close issue maidsafe/safe_network#69

  • adds retry attempts on new connections
  • adds retry atetmpts on sending messages
  • adds test for sending many 1mb messages

@joshuef joshuef requested a review from a team as a code owner July 12, 2021 07:47
@joshuef joshuef force-pushed the Tests-TimedOuts branch 3 times, most recently from 6f0d8bb to b322289 Compare July 13, 2021 13:39
src/connection_deduplicator.rs Outdated Show resolved Hide resolved
src/connections.rs Outdated Show resolved Hide resolved
@joshuef joshuef changed the title Test for errors during large message send/receipt Test for errors during large message send/receipt + retry on send fail Jul 15, 2021
src/connection_deduplicator.rs Outdated Show resolved Hide resolved
src/connections.rs Outdated Show resolved Hide resolved
src/endpoint.rs Outdated Show resolved Hide resolved
src/endpoint.rs Outdated Show resolved Hide resolved
Return quinn error from deduplicator
remove an unneeded clone
reduce max attempts
@@ -3,3 +3,4 @@
Cargo.lock
*.*~
.idea
*.log
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a line break?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it needs an extra line break at the end of file.
otherwise it will show this red icon.

src/connections.rs Outdated Show resolved Hide resolved
Yoga07
Yoga07 previously approved these changes Jul 15, 2021
@joshuef joshuef enabled auto-merge (rebase) July 15, 2021 12:39
@@ -193,15 +201,22 @@ pub(super) fn listen_for_incoming_messages(
) {
let src = *remover.remote_addr();
let _ = tokio::spawn(async move {
debug!("qp2p another incoming listerner spawned");
let _ = future::join(
match future::try_join(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm.
so this try_join will return on the first error encounter.
which I am not sure what's happening to the other.
Is the read on the other stream will be terminated at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh the other future will be cancelled as I understand.

error!(
"Failed reading from a bi-stream for peer {:?} with error: {:?}",
warn!(
"Failed reading from a bi-stream for peer {:?} with: {:?}",
peer_addr, err
);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... no sure about breaking here (and that error of FinishedEarly), which actually now returns Ok from the function.
So we don't need (or don't know what to do) under these error situations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can figure, it's a question of "can this connection still be used". If so, we do not error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we were breaking before here, so that's not a change.

As I understand, here we've received the bytes and it's the processing that fails (some message fails), but the connection is still valid, so we don't want to err out and cause a disconnect.

let mut attempts: usize = 0;
let mut res = self.try_send_message(msg.clone(), dest).await;

while attempts < MAX_ATTEMPTS && res.is_err() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will try resends for 10 times, then on disconnection, try re-connect for 10 times as well ?
and each takes an interval of 0.5s, which makes a total of 50s to detect a really failed connection ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be. (worth nothing, this has already been updated to MAX_ATTEMPTS=5 ).

What's an acceptable time here? How many tries do we want?

50s doesn't seem too bad. to be sure, vs just a 30s timeout eg

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mainly just confirming this setup is ok.

@joshuef joshuef merged commit 56a7033 into maidsafe:master Jul 15, 2021
@@ -28,6 +28,17 @@ pub(crate) struct Connection {
remover: ConnectionRemover,
}

/// Disconnection events, and the result that led to disconnection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants