-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for connected-peers protocol (DSN). #1874
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really appreciate some comments about what invariants you're testing and why the outcomes are what they are because they don't seem logical to me. No need to write useless water, just short explanation. Look at pallet-subspace
or subspace-archiver
for examples for such comments.
} | ||
|
||
#[tokio::test()] | ||
async fn test_connection_decision() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and other tests will all run for 2 seconds, are you sure we can't make it run MUCH faster? Like 100ms decision timeout and large delay of 200ms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 seconds are insignificant compared to the whole testing duration of the project. However, I can try setting it to 1s. Generally, I want to set delays several orders of magnitude higher than the expected overhead of all the scaffolding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is significant. Tests on my machine take ~5 minutes to run. 5 tests here will increase time to run tests by at least 2 seconds when running all in parallel, while not really using any meaningful amount of CPU. So having faster tests is certainly a desirable property. And in CI we only have 2 cores right now, so these 5 tests will take 6 seconds to run (depending on how they are scheduled). 3 seconds for 5 simple test cases is a lot.
assert!(!peer1.is_connected(peer2.local_peer_id())); | ||
assert!(!peer2.is_connected(peer1.local_peer_id())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would they not remain connected if one of the peers still wants to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A peer can't force remote peers to maintain connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... then we have a problem I think. Connected peers doesn't really maintain peers in connected state! It only prevents local end of the connection from being closed, which as you just mentioned is not sufficient to keep connections established.
So what this seems to mean is that we will still have a lot of reconnections unless both peers are interested in each other. More specifically this means nodes will continuously disconnect from farmers over and over again because farmers are never interested in nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... then we have a problem I think. Connected peers doesn't really maintain peers in connected state! It only prevents local end of the connection from being closed, which as you just mentioned is not sufficient to keep connections established.
I'm not sure I understand the problem. Peers can drop connections at any time with or without protocols for many reasons unless two peers decide to maintain them for some time.
So what this seems to mean is that we will still have a lot of reconnections unless both peers are interested in each other. More specifically this means nodes will continuously disconnect from farmers over and over again because farmers are never interested in nodes.
We could have reconnections with the same peers if you meant that but it would be the case after failing connections with all the other known peers.
Nodes will maintain connections with farmers because we have two instances of this protocol on the farmer and one of them maintains connections with nodes.
.authenticate(PlainText2Config { | ||
local_public_key: identity.public(), | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why plain text though, how do we benefit from it? Crypto cost should be minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially removed that but swarm-test
crate has "test-only" quality and it doesn't work without it.
crates/subspace-networking/src/protocols/connected_peers/tests.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really appreciate some comments about what invariants you're testing and why the outcomes are what they are because they don't seem logical to me. No need to write useless water, just short explanation. Look at pallet-subspace or subspace-archiver for examples for such comments.
Hmm, I write long test names but apparently, it's not enough. I will add some extra comments.
assert!(!peer1.is_connected(peer2.local_peer_id())); | ||
assert!(!peer2.is_connected(peer1.local_peer_id())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A peer can't force remote peers to maintain connections.
.authenticate(PlainText2Config { | ||
local_public_key: identity.public(), | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially removed that but swarm-test
crate has "test-only" quality and it doesn't work without it.
crates/subspace-networking/src/protocols/connected_peers/tests.rs
Outdated
Show resolved
Hide resolved
} | ||
|
||
#[tokio::test()] | ||
async fn test_connection_decision() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 seconds are insignificant compared to the whole testing duration of the project. However, I can try setting it to 1s. Generally, I want to set delays several orders of magnitude higher than the expected overhead of all the scaffolding.
} | ||
|
||
#[tokio::test()] | ||
async fn test_connection_decision() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is significant. Tests on my machine take ~5 minutes to run. 5 tests here will increase time to run tests by at least 2 seconds when running all in parallel, while not really using any meaningful amount of CPU. So having faster tests is certainly a desirable property. And in CI we only have 2 cores right now, so these 5 tests will take 6 seconds to run (depending on how they are scheduled). 3 seconds for 5 simple test cases is a lot.
assert!(!peer1.is_connected(peer2.local_peer_id())); | ||
assert!(!peer2.is_connected(peer1.local_peer_id())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... then we have a problem I think. Connected peers doesn't really maintain peers in connected state! It only prevents local end of the connection from being closed, which as you just mentioned is not sufficient to keep connections established.
So what this seems to mean is that we will still have a lot of reconnections unless both peers are interested in each other. More specifically this means nodes will continuously disconnect from farmers over and over again because farmers are never interested in nodes.
crates/subspace-networking/src/protocols/connected_peers/tests.rs
Outdated
Show resolved
Hide resolved
break; | ||
} | ||
} | ||
} | ||
|
||
// We don't maintain with new peers when we have enough connected peers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful, but there are 3 blocks. The comment explains the high idea behind the whole test, it doesn't really comment what is happening here. And what is happening in these 3 blocks of assertions is checks where some peers are within limits and thus maintain connections and others are not and thus don't maintain connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment.
This PR adds several tests to
connected-peers
protocol.Code contributor checklist: