Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a boolean query flag to request supported versions #4256

Merged
merged 27 commits into from
May 8, 2023

Conversation

jprider63
Copy link
Contributor

Description

This is a WIP PR implementing #3907. Specifically, it adds a boolean query flag to NodeTo{Node,Client}VersionData that causes the other party to return their list of supported versions instead of the negotiated version. I still need to do more testing.

Checklist

  • Branch
    • Commit sequence broadly makes sense
    • Commits have useful messages
    • New tests are added if needed and existing tests are updated
    • If this branch changes Consensus and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If this branch changes Network and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If serialization changes, user-facing consequences (e.g. replay from genesis) are confirmed to be intentional.
  • Pull Request
    • Self-reviewed the diff
    • Useful pull request description at least containing the following information:
      • What does this PR change?
      • Why these changes were needed?
      • How does this affect downstream repositories and/or end-users?
      • Which ticket does this PR close (if any)? If it does, is it linked?
    • Reviewer requested

@coot coot linked an issue Jan 9, 2023 that may be closed by this pull request
Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Below are a few requestests for changes. The two most important ones are:

  • modify the protocol to add a MsgQueryResp rather than reuse MsgReplyVersions

  • I think we should terminate connections which had the query flag on.

Ad the last point:
This will require changes in ConnectionHandler for P2P and Ouroboros.Network.Socket for NonP2P.

@coot
Copy link
Contributor

coot commented Jan 19, 2023

Here's the rationale why I think we should terminate connections which are using the query flag.

query flag changes how the negotiation is done. Without this flag client sends map of supported versions and parameters, the server chooses and sends back the choice. With querying, client does the same thing, but server sends back all it's versions. Both compute the negotiated version & data on their end. If none of them changed their map in the meantime the result should be the same on both ends. With peer sharing the monoid which we use is no longer commutative, so we need to be double careful that we compute the accepted versions in the right order in both places.

If we say that query should terminate a connection, there's only one code path that allows to negotiate a connection, one one which allows to query the supported versions.

@jprider63 jprider63 force-pushed the 3907-handshake-parameters branch from a5d0d2d to 1589e50 Compare January 23, 2023 22:08
@jprider63
Copy link
Contributor Author

@coot I've addressed your comments in the latest commits. I wasn't completely sure how to update the connection manager state and add new protocol versions, so let me know if we should be doing something different there.

Copy link
Contributor

@coot coot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coot coot changed the title WIP: Add a boolean query flag to request supported versions Add a boolean query flag to request supported versions Feb 17, 2023
@coot coot marked this pull request as ready for review February 17, 2023 09:41
@coot
Copy link
Contributor

coot commented Feb 17, 2023

Since it's approved, it's not a draft anymore :)

@coot
Copy link
Contributor

coot commented Feb 17, 2023

To fix some of the CI failures you'll need to run these two scripts:

  • ./scripts/ci/check-stylish-network.sh
  • ./scripts/ci/check-stylish.sh

@coot
Copy link
Contributor

coot commented Feb 20, 2023

The CDDL tests are failing, see bottom of the raw log

2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>     NodeToNode.Handshake:   FAIL (2.25s)
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       *** Failed! Falsified (after 5 tests):
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       (ServerAgency TokConfirm,MsgAcceptVersion NodeToNodeV_11 (TList [TInt 2,TBool False,TBool False]))
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       (ServerAgency TokConfirm,MsgAcceptVersion NodeToNodeV_11 (TList [TInt 2,TBool False,TBool False]))
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       Right ("",TList [TInt 1,TInt 11,TList [TInt 2,TBool False,TBool False]])
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       CDDL validation failure (nil for [1, 11, [2, false, false]]):
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       [1,
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>        [:int, 2],
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>        "[1, [:int, 2], \"occur not reached in array [1, 11, [2, false, false]] for [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:array, [:member, 1, 1, nil, [:int, 0]], [:member, 1, 1, nil, [:array, [:member, 0, Infinity, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]]]]], [:array, [:member, 1, 1, nil, [:int, 1]], [:member, 1, 1, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]], [:member, 1, 1, nil, [:prim, 3]]], [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]], [:member, 1, 1, nil, [:prim, 3]]]]]]\"] -- cannot complete (false, 3) array [1, 11, [2, false, false]] for [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:array, [:member, 1, 1, nil, [:int, 0]], [:member, 1, 1, nil, [:array, [:member, 0, Infinity, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]]]]], [:array, [:member, 1, 1, nil, [:int, 1]], [:member, 1, 1, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]], [:member, 1, 1, nil, [:prim, 3]]], [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:int, 7], [:int, 8], [:int, 9], [:int, 10]]], [:member, 1, 1, nil, [:prim, 3]]]]]]"]
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       Use --quickcheck-replay=691494 --quickcheck-max-size=10 to reproduce.
2023-02-17T16:14:58.958519567+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       Use -p '/encoding.NodeToNode.Handshake/' to rerun this test only.
2023-02-17T16:14:59.638203438+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-lib-ouroboros-network> [33 of 39] Compiling Ouroboros.Network.PeerSelection.Governor ( src/Ouroboros/Network/PeerSelection/Governor.hs, dist/build/Ouroboros/Network/PeerSelection/Governor.o, dist/build/Ouroboros/Network/PeerSelection/Governor.dyn_o )
2023-02-17T16:15:00.561221596+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-test> [2 of 3] Compiling Test.Chain       ( test/Test/Chain.hs, dist/build/test/test-tmp/Test/Chain.o )
2023-02-17T16:15:00.561221596+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-test> [3 of 3] Compiling Main             ( test/Main.hs, dist/build/test/test-tmp/Main.o )
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>     NodeToClient.Handshake: FAIL (1.96s)
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       *** Failed! Falsified (after 5 tests):
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       (ServerAgency TokConfirm,MsgAcceptVersion NodeToClientV_15 (TList [TInt 2,TBool True]))
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       (ServerAgency TokConfirm,MsgAcceptVersion NodeToClientV_15 (TList [TInt 2,TBool True]))
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       Right ("",TList [TInt 1,TInt 32783,TList [TInt 2,TBool True]])
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       CDDL validation failure (nil for [1, 32783, [2, true]]):
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>       [1,
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>        [:int, 2],
2023-02-17T16:15:00.995227719+00:00 stderr F �[48;5;034;38;5;231mx86_64-darwin�[00m	ouroboros-network-protocols-test-cddl>        "[1, [:int, 2], \"occur not reached in array [1, 32783, [2, true]] for [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:array, [:member, 1, 1, nil, [:int, 0]], [:member, 1, 1, nil, [:array, [:member, 0, Infinity, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]]]]], [:array, [:member, 1, 1, nil, [:int, 1]], [:member, 1, 1, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]], [:member, 1, 1, nil, [:prim, 3]]], [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]], [:member, 1, 1, nil, [:prim, 3]]]]]]\"] -- cannot complete (false, 3) array [1, 32783, [2, true]] for [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:array, [:member, 1, 1, nil, [:int, 0]], [:member, 1, 1, nil, [:array, [:member, 0, Infinity, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]]]]], [:array, [:member, 1, 1, nil, [:int, 1]], [:member, 1, 1, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]], [:member, 1, 1, nil, [:prim, 3]]], [:array, [:member, 1, 1, nil, [:int, 2]], [:member, 1, 1, nil, [:type1, [:int, 32777], [:int, 32778], [:int, 32779], [:int, 32780], [:int, 32781], [:int, 32782]]], [:member, 1, 1, nil, [:prim, 3]]]]]]"]

The current state of CI is a bit messy. It's the ci job which links to: https://cicero.ci.iog.io/run/5bda4a49-4ca6-4c97-ad35-736badd7f375
Then there are two links, one to raw logs (the above) another to loki. The problem with both is that they are very verbose and contain information of all the runs (this will change in future, but right now it is how it is). The hint is to look for the phrase Reporting GitHub commit status failure.

@jprider63 jprider63 force-pushed the 3907-handshake-parameters branch from 7fc8927 to 2049df2 Compare March 2, 2023 01:49
@jprider63
Copy link
Contributor Author

I've rebased onto master. master also added new node to node and node to client versions, so I reused the new versions for both features. cardano-ping was deleted from master, so I need to figure out where it lives now and apply those patches manually.

@jprider63
Copy link
Contributor Author

Everything is now building on top of @bolt12's branch, so this should be good to merge! The query successfully runs:

# Start a node
$ cabal run cardano-node -- run --port 55715 --host-addr 127.0.0.1 --config ../configuration/cardano/mainnet-config.json --topology ../configuration/cardano/mainnet-topology.json

# Query it's supported versions
$ cabal new-run -- cardano-cli:cardano-cli ping --host 127.0.0.1 --port 55715 -c 1 -j --query-versions
Up to date
127.0.0.1:55715 network rtt: 0.000
127.0.0.1:55715 handshake rtt: 0.000455833s
127.0.0.1:55715 Queried versions [NodeToNodeVersionV11 764824073 True,NodeToNodeVersionV10 764824073 True,NodeToNodeVersionV9 764824073 True,NodeToNodeVersionV8 764824073 True,NodeToNodeVersionV7 764824073 True]
127.0.0.1:55715 Negotiated version NodeToNodeVersionV11 764824073 True
] }
(AddrInfo {addrFlags = [], addrFamily = AF_INET, addrSocketType = Stream, addrProtocol = 6, addrAddress = 127.0.0.1:55715, addrCanonName = Nothing},MuxError (MuxIOException writev: resource vanished (Connection reset by peer)) "(sendAll errored)")

# Run cardano-ping against a deployed node
$ cabal new-run -- cardano-cli:cardano-cli ping --host relays-new.cardano-mainnet.iohk.io -c 1 -j --query-versions
Up to date
13.52.189.184:3001 network rtt: 0.022
13.52.189.184:3001 handshake rtt: 0.022043953s
13.52.189.184:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
13.52.189.184:3001 Negotiated version NodeToNodeVersionV10 764824073 True
{ "pongs": [ {"cookie":0,"host":"13.52.189.184:3001","max":2.2000886e-2,"mean":2.2000886e-2,"median":2.2000886e-2,"min":2.2000886e-2,"p90":2.2000886e-2,"sample":2.2000886e-2,"timestamp":"2023-04-20T03:09:26.05761207Z"}18.133.87.50:3001 network rtt: 0.129
54.93.82.235:3001 network rtt: 0.139
18.159.27.23:3001 network rtt: 0.140
52.57.122.128:3001 network rtt: 0.140
52.28.21.120:3001 network rtt: 0.140
18.158.118.230:3001 network rtt: 0.141
54.255.116.223:3001 network rtt: 0.216
18.133.87.50:3001 handshake rtt: 0.128594753s
18.133.87.50:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
18.133.87.50:3001 Negotiated version NodeToNodeVersionV10 764824073 True
54.93.82.235:3001 handshake rtt: 0.138898897s
54.93.82.235:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
54.93.82.235:3001 Negotiated version NodeToNodeVersionV10 764824073 True
18.159.27.23:3001 handshake rtt: 0.139786086s
18.159.27.23:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
18.159.27.23:3001 Negotiated version NodeToNodeVersionV10 764824073 True
52.57.122.128:3001 handshake rtt: 0.139917343s
52.57.122.128:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
52.57.122.128:3001 Negotiated version NodeToNodeVersionV10 764824073 True
52.28.21.120:3001 handshake rtt: 0.14001423s
52.28.21.120:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
52.28.21.120:3001 Negotiated version NodeToNodeVersionV10 764824073 True
18.158.118.230:3001 handshake rtt: 0.140582903s
18.158.118.230:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
18.158.118.230:3001 Negotiated version NodeToNodeVersionV10 764824073 True
,
{"cookie":0,"host":"18.133.87.50:3001","max":0.128883673,"mean":0.128883673,"median":0.128883673,"min":0.128883673,"p90":0.128883673,"sample":0.128883673,"timestamp":"2023-04-20T03:09:26.377584884Z"},
{"cookie":0,"host":"54.93.82.235:3001","max":0.138972346,"mean":0.138972346,"median":0.138972346,"min":0.138972346,"p90":0.138972346,"sample":0.138972346,"timestamp":"2023-04-20T03:09:26.407907313Z"},
{"cookie":0,"host":"18.159.27.23:3001","max":0.139895696,"mean":0.139895696,"median":0.139895696,"min":0.139895696,"p90":0.139895696,"sample":0.139895696,"timestamp":"2023-04-20T03:09:26.410171055Z"},
{"cookie":0,"host":"52.57.122.128:3001","max":0.139686976,"mean":0.139686976,"median":0.139686976,"min":0.139686976,"p90":0.139686976,"sample":0.139686976,"timestamp":"2023-04-20T03:09:26.410193404Z"},
{"cookie":0,"host":"52.28.21.120:3001","max":0.140090952,"mean":0.140090952,"median":0.140090952,"min":0.140090952,"p90":0.140090952,"sample":0.140090952,"timestamp":"2023-04-20T03:09:26.411175617Z"},
{"cookie":0,"host":"18.158.118.230:3001","max":0.140536694,"mean":0.140536694,"median":0.140536694,"min":0.140536694,"p90":0.140536694,"sample":0.140536694,"timestamp":"2023-04-20T03:09:26.412592123Z"}54.255.116.223:3001 handshake rtt: 0.21632722s
54.255.116.223:3001 Queried versions [NodeToNodeVersionV10 764824073 True]
54.255.116.223:3001 Negotiated version NodeToNodeVersionV10 764824073 True
,
{"cookie":0,"host":"54.255.116.223:3001","max":0.216595582,"mean":0.216595582,"median":0.216595582,"min":0.216595582,"p90":0.216595582,"sample":0.216595582,"timestamp":"2023-04-20T03:09:26.640644524Z"}] }

You should be able to reproduce this on the cardano-node PR (IntersectMBO/cardano-node/pull/5100 at commit 12cc3d947c59d8a14fbc49010bdc1fd35862f41f). Note that this requires the ekg-forward PR (input-output-hk/ekg-forward/pull/20) as well.

@coot
Copy link
Contributor

coot commented Apr 24, 2023

Well done @jprider63, there seems to be some conflicts. Let me know if you need some help with them.

@jprider63
Copy link
Contributor Author

Thanks @coot! The conflicts should be resolved now.

@coot
Copy link
Contributor

coot commented Apr 24, 2023

@jprider63 once cardano-cli ping --query-versions terminates it prints an error:

(AddrInfo {addrFlags = [], addrFamily = AF_INET, addrSocketType = Stream, addrProtocol = 6, addrAddress = 127.0.0.1:55715, addrCanonName = Nothing},MuxError (MuxIOException writev: resource vanished (Connection reset by peer)) "(sendAll errored)")

I think the client side tries to send a ping message, but the remote side closed the connection. If that's the case we need to change logic in the cardano-ping command to avoid sending ping messages.

cardano-ping/src/Cardano/Network/Ping.hs Outdated Show resolved Hide resolved
@jprider63 jprider63 force-pushed the 3907-handshake-parameters branch from 51a2056 to 3831d0d Compare April 28, 2023 19:27
@coot
Copy link
Contributor

coot commented May 8, 2023

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented May 8, 2023

@iohk-bors iohk-bors bot merged commit b0d975f into IntersectMBO:master May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend cardano-ping to output handshake negotiation parameters
2 participants