Skip to content

Commit

Permalink
Merge #4019 #4384
Browse files Browse the repository at this point in the history
4019: Peer Sharing r=bolt12 a=bolt12

# Description

This PR is a WIP of the Peer Sharing feature. More details about the design and planning see here: https://github.com/input-output-hk/ouroboros-network/wiki/Peer-Sharing-Planning



4384: Fix incorrect transition order r=bolt12 a=bolt12

For the same reason mentioned in d6e67f1 there can be a race condition and the connection manager can wrongly assume it needs to trace a TerminatingSt -> TerminatedSt transition. This was fixed in d6e67f1 for the requestOutboundConnectionImpl function but the includeInboundConnectionImpl got left out.

Fixes #4370

Co-authored-by: Armando Santos <armando@well-typed.com>
  • Loading branch information
iohk-bors[bot] and bolt12 committed Mar 15, 2023
3 parents 0659e7b + b045f1f + f1761b5 commit db78950
Show file tree
Hide file tree
Showing 80 changed files with 3,486 additions and 1,336 deletions.
39 changes: 22 additions & 17 deletions docs/network-design/network-design.tex
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ \section{Revision History}
1.8 & 2020-08-05 & Revised & Resolving remaining points and correcting
headings.\\
1.9 & 2020-08-09 & Format & Converted from Google doc to LaTeX (via pandoc).\\
1.9.1 & 2022-08-29 & Revised & Revised Gossip definition - renamed to "Peer Sharing" .\\
\bottomrule
\end{longtable}

Expand Down Expand Up @@ -1726,7 +1727,7 @@ \subsection{Decentralisation design}
planned work.}.

In our use case we must build a large graph in a decentralised way,
initially from nodes contributed by IOHK and later by state pool
initially from nodes contributed by IOHK and later by stake pool
operators, and grow it to support all stake pools and users. We would
like small hop counts and we would also like to avoid the danger of
there being too many long hops.
Expand Down Expand Up @@ -1770,14 +1771,18 @@ \subsection{Decentralisation design}
can do with standard \emph{random gossip} techniques. This deals with
the decentralisation requirement and the low hop count constraint. The
remaining significant issues are avoiding eclipse attacks, as previously
discussed, and avoiding too many long hops. Our design is to augment
random gossip with two layers of control loop to address these two
issues. This general design is relatively simple, and has the
significant virtue that the policy for the control loops can be adjusted
after initial deployment with relatively few compatibility impacts. This
should enable the policy to be optimised based on real-world feedback,
and feedback from simulations of scale or scenarios that are hard (or
undesirable) to test in a real deployment.
discussed, and avoiding too many long hops. Our design is to address
these 2 concerns with separate mechanisms: eclipse evasion and peer sharing.
Because of this, we're rebranding this "random gossip" inspired process to
\emph{Peer Sharing}. The Peer Sharing protocol's goal is to make it easier
to find possible peers within the overall Cardano network, leaving the
Eclipse Evasion mechanism to deal with that separately. This general
design is relatively simple, and has the significant virtue that the
policy for the control loops can be adjusted after initial deployment
with relatively few compatibility impacts. This should enable the
policy to be optimised based on real-world feedback, and feedback
from simulations of scale or scenarios that are hard (or undesirable)
to test in a real deployment.

Each node maintains three sets of known peer nodes:

Expand Down Expand Up @@ -1807,12 +1812,12 @@ \subsection{Decentralisation design}
For an individual node to join the network, its bootstrapping phase
starts by contacting root nodes and requesting sets of other peers,
which are added to the cold peer set. It proceeds iteratively by
randomly selecting other peers to contact to request more known peers.
This gossip process is governed by a control loop that has a target to
randomly sampling suitable peers to contact to request more known peers.
This Peer Sharing process is governed by a control loop that has a target to
find and maintain a certain number of cold peers. Bootstrapping is not a
special mode, rather it is just a phase for the control loop following
starting with a cold peers set consisting only of the root nodes. This
gossiping aspect is closely analogous to the first stage of Kademlia,
peer sharing aspect is closely analogous to the first stage of Kademlia,
but with random selection rather than selection directed towards finding
peers in an artificial metric space.

Expand All @@ -1828,7 +1833,7 @@ \subsection{Decentralisation design}

\begin{itemize}
\item
the random gossip used to discover more cold peers;
the peer sharing used to discover more cold peers;
\item
promotion of cold peers to be warm peers;
\item
Expand Down Expand Up @@ -1879,8 +1884,8 @@ \subsection{Decentralisation design}

While the purpose of cold and hot peers is clear, the purpose of warm
peers requires further explanation. The primary purpose is to address
the challenge of avoiding too many long hops in the graph. The random
gossip is oblivious to hop distance. By actually connecting to a
the challenge of avoiding too many long hops in the graph. The Peer
Sharing protocol is oblivious to hop distance. By actually connecting to a
selection of peers and measuring the round trip delays\footnote{Or more
precisely the two unidirectional $\Delta{}Q$ measures} we can start to
establish which peers are near or far. The policy for selecting which
Expand Down Expand Up @@ -3793,7 +3798,7 @@ \subsection{Bootstrap}
node does not need to trust the addresses that it is given, as it can
try them out for itself and make dynamic decisions about which peers it
uses as described in \cref{decentralisation-design}. This exchange of
information about (the addresses of) other nodes is called `gossiping',
information about (the addresses of) other nodes is called \emph{Peer Sharing},
and is required for the fully decentralised version of Shelley. The
implementation that will be taken here is inspired by Kademlia, but can
be far less complex.
Expand Down Expand Up @@ -5024,7 +5029,7 @@ \subsubsection{Ouroboros-Network}
\paragraph{Other components of Ouroboros-Network package}

We implemented a peer-to-peer governor which role is to drive decisions
about promotions, demotions and gossip. It is there to provide the rest
about promotions, demotions and peer sharing. It is there to provide the rest
of the system with possible peers to communicate with and govern their
state (hot peer / worm peer / cold peer). We are implementing a
connection manager which will manage active connections and threads and
Expand Down
123 changes: 119 additions & 4 deletions docs/network-spec/miniprotocols.tex
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,21 @@ \subsection{Client and Server Implementation}
be split into multiple segments. These MUX segments are using a reserved
protocol id $0$ (\texttt{Muxcontrol}).

\subsection{CDDL encoding specification}\label{handshake-cddl}
\subsection{Handhsake version 11 and greater}

In most recent versions of Handshake negotiated node-to-node version data has
one more parameter: Peer Sharing willingness information. This is a flag that
can be globally configured by the node, to let others know if a particular node
wants to participate or not in Peer Sharing. This addition breaks the symmetry
of \texttt{acceptable} since, upon negotiatin handshake, each node keeps the
remote side value of the Peer Sharing flag. This can be solved by making
\texttt{acceptable} symmetric modulo peer sharing flag.

This new flag addition also means that, for testing purposes, we are going to
need two different CDDL specifications: one for versions $< 11$ and one for
versions $\geq 11$.

\subsection{CDDL encoding specification ($< 11$)}\label{handshake-cddl}
There are two flavours of the mini-protocol which only differ with type
instantiations, e.g. different protocol versions and version data carried in
messages. First one is used by the node to node protocol the other by node to
Expand All @@ -567,6 +581,11 @@ \subsubsection{Node to node handshake mini-protocol}
\subsubsection{Node to client handshake mini-protocol}
\lstinputlisting[style=cddl]{../../ouroboros-network-protocols/test-cddl/specs/handshake-node-to-client.cddl}

\subsection{CDDL encoding specification ($\geq 11$)}\label{handshake-cddl}

\subsubsection{Node to node handshake mini-protocol}
\lstinputlisting[style=cddl]{../../ouroboros-network/test-cddl/specs/handshake-node-to-node-v11.cddl}

\section{Chain-Sync mini-protocol}
\label{chain-sync-protocol}
\haddockref{Ouroboros.Network.Protocol.ChainSync.Type}{ouroboros-network/Ouroboros-Network-Protocol-ChainSync-Type\#t:ChainSync}
Expand Down Expand Up @@ -652,7 +671,7 @@ \subsection{State Machine}
first) and it is up to the producer to find the first intersection point
on its chain and send it back to the consumer. If an empty list of
points is send with \MsgFindIntersect{} the server will reply with
\MsgIntersectNotFound{}.
\MsgIntersectNotFound{}.
\item [\MsgIntersectFound{} {\boldmath $(point_{intersect} ,tip)$}]
The producer replies with the first point of the request that is on his current chain.
The consumer can decide whether to send more points.
Expand Down Expand Up @@ -1131,7 +1150,7 @@ \subsection{State machine}
back with \MsgKeepAliveResponse{} does not match the value sent with
\MsgKeepAlive{}.
\item [\MsgKeepAliveResponse{} $cookie$]
Keep alive response message.
Keep alive response message.
\item [\MsgDone]
Terminating message.
\end{description}
Expand All @@ -1147,7 +1166,7 @@ \subsection{Description}
for example wallets or CLI tools, to submit transactions to a local node.
The protocol is {\bf not} used to forward transactions from one core node to an other.
The protocol for the transfer of transactions between full nodes
is described in Section \ref{tx-submission-protocol}.
is described in Section \ref{tx-submission-protocol2}.

The protocol follows a simple request-response pattern:
\begin{enumerate}
Expand Down Expand Up @@ -1279,6 +1298,102 @@ \subsection{CDDL encoding specification}
\lstinputlisting[style=cddl]{../../ouroboros-network-protocols/test-cddl/specs/local-state-query.cddl}
See appendix \ref{cddl-common} for common definitions.

\section{Peer Sharing mini-protocol}
\haddockref{Ouroboros.Network.Protocol.PeerSharing.Type}{ouroboros-network/Ouroboros-Network-Protocol-PeerSharing-Type\#t:PeerSharing}
\label{peer-sharing-protocol}
\subsection{Description}
The Peer Sharing MiniProtocol is a simple Request-Reply protocol. Peer Sharing
Protocol is used by nodes to perform share requests to upstream peers. Requested peers
will share a subset of their Known Peers.

\newcommand{\PsClient}{\state{StIdle}}
\newcommand{\PsServer}{\state{StBusy}}
\newcommand{\MsgShareRequest}{\trans{MsgShareRequest}}
\newcommand{\MsgSharePeers}{\trans{MsgSharePeers}}
\subsection{State machine}

\begin{figure}[h]
\begin{tabular}{|l|l|}
\hline
\multicolumn{2}{|c|}{Agency} \\ \hline
Client has Agency & \PsClient \\ \hline
Server has Agency & \PsServer \\ \hline
\end{tabular}
\end{figure}

\begin{figure}[h]
\begin{tikzpicture}[->,shorten >=1pt,auto,node distance=4.5cm, semithick]
\tikzstyle{every state}=[fill=red,draw=none,text=white]
\node[state, mygreen, initial] (Client) {\PsClient};
\node[state, myblue, right of=Client] (Server) {\PsServer};
\node[state, below of=Client] (Done) {\StDone};

\draw (Client) edge[above, bend left=45] node{\MsgShareRequest} (Server);
\draw (Server) edge[below, bend left=45] node{\MsgSharePeers} (Client);
\draw (Client) edge[left] node{\MsgDone} (Done);
\end{tikzpicture}
\caption{State machine of the peer sharing protocol.}
\end{figure}

\paragraph{Protocol messages}
\begin{description}
\item [\MsgShareRequest{} $amount$]
The client requests a maximum number of peers to be shared ($amount$). Ideally this
amount should limited by a protocol level constant to disallow a bad actor from
requesting too many peers.
\item [\MsgSharePeers{} ${[}peerAddress{]}$]
The server replies with a set of peers. Ideally the amount of information (e.g. reply
byte size) should be limited by a protocol level constant to disallow a bad actor from
sending too much information.
\item [\MsgDone]
Terminating message.
\end{description}

\subsection{Client Implementation Details}

The initiator side will have to be running indefinitely since protocol termination means
either an error or peer demotion. Because of this the protocol won't be able to be run as
a simple request-response protocol. To overcome this the client side implementation will
use a registry so that each connected peer gets registered and assigned a controller with
a request mailbox. This controller will be used to issue requests to the client
implementation which will be waiting for the queue to be filled up to send a
\MsgShareRequest. After sending a request, the result is put into a local result mailbox.

If a peer gets disconnected, it should get unregistered.

\subsection{Server Implementation Details}

As soon as the server receives a share request it needs to pick subset not bigger than the
value specified in the request's parameter. The reply set needs to be sampled randomly
from the Known Peer set according to the following constraints:

\begin{itemize}
\item Only pick peers that we managed to connect-to at some point
\item Pick not known-to-be-ledger peers
\item Pick peers that have a public willingness information (e.g. \texttt{PeerSharingPublic}).
\end{itemize}

If a peer has \texttt{NoPeerSharing} flag value do not do not ask it for peers. This peer
won't even have the Peer Sharing miniprotocol server running.

If a given peer has \texttt{PeerSharingPublic} and \texttt{DoNotAdvertise} flags enabled
at the same time, \texttt{DoNotAdvertisePeer} should have priority, so the peer shouldn't
be shared. Also if a peer has \texttt{PeerSharingPrivate} and \texttt{DoAdvertisePeer}
enabled at the same time, \texttt{PeerSharingPrivate} should be respected. Given this, if
a local/remote peer has expressed that its address should be private, when building the
response set one should respect that privacy even if some other public flag conflicts with
it.

Computing the result (i.e. random sampling of available peers) needs access to the
\texttt{PeerSelectionState} which is specific to the \texttt{peerSelectionGovernorLoop}. However when
initializing the server side of the protocol we have to provide the result computing
function early in the consensus side. This means we will have to find a way to delay the
function application all the way to diffusion and share the relevant parts of
\texttt{PeerSelectionState} with this function via a \texttt{TVar}.

\subsection{CDDL encoding specification}
\lstinputlisting[style=cddl]{../../ouroboros-network/test-cddl/specs/peer-sharing.cddl}

\section{Pipelining of Mini Protocols}
\label{pipelining}
Protocol pipelining is a technique that improves the performance of some protocols.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,16 +109,16 @@ data Handlers m peer blk = Handlers {
}

mkHandlers
:: forall m blk remotePeer localPeer.
:: forall m blk addrNTN addrNTC.
( IOLike m
, LedgerSupportsMempool blk
, LedgerSupportsProtocol blk
, QueryLedger blk
, ConfigSupportsNode blk
)
=> NodeKernelArgs m remotePeer localPeer blk
-> NodeKernel m remotePeer localPeer blk
-> Handlers m localPeer blk
=> NodeKernelArgs m addrNTN addrNTC blk
-> NodeKernel m addrNTN addrNTC blk
-> Handlers m addrNTC blk
mkHandlers NodeKernelArgs {cfg, tracers} NodeKernel {getChainDB, getMempool} =
Handlers {
hChainSyncServer =
Expand Down Expand Up @@ -383,7 +383,7 @@ data Apps m peer bCS bTX bSQ bTM a = Apps {

-- | Construct the 'NetworkApplication' for the node-to-client protocols
mkApps
:: forall m remotePeer localPeer blk e bCS bTX bSQ bTM.
:: forall m addrNTN addrNTC blk e bCS bTX bSQ bTM.
( IOLike m
, Exception e
, ShowProxy blk
Expand All @@ -393,16 +393,16 @@ mkApps
, ShowProxy (GenTxId blk)
, ShowQuery (BlockQuery blk)
)
=> NodeKernel m remotePeer localPeer blk
-> Tracers m localPeer blk e
=> NodeKernel m addrNTN addrNTC blk
-> Tracers m addrNTC blk e
-> Codecs blk e m bCS bTX bSQ bTM
-> Handlers m localPeer blk
-> Apps m localPeer bCS bTX bSQ bTM ()
-> Handlers m addrNTC blk
-> Apps m addrNTC bCS bTX bSQ bTM ()
mkApps kernel Tracers {..} Codecs {..} Handlers {..} =
Apps {..}
where
aChainSyncServer
:: localPeer
:: addrNTC
-> Channel m bCS
-> m ((), Maybe bCS)
aChainSyncServer them channel = do
Expand All @@ -419,7 +419,7 @@ mkApps kernel Tracers {..} Codecs {..} Handlers {..} =
$ hChainSyncServer flr

aTxSubmissionServer
:: localPeer
:: addrNTC
-> Channel m bTX
-> m ((), Maybe bTX)
aTxSubmissionServer them channel = do
Expand All @@ -431,7 +431,7 @@ mkApps kernel Tracers {..} Codecs {..} Handlers {..} =
(localTxSubmissionServerPeer (pure hTxSubmissionServer))

aStateQueryServer
:: localPeer
:: addrNTC
-> Channel m bSQ
-> m ((), Maybe bSQ)
aStateQueryServer them channel = do
Expand All @@ -443,7 +443,7 @@ mkApps kernel Tracers {..} Codecs {..} Handlers {..} =
(localStateQueryServerPeer hStateQueryServer)

aTxMonitorServer
:: localPeer
:: addrNTC
-> Channel m bTM
-> m ((), Maybe bTM)
aTxMonitorServer them channel = do
Expand Down

0 comments on commit db78950

Please sign in to comment.