Aries RFC 0003: Protocols
- Authors: Daniel Hardman
- Status: ACCEPTED
- Since: 2019-04-01
- Status Note: standards track and beginning to influence many mental models, but not yet ADOPTED.
- Supersedes: Indy PR #69
- Start Date: 2018-12-28
- Tags: concept
Defines application-level protocols (and the closely related concept of message families) in the context of agent interactions, and shows how they should be designed and documented.
When we began exploring agent interactions, we imagined that interoperability would be achieved by formally defining message families. We have since learned that message family definitions must define more than simply the attributes that are a part of each message. We also need to formally define the roles in an interaction, the possible states those roles can have, the way state changes in response to messages, and the errors that may arise.
In addition, we realized that we need clear examples of how to define all these things, so designs are consistent and robust.
What is a protocol?
A protocol is a recipe for a stateful interaction. Protocols are all around us, and are so ordinary that we take them for granted. Each of the following interactions is stateful, and has conventions that constitute a sort of "recipe":
- Ordering food at a restaurant
- Buying a house
- Playing a game of chess, checkers, tic-tac-toe, etc.
- Bidding on an item in an online auction.
- Going through security at the airport when we fly
- Applying for a loan
In the context of decentralized identity, protocols manifest at many different levels of the stack: at the lowest levels of networking, in cryptographic algorithms like Diffie Helman, in the management of DIDs, in the conventions of DIDComm, and in higher-level interactions that solve problems for people with only minimal interest in the technology they're using. However, this RFC focuses on the last of these layers, where use cases and personas are transformed into features with obvious social value like:
- Connecting with one another
- Buying and Selling
- Enacting and enforcing contracts
- Requesting and issuing credentials
- Proving things using credentials
- Discovering things
- Putting things in escrow (and taking them out again)
- Reporting errors
- Cooperative debugging
When "protocol" is used in an Aries context without any qualifying adjective, it is referencing a recipe for a high-level interaction like these. Lower-level protocols are usually described more specifically and possibly with other verbiage: "cryptographic algorithms", "DID management procedures", "DIDComm conventions", "wire-level message exchange", "transports", and so forth. This helps us focus "protocol" on the place where application developers that consume Aries do most of the work that creates value.
As used in the agent/DIDComm world, protocols are decentralized. This means there is not an overseer for the protocol, guaranteeing information flow, enforcing behaviors, and ensuring a coherent view. It is a subtle but important divergence from API-centric approaches, where a server holds state against which all other parties (clients) operate. Instead, all parties are peers, and they interact by mutual consent and with a (hopefully) shared understanding of the rules and goals. Protocols are like a dance--not one that's choreographed or directed, but one where the parties make dynamic decisions and react to them.
Types of Protocols
The most common protocol style in DID Communication is request-response.
This style involve two parties, with the
requester making the first move,
responder completing the interaction. The Discover Features Protocol uses this style.
A second common pattern that's also important is notification. This style also
involves two parties, but it is one-way: the
notifier emits a message,
and the protocol ends when the
notified receives it. The ACK Protocol and the Report Problem
Protocol use this style.
However, more complex protocols exist. The Introduce Protocol involves three parties, not two. When the DID Exchange Protocol includes organizations, it may involve dozens of participants, and it has cycles and other complex state evolution.
See this note for definitions of the terms "role", "participant", and "party".
Protocols are the key unit of interoperable extensibility in agents. To add a new interoperable feature to an agent, give it the ability to handle a new protocol.
When agents receive messages, they map the messages to a protocol handler and possibly to an interaction state that was previous persisted. The protocol handler is code that knows the rules of a particular protocol; the interaction state tracks progress through an interaction. For more information, see the agents explainer -- RFC 0004 and the DIDComm explainer -- RFC 0005.
Protocols are composable--meaning that you can nest one inside another. The protocol for asking someone to repeat their last sentence can occur inside the protocol for ordering food at a restaurant. The protocols for reporting an error or arranging payment can occur inside a protocol for issuing credentials.
When we invoke one protocol inside another, we call the inner protocol a subprotocol, and the outer protocol a superprotocol. A given protocol may be a subprotocol in some contexts, and a standalone protocol in others. In some contexts, a protocol may be a subprotocol from one perspective, and a superprotocol from another (as when protocols are nested at least 3 deep).
Commonly, protocols wait for subprotocols to complete, and then they continue. A good example of this is ACKs, which are often used as a discrete step in a larger flow.
In other cases, a subprotocol is not "contained" inside its superprotocol. Rather, the superprotocol triggers the subprotocol, then continues in parallel, without waiting for the subprotocol to complete. In the Introduce Protocol, the final step is to begin a connection protocol between the two introducees-- but the introduction superprotocol completes when the connect subprotocol starts, not when it completes.
A message family is a collection of messages that share a common theme, goal, or
usage pattern. The messages used by a protocol may be a subset of a particular
message family; for example, the DID Exchange Protocol
uses one subset of the messages in the
connections message family, and the sync connection protocol
uses a different subset.
Collectively, the message types of a protocol serve as its interface. Each protocol has a primary message family, and the name of the protocol is often the name of the primary message family.
A protocol has the following ingredients:
- Name and semver-compatible version
- URI that uniquely identifies it
- Messages (primary message family)
- Adopted messages
- State and sequencing rules
- Events that can change state -- notably, messages, but also errors, timeouts, and other things
- Constraints that provide trust and incentives
How to define a protocol or message family
To define a protocol, write an RFC. Specific instructions for protocol RFCs, and a discussion about the theory behind detailed protocol concepts, are given in the Template for Protocol RFCs. The tictactoe protocol is also attached to this RFC as an example.
- Message Type and Protocol Identifier URIs
- Semver Rules for Protocols
- State Details and State Machines
- Roles, Participants, Parties, and Controllers
This RFC creates some formalism around defining protocols. It doesn't go nearly as far as SOAP or CORBA/COM did, but it is slightly more demanding of a protocol author than the familiar world of RESTful Swagger/OpenAPI.
The extra complexity is justified by the greater demands that agent-to-agent communications place on the protocol definition. (See notes in Prior Art section for details.)
Rationale and alternatives
Some of the simplest DIDComm protocols could be specified in a Swagger/OpenAPI style. This would give some nice tooling. However, not all fit into that mold. It may be desirable to create conversion tools that allow Swagger interop.
BPMN (Business Process Model and Notation) is a graphical language for modeling flows of all types (plus things less like our protocols as well). BPMN is a mature standard sponsored by OMG(Object Management Group). It has a nice tool ecosystem. It also has an XML file format, so the visual diagrams have a two-way transformation to and from formal written language. And it has a code generation mode, where BPMN can be used to drive executable behavior if diagrams are sufficiently detailed and sufficiently standard. (Since BPMN supports various extensions and is often used at various levels of formality, execution is not its most common application.)
BPMN began with a focus on centralized processes (those driven by a business entity), with diagrams organized around the goal of the point-of-view entity and what they experience in the interaction. This is somewhat different from a DIDComm protocol where any given entity may experience the goal and the scope of interaction differently; the state machine for a home inspector in the "buy a home" protocol is quite different, and somewhat separable, from the state machine of the buyer, and that of the title insurance company.
BPMN 2.0 introduced the notion of a choreography, which is much closer to the concept of an A2A protocol, and which has quite an elegent and intuitive visual representation. However, even a BPMN choreography doesn't have a way to discuss interactions with decorators, adoption of generic messages, and other A2A-specific concerns. Thus, we may lean on BPMN for some diagramming tasks, but it is not a substitute for the RFC definition procedure described here.
WSDL (Web Services Description Language) is a web-centric evolution of earlier, RPC-style interface definition languages like IDL in all its varieties and CORBA. These technologies describe a called interface, but they don't describe the caller, and they lack a formalism for capturing state changes, especiall by the caller. They are also out of favor in the programmer community at present, as being too heavy, too fragile, or poorly supported by current tools.
Swagger / OpenAPI
Swagger / OpenAPI overlaps about 60% with the concerns of protocol definition in agent-to-agent interactions. We like the tools and the convenience of the paradigm offered by OpenAPI, but where these two do not overlap, we have impedance.
Agent-to-agent protocols must support more than 2 roles, or two roles that are peers, whereas RESTful web services assume just client and server--and only the server has a documented API.
Agent-to-agent protocols are fundamentally asynchronous, whereas RESTful web services mostly assume synchronous request~response.
Agent-to-agent protocols have complex considerations for diffuse trust, whereas RESTful web services centralize trust in the web server.
Agent-to-agent protocols need to support transports beyond HTTP, whereas RESTful web services do not.
Agent-to-agent protocols are nestable, while RESTful web services don't provide any special support for that construct.
- Pdef (Protocol Definition Language): An alternative to Swagger.
- JSON RPC: Defines how invocation of remote methods can be accomplished by passing JSON messages. However, the RPC paradigm assumes request/response pairs, and does not provide a way to describe state and roles carefully.
- IPC Protocol Definition Language (IPDL): This is much closer to agent protocols in terms of its scope of concerns than OpenAPI. However, it is C++ only, and intended for use within browser plugins.
- Should we write a Swagger translator?
- If not swagger, what formal definition format should we use in the future?
The following lists the implementations (if any) of this RFC. Please do a pull request to add your implementation. If the implementation is open source, include a link to the repo or to the implementation within the repo. Please be consistent in the "Name" field so that a mechanical processing of the RFCs can generate a list of all RFCs supported by an Aries implementation.
|Name / Link||Implementation Notes|
|Indy Cloud Agent - Python||several protocols, circa Feb 2019|
|Aries Framework - .NET||several protocols, circa Feb 2019|
|Streetcred.id||several protocols, circa Feb 2019|
|Aries Cloud Agent - Python||numerous protocols plus extension mechanism for pluggable protocols|
|Aries Static Agent - Python||2 or 3 protocols|
|Aries Framework - Go||DID Exchange|
|Connect.Me||mature but proprietary protocols; community protocols in process|
|Verity||mature but proprietary protocols; community protocols in process|
|Aries Protocol Test Suite||2 or 3 core protocols; active work to implement all that are ACCEPTED, since this tests conformance of other agents|