New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define client/agent protocol #1652
Comments
I don't think we need it, at least for now. |
Maybe.. but it could be a simple data type depending on the message or option. For example, setting depth filter needs a integer for the depth, while setting function name filter needs a string for the name. |
It doesn't need to be a list. It could be a bit-shifted flags.. then it'd be limited by the number of bits. It could an array.. then you also need to pass the length. |
I'm not sure the handshake is needed for every message exchange. I don't think we need to maintain a connection state. I think we can have a simple exchange flow like below:
|
Agree, this keeps things simple. That's the solution I'm currently using and it works well. |
I like the bit-shifted flags because they are easy to understand and to manipulate. We already use this for |
Sure, my explanation wasn't clear. The handshake only happens once, when initializing the connection. I do agree with your proposed diagram. |
Sounds good. I guess you're gonna update the existing PRs according to this, right? |
Of course. I'm updating #1643 first, and when it's done I'll rebase the others. |
Hi @namhyung, I have a few questions. Is it ok to use
Then I would directly add agent messages to
Second thing, I need a simple way to tell the agent which option to apply. For example in
Or maybe I can use |
Using and extending the existing uftrace_msg format looks good. I'm ok with having a new enum for agent options. I also thought about the short options but I think it'd be better to split since we don't support all options for agent and the value might be changed later. |
Let's define the communication protocol between the client and the agent.
From previous discussions (see #1643), and upcoming pull requests (#1644, #1645, #1646, #1647 ATM), we can outline the following points.
Context
When the agent is started (
--agent/-g
) in libmcount, it can receive messages from a client at runtime. We're introducing new features in the agent, so it can modify tracing parameters e.g. tracing state (on/off), tracing depth and triggers, but also patch and unpatch the running process.Connection handling
Currently, the agent can sequentially handle multiple client connections. The next one is processed when the current one is terminated.
Some operations like patching are heavy and require some time. Any incoming connection will hang until heavy operations are achieved.
The agent will support many independent features e.g. changing tracing depth and patching, which have no reason to be done in a particular order.
Protocol phases
The aim of the message exchange is to ultimately forward options from a client to the agent. To make this robust, we introduce control phases as follows.
Handshake phase (once)
The handshake phase initializes the connection.
If the handshake fails, the client aborts.
Data exchange phase (repeated)
When the handshake succeeds, the client and the agent can exchange data.
The client is aware of the capabilities of the agent.
Client to agent
The client can instruct the agent to apply user options, as mentioned earlier.
The client skips actions that the agent is known not to support.
Agent to client
The client can fetch info from the agent. The agent can send requested info about its state (e.g. parameter values) to the agent.
Validation phase (repeated)
After exchanging data, the client checks the status of the agent. The agent indicates whether it succeeded or failed, giving a reason in case of failure (error code).
Internal definition
Message type
We use an enumeration to define the agent messages types.
Agent capabilities
The agent supports given features. Currently they are defined as bit-shifted flags.
You suggested to return a list of supported features. I am not sure what you mean regarding the list.
For example, a capability query would return
FEATURE1 | FEATURE2 | ...
if we use bit-shifted flags{FEATURE1} -> {FEATURE2} -> ...
if we use a listNaming
We use the
UFTRACE_AGENT_MSG_
prefix for messages.This is similar to the following definition:
Using abbreviations to make names shorter is confusion IMO.
"Each message should handle variable-length data then."
Message types
Protocol control
We define two messages to handle the state of the protocol:
State query
We define a
GET_OPT
message for the client to fetch info about the agent.It holds an option name as data. The response holds the value as data.
Option forwarding
Similarly, we define a
SET_OPT
message to apply user options in the agent.The agent supports multiple options and related data types. Partial definition of supported options:
The options are sent as the data load of the
SET_OPT
message.Error checking
After every operation, the agent returns a status message. It sets an error code depending on the situation:
If the agent doesn't answer, it is up to the client to determine when to abort its operation. For example we can use a timeout threshold.
Failure error codes
We can have multiple failure error codes, for example:
Definition
We ues the following definition for message types:
Sequence diagram
Debug info
Agent output
Level 1
Level 2
Level 3
Client output
Level 1
Level 2
Level 3
The text was updated successfully, but these errors were encountered: