Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Plane Framework Idempotence #622

Closed
jimmarino opened this issue Feb 3, 2022 · 3 comments
Closed

Data Plane Framework Idempotence #622

jimmarino opened this issue Feb 3, 2022 · 3 comments
Assignees
Labels
dpf Feature related to the Data Plane Framework stale Open for x days with no activity

Comments

@jimmarino
Copy link
Contributor

jimmarino commented Feb 3, 2022

The DPF (#463) must support idempotent transfer request handling. This issue lays out what this idempotent behavior entails and a strategy for implementing it.

Preliminary: Idempotent vs. Fire-and-Forget (FAF) Requests

The DPF will support two request types: idempotent requests, which, by definition, must be tracked via durable storage; and Fire-and-Forget requests that are not tracked using durable storage. FAF requests are a performance optimization when idempotency is not needed, such as in a proxy scenario when the client blocks on the request.

Goal

Idempotence does not mean that the DPF can always guarantee no side effects outside its purview. For example, a data transfer may be initiated twice (once after a failure), resulting in copying the same data multiple times. Therefore, end-to-end idempotency may require supporting behavior from external systems, such as the infrastructure backing a data sink.

The goal of DPF idempotency is therefore to enable end-to-end idempotent behavior but not to guarantee it.

Implementation

When an idempotent request is submitted, the DPF will:

  1. Check for an entry in the DataPlaneStore.
  2. If no entry is found, the current DPF node will create one and start a transfer.
  3. If an entry is found and the state is COMPLETED, or FATAL ERROR an appropriate response will be returned to the requestor.
  4. If an entry is found and the state is IN_PROCESS, the protocol described below will be initiated.

Idempotent Protocol

If an entry is already found in the DataPlaneStore when a request is made, the current node must determine if the request is still in process or the node handling the request crashed and the latter needs to be restarted. In a clustered environment, the request may be in process on another node.

To disambiguate these two conditions (crashed vs. handled on a different node), the current node will use the Backplane topic (the Backplane and its implementation will be detailed in another issue):

  1. The current node will broadcast an in-process query message to other nodes over the Backplane.
  2. If a node is not currently processing the request, it will ignore the message.
  3. If a node is processing the request, it will send an acknowledgment message on a private channel to the requesting (current) node.
  4. The current client will wait a configurable amount of time for an acknowledgment message. If no message is received, it will initiate processing the request itself.

Note that idempotent support will only be partially implemented for Milestone 2 (the Backplane will not be implemented).

Note after a discussion with @paullatzelsperger and @bscholtes1A it may be possible to implement the Backplane as part of the DataPlaneStore when each entry contains a resolvable address (URL) of the node that originally handled the request. In that case, the current node can query the resolvable address directly. If the address does not respond or is unavailable, the process can assumed to be crashed.

@jimmarino jimmarino self-assigned this Feb 3, 2022
@jimmarino jimmarino added the dpf Feature related to the Data Plane Framework label Feb 3, 2022
@github-actions
Copy link

github-actions bot commented Jun 7, 2022

This issue is stale because it has been open for 28 days with no activity.

@github-actions github-actions bot added the stale Open for x days with no activity label Jun 7, 2022
@github-actions
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale.

@bscholtes1A bscholtes1A reopened this Jun 15, 2022
@github-actions
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dpf Feature related to the Data Plane Framework stale Open for x days with no activity
Projects
None yet
Development

No branches or pull requests

2 participants