Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
src/main/java/org/eclipse/microprofile/lra
README.adoc
arch.png
lra-state-model.png
lra-state-model.txt
lra.png
lra.sd
pom.xml

README.adoc

Long Running Actions

Introduction

The proposal introduces APIs for services to coordinate activities.

The main thrust of the proposal introduces an API for loosely coupled services to coordinate long running activities in such a way as to guarantee a globally consistent outcome without the need to take locks on data.

Motivation

In a loosely coupled service based environment there is sometimes a need for different services to provide consistency guarantees. Typical examples include:

  • order processing involving three services (take order, bill customer, ship product). If the shipping service finds that it is out of stock then the customer will have been billed with no prospect of receiving his item;

  • an airline overbooks a flight which means that the booking count and the flight capacity are inconsistent.

There are various ways that systems overcome such inconsistency but it would be advantageous to provide a generic solution which handles failure conditions, maintains state for those flows that span long periods of time and ensures that remedial activities are called correctly.

Traditional techniques for guaranteeing consistency in distributed environments has focused on XA transactions where locks may be held for long periods thereby introducing strong coupling between services and decreasing concurrency to unacceptable levels. Additionally, if such a transaction aborts then valuable work which may be valid will be rolled back. In view of these issues an alternative approach is desirable.

Goals

  • support long running actions

  • no strong coupling between services

  • allow actions to finish early

  • allow compensating actions if a business activity is cancelled

Proposed solution

We propose a compensation based approach in which participants make changes visible but register a compensatory action which is performed if something goes wrong. We call the model LRA (short for Long Running Action) and is based on work done within the OASIS Web Services Composite Application Framework Technical Committee, namely Long Running Action transaction model, but updated to be more suited for use in microservice based architectures.

In the LRA model, an activity reflects business interactions: all work performed within the scope of an activity is required to be compensatable. Therefore, an activity’s work is either performed successfully or undone. How services perform their work and ensure it can be undone if compensation is required are implementation choices and is not exposed to the LRA model which simply defines the triggers for compensation actions and the conditions under which those triggers are executed. In other words, an LRA coordinator is concerned only with ensuring participants obey the protocol necessary to make an activity compensatable (and the semantics of the business interactions are not part of the model). Issues such as isolation of services between potentially conflicting activities and durability of service work are assumed to be implementation decisions. The coordination protocol used to ensure an activity is completed successfully or compensated is not two-phase and is intended to better model interactions between microservices. Although this may result in non-atomic behaviour for the overall business activity, other activities may be started by the service to attempt to compensate in some other manner.

In the model, an LRA is tied to the scope of an activity so that when the activity terminates the LRA coordination protocol will be automatically performed either to accept or to compensate the work. For example, when a user reserves a seat on a flight, the airline reservation centre may take an optimistic approach and actually book the seat and debit the user’s account, relying on the fact that most of their customers who reserve seats later book them; the compensation action for this activity would be to un-book the seat and credit the user’s account.

As in any business interaction, service activities may or may not be compensatable. Even the ability to compensate may be a transient capability of a service. A Compensator (or simply LRA participant) is the LRA participant that operates on behalf of a service to undo the work it performs within the scope of an LRA or to compensate for the fact that the original work could not be completed.

The Model

The model concerns participants (aka Compensators) and a coordinator. A client starts a new LRA via a call to an LRA coordination service. This call creates a new LRA coordinator. When a business service does work that may have to be later compensated for within the scope of the LRA, it enlists a participant with the LRA coordinator. Subsequently the client closes or cancels the LRA via the coordinator which in turn tells all enlisted participants to either complete or compensate:

LRA Protocol Sequence

The lifecycle of an LRA

The LRA participant will be invoked in the following way by the LRA coordinator when the activity terminates:

  • Success: the activity has completed successfully. If the activity is nested then participants may propagate themselves to the enclosing LRA. Otherwise the participants are informed that the activity has terminated and they can perform any necessary cleanup.

  • Fail: the activity has completed unsuccessfully. All participants that are registered with the LRA will be invoked to perform compensation in the reverse order. The coordinator forgets about all participants that indicated they operated correctly. Otherwise, compensation may be attempted again (possibly after a period of time) or alternatively a compensation violation has occurred and must be logged. Each service is required to log sufficient information in order to ensure (with best effort) that compensation is possible. Each participant or subordinate coordinator (in the case of nested LRAs) is responsible for ensuring that sufficient data is made durable in order to undo the LRA in the event of failures.

Interposition and check pointing of state allow the system to drive a consistent view of the outcome and recovery actions taken, but allowing always the possibility that recovery isn’t possible and must be logged or flagged for the administrator.

In a large scale environment or in the presence of long term failures, recovery may not be automatic and manual intervention may be necessary to restore an application’s consistency.

Note that calling participants in reverse order does not guarantee that the compensation actions will be performed in strict sequential order since participants are allowed to indicate that the compensation is in progress and will complete at some future time. Furthermore a participant can indicate that it failed to compensate, or could be unavailable in which case it will be periodically retried (out of order).

Participants follow a state model with the following states:

  • Compensating: a participant is currently compensating for the work it did during the LRA;

  • Compensated: a participant has successfully compensated for the LRA.

  • FailedToCompensate: the participant was not able to compensate for the LRA. It MUST maintain information about the work it was to compensate until the coordinator sends it a forget message.

  • Completing: the participant is tidying up after being told to complete.

  • Completed: the participant has confirmed that it has finished tidying up.

  • FailedToComplete: the participant was unable to tidy-up. It MUST maintain information about the work it was to complete until the coordinator sends it a forget message.

participant-state-model

The LRA follows a similar state model:

  • Compensating: the LRA is currently being cancelled

  • Compensated: the LRA has successfully cancelled

  • FailedToCompensate: one or more participants was not able to compensate

  • Completing: the LRA is currently being closed

  • Completed: the LRA has closed

  • FailedToComplete: one or more participants was not able to complete

Different usage patterns for LRAs are possible, for example LRAs may be used sequentially and/or concurrently, where the termination of one LRA signals the start of some other unit of work within an application. However, LRAs are units of compensatable work and an application may have as many such units of work operating simultaneously as it needs to accomplish its tasks. Furthermore, the outcome of work within LRAs may determine how other LRAs are terminated. An application can be structured so that LRAs are used to assemble units of compensatable work and then held in the active state while the application performs other work in the scope of different (concurrent or sequential) LRAs. Only when the right subset of work (LRAs) is arrived at by the application will that subset be confirmed; all other LRAs will be told to cancel (complete in a failure state).

In the rest of this proposal we specify two different APIs for controlling the lifecycle of and participation in LRAs and a third API for writing participants:

  1. Java Annotations for LRAs

  2. Client API

    • the client API is for use with containers that do not use Java annotations;

  3. Java based LRA participant registration API

    • this API supports services that do not use JAX-RS

Java Annotations for LRAs

Support for the proposal in MicroProfile is primarily based upon the use of Java annotations for controlling the lifecycle of LRAs and participants.

Java Annotations

A JAX-RS implementation of the specification should be achievable via a set of Java annotations which are available in the linked java package. The service developer annotates resources to specify how LRAs should be controlled and when to enlist a class as a participant:

Controlling the lifecycle of an LRA

/**
 * An annotation for controlling the lifecycle of Long Running Actions (LRAs).
 *
 * Newly created LRAs are uniquely identified and the id is referred to as the LRA context. The context is passed around
 * using a JAX-RS request/response header called LRAClient#LRA_HTTP_HEADER ("Long-Running-Action"). The implementation (of the LRA
 * specification) is expected to manage this context and the application developer is expected to declaratively control
 * the creation, propagation and destruction of LRAs using the @LRA annotation. When a JAX-RS bean method is invoked in the
 * context of an LRA any JAX-RS client requests that it performs will carry the same header so that the receiving
 * resource knows that it is inside an LRA context (typically achieved using JAX-RS client filters).
 *
 * Resource methods can access the context id, if required, by injecting it via the JAX-RS @HeaderParam annotation.
 * This may be useful, for example, for associating business work with an LRA.
 */
@Inherited
@Retention(value = RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE, ElementType.METHOD})
public @interface LRA {

    /**
     * The Type element of the LRA annotation indicates whether a bean method
     * is to be executed within a compensatable LRA context.
     */
    Type value() default Type.REQUIRED;

    /**
     * The Type element of the annotation indicates whether a bean method is to be executed within a
     * compensatable transaction (aka LRA) context where the values provide the following behavior:
     */
    enum Type {
        /**
         *  If called outside an LRA context a JAX-RS filter will begin a new LRA for the duration of the
         *  method call and when the call completes another JAX-RS filter will complete the LRA.
         */
        REQUIRED,

        /**
         *  If called outside an LRA context a JAX-RS filter will begin a new LRA for the duration of the
         *  method call and when the call completes another JAX-RS filter will complete the LRA.
         *
         *  If called inside an LRA context a JAX-RS filter will suspend it and begin a new LRA for the
         *  duration of the method call and when the call completes another JAX-RS filter will complete the
         *  LRA and resume the one that was active on entry to the method.
         */
        REQUIRES_NEW,

        /**
         *  If called outside a transaction context, the method call will return
         *  with a 412 Precondition Failed HTTP status code
         *
         *  If called inside a transaction context the bean method execution will then continue within
         *  that context.
         */
        MANDATORY,

        /**
         *  If called outside an LRA context the bean method execution
         *  must then continue outside an LRA context.
         *
         *  If called inside an LRA context the managed bean method execution
         *  must then continue inside this LRA context.
         */
        SUPPORTS,

        /**
         *  The bean method is executed without an LRA context. If a context is present on
         *  entry then it is suspended and then resumed after the execution has completed.
         */
        NOT_SUPPORTED,

        /**
         *  If called outside an LRA context the managed bean method execution
         *  must then continue outside an LRA context.
         *
         *  If called inside an LRA context the method is not executed and a
         *  412 Precondition Failed HTTP status code is returned to the caller.
         */
        NEVER
    }

    /**
     * Some annotations (such as REQUIRES_NEW) will start an LRA on entry to a method and
     * end it on exit. For some business activities it is desirable for the action to survive
     * method execution and be completed elsewhere.
     *
     * @return whether or not newly created LRAs will survive after the method has finished executing.
     */
    boolean delayClose() default false;

    /**
     * Normally if an LRA is present when a bean method is executed it will not be ended when
     * the method returns. To override this behaviour and force LRA termination on exit use the
     * terminal element
     *
     * @return true if an LRA that was present before method execution will be terminated when the bean method finishes.
     */
    boolean terminal() default false;

    /**
     * If true then the annotated class will be checked for participant annotations and when present the class
     * will be enlisted with any LRA that is associated with the invocation
     *
     * @return whether or not to automatically enlist a participant
     */
    boolean join() default true;

    /**
     * The cancelOnFamily element can be set to indicate which families of HTTP response codes will cause
     * the LRA to cancel. By default client errors (4xx codes) and server errors (5xx codes) will result in
     * cancellation of the LRA.
     *
     * @return the {@link Response.Status.Family} families that will cause cancellation of the LRA
     */
    @Nonbinding
    Response.Status.Family[] cancelOnFamily() default {};

    /**
     * The cancelOn element can be set to indicate which  HTTP response codes will cause the LRA to cancel
     *
     * @return the {@link Response.Status} HTTP status codes that will cause cancellation of the LRA
     */
    @Nonbinding
    Response.Status [] cancelOn() default {};

Example:

  @POST
  @Path("/book")
  @Produces(MediaType.APPLICATION_JSON)
  @LRA(value = LRA.Type.REQUIRED,
       cancelOn = {Response.Status.INTERNAL_SERVER_ERROR} // cancel on a 500 code
       cancelOnFamily = {Response.Status.Family.CLIENT_ERROR}, // cancel on any 4xx code
       delayClose = true) // the LRA will continue to run when the method finishes
  public Response bookTrip(...) { ... }

  @PUT
  @Path("/confirm")
  @Produces(MediaType.APPLICATION_JSON)
  @Consumes(MediaType.APPLICATION_JSON)
  @LRA(LRA.Type.SUPPORTS,
       terminal = true) // the confirmation should trigger the closing of the LRA started in the bookTrip bean method
  public Booking confirmTrip(Booking booking) throws BookingException { ... }

When an LRA is present it SHOULD be made available to the business logic via request and response headers (with the name "Long-Running-Action")

Example:

  @PUT
  @Path("/confirm")
  @Produces(MediaType.APPLICATION_JSON)
  @LRA(LRA.Type.SUPPORTS, terminal = true)
  public Booking confirmTrip(
      @HeaderParam(LRAClient.LRA_HTTP_HEADER) String lraId) { ... }

Compensating Activities

Participants join LRAs using the @Compensate and @Complete annotations. These annotations must be combined with JAX-RS annotations so that they can be invoked as JAX-RS endpoints. Both annotations are expected to be used with JAX-RS @PUT annotation. Only the @Compensate method is mandatory.

If a JAX-RS resource method is invoked in the context of an LRA and the resource class contains a method annotated with @Compensate then the class will be enlisted as a participant of the LRA. When the LRA is cancelled this @Compensate will be invoked with a header parameter that contains the id of the LRA, for example:

  @PUT
  @Path("/compensate")
  @Produces(MediaType.APPLICATION_JSON)
  @Compensate
  public Response compensateWork(
      @HeaderParam(LRAClient.LRA_HTTP_HEADER) String lraId) {
    // compensate for whatever activity the business logic has associated with lraId
  }

Similarly, if the developer has provided a @Complete method it will be invoked if the LRA is closed.

If the participant bean knows that it will never be able to compensate the activity it SHOULD return a 200 OK status code and content body with the literal string FailedToCompensate. If it returns any other content the coordinator will call JAX-RS endpoint declared by the @Status method to obtain the status. If the @Status method is not present the condition will be logged and this participant will be dropped by the coordinator (ie the participant should avoid this circumstance). Similar remarks apply if the bean method knows that it will never be able to complete.

If the bean cannot perform a compensation or completion activity immediately the termination method MUST indicate the condition. In this case the LRA coordinator will need to monitor the progress of the participant and the developer should either provide a @GET method annotated with @Status which must return a string representation string representation of the status of the status or expect the compensator to be called again (ie the method must be idempotent). The bean indicates that it cannot finish immediately by either

  • returning a 202 Accepted HTTP status code or

  • the method is marked as a JAX-RS asynchronous method (using the javax.ws.rs.container.Suspended annotation). If an implementation does not support asynchronous JAX-RS then it MUST return the 202 Accepted code.

When the coordinator knows it has the final status it will inform the participant that it can clean up. The developer indicates which method to use for this purpose by annotating one of the methods with the @DELETE and @Forget annotations. If the developer has not provided both of these methods then a warning is logged when the asynchronous termination method finishes. But note that the interoperability portion of this specification allows the status URL to be reported in the response Location header and this will be used in place of the @Status and @Forget methods if present. However, there is no checking that the URLs are valid so mixing the two approaches is not recommended.

If an annotation is present on multiple methods an arbitrary one is chosen.

Nesting LRAs

An activity can be scoped within an existing LRA using the @NestedLRA annotation. Invoking a method marked with this annotation will start a new LRA whose outcome depends upon whether the enclosing LRA is closed or cancelled.

  • If the nested LRA is closed but the outer LRA is cancelled then the participants registered with the nested LRA will be told to compensate.

  • If the nested LRA is cancelled the outer LRA can be still closed.

Note that there is no annotation to directly cancel a closed nested LRA and the Java LRAClient api must be used for this purpose if required.

Timing out LRAs and Compensators

The ability to compensate may be a transient capability of a service so participants (and LRAs) can be timed out after which the compensator is called (the LRA is cancelled).

To set such a time limit use the @TimeLimit annotation, for example:

  @GET
  @Path("/doitASAP")
  @Produces(MediaType.APPLICATION_JSON)
  @TimeLimit(limit = 100, unit = TimeUnit.MILLISECONDS)
  @LRA(value = LRA.Type.REQUIRED)
  public Response theClockIsTicking(
      @HeaderParam(LRAClient.LRA_HTTP_HEADER) String lraId) {...}

Leaving an LRA

If a user calls a method annotated with @Leave while this bean method is executed in the context of a LRA then if the bean class has registered a participant with the active LRA it will be removed from the LRA just before the bean method is called (and will not be asked to complete or compensate when the LRA is subsequently ended).

Reporting the status of a participant

As alluded to above, participants can provide a method for reporting the status of the participant by annotating one of the methods with the @Status annotation. The method is required when at least one the participant methods that is annotated with @Compensate or @Complete is not able to complete the task immediately. If the participant has not finished - ie. it has not yet been asked to @Compensate or @Complete it should report the error using a JAX-RS exception mapper that maps to a 412 Precondition Failed HTTP status code (such as IllegalLRAStateException or InvalidStateException). Otherwise the response entity must correspond to one of the Strings defined by following enum values (as reported by the enum name() method):

/**
 * The status of a participant. The status is only valid after the coordinator has told the participant to
 * complete or compensate. The name value of the enum should be returned by any method marked with
 * the {@link Status} annotation.
 */
public enum CompensatorStatus {
    Compensating, // the Compensator is currently compensating for the LRA.
    Compensated, //  the Compensator has successfully compensated for the LRA.
    FailedToCompensate, //  the Compensator was not able to compensate for the LRA
                // (and must remember it could not compensate until it receives a forget message).
    Completing, //  the Compensator is tidying up after being told to complete.
    Completed, //  the Compensator has confirmed.
    FailedToComplete, //  the Compensator was unable to tidy-up.
}

Notice that the enum constants correspond to participant state model.

Forgetting an LRA

If a participant is unable to complete or compensate immediately then it must remember the fact until explicitly told that it can clean up using the @Forget annotation. The method annotated with the @Forget annotation is a standard REST endpoint expected to be used with JAX-RS @DELETE annotation.

LRA Client API

For completeness the proposal supports clients that wish to directly control LRAs and participants. To support this class of user an instance of LRA client API can be instantiated directly or injected if the client is using CDI:

public interface LRAClient {

    /**
     * Start a new LRA
     *
     * @param parentLRA The parent of the LRA that is about to start. If null then the new LRA will
     *                  be top level
     * @param clientID The client may provide a (preferably) unique identity which will be reported
     *                back when the LRA is queried.
     * @param timeout Specifies the maximum time that the LRA will exist for. If the LRA is
     *                terminated because of a timeout it will be cancelled.
     * @param unit Specifies the unit that the timeout is measured in
     *
     * @throws GenericLRAException a new LRA could not be started. The specific reason
     *                is available in {@link GenericLRAException#getStatusCode()}
     */
    URL startLRA(URL parentLRA, String clientID, Long timeout, TimeUnit unit) throws GenericLRAException;

    /**
     * Attempt to cancel an LRA
     *
     * Trigger compensation of all participants enlisted with the LRA (ie the compensate message will be
     * sent to each participant).
     *
     * @param lraId The unique identifier of the LRA (required)
     * @return the response MAY contain the final status of the LRA as reported by
     * {@link CompensatorStatus#name()}. If the final status is not returned the client can still discover
     * the final state using the {@link LRAClient#getStatus(URL)} method
     * @throws GenericLRAException Communication error (the reason is available via the
     * {@link GenericLRAException#getStatusCode()} method
     */
    String cancelLRA(URL lraId) throws GenericLRAException;

    /**
     * Attempt to close an LRA
     *
     * Tells the LRA to close normally. All participants will be triggered by the coordinator
     * (ie the complete message will be sent to each participant).
     *
     * @param lraId The unique identifier of the LRA (required)
     *
     * @return the response MAY contain the final status of the LRA as reported by
     * {@link CompensatorStatus#name()}. If the final status is not returned the client can still discover
     * the final state using the {@link LRAClient#getStatus(URL)} method
     * @throws GenericLRAException Communication error (the reason is available via the
     * {@link GenericLRAException#getStatusCode()} method
     */
    String closeLRA(URL lraId) throws GenericLRAException;

    /**
     * Lookup active LRAs
     *
     * @throws GenericLRAException on error
     */
    List<LRAInfo> getActiveLRAs() throws GenericLRAException;

    /**
     * Returns all LRAs
     *
     * Gets both active and recovering LRAs
     *
     * @return List<LRA>
     * @throws GenericLRAException on error
     */
    List<LRAInfo> getAllLRAs() throws GenericLRAException;

    /**
     * List recovering Long Running Actions
     *
     * Returns LRAs that are recovering (ie the participant is still
     * attempting to complete or compensate
     *
     *
     * @throws GenericLRAException on error
     */
    List<LRAInfo> getRecoveringLRAs() throws GenericLRAException;

    /**
     * Lookup the status of an LRA
     *
     * @param lraId the LRA whose status is being requested
     * @return the status or null if the the LRA is still active (ie has not yet been closed or cancelled)
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    Optional<CompensatorStatus> getStatus(URL lraId) throws GenericLRAException;

    /**
     * Indicates whether an LRA is active. The same information can be obtained via a call to
     * {@link LRAClient#getStatus(URL)}.
     *
     * @param lraId The unique identifier of the LRA (required)
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    Boolean isActiveLRA(URL lraId) throws GenericLRAException;

    /**
     * Indicates whether an LRA was compensated. The same information can be obtained via a call to
     * {@link LRAClient#getStatus(URL)}.
     *
     * @param lraId The unique identifier of the LRA (required)
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    Boolean isCompensatedLRA(URL lraId) throws GenericLRAException;

    /**
     * Indicates whether an LRA is complete. The same information can be obtained via a call to
     * {@link LRAClient#getStatus(URL)}.
     *
     * @param lraId The unique identifier of the LRA (required)
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.     */
    Boolean isCompletedLRA(URL lraId) throws GenericLRAException;

    /**
     * A participant can join with the LRA at any time prior to the completion of an activity.
     * The participant provides end points on which it will listen for LRA related events.
     *
     * @param lraId   The unique identifier of the LRA (required) to enlist with
     * @param timelimit The time limit (in seconds) that the participant can guarantee that it
     *                can compensate the work performed while the LRA is active.
     * @param body   The resource path or participant URL that the LRA coordinator will use
     *               to drive the participant. The coordinator uses the URL as follows:
     *
     *               - `{participant URL}/complete` is the `completion URL`,
     *               - `{participant URL}/compensate` is the `compensation URL` and
     *               - `{participant URL}` serves as both the `status` and `forget` URLs.
     *
     * @param compensatorData data that will be stored with the coordinator and passed back to
     *                        the participant when the LRA is closed or cancelled
     * @return a recovery URL for this enlistment
     *
     * @throws GenericLRAException  if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    String joinLRA(URL lraId, Long timelimit, String body, String compensatorData) throws GenericLRAException;

    /**
     * Similar to {@link LRAClient#joinLRA(URL, Long, String, String)} except that the various
     * participant URLs are passed in explicitly.
     */
    String joinLRA(URL lraId, Long timelimit,
                   URL compensateUrl, URL completeUrl, URL forgetUrl, URL leaveUrl, URL statusUrl,
                   String compensatorData) throws GenericLRAException;

    /**
     * Join an LRA passing in a class that will act as the participant.
     * Similar to {@link LRAClient#joinLRA(URL, Long, String, String)} but the various participant URLs
     * are expressed as Java annotations on the passed in resource class.
     *
     * @param lraId The unique identifier of the LRA (required)
     * @param resourceClass An annotated class for the participant methods: {@link io.narayana.lra.annotation.Compensate},
     *                      etc.
     * @param baseUri Base uri for the participant endpoints
     * @param compensatorData Compensator specific data that the coordinator will pass to the participant when the LRA
     *                        is closed or cancelled
     * @return a recovery URL for this enlistment
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    String joinLRA(URL lraId, Class<?> resourceClass, URI baseUri, String compensatorData) throws GenericLRAException;

    /**
     * Change the endpoints that a participant can be contacted on.
     *
     * @param recoveryUrl the recovery URL returned from a participant join request
     * @param compensateUrl the URL to invoke when the LRA is cancelled
     * @param completeUrl the URL to invoke when the LRA is closed
     * @param statusUrl if a participant cannot finish immediately then it provides
     *                  this URL that the coordinator uses to monitor the progress
     * @param forgetUrl used to inform the participant that can forget about this LRA
     * @param compensatorData opaque data that returned to the participant when the LRA
     *                        is closed or cancelled
     * @return an updated recovery URL for this participant
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    URL updateCompensator(URL recoveryUrl,URL compensateUrl, URL completeUrl, URL forgetUrl, URL statusUrl,
                           String compensatorData) throws GenericLRAException;

    /**
     * A Compensator can resign from the LRA at any time prior to the completion of an activity
     *
     * @param lraId The unique identifier of the LRA (required)
     * @param body  (optional)
     * @throws GenericLRAException if the request to the coordinator failed.
     * {@link GenericLRAException#getCause()} and/or {@link GenericLRAException#getStatusCode()}
     * may provide a more specific reason.
     */
    void leaveLRA(URL lraId, String body) throws GenericLRAException;

    /**
     * LRAs can be created with timeouts after which they are cancelled. Use this method to update the timeout.
     *
     * @param lraId the id of the lra to update
     * @param limit the new timeout period
     * @param unit the time unit for limit
     */
    void renewTimeLimit(URL lraId, long limit, TimeUnit unit);

    /**
     * checks whether there is an LRA associated with the calling thread
     *
     * @return the current LRA (can be null)
     */
    URL getCurrent();

    /**
     * Update the clients notion of the current coordinator.
     *
     * @param lraId the id of the LRA (can be null)
     */
    void setCurrentLRA(URL lraId);
}
Java based LRA participant registration API

For those applications that cannot directly expose JAX-RS endpoints for compensation activities this specification optionally supports an API for directly registering participants. A participant is a serializable java class that is interested in LRA lifecycle notifications, and does so by registering an instance of LRAParticipant with an instance of an LRAManagement:

/**
 * The API for notifying participants that an LRA is completing or cancelling.
 * A participant joins with an LRA via a call to
 * {@link LRAManagement#joinLRA(LRAParticipant, LRAParticipantDeserializer, URL, Long, TimeUnit)}
 */
public interface LRAParticipant extends Serializable {
    /**
     * Notifies the participant that the LRA is closing
     * @param lraId the LRA that is closing
     * @return null if the participant completed successfully. If the participant cannot
     *         complete immediately it should return a future that the caller can use
     *         to monitor progress. If the JVM crashes before the participant can finish
     *         it should expect this method to be called again. If the participant fails
     *         to complete it must cancel the future or throw a TerminationException.
     * @throws NotFoundException the participant does not know about this LRA
     * @throws TerminationException the participant was unable to complete and will never
     *         be able to do so
     */
    Future<Void> completeWork(URL lraId) throws NotFoundException, TerminationException;

    /**
     * Notifies the participant that the LRA is cancelling
     * @param lraId the LRA that is closing
     * @return null if the participant completed successfully. If the participant cannot
     *         complete immediately it should return a future that the caller can use
     *         to monitor progress. If the JVM crashes before the participant can finish
     *         it should expect this method to be called again. If the participant fails
     *         to complete it must cancel the future or throw a TerminationException.
     * @throws NotFoundException the participant does not know about this LRA
     * @throws TerminationException the participant was unable to complete and will never
     *         be able to do so
     */
    Future<Void> compensateWork(URL lraId) throws NotFoundException, TerminationException;
}

where the registration interface is defined as:

public interface LRAManagement {
    /**
     * Join an existing LRA
     *
     * @param participant an instance of a {@link LRAParticipant} that will be notified when the target LRA ends
     * @param deserializer a mechanism for recreating participants during recovery.
     *                     If the parameter is null then standard Java object deserialization will be used
     * @param lraId the LRA that the join request pertains to
     * @param timeLimit the time for which the participant should remain valid. When this time limit is exceeded
     *                  the participant may longer be able to fulfil the protocol guarantees.
     * @param unit the unit that the timeLimit parameter is expressed in
     */
    String joinLRA(LRAParticipant participant, LRAParticipantDeserializer deserializer,
                   URL lraId, Long timeLimit, TimeUnit unit) throws JoinLRAException;

    /**
     * Join an existing LRA. In contrast to the other form of registration this method does not indicate a time limit
     * for the participant meaning that the participant registration will remain valid until it terminates successfully
     * or unsuccessfully (ie it will never be timed out externally).
     *
     * @param participant an instance of a {@link LRAParticipant} that will be notified when the target LRA ends
     * @param deserializer a mechanism for recreating participants during recovery.
     *                     If the parameter is null then standard Java object deserialization will be used
     * @param lraId the LRA that the join request pertains to
     */
    String joinLRA(LRAParticipant participant, LRAParticipantDeserializer deserializer, URL lraId) throws JoinLRAException;
}

How the application obtains an LRAManagement instance is unspecified (for example the reference implementation does it by CDI injection). The deserializer, if provided, must match the interface

/**
 * An object that knows how to recreate a participant from its' persistent form
 */
public interface LRAParticipantDeserializer {
    LRAParticipant deserialize(byte[] recoveryState);
}

Compensators must be serializable for this approach to work.

The purpose of the deserializer covers recovery scenarios (where one or more components fail): the framework must guarantee that participants will still be triggered (the LRA protocol still provides the "all or nothing" guarantees that traditional transactions give). The deserializer provides a mechanism for the recovery component to recreate participants from their persistent form. Note that, in contrast to the JAX-RS based support, an installation must ensure that the java class definitions of Compensators be made available to the recovery system. Serializable participants need to know how to contact the original business application in order to trigger compensation activities whereas the JAX-RS based solution need only persist resource paths which are likely to correspond to existing microservice endpoints. In other words, from an administrative and manageability point of view, it is desirable to use one of the other APIs such as the Java Annotations for LRAs.

In the reference implementation recovery is achieved by depending on an maven artifact that automatically starts up a proxy participant which listens for replay requests. For this to work the proxy must start up on the same endpoint or it must be told where the coordinator resides so that it can inform the coordinator of its new location: the way in which participants can report their location is not defined in this version of the specification but the reference implementation achieves the behaviour via an HTTP PUT operation on the recovery URL. If a service is restarted the classes for any previously registered compensators must be on available on the classpath.

Appendix 1

Typical Recovery Scenarios

Setup:

  • Start 2 services and an LRA coordinator

  • Start an LRA and enlist both services

Scenario 1
Scenario 2
  • Kill one of the services before closing the LRA

  • The LRA close will fail because one of the services is down

  • Periodic recovery should keep retrying to close the LRA (even if you restart the coordinator it should still replay the close)

  • Restart the service

  • Periodic recovery should now successfully close the LRA

Scenario 3
  • Crash the second service after the first one has completed (this generates a heuristic)

  • Restart the second service

  • Periodic recovery should replay the complete on the failed participant

  • NB if you restart the coordinator before the last step then the recovery should replay all participants (since it will re-read the whole list).

Scenarios 4, 5 and 6
  • And similarly, the same 3 scenarios but cancelling an LRA instead of closing it.