Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core-lro] Deprecate cancelOperation and introduce createPoller #22753

Merged
merged 18 commits into from
Sep 1, 2022

Conversation

deyaaeldeen
Copy link
Member

@deyaaeldeen deyaaeldeen commented Jul 29, 2022

Fixing LRO cancellation by deprecating cancelOperation

Canceling a long-running operation is currently done by calling cancelOperation:

/**
* Attempts to cancel the underlying operation.
*/
cancelOperation(options?: { abortSignal?: AbortSignalLike }): Promise<void>;
However the cancellation behavior could vary between APIs in a way that makes cancelOperation a confusing abstraction. In particular, when sending a cancellation request, the service may respond with either a 200 or a 202 response, depending on the API and/or the operation, where the former indicates the operation has been canceled immediately and the latter indicates it could be some time before the operation is deemed as canceled. Handling 202 responses needs care to set the customer's expectation about wait times and this is typically done by providing a poller object that customer could use to optionally block until the operation finishes. This discrepancy in behavior makes cancelOperation not a suitable abstraction for the case when cancellation returns a 202 response.

In this PR, I am deprecating cancelOperation and leaving it up to the library author to implement operation cancellation as they see fit. This proposal has been discussed with @bterlson, @joheredi, and @witemple-msft and they're all on-board with it.

Pollers without cancellation abstraction

The PR also introduces a second poller interface that doesn't have cancelOperation at all, named SimplePollerLike. This interface is meant to be used for pollers for new operations that don't support cancellation.
The other difference between PollerLike and SimplePollerLike is in how the operation state is defined. PollerLike defines its state shape as a type that extends PollOperationState<TResult>:

export interface PollOperationState<TResult> {
/**
* True if the operation has started.
*/
isStarted?: boolean;
/**
* True if the operation has been completed.
*/
isCompleted?: boolean;
/**
* True if the operation has been cancelled.
*/
isCancelled?: boolean;
/**
* Will exist if the operation encountered any error.
*/
error?: Error;
/**
* Will exist if the operation concluded in a result of an expected type.
*/
result?: TResult;
}
Mainly, it has isStarted, isCanceled, and isCompleted booleans. However, the semantics of isCanceled can be confusing; should it be set at the time when the cancellation request is sent or when the service responds that the operation is canceled?
To avoid this confusion, OperationState replaces those booleans with a status field typed as a union of states:

export interface OperationState<TResult> {
  /**
   * The current status of the operation.
   */
  status: "notStarted" | "running" | "succeeded" | "canceling" | "canceled" | "failed";
  /**
   * Will exist if the operation encountered any error.
   */
  error?: Error;
  /**
   * Will exist if the operation produced a result of the expected type.
   */
  result?: TResult;
}

Which makes it clear that it reflects operation status after each poll.

Use case: Implement LRO cancellation in @azure/ai-language-text (#22852)

The proposal is implemented in a different PR: #22852

This LRO cancellation returns a 202 response so cancellation itself is an LRO. I am proposing to implement cancellation by returning a function named sendCancellationRequest alongside the operation poller, so the return type is:

interface AnalyzeBatchOperation {
    poller: AnalyzeBatchPoller;
    sendCancellationRequest(): Promise<void>;
}

And client code can simply destruct the output object as follows to get access to the poller and the cancellation function:

const { poller, sendCancellationRequest } = await client.beginAnalyzeBatch(actions, documents, "en");
...
// Let's cancel the operation
await sendCancellationRequest(); // poller.cancelOperation() no longer exists

Design considerations:

  • cancellation as a concept is not related to polling so a poller is not the right place to send cancellation requests
  • canceling an operation requires knowing the operation ID/location, but we shouldn't ask customers for that information because it is an implementation detail
  • cancellation is an LRO but it doesn't have its own operation location, i.e. the operation location is the same for the operation being cancelled and the cancellation operation itself. This means there is no need to create a dedicated poller for the cancellation operation and customers could still use the existing poller to await their operation or its cancellation
  • cancellation is a POST request with no interesting response, so it should be modeled programmatically as (): Promise<void> function

@azure-sdk
Copy link
Collaborator

azure-sdk commented Aug 5, 2022

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-core-lro
azure-storage-blob
azure-keyvault-certificates
azure-mixed-reality-remote-rendering
azure-ai-text-analytics

@deyaaeldeen deyaaeldeen force-pushed the core-lro/v3 branch 3 times, most recently from a3fff69 to 3d3a64d Compare August 8, 2022 13:21
@deyaaeldeen deyaaeldeen changed the title [core-lro] Introduce v3 [core-lro] Deprecate cancelOperation and introduce createPoller Aug 8, 2022
azure-sdk pushed a commit to azure-sdk/azure-sdk-for-js that referenced this pull request Aug 8, 2022
cancelOperation is being deprecated in Azure#22753 so `LroEngine` no longer needs to have support for it. Note that this support was introduced in 2.3.0-beta.1 so removing it is not a breaking change.

Furthermore, this PR prepares for release today.
@deyaaeldeen deyaaeldeen force-pushed the core-lro/v3 branch 3 times, most recently from 3385839 to a22fee0 Compare August 8, 2022 21:04
@deyaaeldeen deyaaeldeen marked this pull request as ready for review August 8, 2022 22:27
Comment on lines 13 to 23
export function createPoller<TResult, TState extends OperationState<TResult>>(lro: LongRunningOperation, options?: CreatePollerOptions<TResult, TState>): Promise<SimplePollerLike<TState, TResult>>;

// @public
export interface CreatePollerOptions<TResult, TState> {
intervalInMs?: number;
processResult?: (result: unknown, state: TState) => TResult;
resourceLocationConfig?: LroResourceLocationConfig;
restoreFrom?: string;
updateState?: (state: TState, lastResponse: RawResponse) => void;
withPollingUrl?: (pollingUrl: string) => void;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This abstraction feels a bit too leaky. It exposes a lot of HTTP-specific and Azure-specific logic in the API.

I think there's room for a two-layered abstraction. Something like createAzurePoller which does what you have written here and is specific to the Azure HTTP LRO pattern, and an even smaller createPoller that encapsulates only the logic around managing updates to the poller state.

It might even make sense to do a core-poller package where the simplest form of the poller can live and then import/use that in core-lro, where the Azure convention bits can live.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what scenario you're trying to accommodate here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpodwysocki could you share some of what you showed me you are trying to accomplish in NH with the polling API?

I'm also not 100% sure how I'd take the existing logic in FR and encode it in createPoller. What we have there is so simple, but I'm not sure how it will map into LongRunningOperation/CreatePollerOptions. In FR there's just a function to init the state, a function to poll, and some helpers for serialization/deserialization.

Copy link
Member Author

@deyaaeldeen deyaaeldeen Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not 100% sure how I'd take the existing logic in FR and encode it in createPoller. What we have there is so simple, but I'm not sure how it will map into LongRunningOperation/CreatePollerOptions. In FR there's just a function to init the state, a function to poll, and some helpers for serialization/deserialization.

It is pretty much simpler, mainly, you will need to specify:

  • sendInitRequest: to send the operation initiation request
  • sendPollRequest: to send the polling GET request given the operation location URL as input
  • processResult: to transform the result in the final response to a convenience-layer result
  • updateState: to update the poller's state

You no longer need to worry about how the poller's state is being instantiated or updated to perform the polling logic. See #22852 for an example in using it in @azure/ai-language-text.

EDIT: the FR abstraction forces client code to reason about the state because the abstraction is defined as a state transition machine over, well, the state. For example, there is some logic needed to handle resumeFrom in createAnalysisPoller. Such logic is not needed if createPoller is used because it manages the state under the hood.

To rewrite this in createPoller terms, you will need to provide just the following:

{
sendInitRequest: () => this._tracing.withSpan(
  "DocumentAnalysisClient.createAnalysisPoller-start",
  definition.options,
   async () => {
     const [contentType, analyzeRequest] = toAnalyzeRequest(input);
     const { operationLocation } = await this._restClient.analyzeDocument(
       definition.initialModelId,
       contentType as any,
       {
         ...definition.options,
         analyzeRequest,
       }
     );
  }

sendPollRequest: (operationLocation: string) => this._tracing.withSpan(
  "DocumentAnalysisClient.createAnalysisPoller-getAnalyzeResult",
  definition.options,
  (finalOptions) =>
    this._restClient.sendOperationRequest<AnalyzeResultOperation>(
    {
      options: finalOptions,
    },
    {
      path: operationLocation,
      httpMethod: "GET",
      responses: {
        200: {
          bodyMapper: Mappers.AnalyzeResultOperation,
        },
        default: {
          bodyMapper: Mappers.ErrorResponse,
        },
      },
      // URL is fully-formed, so we don't need any query parameters
      headerParameters: [accept1],
      serializer: SERIALIZER,
    }
  )
}

and then you will need to tell createPoller how to update convenience-layer stuff in the state and to transform the final response by providing updateState and processResult options. FR already have functions to do so (e.g. toDocumentAnalysisPollOperationState), so you can pass a variation of them there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a protocol-agnostic poller and used it to implement the http one in 5baa826. How does that look?

sdk/storage/storage-blob/src/models.ts Outdated Show resolved Hide resolved
sdk/storage/storage-blob/src/models.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Show resolved Hide resolved
sdk/core/core-lro/src/createPoller.ts Outdated Show resolved Hide resolved
stopPolling: () => cancelJob?.(),
toString: () =>
JSON.stringify({
state,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure it'll always be safe to serialize operation state as JSON. The operation state might include convenience-layer transforms that make it cyclic, or it could be extremely deep. We should only have to serialize --- in general --- the information that's needed to poll the operation.

The reason serialize was a user-provided method in the FR implementation was to account for this possibility and to allow simply serializing the model & operation IDs required to restore the poller, rather than the whole operation state which might include a really deep object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operation state might include convenience-layer transforms that make it cyclic, or it could be extremely deep.

Is there an existing example of this scenario?

We should only have to serialize --- in general --- the information that's needed to poll the operation.

I disagree, a client will attach something to the state for a reason, e.g. in TA, the input documents list is attached to the state because it is needed to sort the results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this some more thinking, I guess there is a question about whether all poller states are serializable or not.

Unserializable states are against our SDK design guidelines:

DO provide a way to instantiate a poller with the serialized state of another poller to begin where it left off, for example by passing the state as a parameter to the same method which started the operation, or by directly instantiating a poller with that state.

So I think we should require our states to always be JSON serializable. That being said, I would love to take a look at any existing poller with an unserializable state and see if we could make it serializable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an existing example

In FR, I only serialize the bare minimum of information that is required to restore the poller. I suppose whether or not this is important depends on whether or not you think a completed poller should be restorable, because the full state of a completed poller includes the result, but the serialized version in FR doesn't.

I think it's certainly possible that in general a status monitor could return way more information than we want to serialize into the restoration token, and in general a convenience layer state transform could make the full state cyclic or otherwise unserializable.

the input documents list is attached to the state because it is needed to sort the results.

The input documents themselves aren't needed, only the input order of their IDs, IIRC. If I call the TA analyze LRO with a couple of megabytes of documents in the input and then serialize the state, do I get a restoration token that is a few megabytes huge?

Copy link
Member Author

@deyaaeldeen deyaaeldeen Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's certainly possible that in general a status monitor could return way more information than we want to serialize into the restoration token, and in general a convenience layer state transform could make the full state cyclic or otherwise unserializable.

Yeah I was hoping for a concrete example for this situation to discuss further. As I mentioned, generally, an unserializable state is against our design guidelines.

The input documents themselves aren't needed, only the input order of their IDs, IIRC. If I call the TA analyze LRO with a couple of megabytes of documents in the input and then serialize the state, do I get a restoration token that is a few megabytes huge?

Good point, thanks for the suggestion. Currently the service limits the number of input documents to 25 only so the restoration token should contain at worst an array of 25 numbers IDs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed the TA comment in the other PR (#22852), so state would look like the following:

{"state":{"status":"running","config":{"mode":"OperationLocation","pollingUrl":"https://javatextanalyticstestresources.cognitiveservices.azure.com/language/analyze-text/jobs/<OPERATION ID>?api-version=2022-04-01-preview"},"createdOn":"2022-08-12T12:25:37.000Z","modifiedOn":"2022-08-12T12:25:38.000Z","expiresOn":"2022-08-13T12:25:37.000Z","id":"<OPERATION ID>","actionSucceededCount":0,"actionFailedCount":0,"actionInProgressCount":1,"docIds":["0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24"]}}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any outstanding asks in this thread?

sdk/core/core-lro/src/util/delayMs.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/createPoller.ts Outdated Show resolved Hide resolved
@deyaaeldeen
Copy link
Member Author

Copy link
Member

@joheredi joheredi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for taking this work. I went through the PR offline with @deyaaeldeen.

One thing that came up is the state serialization. We think we can incrementally add an option to provide custom serialization if there is a need which will help solve the problem of non-serializable states

@deyaaeldeen deyaaeldeen merged commit 3de4ed1 into Azure:main Sep 1, 2022
@deyaaeldeen deyaaeldeen deleted the core-lro/v3 branch September 1, 2022 04:52
azure-sdk pushed a commit to azure-sdk/azure-sdk-for-js that referenced this pull request Feb 27, 2023
updated parameter names for more clarity (Azure#22753)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants