Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
189 changes: 189 additions & 0 deletions aip/general/0151/aip.md.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# Long-running requests

Occasionally, a service may need to expose an operation that takes a
significant amount of time to complete. In these situations, it is often a poor
user experience to simply block while the task runs; rather, it is better to
return some kind of promise to the user, and allow the user to check back in
later.

The long-running request pattern is roughly analogous to a [Future][] in Python
or Java, or a [Node.js Promise][]. Essentially, the user is given a token that
can be used to track progress and retrieve the result.

## Guidance

Operations that might take a significant amount of time to complete **should**
return a `202 Accepted` response along with an identifier that can be used to
track the status of the request and ultimately retrieve the result.

Any single operation defined in an API surface **must** either _always_ return
`202 Accepted` along with a request identifier, or _never_ do so. A service
**must not** return a `200 OK` response with the result if it is "fast enough",
and `202 Accepted` if it is not fast enough, because such behavior adds
significant burdens for clients.

**Note:** User expectations can vary on what is considered "a significant
amount of time" depending on what work is being done. A good rule of thumb is
10 seconds.

### Status monitor representation

The response to a long-running request **should** be a "status monitor" having
the following common format:

```typescript
interface StatusMonitor {
// The identifier for this status monitor.
id: string;

// Whether the request is done.
done: boolean;

// The result of the request.
// Only populated if the request is done and was successful.
response: any;

// The error that arose from the request.
// Only populated if the request is done and was unsuccessful.
error: Error;

// Metadata associated with the request.
// Populated throughout the life of the request, including after
// it completes.
metadata: any;
}
```

- If the `done` field is `true`, then one and exactly one of the `response` and
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(forgot to submit) Should we care about partial responses? Meaning the operation partially succeeded and returned some of the data with an error.
In that case, should we specify that StatusMonitor.error should only encompass errors that aborted the entire operation?

`error` fields **must** be populated.
- If the `done` field is `false`, then the `response` and `error` fields
**must not** be populated.
- The `response` and `metadata` fields **may** be any type that the service
determines to be appropriate, but **must** always be the same type for any
particular operation.
- The `response` and `metadata` types **should** be defined in the same API
surface as the operation itself.
- The `response` and `metadata` types that need no data **should** use a
custom-defined empty struct rather than a common void or empty type, to
permit future extensibility.

### Querying a status monitor

The service **must** provide an endpoint to query the status of the operation,
which **must** accept the operation identifier and **should not** include other
parameters:

```http
GET /v1/statusMonitors/{status_monitor} HTTP/2
Host: library.googleapis.com
Accept: application/json
```

The endpoint **must** return a `StatusMonitor` as described above.

### Standard methods

APIs **may** return an `StatusMonitor` from the [`Create`][aip-133],
[`Update`][aip-134], or [`Delete`][aip-135] standard methods if appropriate. In
this case, the `response` field **must** be the standard and expected response
type for that standard method.

When creating or deleting a resource with a long-running request, the resource
**should** be included in [`List`][aip-132] and [`Get`][aip-131] calls;
however, the resource **should** indicate that it is not usable, generally with
a [state enum][aip-216].

### Parallel requests

A resource **may** accept multiple requests that will work on it in parallel,
but is not obligated to do so:

- Resources that accept multiple parallel requests **may** place them in a
queue rather than work on the requests simultaneously.
- Resource that does not permit multiple requests in parallel (denying any new
request until the one that is in progress finishes) **must** return
`409 Conflict` if a user attempts a parallel request, and include an error
message explaining the situation.

### Expiration

APIs **may** allow their status monitor resources to expire after sufficient
time has elapsed after the request completed.

**Note:** A good rule of thumb for status monitor expiry is 30 days.

### Errors

Errors that prevent a long-running request from _starting_ **must** return an
error response (AIP-193), similar to any other method.

Errors that occur over the course of a request **may** be placed in the
metadata message. The errors themselves **must** still be represented with a
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases should the error be placed in StatusMonitor.metadata but not on StatusMonitor.error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial failures, for sure.

canonical error object.

## Interface Definitions

{% tab proto %}

When using protocol buffers, the well-known type `google.longrunning.Operation`
is used.

**Note:** For historical reasons, Google uses the term `Operation` to represent
what this document describes as a `StatusMonitor`.

{% sample 'lro.proto', 'rpc WriteBook' %}

- The response type **must** be `google.longrunning.Operation`. The `Operation`
proto definition **must not** be copied into individual APIs.
- The response **must not** be a streaming response.
- The method **must** include a `google.longrunning.operation_info` annotation,
which **must** define both response and metadata types.
- The response and metadata types **must** be defined in the file where the
RPC appears, or a file imported by that file.
- If the response and metadata types are defined in another package, the
fully-qualified message name **must** be used.
- The response type **should not** be `google.protobuf.Empty` (except for
[`Delete`][aip-135] methods), unless it is certain that response data will
_never_ be needed. If response data might be added in the future, define an
empty message for the RPC response and use that.
- The metadata type is used to provide information such as progress, partial
failures, and similar information on each `GetOperation` call. The metadata
Copy link
Copy Markdown

@tinnou tinnou Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see I should have read until the end, so the guidance if I understand correctly:

A. Errors preventing the operation to start -> (AIP-193)
B. Error making the entire request unsuccessful, (e.g a 500 on a downstream service) -> StatusMonitor.error populated, with optionally error metadata in StatusMonitor.metadata. StatusMonitor.response must not be populated.
C. Any partial failures -> StatusMonitor.response populated, and error metadata in StatusMonitor.metadata. StatusMonitor.error must not be populated.
D. Successful request no errors -> StatusMonitor.response populated, and optionally metadata in StatusMonitor.metadata. StatusMonitor.error must not be populated.

type **should not** be `google.protobuf.Empty`, unless it is certain that
metadata will _never_ be needed. If metadata might be added in the future,
define an empty message for the RPC metadata and use that.
- APIs with messages that return `Operation` **must** implement the
[`Operations`][lro] service. Individual APIs **must not** define their own
interfaces for long-running operations to avoid inconsistency.

{% tab oas %}

{% sample 'lro.oas.yaml', 'paths' %}

- `202` **must** be the only success status code defined.
- The `202` response **must** define an `application/json` response body and no other
response content types.
- The response body schema **must** be an object with `name`, `done`, and `result`
properties as described above for a StatusMonitor
- The response body schema **may** contain an object property named `metadata` to
hold service-specific metadata associated with the operation, for example progress
information and common metadata such as create time. The service **should** define
the contents of the `metadata` object in a separate schema, which **should** specify
`additionalProperties: true` to allow for future extensibility.
- The `response` property **must** be a schema that defines the success
response for the operation. For an operation that typically gives a `204 No Content`
response, such as a `Delete`, `response` should be defined as an empty object schema.
For a standard `Get/Create/Update` operation, `response` should be a representation
of the resource.
- If a service has any long running operations, the service **must** define an
`StatusMonitor` resource with a `list` operation to retrieve a potentially filtered
list of status monitors and a `get` operation to retrieve a specific status monitor
by its `name`.

{% endtabs %}

<!-- prettier-ignore-start -->
[google.rpc.Status]: https://github.com/googleapis/api-common-protos/blob/master/google/rpc/S.proto
[lro]: https://github.com/googleapis/api-common-protos/blob/master/google/longrunning/operations.proto
[node.js promise]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Using_promises
[future]: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future
<!-- prettier-ignore-end -->
7 changes: 7 additions & 0 deletions aip/general/0151/aip.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
id: 151
state: reviewing
created: 2019-07-25
placement:
category: operations
order: 70
65 changes: 65 additions & 0 deletions aip/general/0151/lro.oas.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
openapi: 3.0.3
info:
title: Library
version: 1.0.0
paths:
/v1/resources:
post:
operationId: write_book
description: Write a book.
responses:
202:
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/WriteBookStatus'

components:
schemas:
StatusMonitor:
description: The status of a long running operation.
properties:
name:
type: string
description: The server-assigned name, which is only unique within the same service that originally returns it.
done:
type: boolean
description: >-
If the value is false, it means the operation is still in progress. If true, the operation is completed,
and either response or error is available.
error:
$ref: '#/components/schemas/Error'
required:
- name
- done

WriteBookStatus:
description: The status of a write_book operation.
allOf:
- $ref: '#/components/schemas/StatusMonitor'
- type: object
properties:
response:
type: string
description: The text that was written.
metadata:
type: object
properties:
start_time:
type: string
format: date-time
description: The time the operation started.
progress_percent:
type: integer
format: int32
description: The current progress, expressed as an integer.
state:
type: string
description: The current state of the operation.
enum:
- STATE_UNSPECIFIED
- RUNNING
- CANCELLING
- CANCELLED
- FAILED
85 changes: 85 additions & 0 deletions aip/general/0151/lro.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// Copyright 2020 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

syntax = "proto3";

import "google/api/annotations.proto";
import "google/api/resource.proto";
import "google/longrunning/operations.proto";
import "google/protobuf/timestamp.proto";

service Library {
// Write a book.
rpc WriteBook(WriteBookRequest) returns (google.longrunning.Operation) {
option (google.api.http) = {
post: "/v1/{parent=publishers/*}/books:write"
body: "*"
};
option (google.longrunning.operation_info) = {
response_type: "WriteBookResponse"
metadata_type: "WriteBookMetadata"
};
}
}

// The request message for the WriteBook endpoint.
// Unlike the response, this message is accepted directly.
message WriteBookRequest {
// The publisher for which the book is to be written.
string parent = 1 [(google.api.resource_reference) = {
child_type: "library.googleapis.com/Book"
}];

// The title of the new book.
string title = 2;
}

// The response message for the WriteBook endpoint.
// When WriteBook is called, it will not send this response directly; instead,
// it sends a long-running operation; the operation will contain this in
// the `response` field once it completes.
message WriteBookResponse {
// The text that was written.
string text = 1;
}

// The metadata message for the WriteBook endpoint.
// This is populated in the `metadata` field for all WriteBook LROs.
message WriteBookMetadata {
// The time the operation started.
google.protobuf.Timestamp start_time = 1;

// The current progress, expressed as an integer: [0, 100].
int32 progress_percent = 2;

enum State {
STATE_UNSPECIFIED = 0;

// The operation is running.
RUNNING = 1;

// The operation is still running, but cancellation has been requested
// and accepted, and is in progress.
CANCELLING = 2;

// The operation was cancelled.
CANCELLED = 3;

// The operation failed.
FAILED = 4;
}

// The current state of the operation.
State state = 3;
}