Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch operations spec points #1421

Merged
merged 15 commits into from
May 13, 2022
Merged

Conversation

Peter-Maguire
Copy link
Contributor

@Peter-Maguire Peter-Maguire commented May 9, 2022

NOTE TO REVIEWERS - please do not review PRs in the DRAFT state, as the PR may change substantially before it is ready to review. Thanks.

Description

Adds spec points for the Batch Operations functions.

Review

@Peter-Maguire Peter-Maguire self-assigned this May 9, 2022
Copy link
Member

@owenpearson owenpearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few comments. Also, I notice that the REST API supports multiple BatchSpec objects in one HTTP request (as in specifying different messages for different channels in the same batch). Did you consider supporting this in SDKs? Even if we don't support it initially it might be good to design this such that we can support it in the future without a breaking change.

content/client-lib-development-guide/features.textile Outdated Show resolved Hide resolved
content/client-lib-development-guide/features.textile Outdated Show resolved Hide resolved
content/client-lib-development-guide/features.textile Outdated Show resolved Hide resolved
Peter Maguire and others added 3 commits May 11, 2022 10:07
Co-authored-by: Owen Pearson <48608556+owenpearson@users.noreply.github.com>
@Peter-Maguire
Copy link
Contributor Author

Looks good, just a few comments. Also, I notice that the REST API supports multiple BatchSpec objects in one HTTP request (as in specifying different messages for different channels in the same batch). Did you consider supporting this in SDKs? Even if we don't support it initially it might be good to design this such that we can support it in the future without a breaking change.

@owenpearson this is a good question, I definitely agree that it should be implemented here, although I can't quite figure out how the rest way to do it would be. I've pushed a commit for how I think it should work, could you take a look and see if it's how you were thinking too?

@owenpearson
Copy link
Member

@Peter-Maguire Nice, that is pretty much the API I was imagining, although now that I see it side-by-side with the pre-existing API I almost wonder if it's worth keeping the (string[], Message[]) API, since it would be almost identical to using the (BatchSpec[]) API, for example:

// Using the (string[], Message[]) API
client.batch.publish(['my_cool_channel', 'my_other_cool_channel'], { foo: 'bar' });

// Using the (BatchSpec[]) API
client.batch.publish([{
  channels: ['my_cool_channel', 'my_other_cool_channel'],
  messages: [{ foo: 'bar' }]
}]);

I haven't thought about it too much but my current thinking is maybe we just support two batch.publish signatures: one which takes a single BatchSpec, and another which takes a BatchSpec[]. WDYT?

@Peter-Maguire
Copy link
Contributor Author

@owenpearson my initial thinking with including the various inputs for single channels/messages and arrays of them was for convenience. Perhaps having a combination of each plus the BatchSpec methods would be too many different variations, though. Especially for languages that don't support function overloading. I will remove the shorthand functions and replace them with BatchSpec and []BatchSpec, I think that will be easier to implement more consistently.

Co-authored-by: Rosie Hamilton <Rosalita@users.noreply.github.com>
Copy link
Contributor

@Rosalita Rosalita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing all my comments, I have no more comments to make 👍

Copy link
Member

@owenpearson owenpearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@Peter-Maguire Peter-Maguire merged commit 44d9e76 into main May 13, 2022
@Peter-Maguire Peter-Maguire deleted the feature/sdk-1792-batch-operations branch May 13, 2022 15:00
@mattheworiordan
Copy link
Member

This PR is great, apologies for feedback after the fact, but I was not aware of this PR until today. It's so great to see the spec evolving 🥳

I do have on question though. Why does the batch method not exist on the Realtime SDK? I believe this creates inconsistency in what we've done historically where all Rest functionality is available in the Realtime SDK by default. As you can see from the screenshot below, by not doing this, we're deviating from what we've done historically. Was this intentional or just an omission?

MO screenshot 2022-05-13 at 15 23 40

@Peter-Maguire
Copy link
Contributor Author

@mattheworiordan good point, I assumed that realtime extends rest and that that was defined somewhere in the spec, but I should have checked. As I've already merged this, I can create a new PR to add batch into realtime.

@@ -339,6 +340,14 @@ h3(#rest-encryption). Encryption
h3(#rest-compatibility). Forwards compatibility
* @(RSF1)@ The library must apply the "robustness principle":https://en.wikipedia.org/wiki/Robustness_principle in its processing of requests and responses with the Ably system. In particular, deserialization of Messages and related types, and associated enums, must be tolerant to unrecognised attributes or enum values. Such unrecognised values must be ignored.

h3(#batch-operations). Batch Operations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense for there to be explicit sections for each of the batch operations?

For example an h3 of "Batch publish", and another one for "Batch presence" ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I agree with this unless we change the API structure; the h3s in this section of the spec seem to correlate with the API. I've restructured this section in 015aa44 to make it more inline with similar parts of the spec.

@@ -339,6 +340,14 @@ h3(#rest-encryption). Encryption
h3(#rest-compatibility). Forwards compatibility
* @(RSF1)@ The library must apply the "robustness principle":https://en.wikipedia.org/wiki/Robustness_principle in its processing of requests and responses with the Ably system. In particular, deserialization of Messages and related types, and associated enums, must be tolerant to unrecognised attributes or enum values. Such unrecognised values must be ignored.

h3(#batch-operations). Batch Operations
* @(BO1)@ The batch operations functions must use the REST endpoints in Batch Mode, sending a single request containing all specified data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"must use the REST endpoints in Batch Mode" - what does this mean? I think this spec should be explicit about the client API, and what HTTP request is made as a result (ie specific path, specific params)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this sentence and added some clarity about the request in 015aa44

** @(BO2a)@ Publishing messages against one or more channels with one or more messages
*** @(B02a1)@ Functions should be provided to pass either an array or a single object for both Message and Channel
** @(BO2b)@ Retrieving the presence data for one or more channels
* @(BO3)@ When all passed arrays contain a single object, the underlying request is functionally identical to it's non-batch equivalent, but the returned result should be a @BatchResponse@ object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should have a generic BatchResponse type, or a BatchPublishResponse ? Or BatchResponse<PublishResult> ?
Edit: I see you have BatchPublishResponse below, but here I'm talking about the outer type representing the top-level response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outer response is already genericised as BatchResult<BatchPublishResponse>. I'll update this line to reflect that.

h4. BatchResult
* @(BPA1)@ Contains the results from the batch operation
* @(BPA2)@ @BatchResult@ has the following attributes:
** @(BPA2a)@ @responses@ is an array of batch response objects. undefined if the request failed completely (e.g. for an invalid key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined is a JS concept. I think we need to try to word this in a way that's as language-agnostic as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a discussion around this here where it was changed to null.

** @(BPA2a)@ @responses@ is an array of batch response objects. undefined if the request failed completely (e.g. for an invalid key)
** @(BPA2b)@ @error@ is an @ErrorInfo@ object which is populated if one or more batch publish requests failed

h4. BatchPublishResponse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that these individual publish responses are the same thing as you get back from a regular single-channel REST publish request. I know that we hadn't explicitly named that type before, but it might make sense to do that now, and have this as a PublishResponse or PublishResult or something.

This already exists in the java library: https://github.com/ably/ably-java/blob/main/lib/src/main/java/io/ably/lib/types/PublishResponse.java

@@ -1525,6 +1561,7 @@ class Rest:
constructor(ClientOptions) // RSC1
auth: Auth // RSC5
push: Push
batch: BatchOperations // BO1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a client option?
Edit: I misread.
I'm not sure I like the idea that in the API we're grouping together all of the operations that happen to be in batch mode into a top-level entry like this. Publish, presence, and future batch operations (history perhaps) are functionally quite separate - it just happens that they operate in a similar way. This doesn't seem intuitive to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this didn't feel like a 'perfect' solution to me either. I was stuck for a better way to do it though, especially for batch publishing. I could add batch presence to RestPresence easily, but batch publishing to multiple channels didn't feel like it would work in the same way as publishing happens on a Channel object

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only sensible alternative I can think of is to have RestChannels#batchPublish and RestPresence#batchGet, @paddybyers WDYT?

@@ -1697,6 +1735,30 @@ class RealtimeChannel:
unsubscribe(String, (Message) ->) // RTL8a
setOptions(options: ChannelOptions) => io // RTL16

class BatchOperations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an existing Java API for batch publishing: https://github.com/ably/ably-java/blob/main/lib/src/main/java/io/ably/lib/rest/AblyBase.java#L232. Did you look at that?

This existing Java API isn't great, in that you pass in an array of publish specs, and you get back an array of publish responses, and you have to correlate them yourself. (Ie you have to find the publish response that corresponds to the publish spec you passed in). It would be worth thinking about whether or not this API does something to address that problem.

Copy link
Member

@lmars lmars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Peter-Maguire @owenpearson apologies for being late with my review too! Happy to move the conversation elsewhere, but I thought for now this was the most appropriate place.

class BatchPresence:
clientId: string
action: string?
error: ErrorInfo?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this error belongs on BatchPresenceResponse

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, I've moved it over on the new PR in a33bfbe and 384b4fe

class PushChannel:
subscribeDevice() => io // RSH7a
subscribeClient() => io // RSH7b
unsubscribeDevice() => io // RSH7c
unsubscribeClient() => io // RSH7d
listSubscriptions() => io PaginatedResult<PushChannelSubscription> // RSH7e

class BatchSpec:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the documentation calls this BatchSpec too, but it seems specific to publishing a batch of messages, so perhaps it should be called something like BatchPublishRequest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch publish documentation reads as if the BatchSpec object is the one passed into the body of the request either as a single object or within an array as it's being used here. In any case I would argue that the spec should be naming things as consistently as possible with the REST documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case I would argue that the spec should be naming things as consistently as possible with the REST documentation.

I agree with that, so I guess what I'm saying is, how about we rename it to BatchPublishRequest in both places? This term isn't something that appears in the API response itself, so we should be free to change it, and now would be the time to change it if we think it could be named better before it starts to appear in SDK code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to BatchPublishRequest both in the spec and in the docs in a2eee7d


class BatchPresenceResponse:
channel: String // BPD2a
presence: []BatchPresence // PBD2b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be []PresenceMessage, not sure an extra type is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PresenceMessage has far more fields on it than what appears in the response for a batch presence request. If those fields are all optional or can be retrieved from context then I would be happy to use that type instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PresenceMessage has far more fields on it than what appears in the response for a batch presence request

Can you give an example? I'm not sure why retrieving presence for multiple channels in one request should return less fields than retrieving for a single channel, if that is the case it sounds more like a problem with the API that we should fix rather than something we should be encoding into the type system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, PresenceMessage has connectionId, id and timestamp along with action and clientId which are the only two fields that the REST API returns for batch presence.
Docs link for PresenceMessage | Docs link for BatchPresence

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PresenceMessage has connectionId, id and timestamp along with action and clientId which are the only two fields that the REST API returns for batch presence.

The docs may seem to indicate that, but it's not accurate, you get the same fields from both endpoints:

$ curl -u "${ABLY_API_KEY}" "https://rest.ably.io/channels/test1/presence"
[
        {
                "id": "LnUWZJf1G-:0:0",
                "clientId": "lmars",
                "connectionId": "LnUWZJf1G-",
                "timestamp": 1653686295156,
                "data": "",
                "action": 1
        }
]

$ curl -u "${ABLY_API_KEY}" "https://rest.ably.io/presence?channel=test1"
[
        {
                "channel": "test1",
                "presence": [
                        {
                                "id": "LnUWZJf1G-:0:0",
                                "clientId": "lmars",
                                "connectionId": "LnUWZJf1G-",
                                "timestamp": 1653686295156,
                                "data": "",
                                "action": 1
                        }
                ]
        }
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to use PresenceMessage in 384b4fe. I've also created #1451 to address this on the docs side.

@@ -1697,13 +1735,40 @@ class RealtimeChannel:
unsubscribe(String, (Message) ->) // RTL8a
setOptions(options: ChannelOptions) => io // RTL16

class BatchOperations:
publish([BatchSpec]) => BatchResult<BatchPublishResponse> // BO2a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about partial success, I think I'd like to get back something which helps me understand which of the things I tried to publish failed.

Whilst I understand why the REST API is designed as it is, I think there's an opportunity to wrap it in a client-side abstraction which might be more useful.

For example, what if it looked like this:

batch := client.newBatchPublish()
req1 := batch.publish("channel1", "msg1")
req2 := batch.publish("channel1", "msg2")
req3 := batch.publish("channel2", "msg3")
results, err := batch.send()

and results somehow helps me correlate any errors with one or more of req1, req2, or req3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants