Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement from_sort_value Parameter in Get Snapshots API #77618

Merged
merged 15 commits into from Sep 15, 2021

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Sep 13, 2021

Add from_sort_value parameter to allow for filtering snapshots by comparing to concrete sort column
values similar to the existing after parameter`.

Add `after_value` parameter to allow for filtering snapshots by comparing to concrete sort column
values similar to the existing `after` parameter`.
@original-brownbear original-brownbear added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Sep 13, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Sep 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@@ -451,6 +447,189 @@ public void testFilterBySLMPolicy() throws Exception {
assertThat(getAllSnapshotsForPolicies(GetSnapshotsRequest.NO_POLICY_PATTERN, "*"), is(allSnapshots));
}

public void testSortAfterStartTime() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not explicitly add tests for all columns as they use the same functionality and limited myself to those that I needed for full coverage only here.

SortOrder order
) {
if (snapshotName == null) {
assert repoName == null : "no snapshot name given but saw repo name [" + repoName + "]";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's definitely ways of making this nicer, but since it's a temporary solution I went for the shortest+easiest-to-review version again to get this into the API (and thus complete the API side of this work) asap.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/part-1

@original-brownbear
Copy link
Member Author

@henningandersen this one should be good for review now :) as discussed it's using >= all the way now.

blockAllDataNodes(repoName);
final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2");
awaitNumberOfSnapshotsInProgress(1);
TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this complexity really worth it? Maybe we could just snapshot indices with random docs, retrieve the duration and test after_value after that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty easy to get collisions no matter what things you do in between snapshots if you don't make it deterministic like this. That said, this wasn't even good enough for Windows I think and I went with a different approach now, running the snapshot create in a loop until we get unique timestamps. Should be safe on all systems now :)

@@ -628,3 +633,84 @@ The API returns the following response:
// TESTRESPONSE[s/"end_time_in_millis": 1593094752019/"end_time_in_millis": $body.snapshots.1.end_time_in_millis/]
// TESTRESPONSE[s/"duration_in_millis": 0/"duration_in_millis": $body.snapshots.0.duration_in_millis/]
// TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/]


The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that sorting after a given start/end time will be the most common use case. Maybe it's worth showing as an example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to but I don't see a way of doing this with a test because the timestamp on the request would have to be dynamic right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the timestamp on the request would have to be dynamic right?

Or low enough to always return some snapshots in the response. But that's just a suggestion, let's ignore if that's too complicated or ugly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point :) Added a docs run for this with a low timestamp now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few smaller comments, otherwise this looks good.

@@ -140,14 +140,19 @@ Allows setting a sort order for the result. Defaults to `start_time`, i.e. sorti
(Optional, string)
Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order.

`after_value`::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should call the parameter after_sort_value to make the coupling to sort explicit in the name?

I am still pondering on after_. Reading your text, perhaps start_sort_value is better, though I am not really contend with that either...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this or the alternatives. How about just from maybe? It's short and it indicates "start" + "inclusive"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like from, but perhaps still qualify it as from_sort_value? I think that makes it easier to see from the request what it means.


assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo));
assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo));
assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that snapshot1 and snapshot2 get different timestamps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not on Windows where the clock is less accurate actually ... let me make sure they will.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted the code to ensure the timestamps never collide now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right missed this one, adding a fix here as well :)

.getSnapshots();
}

public void testSortAfterName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and the one sorting by timestamp are very similar, perhaps we can share most of the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged all tests into one :)

blockAllDataNodes(repoName);
final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2");
awaitNumberOfSnapshotsInProgress(1);
TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?

if (after != null) {
validationException = addValidationError("can't use after and offset simultaneously", validationException);
}
if (afterValue != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are not allowing this? offset could just be evaluated after after_value?

Obviously we can do that in a follow-up rather than here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh yea actually we have to that for Kibana (otherwise any filtered results can't be paginated through which seems weird). Let's push it to a follow-up though as it will require some more code-heavy tests I think, though the production code change is easy/small :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this turned out to be entirely trivial with the way offset works at the moment. I just enabled it and added 2 simple tests for offset + after.

deleted will be seen during the iteration. Snapshots concurrently created may be seen during an iteration.

NOTE: The parameters `size`, `order`, `after`, `offset`, `slm_policy_filter` and `sort` are not supported when using `verbose=false` and
the sort order for requests with `verbose=false` is undefined.
NOTE: The parameters `size`, `order`, `after`, `after_value`, `offset`, `slm_policy_filter` and `sort` are not supported when using
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also note the incompatibility of after_value with offset and after?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well now that you pointed it out below, it would only be with "after" in the final thing which I'd code up very shortly after this one. Not sure it's worth the added text though, it obviously makes no sense setting both, the request validation will tell you so and we already point this out in the after section?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added the incompatibility note to offset, I missed that on my initial read. I think it still makes sense to add to either after or after_value that they are incompatible. I could see this being slightly surprising, since after_value is like a filter so it could be surprising that it does not work with pagination using after (and this could slip through some UI testing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added a note to after :)

final Predicate<SnapshotInfo> isAfter;
switch (sortBy) {
case START_TIME:
isAfter = filterByLongOffset(SnapshotInfo::startTime, Long.parseLong(after), snapshotName, repoName, order);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add explicit error handling when parsing the arguments, perhaps to the request validation or here. That would allow the message to indicate that it is the after_value causing the problem.

Also, I think < 0 is illegal so we could guard against that.

Similarly for duration, indices and failed shards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized we have the safe problem for the after value as well (though due to the way we encode things there it's less likely). May I push this into a follow-up? :) Then I can just deal with both in one go and make it a little nicer without blowing this one up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I refactored this a little to build the predicate early on now. This makes sure that we at least don't run any actual requests unless we have a valid predicate. Refactoring this nicer and working in a fix for the after param + easy-to-read validation I'd still like to push to a follow-up though if possible :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for after we expect it to be handle opaquely by clients so if they mess with it, I am OK to have a worse error (though I will not object to improving it).

I am OK with pushing the error handling to a separate PR.

assertThat(allBeforeStartTimeDescending(startTime2), is(List.of(snapshot2, snapshot1)));
assertThat(allBeforeStartTimeDescending(startTime1), is(List.of(snapshot1)));
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a combination test too, validating that after_value and for instance snapshot name filtering works together? Just one of these is enough, no need to do several combinations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for that and also for offset + after_value which turned out to be trivial.

@original-brownbear
Copy link
Member Author

Thanks so much for reading through this monster @henningandersen @tlrx . I think I was able to work in all your points now :) The BwC failure is unrelated and know and this should be good for another review I hope.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Armin, this looks good, just a comment on the name and some minor comments.

@@ -140,14 +140,19 @@ Allows setting a sort order for the result. Defaults to `start_time`, i.e. sorti
(Optional, string)
Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order.

`after_value`::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like from, but perhaps still qualify it as from_sort_value? I think that makes it easier to see from the request what it means.

`after`::
(Optional, string)
Offset identifier to start pagination from as returned by the `next` field in the response body.
Offset identifier to start pagination from as returned by the `next` field in the response body. Using this parameter is mutually exclusive
with using the `after_value` parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I saw a comment where you mentioned you did this anyway since it was so easy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do it for after, just for numeric offsets. We don't want after and from_sort_value working in parallel do we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I got confused by the "offset" part of the sentence, this is fine as is. About after and from_sort_value, let us see what kibana thinks, I think they can handle it.


assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo));
assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo));
assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?

) throws Exception {
while (true) {
final SnapshotInfo snapshotInfo = createFullSnapshot(repoName, snapshotName);
final long duration = snapshotInfo.endTime() - snapshotInfo.startTime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mildly concerned about the duration being either 0 or 100ms on some platform meaning we could loop infinitely here. I think we will be ok so let us leave it, though a comment to not go beyond the 3 snapshots we generate now seems in order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, comment added

@original-brownbear
Copy link
Member Author

Thanks @henningandersen renamed + other comments addressed as well now :)

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -628,3 +633,84 @@ The API returns the following response:
// TESTRESPONSE[s/"end_time_in_millis": 1593094752019/"end_time_in_millis": $body.snapshots.1.end_time_in_millis/]
// TESTRESPONSE[s/"duration_in_millis": 0/"duration_in_millis": $body.snapshots.0.duration_in_millis/]
// TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/]


The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@original-brownbear
Copy link
Member Author

Thanks Henning & Tanguy!!

@original-brownbear original-brownbear changed the title Implement after_value Parameter in Get Snapshots API Implement from_sort_value Parameter in Get Snapshots API Sep 15, 2021
@original-brownbear original-brownbear merged commit 2544d91 into elastic:master Sep 15, 2021
@original-brownbear original-brownbear deleted the after-value branch September 15, 2021 11:37
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 17, 2021
Add `from_sort_value` parameter to allow for filtering snapshots by comparing to concrete sort column
values similar to the existing after parameter`.
original-brownbear added a commit that referenced this pull request Oct 17, 2021
…9318)

Add `from_sort_value` parameter to allow for filtering snapshots by comparing to concrete sort column
values similar to the existing after parameter`.
@original-brownbear original-brownbear restored the after-value branch April 18, 2023 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed Meta label for distributed team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants