New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement from_sort_value Parameter in Get Snapshots API #77618
Implement from_sort_value Parameter in Get Snapshots API #77618
Conversation
Add `after_value` parameter to allow for filtering snapshots by comparing to concrete sort column values similar to the existing `after` parameter`.
Pinging @elastic/es-distributed (Team:Distributed) |
@@ -451,6 +447,189 @@ public void testFilterBySLMPolicy() throws Exception { | |||
assertThat(getAllSnapshotsForPolicies(GetSnapshotsRequest.NO_POLICY_PATTERN, "*"), is(allSnapshots)); | |||
} | |||
|
|||
public void testSortAfterStartTime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not explicitly add tests for all columns as they use the same functionality and limited myself to those that I needed for full coverage only here.
SortOrder order | ||
) { | ||
if (snapshotName == null) { | ||
assert repoName == null : "no snapshot name given but saw repo name [" + repoName + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's definitely ways of making this nicer, but since it's a temporary solution I went for the shortest+easiest-to-review version again to get this into the API (and thus complete the API side of this work) asap.
Jenkins run elasticsearch-ci/part-1 |
@henningandersen this one should be good for review now :) as discussed it's using |
blockAllDataNodes(repoName); | ||
final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2"); | ||
awaitNumberOfSnapshotsInProgress(1); | ||
TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this complexity really worth it? Maybe we could just snapshot indices with random docs, retrieve the duration and test after_value
after that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty easy to get collisions no matter what things you do in between snapshots if you don't make it deterministic like this. That said, this wasn't even good enough for Windows I think and I went with a different approach now, running the snapshot create in a loop until we get unique timestamps. Should be safe on all systems now :)
@@ -628,3 +633,84 @@ The API returns the following response: | |||
// TESTRESPONSE[s/"end_time_in_millis": 1593094752019/"end_time_in_millis": $body.snapshots.1.end_time_in_millis/] | |||
// TESTRESPONSE[s/"duration_in_millis": 0/"duration_in_millis": $body.snapshots.0.duration_in_millis/] | |||
// TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/] | |||
|
|||
|
|||
The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that sorting after a given start/end time will be the most common use case. Maybe it's worth showing as an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to but I don't see a way of doing this with a test because the timestamp on the request would have to be dynamic right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the timestamp on the request would have to be dynamic right?
Or low enough to always return some snapshots in the response. But that's just a suggestion, let's ignore if that's too complicated or ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point :) Added a docs run for this with a low timestamp now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few smaller comments, otherwise this looks good.
@@ -140,14 +140,19 @@ Allows setting a sort order for the result. Defaults to `start_time`, i.e. sorti | |||
(Optional, string) | |||
Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order. | |||
|
|||
`after_value`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should call the parameter after_sort_value
to make the coupling to sort explicit in the name?
I am still pondering on after_
. Reading your text, perhaps start_sort_value
is better, though I am not really contend with that either...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this or the alternatives. How about just from
maybe? It's short and it indicates "start" + "inclusive"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like from, but perhaps still qualify it as from_sort_value
? I think that makes it easier to see from the request what it means.
|
||
assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo)); | ||
assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo)); | ||
assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that snapshot1
and snapshot2
get different timestamps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not on Windows where the clock is less accurate actually ... let me make sure they will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted the code to ensure the timestamps never collide now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right missed this one, adding a fix here as well :)
.getSnapshots(); | ||
} | ||
|
||
public void testSortAfterName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test and the one sorting by timestamp are very similar, perhaps we can share most of the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged all tests into one :)
blockAllDataNodes(repoName); | ||
final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2"); | ||
awaitNumberOfSnapshotsInProgress(1); | ||
TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?
if (after != null) { | ||
validationException = addValidationError("can't use after and offset simultaneously", validationException); | ||
} | ||
if (afterValue != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we are not allowing this? offset
could just be evaluated after after_value
?
Obviously we can do that in a follow-up rather than here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh yea actually we have to that for Kibana (otherwise any filtered results can't be paginated through which seems weird). Let's push it to a follow-up though as it will require some more code-heavy tests I think, though the production code change is easy/small :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, this turned out to be entirely trivial with the way offset works at the moment. I just enabled it and added 2 simple tests for offset + after.
deleted will be seen during the iteration. Snapshots concurrently created may be seen during an iteration. | ||
|
||
NOTE: The parameters `size`, `order`, `after`, `offset`, `slm_policy_filter` and `sort` are not supported when using `verbose=false` and | ||
the sort order for requests with `verbose=false` is undefined. | ||
NOTE: The parameters `size`, `order`, `after`, `after_value`, `offset`, `slm_policy_filter` and `sort` are not supported when using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also note the incompatibility of after_value
with offset
and after
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well now that you pointed it out below, it would only be with "after" in the final thing which I'd code up very shortly after this one. Not sure it's worth the added text though, it obviously makes no sense setting both, the request validation will tell you so and we already point this out in the after
section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you added the incompatibility note to offset
, I missed that on my initial read. I think it still makes sense to add to either after
or after_value
that they are incompatible. I could see this being slightly surprising, since after_value
is like a filter so it could be surprising that it does not work with pagination using after
(and this could slip through some UI testing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added a note to after
:)
final Predicate<SnapshotInfo> isAfter; | ||
switch (sortBy) { | ||
case START_TIME: | ||
isAfter = filterByLongOffset(SnapshotInfo::startTime, Long.parseLong(after), snapshotName, repoName, order); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add explicit error handling when parsing the arguments, perhaps to the request validation or here. That would allow the message to indicate that it is the after_value
causing the problem.
Also, I think < 0
is illegal so we could guard against that.
Similarly for duration, indices and failed shards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized we have the safe problem for the after
value as well (though due to the way we encode things there it's less likely). May I push this into a follow-up? :) Then I can just deal with both in one go and make it a little nicer without blowing this one up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I refactored this a little to build the predicate early on now. This makes sure that we at least don't run any actual requests unless we have a valid predicate. Refactoring this nicer and working in a fix for the after
param + easy-to-read validation I'd still like to push to a follow-up though if possible :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that for after
we expect it to be handle opaquely by clients so if they mess with it, I am OK to have a worse error (though I will not object to improving it).
I am OK with pushing the error handling to a separate PR.
assertThat(allBeforeStartTimeDescending(startTime2), is(List.of(snapshot2, snapshot1))); | ||
assertThat(allBeforeStartTimeDescending(startTime1), is(List.of(snapshot1))); | ||
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a combination test too, validating that after_value
and for instance snapshot name filtering works together? Just one of these is enough, no need to do several combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test for that and also for offset + after_value
which turned out to be trivial.
Thanks so much for reading through this monster @henningandersen @tlrx . I think I was able to work in all your points now :) The BwC failure is unrelated and know and this should be good for another review I hope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Armin, this looks good, just a comment on the name and some minor comments.
@@ -140,14 +140,19 @@ Allows setting a sort order for the result. Defaults to `start_time`, i.e. sorti | |||
(Optional, string) | |||
Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order. | |||
|
|||
`after_value`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like from, but perhaps still qualify it as from_sort_value
? I think that makes it easier to see from the request what it means.
`after`:: | ||
(Optional, string) | ||
Offset identifier to start pagination from as returned by the `next` field in the response body. | ||
Offset identifier to start pagination from as returned by the `next` field in the response body. Using this parameter is mutually exclusive | ||
with using the `after_value` parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw a comment where you mentioned you did this anyway since it was so easy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do it for after
, just for numeric offsets. We don't want after
and from_sort_value
working in parallel do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I got confused by the "offset" part of the sentence, this is fine as is. About after
and from_sort_value
, let us see what kibana thinks, I think they can handle it.
|
||
assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo)); | ||
assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo)); | ||
assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?
) throws Exception { | ||
while (true) { | ||
final SnapshotInfo snapshotInfo = createFullSnapshot(repoName, snapshotName); | ||
final long duration = snapshotInfo.endTime() - snapshotInfo.startTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mildly concerned about the duration being either 0 or 100ms on some platform meaning we could loop infinitely here. I think we will be ok so let us leave it, though a comment to not go beyond the 3 snapshots we generate now seems in order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, comment added
Thanks @henningandersen renamed + other comments addressed as well now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -628,3 +633,84 @@ The API returns the following response: | |||
// TESTRESPONSE[s/"end_time_in_millis": 1593094752019/"end_time_in_millis": $body.snapshots.1.end_time_in_millis/] | |||
// TESTRESPONSE[s/"duration_in_millis": 0/"duration_in_millis": $body.snapshots.0.duration_in_millis/] | |||
// TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/] | |||
|
|||
|
|||
The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Thanks Henning & Tanguy!! |
Add `from_sort_value` parameter to allow for filtering snapshots by comparing to concrete sort column values similar to the existing after parameter`.
Add
from_sort_value
parameter to allow for filtering snapshots by comparing to concrete sort columnvalues similar to the existing
after
parameter`.