Reduce Number of List Calls During Snapshot Create and Delete #44088

original-brownbear · 2019-07-08T20:48:14Z

Some obvious cleanups I found when investigation the API call count
metering:

No need to get the latest generation id after loading latest
repository data
- Loading RepositoryData already requires fetching the latest
  generation so we can reuse it
Also, reuse list of all root blobs when fetching latest repo
generation during snapshot delete like we do for shard folders
Lastly, don't try and load index--1 (N = -1) repository data, it
doesn't exist -> just return the empty repo data initially

Some obvious cleanups I found when investigation the API call count metering: * No need to get the latest generation id after loading latest repository data * Loading RepositoryData already requires fetching the latest generation so we can reuse it * Also, reuse list of all root blobs when fetching latest repo generation during snapshot delete like we do for shard folders * Lastly, don't try and load `index--1` (N = -1) repository data, it doesn't exist -> just return the empty repo data initially

elasticmachine · 2019-07-08T20:48:16Z

Pinging @elastic/es-distributed

original-brownbear · 2019-07-08T20:49:20Z

server/src/main/java/org/elasticsearch/repositories/RepositoryData.java

+     * @param newGeneration New Generation
+     * @return New instance
+     */
+    public RepositoryData withGenId(long newGeneration) {


Had to add this for the test, but I think it can be used in a follow up to clean up handling of the generation on repository data instances a little as well (currently the generation on these instances doesn't always correspond the the actual generation it will be written as).

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

tlrx · 2019-07-09T07:48:47Z

server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java

-        final long currentGen = latestIndexBlobId();
-        if (currentGen != repositoryStateId) {
+        final long currentGen = repositoryData.getGenId();
+        if (currentGen != expectedGen) {


Ok, so the behavior of re-reading the last index-N file before writing a new one is kept but it is now the responsibility of the caller of writeIndexGen() to pass an up to date RepositoryData. While I'm OK with the change I find it more difficult now to identify the "generation" of the RepositoryData to be written.

Maybe we could change RepositoryData so that it increments its own generation when a snapshot is removed or added and here we check that repositoryData.getGenId() == (expectedGen + 1)?

@tlrx yea, I think that would be a nice refactoring. Unfortunately, when I last tried doing that (the auto-increment) it ran into a huge number of failing tests (there's literally more than 10 that somehow indirectly rely on the fact the the repository gen id stays the same on these operations). I think it's worthwhile, especially with this change, but probably best left to another PR?

I think it's worthwhile, especially with this change, but probably best left to another PR?

I'm OK if it's done in a follow up PR

original-brownbear · 2019-07-09T08:03:36Z

Thanks @tlrx all points addressed :)

tlrx

LGTM

original-brownbear · 2019-07-09T11:05:59Z

thanks @tlrx

…c#44088) * Reduce Numebr of List Calls During Snapshot Create and Delete Some obvious cleanups I found when investigation the API call count metering: * No need to get the latest generation id after loading latest repository data * Loading RepositoryData already requires fetching the latest generation so we can reuse it * Also, reuse list of all root blobs when fetching latest repo generation during snapshot delete like we do for shard folders * Lastly, don't try and load `index--1` (N = -1) repository data, it doesn't exist -> just return the empty repo data initially

#44209) * Reduce Number of List Calls During Snapshot Create and Delete Some obvious cleanups I found when investigation the API call count metering: * No need to get the latest generation id after loading latest repository data * Loading RepositoryData already requires fetching the latest generation so we can reuse it * Also, reuse list of all root blobs when fetching latest repo generation during snapshot delete like we do for shard folders * Lastly, don't try and load `index--1` (N = -1) repository data, it doesn't exist -> just return the empty repo data initially

original-brownbear added >non-issue :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.4.0 labels Jul 8, 2019

original-brownbear commented Jul 8, 2019

View reviewed changes

original-brownbear changed the title ~~Reduce Numebr of List Calls During Snapshot Create and Delete~~ Reduce Number of List Calls During Snapshot Create and Delete Jul 8, 2019

nicer

263b39a

original-brownbear requested review from tlrx and ywelsch July 9, 2019 07:09

tlrx reviewed Jul 9, 2019

View reviewed changes

original-brownbear added 2 commits July 9, 2019 09:59

Merge remote-tracking branch 'elastic/master' into save-some-rpc-calls

a8b8a92

CR: simplifications

85c0c2c

original-brownbear requested a review from tlrx July 9, 2019 08:03

tlrx approved these changes Jul 9, 2019

View reviewed changes

original-brownbear merged commit 4849c3e into elastic:master Jul 9, 2019

original-brownbear deleted the save-some-rpc-calls branch July 9, 2019 11:06

original-brownbear added backport pending and removed backport pending labels Jul 9, 2019

original-brownbear mentioned this pull request Jul 11, 2019

Reduce Number of List Calls During Snapshot Create and Delete (#44088) #44209

Merged

original-brownbear restored the save-some-rpc-calls branch August 6, 2020 18:35

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce Number of List Calls During Snapshot Create and Delete #44088

Reduce Number of List Calls During Snapshot Create and Delete #44088

original-brownbear commented Jul 8, 2019

elasticmachine commented Jul 8, 2019

original-brownbear Jul 8, 2019

tlrx Jul 9, 2019

original-brownbear Jul 9, 2019 •

edited

Loading

tlrx Jul 9, 2019

original-brownbear commented Jul 9, 2019

tlrx left a comment

original-brownbear commented Jul 9, 2019

Reduce Number of List Calls During Snapshot Create and Delete #44088

Reduce Number of List Calls During Snapshot Create and Delete #44088

Conversation

original-brownbear commented Jul 8, 2019

elasticmachine commented Jul 8, 2019

original-brownbear Jul 8, 2019

Choose a reason for hiding this comment

tlrx Jul 9, 2019

Choose a reason for hiding this comment

original-brownbear Jul 9, 2019 • edited Loading

Choose a reason for hiding this comment

tlrx Jul 9, 2019

Choose a reason for hiding this comment

original-brownbear commented Jul 9, 2019

tlrx left a comment

Choose a reason for hiding this comment

original-brownbear commented Jul 9, 2019

original-brownbear Jul 9, 2019 •

edited

Loading