Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inefficient (worst case exponential) loading of snapshot repository #24510

Merged
merged 4 commits into from May 8, 2017

Conversation

@joachimdraeger
Copy link
Contributor

commented May 5, 2017

Ensure that getRepositoryData() is only called once during a list snapshots operation. Fixes #24509.

…ry data

when checking for incompatible snapshots.
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented May 5, 2017

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

* @param snapshotId snapshot id
* @return information about snapshot
*/
SnapshotInfo getSnapshotInfo(RepositoryData repositoryData, SnapshotId snapshotId);

This comment has been minimized.

Copy link
@joachimdraeger

joachimdraeger May 5, 2017

Author Contributor

I appreciate that requiring to pass RepositoryData on getSnapshotInfo might not be the cleanest solution regarding Repository's API.

@jasontedor jasontedor requested a review from abeyad May 5, 2017
@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 5, 2017

@joachimdraeger thank you for this contribution, and great catch! Indeed this would help solve the performance issues introduced by incompatible-snapshots. However, as you stated, I don't think we want to alter the Repository interface to take the RepositoryData. Rather, I would propose a different solution: in TransportGetSnapshotsAction, we have already retrieved the RepositoryData, so we can pass the incompatible snapshots list from that RepositoryData instance into SnapshotsService#snapshots. The SnapshotsServce#snapshots method can return a SnapshotInfo.incompatible instance, instead of relying on BlobStoreRepository#getSnapshotInfo to do so.

Let me know if you have the time to do this, or if you prefer I take it on. We would want to get this in for 5.4.1 and 5.5.0.

Thanks again!

Regarding your question of running gradle check, what exactly was failing?

Copy link
Contributor

left a comment

I left feedback on a different approach

@joachimdraeger

This comment has been minimized.

Copy link
Contributor Author

commented May 5, 2017

@abeyad thanks for your feedback. I see that the incompatible SnapshotInfo is only used for the list. In get and restore incompatible is determined twice at the moment: directly and in getSnapshotInfo.

I think it would be nicer in general if the source of truth would be getSnapshotInfo with the INCOMPATIBLE state in all cases. However, that would require a much bigger refactoring.

For now I agree that removing the check from 'getSnapshotInfo' and letting 'SnapshotsService#snapshots' check for incompatible snapshots itself seems to be the most viable solution.

I'm happy to give it a go on Monday.

@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 5, 2017

Thank you @joachimdraeger, don't hesitate to ping me on Monday if I can help in any way.

…repository data"

This reverts commit 6eb9837.
@joachimdraeger

This comment has been minimized.

Copy link
Contributor Author

commented May 8, 2017

@abeyad I re-implemented the fix as discussed. I tested the incompatible snapshots listing manually and the performance improvement.

Copy link
Contributor

left a comment

I left a couple small comments. Once fixed, we can get this merged.

// an incompatible snapshot - cannot read its snapshot metadata file, just return
// a SnapshotInfo indicating its incompatible
return SnapshotInfo.incompatible(snapshotId);
}

This comment has been minimized.

Copy link
@abeyad

abeyad May 8, 2017

Contributor

Since getSnapshotInfoInternal (just below) is only used by getSnapshotInfo, we can move the code in getSnapshotInfoInternal directly into getSnapshotInfo and get rid of getSnaphotInfoInternal

@@ -196,6 +206,7 @@ public SnapshotInfo snapshot(final String repositoryName, final SnapshotId snaps
return Collections.unmodifiableList(snapshotList);
}


This comment has been minimized.

Copy link
@abeyad

abeyad May 8, 2017

Contributor

remove extra new line

@abeyad
abeyad approved these changes May 8, 2017
Copy link
Contributor

left a comment

LGTM, I will run the tests locally as well and merge. Thank you for working on this @joachimdraeger!

@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 8, 2017

@elasticmachine test this please

@abeyad abeyad merged commit fec1802 into elastic:master May 8, 2017
2 checks passed
2 checks passed
CLA Commit author has signed the CLA
Details
elasticsearch-ci Build finished.
Details
abeyad pushed a commit that referenced this pull request May 8, 2017
This commit fixes inefficient (worst case exponential) loading of 
snapshot repository data when checking for incompatible snapshots,
that was introduced in #22267.  When getting snapshot information,
getRepositoryData() was called on every snapshot, so if there are
a large number of snapshots in the repository and _all snapshots
were requested, the performance degraded exponentially.  This
commit fixes the issue by only calling getRepositoryData once and
using the data from it in all subsequent calls to get snapshot 
information.

Closes #24509
abeyad pushed a commit that referenced this pull request May 8, 2017
This commit fixes inefficient (worst case exponential) loading of 
snapshot repository data when checking for incompatible snapshots,
that was introduced in #22267.  When getting snapshot information,
getRepositoryData() was called on every snapshot, so if there are
a large number of snapshots in the repository and _all snapshots
were requested, the performance degraded exponentially.  This
commit fixes the issue by only calling getRepositoryData once and
using the data from it in all subsequent calls to get snapshot 
information.

Closes #24509
@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 8, 2017

5.x commit: 690250e
5.4 commit: 21da532

@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 8, 2017

@joachimdraeger the PR has been merged - thanks again!

@talevy

This comment has been minimized.

Copy link
Contributor

commented May 8, 2017

@abeyad are you sure this change was made properly into 5.x?

I am seeing that there is a rogue reference to the getSnapshotInfoInternal method:

https://github.com/elastic/elasticsearch/blob/5.x/core/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java#L369

seems to be the only call to it... but the method definition is deleted.

abeyad pushed a commit that referenced this pull request May 8, 2017
abeyad pushed a commit that referenced this pull request May 8, 2017
@abeyad

This comment has been minimized.

Copy link
Contributor

commented May 8, 2017

@talevy I pushed ccee1b0 and 6a0e070 to fix the issue on 5.x

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 9, 2017
* master:
  Increase compilation limit in ingest tests
  Mark 6.0.0-alpha1 as prerelease
  Updated release notes for 6.0.0-alpha1
  Fix single shard scroll within a cluster with nodes in version >= 5.3 and <= 5.3 (elastic#24512)
  add option for _ingest.timestamp to use new ZonedDateTime (elastic#24030)
  Fixes inefficient loading of snapshot repository data (elastic#24510)
  Scripting: Deprecate file scripts (elastic#24552)
  Remove commented code from ESILRTC
  Ensure test replicas have valid recovery state
  Add global checkpoint assertion in index shard
  Improve bootstrap checks error messages
  Refactor UpdateHelper into unit-testable pieces
  Fix cache expire after access
  Document work-around for jar hell in idea_rt.jar file (elastic#24523)
  Move MockLogAppender to elasticsearch test (elastic#24542)
  Remove gap skipping when opening engine
  documentation of preserve existing settings
  remove duplicated import in AppendProcessor
@clintongormley clintongormley changed the title Fix inefficient (worst case exponential) loading of snapshot reposito… Fix inefficient (worst case exponential) loading of snapshot repository May 15, 2017
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented May 15, 2017

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@clintongormley clintongormley added v6.0.0-alpha2 and removed v6.0.0 labels Jun 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.