Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot UUIDs in blob names #19421

Conversation

abeyad
Copy link

@abeyad abeyad commented Jul 13, 2016

This PR adds to #18815 to enable safer behavior with respect to snapshots. #18815 makes blob deletions strict - if the blob doesn't exist, an exception is thrown instead of silently failing. However, this presents an issue with some of our tests scenarios (and possibly real life occurrences) where the inability to delete a blob because it was deleted by someone else would cause the snapshot deletion itself to fail, potentially leaving other blobs around. This could even happen if the snap-.dat blob was successfully deleted but the machine crashed before the other blobs in the snapshot could be deleted. Our current mechanism uses a listing of the snap-.dat blobs to determine what the current snapshots are. If deletions can't be relied upon, then we can't be sure that the existence of snap-A.dat in the repository implies that A is a current snapshot.

This PR uses the index generational files from #19002 to retrieve the snapshot UUID for snapshots and name all blobs by the snapshot UUID instead of the snapshot name. If a snapshot A was deleted, then recreated, but not all of A's files were deleted, then we would have to worry about overwriting existing blobs, which is problematic. By naming blobs with the snapshot UUID, we avoid this issue.

This PR also introduces a unique index ID (a UUID) for indices in the snapshots, so index folders can be named by the UUID and avoid problems such as #7540.

Relates #18156

@abeyad abeyad added >enhancement WIP :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Internal v5.0.0-alpha5 labels Jul 13, 2016
@abeyad abeyad self-assigned this Jul 13, 2016
@abeyad abeyad force-pushed the snapshot-uuids-in-blob-names branch from 045a386 to 4a0ad6e Compare July 14, 2016 00:41
@abeyad abeyad removed the WIP label Jul 14, 2016
@abeyad abeyad changed the title [WIP] Snapshot UUIDs in blob names Snapshot UUIDs in blob names Jul 14, 2016
@abeyad
Copy link
Author

abeyad commented Jul 14, 2016

@imotov your review would be most appreciated

@gfyoung the elastic:enhancement/snapshot-blob-handling contains your commits and this PR is against that branch

@@ -72,14 +72,14 @@ public InputStream readBlob(String blobName) throws IOException {
logger.trace("readBlob({})", blobName);

if (!blobExists(blobName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is a part of another PR, but since we are modifying it here - why did we need it in the first place? It looks like we are handling file not found condition when we try to read from the stream anyway. Why do it twice?

Copy link
Contributor

@gfyoung gfyoung Jul 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imotov: Azure is a little too lenient about this unfortunately (at least in testing), hence necessitating this check. FWIW, @abeyad could remove it and run the unit tests again that I wrote to see whether we still get those failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung what do you mean by lenient? What does it do if the file doesn't exist and you are trying to read the file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imotov : What I meant was that it didn't throw Exceptions during testing (e.g. when I tried deleting a blob that didn't exist). Of course, this was using a mock, so I don't know if that would be the case in real life, but testing purposes, the existence check was necessary for it to pass.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The better strategy is to change the mocks to conform to how the real APIs would behave (i.e. throw a 404 if the blob doesn't exist), in which case, the blobExists check is extraneous. I will take care of this in a separate commit as part of the PR

@imotov
Copy link
Contributor

imotov commented Jul 19, 2016

@abeyad Looks good. I left a few minor comments.

@abeyad abeyad force-pushed the enhancement/snapshot-blob-handling branch from c8f1258 to 6a9f488 Compare July 22, 2016 17:49
Ali Beyad added 4 commits July 22, 2016 13:49
Makes deleting snapshots more robust by first deleting the
snapshot from the index generational file, then handling
individual deletion file errors with log messages instead of
failing the entire operation.
for reads and deletes if the blob does not exist.
to snapshot/restore) and the index to UUID mapping is stored in the
repository index file.
@abeyad abeyad force-pushed the snapshot-uuids-in-blob-names branch from a219c70 to f0c9f6e Compare July 23, 2016 20:20
Ali Beyad added 2 commits July 23, 2016 23:24
Azure and Google cloud blob containers, as the APIs for both return
a 404 in the case of a missing object, which we already handle through
a NoSuchFileFoundException.
@abeyad abeyad force-pushed the snapshot-uuids-in-blob-names branch from f0c9f6e to 299b8a7 Compare July 24, 2016 19:08
@abeyad
Copy link
Author

abeyad commented Jul 24, 2016

@imotov I've pushed a6f5e0b to address code review comments and 299b8a7 to remove the blobExists check from readBlob methods

@@ -223,6 +228,9 @@

private final ChecksumBlobStoreFormat<BlobStoreIndexShardSnapshots> indexShardSnapshotsFormat;

// flag to indicate if the index gen file has been checked for updating from pre 5.0 versions
private volatile boolean indexGenChecked;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that this might cause some issues. We don't control the underlying storage, which might change on us without our knowledge. I would like to run a couple of scenarios by you when you have a chance to see if we should/can make it more robust.

@imotov
Copy link
Contributor

imotov commented Jul 27, 2016

@abeyad left a couple of comments

upgrades if it determines the read data is in the legacy
format. It writes the upgraded version if it is not a
read-only repository and caches the repository data if
it is a read-only repository.
@abeyad abeyad force-pushed the snapshot-uuids-in-blob-names branch from 0148175 to 58d6b9d Compare July 29, 2016 02:09
@abeyad
Copy link
Author

abeyad commented Jul 29, 2016

@imotov i pushed 58d6b9d

RepositoryData repositoryData = updateIndexGenIfNecessary();
if (repositoryData == null) {
repositoryData = readIndexGen();
RepositoryData repositoryData = readIndexGen();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you collapse readIndexGen into getRepositoryData you will be able to remove otherwise unnecessary isLegacyFormat flag from RepositoryData.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @imotov, I'll push a commit with those changes

@imotov
Copy link
Contributor

imotov commented Jul 30, 2016

Left a minor comment. Otherwise, LGTM.

@abeyad
Copy link
Author

abeyad commented Jul 31, 2016

@imotov I pushed 0f335ac

@abeyad
Copy link
Author

abeyad commented Jul 31, 2016

@imotov thanks for the review!

@abeyad abeyad merged commit ce88815 into elastic:enhancement/snapshot-blob-handling Jul 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement v5.0.0-alpha5
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants