Automatic verification of all files that are being snapshotted with Snapshot/Restore #7159

Merged
merged 1 commit into from Aug 20, 2014

Projects

None yet

6 participants

@imotov
Member
imotov commented Aug 5, 2014

Adds automatic verification of all files that are being snapshotted. Closes #5593

@dadoonet dadoonet added the review label Aug 5, 2014
@dadoonet dadoonet self-assigned this Aug 5, 2014
@kimchy kimchy commented on the diff Aug 5, 2014
...napshots/blobstore/BlobStoreIndexShardRepository.java
@@ -396,8 +392,10 @@ protected BlobStoreIndexShardSnapshots buildBlobStoreIndexShardSnapshots(Immutab
*/
public SnapshotContext(SnapshotId snapshotId, ShardId shardId, IndexShardSnapshotStatus snapshotStatus) {
super(snapshotId, shardId);
- store = indicesService.indexServiceSafe(shardId.getIndex()).shardInjectorSafe(shardId.id()).getInstance(Store.class);
+ IndexService indexService = indicesService.indexServiceSafe(shardId.getIndex());
+ store = indexService.shardInjectorSafe(shardId.id()).getInstance(Store.class);
@kimchy
kimchy Aug 5, 2014 Member

can we cast to InternalIndexShard and call #store() on it? I think its cleaner....

@kimchy
Member
kimchy commented Aug 5, 2014

it looks good to me, maybe @rmuir can have a look at that verifying index input?

@rmuir rmuir commented on the diff Aug 5, 2014
src/main/java/org/elasticsearch/index/store/Store.java
@@ -704,6 +722,134 @@ public void writeBytes(byte[] b, int offset, int length) throws IOException {
}
+ /**
+ * Index input that calculates checksum as data is read from the input.
+ *
+ * This class supports random access (it is possible to seek backward and forward) in order to accommodate retry
+ * mechanism that is used in some repository plugins (S3 for example). However, the checksum is only calculated on
+ * the first read. All consecutive reads of the same data are not used to calculate the checksum.
+ */
+ static class VerifyingIndexInput extends ChecksumIndexInput {
@rmuir
rmuir Aug 5, 2014 Contributor

This guy is fairly tricky: I know there are explicit tests for this indexinput, but do they test that seeking/re-reading does not impact the checksum computation?

@imotov
imotov Aug 5, 2014 Member

@rmuir I added a test to verify that, but I should probably rename readIndexInputFully into something like readIndexInputFullyWithRandomeSeeks to reflect that.

@rmuir
rmuir Aug 19, 2014 Contributor

Yes, that test takes care of my concerns. Maybe we can just add a comment to readBytes that our casts to 'int' are safe because seek catches us up? I stared hard at this thing, I dont see any problems, but it might help someone else. Thanks for doing this, I think it may be useful in the future for other purposes too (e.g. fold logic into ChecksumIndexInput).

@dadoonet dadoonet assigned imotov and unassigned dadoonet Aug 5, 2014
@rmuir
Contributor
rmuir commented Aug 19, 2014

looks good here. sorry for the delay in review!

@imotov imotov Snapshot checksum verification
Adds automatic verification of all files that are being snapshotted. Closes #5593
45ca777
@imotov imotov merged commit 45ca777 into elastic:master Aug 20, 2014
@jpountz jpountz removed the review label Aug 26, 2014
@clintongormley clintongormley changed the title from Snapshot checksum verification to Resiliency: Automatic verification of all files that are being snapshotted with Snapshot/Restore Sep 10, 2014
@clintongormley clintongormley changed the title from Resiliency: Automatic verification of all files that are being snapshotted with Snapshot/Restore to Automatic verification of all files that are being snapshotted with Snapshot/Restore Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment