Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resiliency: Backport Recovery / Snapshot file identity improvements to 1.3 #7857

Merged
merged 9 commits into from Sep 24, 2014

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Sep 24, 2014

We saw lots of issues with hash collisions which have been fixed in 1.4 but back then it seemed like an improvement rather than a bugfix. I think in the meanwhile we should really declare it as a bugfix and port it into 1.3 This backport PR includes fixes for #7434 & #7351 as well as related fixes.

I ran BWC tests against 1.3.2 and 1.2.4 as well as the entire test suite multiple times but I think this need a deep review.

@s1monw s1monw added the review label Sep 24, 2014
@@ -445,7 +447,15 @@ public void snapshot(SnapshotIndexCommit snapshotIndexCommit) {
// }

BlobStoreIndexShardSnapshot.FileInfo fileInfo = snapshots.findPhysicalIndexFile(fileName);

try {
// in 1.4.0 we added additional hashes for .si / segments_N files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this comment to say 1.3.3? I suppose we should update master/1.x too, so we have accurate "history" in case we are debugging something.

s1monw added a commit to s1monw/elasticsearch that referenced this pull request Sep 24, 2014
@rmuir
Copy link
Contributor

rmuir commented Sep 24, 2014

looks good, thank you simon!

@imotov
Copy link
Contributor

imotov commented Sep 24, 2014

LGTM

imotov and others added 9 commits September 24, 2014 15:24
This commit changes the way how files are selected for retransmission
on recovery / restore. Today this happens on a per-file basis where the
rather weak checksum and the file length in bytes is compared to check if
a file is identical. This is prone to fail in the case of a checksum collision
which can happen under certain circumstances.
The changes in this commit move the identity comparsion to a per-commit / per-segment
level where files are only treated as identical iff all the other files in the
commit / segment are the same. This "all or nothing" strategy is reducing the chance for
a collision dramatically since we also use a strong hash to identify commits / segments
based on the content of the ".si" / "segments.N" file.

Closes elastic#7351
Due to additional safety added in elastic#7351 we compute now a strong hash for
.si and segments_N files which are compared during snapshot / restore.
Old snapshots don't have this hash which can cause unnecessary copying
of large amount of data. This commit adds the ability to fetch this
hash from the blob store if needed.

Closes elastic#7434
The comparison and read code in the BlobStoreIndexShardRepository
used the physicalName and Name in reverse order. This caused
SnapshotBackwardsCompatibilityTest to fail.
We backported this feature to 1.3.3 as well so comments should be
updated.
s1monw added a commit that referenced this pull request Sep 24, 2014
s1monw added a commit that referenced this pull request Sep 24, 2014
@s1monw s1monw merged commit 004afa5 into elastic:1.3 Sep 24, 2014
@s1monw s1monw removed the review label Sep 24, 2014
@clintongormley clintongormley changed the title Backport Recovery / Snapshot file identity improvements to 1.3 Resiliency: Backport Recovery / Snapshot file identity improvements to 1.3 Sep 26, 2014
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants