Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshots are taking more place even if no changes happened #8119

Closed
bobrik opened this issue Oct 16, 2014 · 13 comments · Fixed by #8358

Comments

@bobrik
Copy link
Contributor

commented Oct 16, 2014

I took 2 snapshots for read-only indices with curator and some indices were snapshotted again even though they didn't have any changes.

Look at the first backup (50 oldest indices):

51G      s3://backups-es-statistics/indices/statistics-20131004/
27G      s3://backups-es-statistics/indices/statistics-20131005/
24G      s3://backups-es-statistics/indices/statistics-20131006/
39G      s3://backups-es-statistics/indices/statistics-20131007/
25G      s3://backups-es-statistics/indices/statistics-20131008/
30G      s3://backups-es-statistics/indices/statistics-20131009/
37G      s3://backups-es-statistics/indices/statistics-20131010/
28G      s3://backups-es-statistics/indices/statistics-20131011/
27G      s3://backups-es-statistics/indices/statistics-20131012/
28G      s3://backups-es-statistics/indices/statistics-20131013/
32G      s3://backups-es-statistics/indices/statistics-20131014/
41G      s3://backups-es-statistics/indices/statistics-20131015/
42G      s3://backups-es-statistics/indices/statistics-20131016/
33G      s3://backups-es-statistics/indices/statistics-20131017/
29G      s3://backups-es-statistics/indices/statistics-20131018/
29G      s3://backups-es-statistics/indices/statistics-20131019/
30G      s3://backups-es-statistics/indices/statistics-20131020/
32G      s3://backups-es-statistics/indices/statistics-20131021/
33G      s3://backups-es-statistics/indices/statistics-20131022/
29G      s3://backups-es-statistics/indices/statistics-20131023/
36G      s3://backups-es-statistics/indices/statistics-20131024/
32G      s3://backups-es-statistics/indices/statistics-20131025/
32G      s3://backups-es-statistics/indices/statistics-20131026/
34G      s3://backups-es-statistics/indices/statistics-20131027/
31G      s3://backups-es-statistics/indices/statistics-20131028/
40G      s3://backups-es-statistics/indices/statistics-20131029/
29G      s3://backups-es-statistics/indices/statistics-20131030/
35G      s3://backups-es-statistics/indices/statistics-20131031/
7G       s3://backups-es-statistics/indices/statistics-20131101/
6G       s3://backups-es-statistics/indices/statistics-20131102/
7G       s3://backups-es-statistics/indices/statistics-20131103/
7G       s3://backups-es-statistics/indices/statistics-20131104/
7G       s3://backups-es-statistics/indices/statistics-20131105/
7G       s3://backups-es-statistics/indices/statistics-20131106/
7G       s3://backups-es-statistics/indices/statistics-20131107/
7G       s3://backups-es-statistics/indices/statistics-20131108/
7G       s3://backups-es-statistics/indices/statistics-20131109/
7G       s3://backups-es-statistics/indices/statistics-20131110/
7G       s3://backups-es-statistics/indices/statistics-20131111/
7G       s3://backups-es-statistics/indices/statistics-20131112/

And the subsequent backup, same indices:

57G      s3://backups-es-statistics/indices/statistics-20131004/
30G      s3://backups-es-statistics/indices/statistics-20131005/
27G      s3://backups-es-statistics/indices/statistics-20131006/
44G      s3://backups-es-statistics/indices/statistics-20131007/
28G      s3://backups-es-statistics/indices/statistics-20131008/
33G      s3://backups-es-statistics/indices/statistics-20131009/
41G      s3://backups-es-statistics/indices/statistics-20131010/
31G      s3://backups-es-statistics/indices/statistics-20131011/
30G      s3://backups-es-statistics/indices/statistics-20131012/
31G      s3://backups-es-statistics/indices/statistics-20131013/
35G      s3://backups-es-statistics/indices/statistics-20131014/
46G      s3://backups-es-statistics/indices/statistics-20131015/
47G      s3://backups-es-statistics/indices/statistics-20131016/
37G      s3://backups-es-statistics/indices/statistics-20131017/
33G      s3://backups-es-statistics/indices/statistics-20131018/
33G      s3://backups-es-statistics/indices/statistics-20131019/
34G      s3://backups-es-statistics/indices/statistics-20131020/
36G      s3://backups-es-statistics/indices/statistics-20131021/
37G      s3://backups-es-statistics/indices/statistics-20131022/
32G      s3://backups-es-statistics/indices/statistics-20131023/
40G      s3://backups-es-statistics/indices/statistics-20131024/
36G      s3://backups-es-statistics/indices/statistics-20131025/
36G      s3://backups-es-statistics/indices/statistics-20131026/
38G      s3://backups-es-statistics/indices/statistics-20131027/
34G      s3://backups-es-statistics/indices/statistics-20131028/
45G      s3://backups-es-statistics/indices/statistics-20131029/
33G      s3://backups-es-statistics/indices/statistics-20131030/
39G      s3://backups-es-statistics/indices/statistics-20131031/
7G       s3://backups-es-statistics/indices/statistics-20131101/
6G       s3://backups-es-statistics/indices/statistics-20131102/
7G       s3://backups-es-statistics/indices/statistics-20131103/
7G       s3://backups-es-statistics/indices/statistics-20131104/
7G       s3://backups-es-statistics/indices/statistics-20131105/
7G       s3://backups-es-statistics/indices/statistics-20131106/
7G       s3://backups-es-statistics/indices/statistics-20131107/
7G       s3://backups-es-statistics/indices/statistics-20131108/
7G       s3://backups-es-statistics/indices/statistics-20131109/
7G       s3://backups-es-statistics/indices/statistics-20131110/
7G       s3://backups-es-statistics/indices/statistics-20131111/
7G       s3://backups-es-statistics/indices/statistics-20131112/

Segments are here:

statistics-20131004 0 p 4.6
statistics-20131004 0 r 4.6
statistics-20131005 0 p 4.4
statistics-20131005 0 p 4.9
statistics-20131005 0 r 4.4
statistics-20131005 0 r 4.9
statistics-20131006 0 p 4.4
statistics-20131006 0 p 4.9
statistics-20131006 0 r 4.4
statistics-20131006 0 r 4.9
statistics-20131007 0 p 4.4
statistics-20131007 0 p 4.9
statistics-20131007 0 r 4.4
statistics-20131007 0 r 4.9
statistics-20131008 0 p 4.4
statistics-20131008 0 p 4.9
statistics-20131008 0 r 4.4
statistics-20131008 0 r 4.9
statistics-20131009 0 p 4.4
statistics-20131009 0 p 4.9
statistics-20131009 0 r 4.4
statistics-20131009 0 r 4.9
statistics-20131010 0 p 4.4
statistics-20131010 0 p 4.9
statistics-20131010 0 r 4.4
statistics-20131010 0 r 4.9
statistics-20131011 0 p 4.4
statistics-20131011 0 p 4.9
statistics-20131011 0 r 4.4
statistics-20131011 0 r 4.9
statistics-20131012 0 p 4.4
statistics-20131012 0 p 4.9
statistics-20131012 0 r 4.4
statistics-20131012 0 r 4.9
statistics-20131013 0 p 4.4
statistics-20131013 0 p 4.9
statistics-20131013 0 r 4.4
statistics-20131013 0 r 4.9
statistics-20131014 0 p 4.4
statistics-20131014 0 p 4.9
statistics-20131014 0 r 4.4
statistics-20131014 0 r 4.9
statistics-20131015 0 p 4.4
statistics-20131015 0 p 4.9
statistics-20131015 0 r 4.4
statistics-20131015 0 r 4.9
statistics-20131016 0 p 4.4
statistics-20131016 0 p 4.9
statistics-20131016 0 r 4.4
statistics-20131016 0 r 4.9
statistics-20131017 0 p 4.4
statistics-20131017 0 p 4.9
statistics-20131017 0 r 4.4
statistics-20131017 0 r 4.9
statistics-20131018 0 p 4.4
statistics-20131018 0 p 4.9
statistics-20131018 0 r 4.4
statistics-20131018 0 r 4.9
statistics-20131019 0 p 4.4
statistics-20131019 0 p 4.9
statistics-20131019 0 r 4.4
statistics-20131019 0 r 4.9
statistics-20131020 0 p 4.4
statistics-20131020 0 p 4.9
statistics-20131020 0 r 4.4
statistics-20131020 0 r 4.9
statistics-20131021 0 p 4.4
statistics-20131021 0 p 4.9
statistics-20131021 0 r 4.4
statistics-20131021 0 r 4.9
statistics-20131022 0 p 4.4
statistics-20131022 0 p 4.9
statistics-20131022 0 r 4.4
statistics-20131022 0 r 4.9
statistics-20131023 0 p 4.4
statistics-20131023 0 p 4.9
statistics-20131023 0 r 4.4
statistics-20131023 0 r 4.9
statistics-20131024 0 p 4.4
statistics-20131024 0 p 4.9
statistics-20131024 0 r 4.4
statistics-20131024 0 r 4.9
statistics-20131025 0 p 4.4
statistics-20131025 0 p 4.9
statistics-20131025 0 r 4.4
statistics-20131025 0 r 4.9
statistics-20131026 0 p 4.4
statistics-20131026 0 p 4.9
statistics-20131026 0 r 4.4
statistics-20131026 0 r 4.9
statistics-20131027 0 p 4.4
statistics-20131027 0 p 4.9
statistics-20131027 0 r 4.4
statistics-20131027 0 r 4.9
statistics-20131028 0 p 4.4
statistics-20131028 0 p 4.9
statistics-20131028 0 r 4.4
statistics-20131028 0 r 4.9
statistics-20131029 0 p 4.4
statistics-20131029 0 p 4.9
statistics-20131029 0 r 4.4
statistics-20131029 0 r 4.9
statistics-20131030 0 p 4.4
statistics-20131030 0 p 4.9
statistics-20131030 0 r 4.4
statistics-20131030 0 r 4.9
statistics-20131031 0 p 4.4
statistics-20131031 0 p 4.9
statistics-20131031 0 r 4.4
statistics-20131031 0 r 4.9
statistics-20131101 0 p 4.4
statistics-20131101 0 p 4.9
statistics-20131101 0 r 4.4
statistics-20131101 0 r 4.9
statistics-20131101 1 p 4.4
statistics-20131101 1 p 4.9
statistics-20131101 1 r 4.4
statistics-20131101 1 r 4.9
statistics-20131101 2 p 4.4
statistics-20131101 2 p 4.9
statistics-20131101 2 r 4.4
statistics-20131101 2 r 4.9
statistics-20131101 3 p 4.4
statistics-20131101 3 p 4.9
statistics-20131101 3 r 4.4
statistics-20131101 3 r 4.9
statistics-20131101 4 p 4.4
statistics-20131101 4 p 4.9
statistics-20131101 4 r 4.4
statistics-20131101 4 r 4.9
statistics-20131102 0 p 4.4
statistics-20131102 0 p 4.9
statistics-20131102 0 r 4.4
statistics-20131102 0 r 4.9
statistics-20131102 1 p 4.4
statistics-20131102 1 p 4.9
statistics-20131102 1 r 4.4
statistics-20131102 1 r 4.9
statistics-20131102 2 p 4.4
statistics-20131102 2 p 4.9
statistics-20131102 2 r 4.4
statistics-20131102 2 r 4.9
statistics-20131102 3 p 4.4
statistics-20131102 3 p 4.9
statistics-20131102 3 r 4.4
statistics-20131102 3 r 4.9
statistics-20131102 4 p 4.4
statistics-20131102 4 p 4.9
statistics-20131102 4 r 4.4
statistics-20131102 4 r 4.9
statistics-20131103 0 p 4.4
statistics-20131103 0 p 4.9
statistics-20131103 0 r 4.4
statistics-20131103 0 r 4.9
statistics-20131103 1 p 4.4
statistics-20131103 1 p 4.9
statistics-20131103 1 r 4.4
statistics-20131103 1 r 4.9
statistics-20131103 2 p 4.4
statistics-20131103 2 p 4.9
statistics-20131103 2 r 4.4
statistics-20131103 2 r 4.9
statistics-20131103 3 p 4.4
statistics-20131103 3 p 4.9
statistics-20131103 3 r 4.4
statistics-20131103 3 r 4.9
statistics-20131103 4 p 4.4
statistics-20131103 4 p 4.9
statistics-20131103 4 r 4.4
statistics-20131103 4 r 4.9
statistics-20131104 0 p 4.4
statistics-20131104 0 p 4.9
statistics-20131104 0 r 4.4
statistics-20131104 0 r 4.9
statistics-20131104 1 p 4.4
statistics-20131104 1 p 4.9
statistics-20131104 1 r 4.4
statistics-20131104 1 r 4.9
statistics-20131104 2 p 4.4
statistics-20131104 2 p 4.9
statistics-20131104 2 r 4.4
statistics-20131104 2 r 4.9
statistics-20131104 3 p 4.4
statistics-20131104 3 p 4.9
statistics-20131104 3 r 4.4
statistics-20131104 3 r 4.9
statistics-20131104 4 p 4.4
statistics-20131104 4 p 4.9
statistics-20131104 4 r 4.4
statistics-20131104 4 r 4.9
statistics-20131105 0 p 4.4
statistics-20131105 0 p 4.9
statistics-20131105 0 r 4.4
statistics-20131105 0 r 4.9
statistics-20131105 1 p 4.4
statistics-20131105 1 p 4.9
statistics-20131105 1 r 4.4
statistics-20131105 1 r 4.9
statistics-20131105 2 p 4.4
statistics-20131105 2 p 4.9
statistics-20131105 2 r 4.4
statistics-20131105 2 r 4.9
statistics-20131105 3 p 4.4
statistics-20131105 3 p 4.9
statistics-20131105 3 r 4.4
statistics-20131105 3 r 4.9
statistics-20131105 4 p 4.4
statistics-20131105 4 p 4.9
statistics-20131105 4 r 4.4
statistics-20131105 4 r 4.9
statistics-20131106 0 p 4.4
statistics-20131106 0 p 4.9
statistics-20131106 0 r 4.4
statistics-20131106 0 r 4.9
statistics-20131106 1 p 4.4
statistics-20131106 1 p 4.9
statistics-20131106 1 r 4.4
statistics-20131106 1 r 4.9
statistics-20131106 2 p 4.4
statistics-20131106 2 p 4.9
statistics-20131106 2 r 4.4
statistics-20131106 2 r 4.9
statistics-20131106 3 p 4.4
statistics-20131106 3 p 4.9
statistics-20131106 3 r 4.4
statistics-20131106 3 r 4.9
statistics-20131106 4 p 4.4
statistics-20131106 4 p 4.9
statistics-20131106 4 r 4.4
statistics-20131106 4 r 4.9
statistics-20131107 0 p 4.4
statistics-20131107 0 p 4.9
statistics-20131107 0 r 4.4
statistics-20131107 0 r 4.9
statistics-20131107 1 p 4.4
statistics-20131107 1 p 4.9
statistics-20131107 1 r 4.4
statistics-20131107 1 r 4.9
statistics-20131107 2 p 4.4
statistics-20131107 2 p 4.9
statistics-20131107 2 r 4.4
statistics-20131107 2 r 4.9
statistics-20131107 3 p 4.4
statistics-20131107 3 p 4.9
statistics-20131107 3 r 4.4
statistics-20131107 3 r 4.9
statistics-20131107 4 p 4.4
statistics-20131107 4 p 4.9
statistics-20131107 4 r 4.4
statistics-20131107 4 r 4.9
statistics-20131108 0 p 4.4
statistics-20131108 0 p 4.9
statistics-20131108 0 r 4.4
statistics-20131108 0 r 4.9
statistics-20131108 1 p 4.4
statistics-20131108 1 p 4.9
statistics-20131108 1 r 4.4
statistics-20131108 1 r 4.9
statistics-20131108 2 p 4.4
statistics-20131108 2 p 4.9
statistics-20131108 2 r 4.4
statistics-20131108 2 r 4.9
statistics-20131108 3 p 4.4
statistics-20131108 3 p 4.9
statistics-20131108 3 r 4.4
statistics-20131108 3 r 4.9
statistics-20131108 4 p 4.4
statistics-20131108 4 p 4.9
statistics-20131108 4 r 4.4
statistics-20131108 4 r 4.9
statistics-20131109 0 p 4.4
statistics-20131109 0 p 4.9
statistics-20131109 0 r 4.4
statistics-20131109 0 r 4.9
statistics-20131109 1 p 4.4
statistics-20131109 1 p 4.9
statistics-20131109 1 r 4.4
statistics-20131109 1 r 4.9
statistics-20131109 2 p 4.4
statistics-20131109 2 p 4.9
statistics-20131109 2 r 4.4
statistics-20131109 2 r 4.9
statistics-20131109 3 p 4.4
statistics-20131109 3 p 4.9
statistics-20131109 3 r 4.4
statistics-20131109 3 r 4.9
statistics-20131109 4 p 4.4
statistics-20131109 4 p 4.9
statistics-20131109 4 r 4.4
statistics-20131109 4 r 4.9
statistics-20131110 0 p 4.4
statistics-20131110 0 p 4.9
statistics-20131110 0 r 4.4
statistics-20131110 0 r 4.9
statistics-20131110 1 p 4.4
statistics-20131110 1 p 4.9
statistics-20131110 1 r 4.4
statistics-20131110 1 r 4.9
statistics-20131110 2 p 4.4
statistics-20131110 2 p 4.9
statistics-20131110 2 r 4.4
statistics-20131110 2 r 4.9
statistics-20131110 3 p 4.4
statistics-20131110 3 p 4.9
statistics-20131110 3 r 4.4
statistics-20131110 3 r 4.9
statistics-20131110 4 p 4.4
statistics-20131110 4 p 4.9
statistics-20131110 4 r 4.4
statistics-20131110 4 r 4.9
statistics-20131111 0 p 4.4
statistics-20131111 0 p 4.9
statistics-20131111 0 r 4.4
statistics-20131111 0 r 4.9
statistics-20131111 1 p 4.4
statistics-20131111 1 p 4.9
statistics-20131111 1 r 4.4
statistics-20131111 1 r 4.9
statistics-20131111 2 p 4.4
statistics-20131111 2 p 4.9
statistics-20131111 2 r 4.4
statistics-20131111 2 r 4.9
statistics-20131111 3 p 4.4
statistics-20131111 3 p 4.9
statistics-20131111 3 r 4.4
statistics-20131111 3 r 4.9
statistics-20131111 4 p 4.4
statistics-20131111 4 p 4.9
statistics-20131111 4 r 4.4
statistics-20131111 4 r 4.9
statistics-20131112 0 p 4.4
statistics-20131112 0 p 4.9
statistics-20131112 0 r 4.4
statistics-20131112 0 r 4.9

Those indices should be roughly the same size in snapshot.

I also took dir diff from subsequent backups:

--- before.3.txt    2014-10-16 21:39:05.559338129 +0400
+++ after.3.txt 2014-10-16 21:56:46.272597922 +0400
@@ -152,6 +152,26 @@
 2014-10-16 17:25       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3m.part0
 2014-10-16 17:25       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3m.part1
 2014-10-16 17:25        21M  s3://backups-es-statistics/indices/statistics-20140108/3/__3m.part2
+2014-10-16 17:48       436   s3://backups-es-statistics/indices/statistics-20140108/3/__3n
+2014-10-16 17:48       179   s3://backups-es-statistics/indices/statistics-20140108/3/__3o
+2014-10-16 17:48       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3p.part0
+2014-10-16 17:49       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3p.part1
+2014-10-16 17:48       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3p.part2
+2014-10-16 17:49       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3p.part3
+2014-10-16 17:48        18M  s3://backups-es-statistics/indices/statistics-20140108/3/__3p.part4
+2014-10-16 17:48       459k  s3://backups-es-statistics/indices/statistics-20140108/3/__3q
+2014-10-16 17:48         7M  s3://backups-es-statistics/indices/statistics-20140108/3/__3r
+2014-10-16 17:48        34   s3://backups-es-statistics/indices/statistics-20140108/3/__3s
+2014-10-16 17:48         2k  s3://backups-es-statistics/indices/statistics-20140108/3/__3t
+2014-10-16 17:48        16M  s3://backups-es-statistics/indices/statistics-20140108/3/__3u
+2014-10-16 17:49        81M  s3://backups-es-statistics/indices/statistics-20140108/3/__3v
+2014-10-16 17:48        57   s3://backups-es-statistics/indices/statistics-20140108/3/__3w
+2014-10-16 17:48         2M  s3://backups-es-statistics/indices/statistics-20140108/3/__3x
+2014-10-16 17:49       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3y.part0
+2014-10-16 17:49        87M  s3://backups-es-statistics/indices/statistics-20140108/3/__3y.part1
+2014-10-16 17:49       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3z.part0
+2014-10-16 17:49       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__3z.part1
+2014-10-16 17:49        21M  s3://backups-es-statistics/indices/statistics-20140108/3/__3z.part2
 2014-09-30 19:39       959k  s3://backups-es-statistics/indices/statistics-20140108/3/__4
 2014-09-30 19:39        53M  s3://backups-es-statistics/indices/statistics-20140108/3/__5
 2014-09-30 19:39       100M  s3://backups-es-statistics/indices/statistics-20140108/3/__6.part0
@@ -205,3 +225,4 @@
 2014-10-15 08:46         5k  s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-14
 2014-10-16 08:41         5k  s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-15
 2014-10-16 17:25         5k  s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-16
+2014-10-16 17:49         5k  s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-16-again

Cluster consists of 5 nodes on 1.3.2.

cc @imotov

@imotov

This comment has been minimized.

Copy link
Member

commented Oct 16, 2014

Is it possible that some primary shards for this index switched between snapshots?

@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2014

Cluster is healthy, so I think no. Look at statistics-20131004, it is 57gb instead of 7gb. This is far from normal.

Do I understand this correctly? If shard is snapshotted, then replica becomes primary, then snapshot is taken again, then procedure is repeated 10 times, then snapshot is going to be 10 times bigger? What if replica is rebuild from primary (previous replica is totally lost).

Let me know if I could add more info to help you with this issue.

@imotov

This comment has been minimized.

Copy link
Member

commented Oct 16, 2014

We copy only files that changed since the last snapshot. If you have a replica that was never synched with primary and primary went down, it's possible to have another copy but it will stop there. So, it can explain 2x difference not 10x difference in size. Could you send us these two files:

s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-16
s3://backups-es-statistics/indices/statistics-20140108/3/snapshot-statistics-2014-10-16-again

Do you know which version of elasticsearch you had when the index statistics-20140108 was created?

@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2014

2x is better than 10x, but can we do better? Is there an api to sync replicas with primaries on byte level so even if replica becomes primary snapshot is noop?

https://gist.github.com/bobrik/c7ab1e0df88f0585f274 here are the files you requested. Elasticsearch version was from 0.90.x line, but those specific indices were probably restored from snapshot on 1.3.2.

Actually it looks like all problematic indices were restored from snapshot with renaming and good indices weren't restored at all (they were here since 0.90.x). That should help.

@imotov

This comment has been minimized.

Copy link
Member

commented Oct 16, 2014

Which version of elasticsearch did you use to restore these indices from snapshot?

@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2014

1.3.2 was used for snapshot and restore. Cluster is 1.3.4 since the day of release, in all comments above you should assume 1.3.4 instead of 1.3.2.

Current snapshots are made on 1.3.4.

@imotov

This comment has been minimized.

Copy link
Member

commented Oct 16, 2014

So, index was created with 0.90, upgraded to 1.3.2, then you created snapshot with 1.3.2, restored this index while renaming in 1.3.2, now you are creating snapshots with 1.3.4 and they are duplicated. Is this correct description? Could you also send us files snapshot-statistics-2014-10-16 and snapshot-statistics-2014-10-16-again from the root directory of your repository?

@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2014

Yes, this looks correct, but index was upgraded from 0.90 to 1.3.2 with many intermediate versions (1.0, 1.1, 1.2 hold those indices too).

Here are the files from the root of repository: https://gist.github.com/bobrik/d1deb9239c59db998f24

@imotov

This comment has been minimized.

Copy link
Member

commented Oct 16, 2014

I see. One last piece of information (hopefully). Could you also post these two files:

s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-16
s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-16-again
@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2014

They are the same:

{"statistics-20140108":{"version":8,"state":"open","settings":{"index.number_of_replicas":"1","index.version.created":"900999","index.number_of_shards":"5","index.uuid":"7OLwrzjOSFemAoXS1XB2qg","index.codec.bloom.load":"false"},"mappings":[{"markers":{"_all":{"enabled":false},"properties":{"@message":{"type":"string"},"@timestamp":{"type":"date","format":"dateOptionalTime"}}}},{"precise":{"_all":{"enabled":false},"_routing":{"required":true,"path":"@key"},"properties":{"@key":{"type":"string","index":"not_analyzed"},"@precise":{"type":"double"},"@timestamp":{"type":"date","format":"dateOptionalTime"}}}},{"events":{"_all":{"enabled":false},"_routing":{"required":true,"path":"@key"},"properties":{"---":{"type":"long"},"@key":{"type":"string","index":"not_analyzed"},"@timestamp":{"type":"date","format":"dateOptionalTime"},"@value":{"type":"long"},"ad":{"type":"string"},"age":{"type":"long"},"app":{"type":"string","index":"not_analyzed"},"cit":{"type":"string","index":"not_analyzed"},"cnt":{"type":"string","index":"not_analyzed"},"con":{"type":"string","index":"not_analyzed"},"cor":{"type":"long"},"cvn":{"type":"string","index":"not_analyzed"},"lng":{"type":"string","index":"not_analyzed"},"mob":{"type":"long"},"mtd":{"type":"string","index":"not_analyzed"},"nic":{"type":"long"},"nov":{"type":"long"},"plc":{"type":"string","index":"not_analyzed"},"plt":{"type":"string","index":"not_analyzed"},"pwr":{"type":"string","index":"not_analyzed"},"ref":{"type":"string","index":"not_analyzed"},"sbs":{"type":"long"},"sex":{"type":"long"},"spc":{"type":"long"},"spl":{"type":"string","index":"not_analyzed"},"tag":{"type":"string","index":"not_analyzed"},"tgt":{"type":"string","index":"not_analyzed"},"trs":{"type":"string","index":"not_analyzed"},"val":{"type":"string","index":"not_analyzed"},"wsh":{"type":"string"}}}}],"aliases":{}}}

All of them:

$ s3cmd ls --list-md5 s3://backups-es-statistics/indices/statistics-20140108/
                       DIR                                     s3://backups-es-statistics/indices/statistics-20140108/0/
                       DIR                                     s3://backups-es-statistics/indices/statistics-20140108/1/
                       DIR                                     s3://backups-es-statistics/indices/statistics-20140108/2/
                       DIR                                     s3://backups-es-statistics/indices/statistics-20140108/3/
                       DIR                                     s3://backups-es-statistics/indices/statistics-20140108/4/
2014-09-30 16:59      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-09-29
2014-10-02 08:30      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-01
2014-10-09 08:29      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-08
2014-10-10 08:34      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-09
2014-10-11 08:34      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-10
2014-10-14 13:43      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-13
2014-10-15 08:36      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-14
2014-10-16 08:31      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-15
2014-10-16 17:15      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-16
2014-10-16 17:39      1849   f6e41305195be8a005a489e36c022dd4  s3://backups-es-statistics/indices/statistics-20140108/snapshot-statistics-2014-10-16-again
@imotov

This comment has been minimized.

Copy link
Member

commented Nov 6, 2014

@bobrik I was able to reproduce the issue. It turned out that cleanup process in v1.3.0+ at the end of restore mistakenly deletes information about legacy checksums (checksums for segments created with old version of elasticsearch). As a result, consecutive snapshots don't store the checksum in snapshot metadata and have to fallback to creating copies of these old segments again and again.

To reproduce this issue:

  • create an index using elasticsearch v0.90.10
  • upgrade cluster to 1.0.1 and create a snapshot
  • restore the index from the created snapshot into cluster v1.3.4
  • create a several snapshots of the restored index in the v1.3.4 cluster
  • observe that each snapshot creates a new copy of all old segment files of the index
@bobrik

This comment has been minimized.

Copy link
Contributor Author

commented Nov 6, 2014

Great! Any thoughts about release where the fix will land?

What would happen with fix? Just final snapshot with checksums or something else?

@imotov

This comment has been minimized.

Copy link
Member

commented Nov 6, 2014

It's going to land in 1.3.6 and 1.4.1. The fix is not going to restore checksums for old segments restored with elasticsearch v.1.3.0-1.3.5 though. You will need to restore indices with such segments again in v1.3.6+ or upgrade them to the new version using upgrade api.

imotov added a commit to imotov/elasticsearch that referenced this issue Nov 18, 2014
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes elastic#8119

@imotov imotov closed this in #8358 Nov 18, 2014

imotov added a commit that referenced this issue Nov 18, 2014
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes #8119
imotov added a commit that referenced this issue Nov 18, 2014
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes #8119
imotov added a commit that referenced this issue Nov 18, 2014
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes #8119
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes elastic#8119
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Snapshot/Restore: keep the last legacy checksums file at the end of r…
…estore

 This commit fixes the issue caused by restore process deleting all legacy checksum files at the end of restore process. Instead it keeps the latest version of the checksum intact. The issue manifests itself in losing checksum for all legacy files restored into post 1.3.0 cluster, which in turn causes unnecessary snapshotting of files that didn't change.

Fixes elastic#8119
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.