-
Notifications
You must be signed in to change notification settings - Fork 24.6k
/
package-info.java
164 lines (163 loc) · 12.9 KB
/
package-info.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
/**
* <p>This package exposes the Elasticsearch Snapshot functionality.</p>
*
* <h2>Preliminaries</h2>
*
* <p>There are two communication channels between all nodes and master in the snapshot functionality:</p>
* <ul>
* <li>The master updates the cluster state by adding, removing or altering the contents of its custom entry
* {@link org.elasticsearch.cluster.SnapshotsInProgress}. All nodes consume the state of the {@code SnapshotsInProgress} and will start or
* abort relevant shard snapshot tasks accordingly.</li>
* <li>Nodes that are executing shard snapshot tasks report either success or failure of their snapshot task by submitting a
* {@link org.elasticsearch.snapshots.UpdateIndexShardSnapshotStatusRequest} to the master node that will update the
* snapshot's entry in the cluster state accordingly.</li>
* </ul>
*
* <h2>Snapshot Creation</h2>
* <p>Snapshots are created by the following sequence of events:</p>
* <ol>
* <li>First the {@link org.elasticsearch.snapshots.SnapshotsService} determines the primary shards' assignments for all indices that are
* being snapshotted and creates a {@code SnapshotsInProgress.Entry} with state {@code STARTED} and adds the map of
* {@link org.elasticsearch.index.shard.ShardId} to {@link org.elasticsearch.cluster.SnapshotsInProgress.ShardSnapshotStatus} that tracks
* the assignment of which node is to snapshot which shard. All shard snapshots are executed on the shard's primary node. Thus all shards
* for which the primary node was found to have a healthy copy of the shard are marked as being in state {@code INIT} in this map. If the
* primary for a shard is unassigned, it is marked as {@code MISSING} in this map. In case the primary is initializing at this point, it is
* marked as in state {@code WAITING}. In case a shard's primary is relocated at any point after its {@code SnapshotsInProgress.Entry} was
* created and thus been assigned to a specific cluster node, that shard's snapshot will fail and move to state {@code FAILED}.</li>
*
* <li>The new {@code SnapshotsInProgress.Entry} is then observed by
* {@link org.elasticsearch.snapshots.SnapshotShardsService#clusterChanged} on all nodes and since the entry is in state {@code STARTED}
* the {@code SnapshotShardsService} will check if any local primary shards are to be snapshotted (signaled by the shard's snapshot state
* being {@code INIT}). For those local primary shards found in state {@code INIT}) the snapshot process of writing the shard's data files
* to the snapshot's {@link org.elasticsearch.repositories.Repository} is executed. Once the snapshot execution finishes for a shard an
* {@code UpdateIndexShardSnapshotStatusRequest} is sent to the master node signaling either status {@code SUCCESS} or {@code FAILED}.
* The master node will then update a shard's state in the snapshots {@code SnapshotsInProgress.Entry} whenever it receives such a
* {@code UpdateIndexShardSnapshotStatusRequest}.</li>
*
* <li>If as a result of the received status update requests, all shards in the cluster state are in a completed state, i.e are marked as
* either {@code SUCCESS}, {@code FAILED} or {@code MISSING}, the {@code SnapshotsService} will update the state of the {@code Entry}
* itself and mark it as {@code SUCCESS}. At the same time {@link org.elasticsearch.snapshots.SnapshotsService#endSnapshot} is executed,
* writing to the repository the metadata necessary to finalize the snapshot in the repository.</li>
*
* <li>After writing the final metadata to the repository, a cluster state update to remove the snapshot from the cluster state is
* submitted and the removal of the snapshot's {@code SnapshotsInProgress.Entry} from the cluster state completes the snapshot process.
* </li>
* </ol>
*
* <h2>Deleting a Snapshot</h2>
*
* <p>Deleting a snapshot can take the form of either simply deleting it from the repository or (if it has not completed yet) aborting it
* and subsequently deleting it from the repository.</p>
*
* <h2>Aborting a Snapshot</h2>
*
* <ol>
* <li>Aborting a snapshot starts by updating the state of the snapshot's {@code SnapshotsInProgress.Entry} to {@code ABORTED}.</li>
*
* <li>The snapshot's state change to {@code ABORTED} in cluster state is then picked up by the {@code SnapshotShardsService} on all nodes.
* Those nodes that have shard snapshot actions for the snapshot assigned to them, will abort them and notify master about the shards
* snapshot status accordingly. If the shard snapshot action completed or was in state {@code FINALIZE} when the abort was registered by
* the {@code SnapshotShardsService}, then the shard's state will be reported to master as {@code SUCCESS}.
* Otherwise, it will be reported as {@code FAILED}.</li>
*
* <li>Once all the shards are reported to master as either {@code SUCCESS} or {@code FAILED} the {@code SnapshotsService} on the master
* will finish the snapshot process as all shard's states are now completed and hence the snapshot can be completed as explained in point 4
* of the snapshot creation section above.</li>
* </ol>
*
* <h2>Deleting a Snapshot from a Repository</h2>
*
* <ol>
* <li>Assuming there are no entries in the cluster state's {@code SnapshotsInProgress}, deleting a snapshot starts by the
* {@code SnapshotsService} creating an entry for deleting the snapshot in the cluster state's
* {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress}.</li>
*
* <li>Once the cluster state contains the deletion entry in {@code SnapshotDeletionsInProgress} the {@code SnapshotsService} will invoke
* {@link org.elasticsearch.repositories.Repository#deleteSnapshots} for the given snapshot, which will remove files associated with the
* snapshot from the repository as well as update its meta-data to reflect the deletion of the snapshot.</li>
*
* <li>After the deletion of the snapshot's data from the repository finishes, the {@code SnapshotsService} will submit a cluster state
* update to remove the deletion's entry in {@code SnapshotDeletionsInProgress} which concludes the process of deleting a snapshot.</li>
* </ol>
*
* <h2>Cloning a Snapshot</h2>
*
* <p>Cloning part of a snapshot is a process executed entirely on the master node. On a high level, the process of cloning a snapshot is
* analogous to that of creating a snapshot from data in the cluster except that the source of data files is the snapshot repository
* instead of the data nodes. It begins with cloning all shards and then finalizes the cloned snapshot the same way a normal snapshot would
* be finalized. Concretely, it is executed as follows:</p>
*
* <ol>
* <li>First, {@link org.elasticsearch.snapshots.SnapshotsService#cloneSnapshot} is invoked which will place a placeholder entry into
* {@code SnapshotsInProgress} that does not yet contain any shard clone assignments. Note that unlike in the case of snapshot
* creation, the shard level clone tasks in
* {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry#shardSnapshotStatusByRepoShardId()} are not created in the initial cluster
* state update as is done for shard snapshot assignments in {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry#shards}. This is
* due to the fact that shard snapshot assignments are computed purely from information in the current cluster state while shard clone
* assignments require information to be read from the repository, which is too slow of a process to be done inside a cluster state
* update. Loading this information ahead of creating a task in the cluster state, runs the risk of race conditions where the source
* snapshot is being deleted before the clone task is enqueued in the cluster state.</li>
* <li>Once a placeholder task for the clone operation is put into the cluster state, we must determine the number of shards in each
* index that is to be cloned as well as ensure the health of the index snapshots in the source snapshot. In order to determine the
* shard count for each index that is to be cloned, we load the index metadata for each such index using the repository's
* {@link org.elasticsearch.repositories.Repository#getSnapshotIndexMetaData} method. In order to ensure the health of the source index
* snapshots, we load the {@link org.elasticsearch.snapshots.SnapshotInfo} for the source snapshot and check for shard snapshot
* failures of the relevant indices.</li>
* <li>Once all shard counts are known and the health of all source indices data has been verified, we populate the
* {@code SnapshotsInProgress.Entry#clones} map for the clone operation with the the relevant shard clone tasks.</li>
* <li>After the clone tasks have been added to the {@code SnapshotsInProgress.Entry}, master executes them on its snapshot thread-pool
* by invoking {@link org.elasticsearch.repositories.Repository#cloneShardSnapshot} for each shard that is to be cloned. Each completed
* shard snapshot triggers a call to the {@link org.elasticsearch.snapshots.SnapshotsService#masterServiceTaskQueue} which updates the
* clone's {@code SnapshotsInProgress.Entry} to mark the shard clone operation completed.</li>
* <li>Once all the entries in {@code SnapshotsInProgress.Entry#clones} have completed, the clone is finalized just like any other
* snapshot through {@link org.elasticsearch.snapshots.SnapshotsService#endSnapshot}. The only difference being that the metadata that
* is written out for indices and the global metadata are read from the source snapshot in the repository instead of the cluster state.
* </li>
* </ol>
*
* <h2>Concurrent Snapshot Operations</h2>
*
* Snapshot create and delete operations may be started concurrently. Operations targeting different repositories run independently of
* each other. Multiple operations targeting the same repository are executed according to the following rules:
*
* <h3>Concurrent Snapshot Creation</h3>
*
* If multiple snapshot creation jobs are started at the same time, the data-node operations of multiple snapshots may run in parallel
* across different shards. If multiple snapshots want to snapshot a certain shard, then the shard snapshots for that shard will be
* executed one by one. This is enforced by the master node setting the shard's snapshot state to
* {@link org.elasticsearch.cluster.SnapshotsInProgress.ShardSnapshotStatus#UNASSIGNED_QUEUED} for all but one snapshot. The order of
* operations on a single shard is given by the order in which the snapshots were started.
* As soon as all shards for a given snapshot have finished, it will be finalized as explained above. Finalization will happen one snapshot
* at a time, working in the order in which snapshots had their shards completed.
*
* <h3>Concurrent Snapshot Deletes</h3>
*
* A snapshot delete will be executed as soon as there are no more shard snapshots or snapshot finalizations executing running for a given
* repository. Before a delete is executed on the repository it will be set to state
* {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress.State#STARTED}. If it cannot be executed when it is received it will be
* set to state {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress.State#WAITING} initially.
* If a delete is received for a given repository while there is already an ongoing delete for the same repository, there are two possible
* scenarios:
* 1. If the delete is in state {@code META_DATA} (i.e. already running on the repository) then the new delete will be added in state
* {@code WAITING} and will be executed after the current delete. The only exception here would be the case where the new delete covers
* the exact same snapshots as the already running delete. In this case no new delete operation is added and second delete request will
* simply wait for the existing delete to return.
* 2. If the existing delete is in state {@code WAITING} then the existing
* {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress.Entry} in the cluster state will be updated to cover both the snapshots
* in the existing delete as well as additional snapshots that may be found in the second delete request.
*
* In either of the above scenarios, in-progress snapshots will be aborted in the same cluster state update that adds a delete to the
* cluster state, if a delete applies to them.
*
* If a snapshot request is received while there already is a delete in the cluster state for the same repository, that snapshot will not
* start doing any shard snapshots until the delete has been executed.
*/
package org.elasticsearch.snapshots;