-
Notifications
You must be signed in to change notification settings - Fork 1.7k
/
replica-set-oplog.txt
227 lines (164 loc) · 7.46 KB
/
replica-set-oplog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
.. _replica-set-oplog:
=================
Replica Set Oplog
=================
.. default-domain:: mongodb
.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: singlecol
The :term:`oplog` (operations log) is a special :term:`capped
collection` that keeps a rolling record of all operations that modify
the data stored in your databases.
.. note::
Starting in MongoDB 4.0, unlike other capped collections, the oplog
can grow past its configured size limit to avoid deleting the
:data:`majority commit point <replSetGetStatus.optimes.lastCommittedOpTime>`.
MongoDB applies database operations
on the :term:`primary` and then records the operations on the
primary's oplog. The :term:`secondary` members then copy and apply
these operations in an asynchronous process. All
replica set members contain a copy of the oplog, in the
:data:`local.oplog.rs` collection, which allows them to maintain the
current state of the database.
To facilitate replication, all replica set members send heartbeats
(pings) to all other members. Any :term:`secondary` member can import
oplog entries from any other member.
Each operation in the oplog is :term:`idempotent`. That is, oplog
operations produce the same results whether applied once or multiple
times to the target dataset.
.. _replica-set-oplog-sizing:
Oplog Size
----------
When you start a replica set member for the first time, MongoDB creates
an oplog of a default size if you do not specify the oplog size. [#oplog]_
For Unix and Windows systems
The default oplog size depends on the storage engine:
.. list-table::
:widths: 30 30 20 20
:header-rows: 1
* - Storage Engine
- Default Oplog Size
- Lower Bound
- Upper Bound
* - :doc:`/core/inmemory`
- 5% of physical memory
- 50 MB
- 50 GB
* - :doc:`/core/wiredtiger`
- 5% of free disk space
- 990 MB
- 50 GB
* - :doc:`/core/mmapv1`
- 5% of free disk space
- 990 MB
- 50 GB
For 64-bit macOS systems
The default oplog size is 192 MB of either physical memory or free
disk space depending on the storage engine:
.. list-table::
:widths: 50 50
:header-rows: 1
* - Storage Engine
- Default Oplog Size
* - :doc:`/core/inmemory`
- 192 MB of physical memory
* - :doc:`/core/wiredtiger`
- 192 MB of free disk space
* - :doc:`/core/mmapv1`
- 192 MB of free disk space
In most cases, the default oplog size is sufficient. For example, if an
oplog is 5% of free disk space and fills up in 24 hours of operations, then
secondaries can stop copying entries from the oplog for up to 24 hours
without becoming too stale to continue replicating. However, most
replica sets have much lower operation volumes, and their oplogs can
hold much higher numbers of operations.
Before :binary:`~bin.mongod` creates an oplog, you can specify its size with
the :setting:`~replication.oplogSizeMB` option. Once you have started a
replica set member for the first time, use the
:dbcommand:`replSetResizeOplog` administrative command to change the
oplog size. :dbcommand:`replSetResizeOplog` enables you to resize the
oplog dynamically without restarting the :binary:`~bin.mongod` process.
.. [#oplog]
.. include:: /includes/fact-oplog-size.rst
Workloads that Might Require a Larger Oplog Size
------------------------------------------------
If you can predict your replica set's workload to resemble one of the
following patterns, then you might want to create an oplog that is
larger than the default. Conversely, if your application predominantly
performs reads with a minimal amount of write operations, a smaller oplog
may be sufficient.
The following workloads might require a larger oplog size.
Updates to Multiple Documents at Once
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The oplog must translate multi-updates into individual operations in
order to maintain :term:`idempotency <idempotent>`. This can use a great
deal of oplog space without a corresponding increase in data size or disk
use.
Deletions Equal the Same Amount of Data as Inserts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you delete roughly the same amount of data as you insert, the
database will not grow significantly in disk use, but the size
of the operation log can be quite large.
Significant Number of In-Place Updates
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If a significant portion of the workload is updates that do not
increase the size of the documents, the database records a large number
of operations but does not change the quantity of data on disk.
Oplog Status
------------
To view oplog status, including the size and the time range of
operations, issue the :method:`rs.printReplicationInfo()` method. For
more information on oplog status, see
:ref:`replica-set-troubleshooting-check-oplog-size`.
Under various exceptional situations, updates to a :term:`secondary's
<secondary>` oplog might lag behind the desired performance time. Use
:method:`db.getReplicationInfo()` from a secondary member and the
:doc:`replication status </reference/method/db.getReplicationInfo>`
output to assess the current state of replication and determine if
there is any unintended replication delay.
See :ref:`Replication Lag <replica-set-replication-lag>` for more
information.
.. _slow-oplog:
Slow Oplog Application
----------------------
Starting in version 4.0.6, secondary members of a replica set now log
oplog entries that take longer than the slow operation threshold to
apply. These messages are :option:`logged <mongod --logpath>` for the
secondaries under the :data:`REPL` component with the text ``applied
op: <oplog entry> took <num>ms``.
.. code-block:: none
2018-11-16T12:31:35.886-0500 I REPL [repl writer worker 13] applied op: command { ... }, took 112ms
The slow oplog application logging on secondaries are:
- Not affected by the :setting:`~operationProfiling.slowOpSampleRate`;
i.e. all slow oplog entries are logged by the secondary.
- Not affected by the
:parameter:`logLevel`/:setting:`systemLog.verbosity` level (or the
:setting:`systemLog.component.replication.verbosity` level); i.e. for
oplog entries, the secondary logs only the slow oplog entries.
Increasing the verbosity level does not log all oplog entries.
- Not captured by the :doc:`profiler
</tutorial/manage-the-database-profiler>` and not affected by the
profiling level.
For more information on setting the slow operation threshold, see
- :option:`mongod --slowms`
- :setting:`~operationProfiling.slowOpThresholdMs`
- The :dbcommand:`profile` command or :method:`db.setProfilingLevel()`
shell helper method.
.. _oplog-coll-behavior:
Oplog Collection Behavior
-------------------------
If your MongoDB deployment uses the
:ref:`WiredTiger Storage Engine <storage-wiredtiger>`, you cannot
:dbcommand:`drop` the ``local.oplog.rs`` collection from any replica
set member. This restriction applies to both single-member and
multi-member replica sets. Dropping the oplog can lead to data
inconsistencies in the replica set if a node temporarily
goes down and attempts to replay the oplog during the restart process.
Starting in MongoDB 4.0.22, it is no longer possible to perform manual
write operations to the :doc:`oplog </core/replica-set-oplog>` on a
cluster running as a :ref:`replica set <replication>`. Performing write
operations to the oplog when running as a
:term:`standalone instance <standalone>` should only be done with
guidance from MongoDB Support.