Skip to content

Commit 69f7be9

Browse files
MahatiCclayg
andcommitted
Move documented reclaim_age option to correct location
The reclaim_age is a DiskFile option, it doesn't make sense for two different object services or nodes to use different values. I also driveby cleanup the reclaim_age plumbing from get_hashes to cleanup_ondisk_files since it's a method on the Manager and has access to the configured reclaim_age. This fixes a bug where finalize_put wouldn't use the [DEFAULT]/object-server configured reclaim_age - which is normally benign but leads to weird behavior on DELETE requests with really small reclaim_age. There's a couple of places in the replicator and reconstructor that reach into their manager to borrow the reclaim_age when emptying out the aborted PUTs that failed to cleanup their files in tmp - but that timeout doesn't really need to be coupled with reclaim_age and that method could have just as reasonably been implemented on the Manager. UpgradeImpact: Previously the reclaim_age was documented to be configurable in various object-* services config sections, but that did not work correctly unless you also configured the option for the object-server because of REPLICATE request rehash cleanup. All object services must use the same reclaim_age. If you require a non-default reclaim age it should be set in the [DEFAULT] section. If there are different non-default values, the greater should be used for all object services and configured only in the [DEFAULT] section. If you specify a reclaim_age value in any object related config you should move it to *only* the [DEFAULT] section before you upgrade. If you configure a reclaim_age less that your consistency window you are likely to be eaten by a Grue. Closes-Bug: #1626296 Change-Id: I2b9189941ac29f6e3be69f76ff1c416315270916 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
1 parent 1a8085f commit 69f7be9

File tree

12 files changed

+81
-88
lines changed

12 files changed

+81
-88
lines changed

doc/manpages/object-server.conf.5

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,9 @@ backend node. The default is 60.
142142
The default is 65536.
143143
.IP \fBdisk_chunk_size\fR
144144
The default is 65536.
145+
.IP \fBreclaim_age\fR
146+
Time elapsed in seconds before an object can be reclaimed. The default is
147+
604800 seconds.
145148
.IP \fBnice_priority\fR
146149
Modify scheduling priority of server processes. Niceness values range from -20
147150
(most favorable to the process) to 19 (least favorable to the process).
@@ -394,9 +397,6 @@ default is 1800 seconds.
394397
The default is 15.
395398
.IP \fBrsync_error_log_line_length\fR
396399
Limits how long rsync error log lines are. 0 (default) means to log the entire line.
397-
.IP \fBreclaim_age\fR
398-
Time elapsed in seconds before an object can be reclaimed. The default is
399-
604800 seconds.
400400
.IP "\fBrecon_cache_path\fR"
401401
The recon_cache_path simply sets the directory where stats for a few items will be stored.
402402
Depending on the method of deployment you may need to create this directory manually
@@ -468,9 +468,6 @@ Attempts to kill all workers if nothing replicates for lockup_timeout seconds. T
468468
default is 1800 seconds.
469469
.IP \fBring_check_interval\fR
470470
The default is 15.
471-
.IP \fBreclaim_age\fR
472-
Time elapsed in seconds before an object can be reclaimed. The default is
473-
604800 seconds.
474471
.IP "\fBrecon_cache_path\fR"
475472
The recon_cache_path simply sets the directory where stats for a few items will be stored.
476473
Depending on the method of deployment you may need to create this directory manually

doc/source/deployment_guide.rst

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -228,10 +228,11 @@ service trying to start is missing there will be an error. The sections not
228228
used by the service are ignored.
229229

230230
Consider the example of an object storage node. By convention, configuration
231-
for the object-server, object-updater, object-replicator, and object-auditor
232-
exist in a single file ``/etc/swift/object-server.conf``::
231+
for the object-server, object-updater, object-replicator, object-auditor, and
232+
object-reconstructor exist in a single file ``/etc/swift/object-server.conf``::
233233

234234
[DEFAULT]
235+
reclaim_age = 604800
235236

236237
[pipeline:main]
237238
pipeline = object-server
@@ -240,7 +241,6 @@ exist in a single file ``/etc/swift/object-server.conf``::
240241
use = egg:swift#object
241242

242243
[object-replicator]
243-
reclaim_age = 259200
244244

245245
[object-updater]
246246

@@ -417,9 +417,9 @@ The following configuration options are available:
417417

418418
[DEFAULT]
419419

420-
================================ ========== ==========================================
420+
================================ ========== ============================================
421421
Option Default Description
422-
-------------------------------- ---------- ------------------------------------------
422+
-------------------------------- ---------- --------------------------------------------
423423
swift_dir /etc/swift Swift configuration directory
424424
devices /srv/node Parent directory of where devices are
425425
mounted
@@ -515,6 +515,16 @@ network_chunk_size 65536 Size of chunks to read/write over t
515515
disk_chunk_size 65536 Size of chunks to read/write to disk
516516
container_update_timeout 1 Time to wait while sending a container
517517
update on object update.
518+
reclaim_age 604800 Time elapsed in seconds before the tombstone
519+
file representing a deleted object can be
520+
reclaimed. This is the maximum window for
521+
your consistency engine. If a node that was
522+
disconnected from the cluster because of a
523+
fault is reintroduced into the cluster after
524+
this window without having its data purged
525+
it will result in dark data. This setting
526+
should be consistent across all object
527+
services.
518528
nice_priority None Scheduling priority of server processes.
519529
Niceness values range from -20 (most
520530
favorable to the process) to 19 (least
@@ -536,7 +546,7 @@ ionice_priority None I/O scheduling priority of server
536546
priority of the process. Work only with
537547
ionice_class.
538548
Ignored if IOPRIO_CLASS_IDLE is set.
539-
================================ ========== ==========================================
549+
================================ ========== ============================================
540550

541551
.. _object-server-options:
542552

@@ -685,8 +695,6 @@ rsync_compress no Allow rsync to compress d
685695
process.
686696
stats_interval 300 Interval in seconds between
687697
logging replication statistics
688-
reclaim_age 604800 Time elapsed in seconds before an
689-
object can be reclaimed
690698
handoffs_first false If set to True, partitions that
691699
are not supposed to be on the
692700
node will be replicated first.

etc/object-server.conf-sample

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,12 @@ bind_port = 6200
6969
# network_chunk_size = 65536
7070
# disk_chunk_size = 65536
7171
#
72+
# Reclamation of tombstone files is performed primarily by the replicator and
73+
# the reconstructor but the object-server and object-auditor also reference
74+
# this value - it should be the same for all object services in the cluster,
75+
# and not greater than the container services reclaim_age
76+
# reclaim_age = 604800
77+
#
7278
# You can set scheduling priority of processes. Niceness values range from -20
7379
# (most favorable to the process) to 19 (least favorable to the process).
7480
# nice_priority =
@@ -229,9 +235,6 @@ use = egg:swift#recon
229235
# attempts to kill all workers if nothing replicates for lockup_timeout seconds
230236
# lockup_timeout = 1800
231237
#
232-
# The replicator also performs reclamation
233-
# reclaim_age = 604800
234-
#
235238
# ring_check_interval = 15
236239
# recon_cache_path = /var/cache/swift
237240
#
@@ -293,7 +296,6 @@ use = egg:swift#recon
293296
# node_timeout = 10
294297
# http_timeout = 60
295298
# lockup_timeout = 1800
296-
# reclaim_age = 604800
297299
# ring_check_interval = 15
298300
# recon_cache_path = /var/cache/swift
299301
# handoffs_first = False

swift/container/reconciler.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -337,6 +337,9 @@ class ContainerReconciler(Daemon):
337337

338338
def __init__(self, conf):
339339
self.conf = conf
340+
# This option defines how long an un-processable misplaced object
341+
# marker will be retried before it is abandoned. It is not coupled
342+
# with the tombstone reclaim age in the consistency engine.
340343
self.reclaim_age = int(conf.get('reclaim_age', 86400 * 7))
341344
self.interval = int(conf.get('interval', 30))
342345
conf_path = conf.get('__file__') or \

swift/obj/diskfile.py

Lines changed: 18 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@
4747
from tempfile import mkstemp
4848
from contextlib import contextmanager
4949
from collections import defaultdict
50+
from datetime import timedelta
5051

5152
from eventlet import Timeout
5253
from eventlet.hubs import trampoline
@@ -77,7 +78,7 @@
7778

7879

7980
PICKLE_PROTOCOL = 2
80-
ONE_WEEK = 604800
81+
DEFAULT_RECLAIM_AGE = timedelta(weeks=1).total_seconds()
8182
HASH_FILE = 'hashes.pkl'
8283
HASH_INVALIDATIONS_FILE = 'hashes.invalid'
8384
METADATA_KEY = 'user.swift.metadata'
@@ -557,7 +558,7 @@ def __init__(self, conf, logger):
557558
self.keep_cache_size = int(conf.get('keep_cache_size', 5242880))
558559
self.bytes_per_sync = int(conf.get('mb_per_sync', 512)) * 1024 * 1024
559560
self.mount_check = config_true_value(conf.get('mount_check', 'true'))
560-
self.reclaim_age = int(conf.get('reclaim_age', ONE_WEEK))
561+
self.reclaim_age = int(conf.get('reclaim_age', DEFAULT_RECLAIM_AGE))
561562
self.replication_one_per_device = config_true_value(
562563
conf.get('replication_one_per_device', 'true'))
563564
self.replication_lock_timeout = int(conf.get(
@@ -886,13 +887,12 @@ def get_ondisk_files(self, files, datadir, verify=True, **kwargs):
886887

887888
return results
888889

889-
def cleanup_ondisk_files(self, hsh_path, reclaim_age=ONE_WEEK, **kwargs):
890+
def cleanup_ondisk_files(self, hsh_path, **kwargs):
890891
"""
891892
Clean up on-disk files that are obsolete and gather the set of valid
892893
on-disk files for an object.
893894
894895
:param hsh_path: object hash path
895-
:param reclaim_age: age in seconds at which to remove tombstones
896896
:param frag_index: if set, search for a specific fragment index .data
897897
file, otherwise accept the first valid .data file
898898
:returns: a dict that may contain: valid on disk files keyed by their
@@ -901,7 +901,7 @@ def cleanup_ondisk_files(self, hsh_path, reclaim_age=ONE_WEEK, **kwargs):
901901
reverse sorted, stored under the key 'files'.
902902
"""
903903
def is_reclaimable(timestamp):
904-
return (time.time() - float(timestamp)) > reclaim_age
904+
return (time.time() - float(timestamp)) > self.reclaim_age
905905

906906
files = listdir(hsh_path)
907907
files.sort(reverse=True)
@@ -932,11 +932,10 @@ def _update_suffix_hashes(self, hashes, ondisk_info):
932932
"""
933933
raise NotImplementedError
934934

935-
def _hash_suffix_dir(self, path, reclaim_age):
935+
def _hash_suffix_dir(self, path):
936936
"""
937937
938938
:param path: full path to directory
939-
:param reclaim_age: age in seconds at which to remove tombstones
940939
"""
941940
hashes = defaultdict(hashlib.md5)
942941
try:
@@ -948,7 +947,7 @@ def _hash_suffix_dir(self, path, reclaim_age):
948947
for hsh in path_contents:
949948
hsh_path = join(path, hsh)
950949
try:
951-
ondisk_info = self.cleanup_ondisk_files(hsh_path, reclaim_age)
950+
ondisk_info = self.cleanup_ondisk_files(hsh_path)
952951
except OSError as err:
953952
if err.errno == errno.ENOTDIR:
954953
partition_path = dirname(path)
@@ -1006,34 +1005,30 @@ def _hash_suffix_dir(self, path, reclaim_age):
10061005
raise PathNotDir()
10071006
return hashes
10081007

1009-
def _hash_suffix(self, path, reclaim_age):
1008+
def _hash_suffix(self, path):
10101009
"""
10111010
Performs reclamation and returns an md5 of all (remaining) files.
10121011
10131012
:param path: full path to directory
1014-
:param reclaim_age: age in seconds at which to remove tombstones
10151013
:raises PathNotDir: if given path is not a valid directory
10161014
:raises OSError: for non-ENOTDIR errors
10171015
"""
10181016
raise NotImplementedError
10191017

1020-
def _get_hashes(self, partition_path, recalculate=None, do_listdir=False,
1021-
reclaim_age=None):
1018+
def _get_hashes(self, partition_path, recalculate=None, do_listdir=False):
10221019
"""
10231020
Get hashes for each suffix dir in a partition. do_listdir causes it to
10241021
mistrust the hash cache for suffix existence at the (unexpectedly high)
1025-
cost of a listdir. reclaim_age is just passed on to hash_suffix.
1022+
cost of a listdir.
10261023
10271024
:param partition_path: absolute path of partition to get hashes for
10281025
:param recalculate: list of suffixes which should be recalculated when
10291026
got
10301027
:param do_listdir: force existence check for all hashes in the
10311028
partition
1032-
:param reclaim_age: age at which to remove tombstones
10331029
10341030
:returns: tuple of (number of suffix dirs hashed, dictionary of hashes)
10351031
"""
1036-
reclaim_age = reclaim_age or self.reclaim_age
10371032
hashed = 0
10381033
hashes_file = join(partition_path, HASH_FILE)
10391034
modified = False
@@ -1072,7 +1067,7 @@ def _get_hashes(self, partition_path, recalculate=None, do_listdir=False,
10721067
if not hash_:
10731068
suffix_dir = join(partition_path, suffix)
10741069
try:
1075-
hashes[suffix] = self._hash_suffix(suffix_dir, reclaim_age)
1070+
hashes[suffix] = self._hash_suffix(suffix_dir)
10761071
hashed += 1
10771072
except PathNotDir:
10781073
del hashes[suffix]
@@ -1086,8 +1081,7 @@ def _get_hashes(self, partition_path, recalculate=None, do_listdir=False,
10861081
write_pickle(
10871082
hashes, hashes_file, partition_path, PICKLE_PROTOCOL)
10881083
return hashed, hashes
1089-
return self._get_hashes(partition_path, recalculate, do_listdir,
1090-
reclaim_age)
1084+
return self._get_hashes(partition_path, recalculate, do_listdir)
10911085
else:
10921086
return hashed, hashes
10931087

@@ -1237,8 +1231,7 @@ def get_diskfile_from_hash(self, device, partition, object_hash,
12371231
dev_path, get_data_dir(policy), str(partition), object_hash[-3:],
12381232
object_hash)
12391233
try:
1240-
filenames = self.cleanup_ondisk_files(object_path,
1241-
self.reclaim_age)['files']
1234+
filenames = self.cleanup_ondisk_files(object_path)['files']
12421235
except OSError as err:
12431236
if err.errno == errno.ENOTDIR:
12441237
quar_path = self.quarantine_renamer(dev_path, object_path)
@@ -1369,7 +1362,7 @@ def yield_hashes(self, device, partition, policy,
13691362
object_path = os.path.join(suffix_path, object_hash)
13701363
try:
13711364
results = self.cleanup_ondisk_files(
1372-
object_path, self.reclaim_age, **kwargs)
1365+
object_path, **kwargs)
13731366
timestamps = {}
13741367
for ts_key, info_key, info_ts_key in key_preference:
13751368
if info_key not in results:
@@ -2581,17 +2574,16 @@ def _update_suffix_hashes(self, hashes, ondisk_info):
25812574
hashes[None].update(
25822575
file_info['timestamp'].internal + file_info['ext'])
25832576

2584-
def _hash_suffix(self, path, reclaim_age):
2577+
def _hash_suffix(self, path):
25852578
"""
25862579
Performs reclamation and returns an md5 of all (remaining) files.
25872580
25882581
:param path: full path to directory
2589-
:param reclaim_age: age in seconds at which to remove tombstones
25902582
:raises PathNotDir: if given path is not a valid directory
25912583
:raises OSError: for non-ENOTDIR errors
25922584
:returns: md5 of files in suffix
25932585
"""
2594-
hashes = self._hash_suffix_dir(path, reclaim_age)
2586+
hashes = self._hash_suffix_dir(path)
25952587
return hashes[None].hexdigest()
25962588

25972589

@@ -3197,12 +3189,11 @@ def _update_suffix_hashes(self, hashes, ondisk_info):
31973189
file_info = ondisk_info['durable_frag_set'][0]
31983190
hashes[None].update(file_info['timestamp'].internal + '.durable')
31993191

3200-
def _hash_suffix(self, path, reclaim_age):
3192+
def _hash_suffix(self, path):
32013193
"""
32023194
Performs reclamation and returns an md5 of all (remaining) files.
32033195
32043196
:param path: full path to directory
3205-
:param reclaim_age: age in seconds at which to remove tombstones
32063197
:raises PathNotDir: if given path is not a valid directory
32073198
:raises OSError: for non-ENOTDIR errors
32083199
:returns: dict of md5 hex digests
@@ -3211,5 +3202,5 @@ def _hash_suffix(self, path, reclaim_age):
32113202
# here we flatten out the hashers hexdigest into a dictionary instead
32123203
# of just returning the one hexdigest for the whole suffix
32133204

3214-
hash_per_fi = self._hash_suffix_dir(path, reclaim_age)
3205+
hash_per_fi = self._hash_suffix_dir(path)
32153206
return dict((fi, md5.hexdigest()) for fi, md5 in hash_per_fi.items())

swift/obj/expirer.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,10 @@ def __init__(self, conf, logger=None, swift=None):
6565
raise ValueError("concurrency must be set to at least 1")
6666
self.processes = int(self.conf.get('processes', 0))
6767
self.process = int(self.conf.get('process', 0))
68-
self.reclaim_age = int(conf.get('reclaim_age', 86400 * 7))
68+
# This option defines how long an un-processable expired object
69+
# marker will be retried before it is abandoned. It is not coupled
70+
# with the tombstone reclaim age in the consistency engine.
71+
self.reclaim_age = int(conf.get('reclaim_age', 604800))
6972

7073
def report(self, final=False):
7174
"""

swift/obj/reconstructor.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,6 @@ def __init__(self, conf, logger=None):
132132
self.stats_interval = int(conf.get('stats_interval', '300'))
133133
self.ring_check_interval = int(conf.get('ring_check_interval', 15))
134134
self.next_check = time.time() + self.ring_check_interval
135-
self.reclaim_age = int(conf.get('reclaim_age', 86400 * 7))
136135
self.partition_times = []
137136
self.interval = int(conf.get('interval') or
138137
conf.get('run_pause') or 30)
@@ -431,7 +430,7 @@ def _get_hashes(self, policy, path, recalculate=None, do_listdir=False):
431430
df_mgr = self._df_router[policy]
432431
hashed, suffix_hashes = tpool_reraise(
433432
df_mgr._get_hashes, path, recalculate=recalculate,
434-
do_listdir=do_listdir, reclaim_age=self.reclaim_age)
433+
do_listdir=do_listdir)
435434
self.logger.update_stats('suffix.hashes', hashed)
436435
return suffix_hashes
437436

@@ -834,7 +833,7 @@ def collect_parts(self, override_devices=None,
834833
obj_path = join(dev_path, data_dir)
835834
tmp_path = join(dev_path, get_tmp_dir(int(policy)))
836835
unlink_older_than(tmp_path, time.time() -
837-
self.reclaim_age)
836+
df_mgr.reclaim_age)
838837
if not os.path.exists(obj_path):
839838
try:
840839
mkdirs(obj_path)

0 commit comments

Comments
 (0)