Skip to content

Commit

Permalink
dcache-qos: support migration using a new pool mode "DRAINNG"
Browse files Browse the repository at this point in the history
Motivation:

The procedure for migrating files off of a pool when
Resilience/QoS is running is currently not user-friendly.
It requires one either to set the pool to disabled and
allow the replica management to proceed as usual, or to
exclude all pools potentiatlly involved as either source
or target of the replica copy, in order to avoid
thrashing between the migration mv and the Resilience/QoS
engine.  The latter is clumsy and prone to error; the
former will alarm on persistent files which are uniquely
resident on that disabled pool, thus leaving the migration
incomplete.

It would be much nicer if we could provide a way of doing
this within QoS (Resilience) itself.

Modification:

Add a new pool state, "DRAINING", which is a READ_ONLY
state with an extra bit.  This state is handled by
QoS as if it were a DOWN pool, but QoS will not be
blocked in using the replica on that pool as source.
This way, all persistent files can be replicated.

The admin thus has only to set this state on the
pool to be drained, and then wait for the QoS
pool task to finish.   At that point, the pool
can be taken off line.

Result:

A pool can be "drained" (i.e., all its persistent
replicas copied elsewhere) without turning off QoS
and without using the migration module.  Note
that this is a migration to a pool group, not
an individual pool, and that cached replicas
are ignored.

Target: master
Patch: https://rb.dcache.org/r/13773/
Requires-book: yes (provided in this patch)
Requires-notes: yes
Acked-by: Tigran
  • Loading branch information
alrossi committed Nov 25, 2022
1 parent f3bfddb commit 85a4920
Show file tree
Hide file tree
Showing 9 changed files with 129 additions and 34 deletions.
51 changes: 39 additions & 12 deletions docs/TheBook/src/main/markdown/config-qos-engine.md
Expand Up @@ -1077,6 +1077,11 @@ Once pools are added to this group, the behavior will be as indicated above.
### Exclude a pool from qos handling

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTE: With version 9.0, the use of 'exclude' is deprecated for migration.
See below for the new procedure.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

During normal operation, QoS should be expected to handle gracefully situations
where a pool with many files, for one reason or another, goes offline. Such an incident,
even if the "grace period" value were set to 0, in initiating a large scan,
Expand Down Expand Up @@ -1107,8 +1112,8 @@ use in the wrong circumstances may easily lead to inconsistent state.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
WARNING: Only use 'pool exclude' for temporary situations where the intention
is eventually to restore the excluded location(s) to qos management;
or when the locations on those pools are actually being migrated or
deleted from the namespace.
or when the locations on those pools are actually to be deleted
from the namespace.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If, for instance, one set a pool to `EXCLUDED`, then removed the pool,
Expand All @@ -1124,15 +1129,14 @@ but caution should be taken when applying it.
_Note that once a pool is excluded, it can no longer be scanned, even manually,
until it is explicitly included again._

### Rebalance or migrate a pool (group)
### Rebalance a pool (group)

Rebalancing should be required less often on pools belonging to a primary pool
group; but if you should decide to rebalance this kind of pool group, or need
to migrate files from one pool group to another, be sure to disable qos on all
those pools. One could do this by stopping qos altogether, but this of course
would stop the processing of other groups not involved in the operation.
The alternative is to use the `exclude` command one or more times with expressions
matching the pools you are interested in:
group; but if you should decide to rebalance this kind of pool group, be sure
to disable qos on all those pools. One could do this by stopping qos altogether,
but this of course would stop the processing of other groups not involved
in the operation. The alternative is to use the `exclude` command one or more times
with expressions matching the pools you are interested in:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
\s qos-scanner pool exclude <exp1>
Expand All @@ -1144,7 +1148,7 @@ Note that the exclusion of a pool will survive a restart of the service because
excluded pools are written out to a file (`excluded-pools`; see above) which is
read back in on initialization.

When rebalancing or migration is completed, pools can be set back to active
When rebalancing is completed, pools can be set back to active
qos control:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -1158,10 +1162,33 @@ window elapses, a manual scan is required.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
BEST PRACTICE: Disable qos on the potential source and target pools
by setting them to EXCLUDED before doing a rebalance
or migration.
by setting them to EXCLUDED before doing a rebalance.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

### Migrating files off of a pool (dCache 9.0+)

Instead of the clumsy and less reliable procedure involved in rebalancing above,
QoS can now handle the copying of all persistent replicas to other pools
(whether of a primary/resilient pool group or globally).

To achieve this, do the following:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
\s <pool-to-decommission> pool disable -draining
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You will then see that pool scheduled for a scan as if it were in the `DOWN`
state. There will be, however, no issues with using the replicas on that
pool as source for a new copy, because the new pool state `DRAINING` is
the same as `READ_ONLY` (`-rdonly`), but with an extra bit to alert QoS to
treat it as if it were offline.

When the scan has completed, all persistent (but not cached) replicas will
have been replicated on other pools, thus leaving the source pool free to
be taken offline or to be manually purged of its replicas. One could even
conceivably `rep rm -force` all replicas on it, and set it back to enabled,
with no issues arising for QoS replica management.

### Manually schedule or cancel a pool scan

A scan can be manually scheduled for any pool, including those in the `DOWN` or
Expand Down
Expand Up @@ -79,7 +79,15 @@ public enum PoolQoSStatus {
/**
* normal read-write operations are possible
*/
READ_ONLY; /** equivalent to disabled for writing by clients */
READ_ONLY,
/**
* equivalent to disabled for writing by clients
*/
DRAINING;
/**
* equivalent to disabled for writing by clients and qos,
* but action needs to be taken to copy all its files elsewhere.
*/

/**
* This status tells qos whether action (scanning) needs to be taken with respect to the pool.
Expand All @@ -103,6 +111,16 @@ public static PoolQoSStatus valueOf(PoolV2Mode poolMode) {
return DOWN;
}

/*
* This is a special READ_ONLY state which must be treated by qos
* like a DOWN pool, but from which the source of the new replica
* need not be taken from another pool (hence, singleton
* replicas on this pool will not raise an alarm).
*/
if (poolMode.isDisabled(PoolV2Mode.DRAINING)) {
return DRAINING;
}

/*
* For qos, 'READ_ONLY' should designate only the fact
* that clients cannot write to the pool, or that staging
Expand All @@ -128,6 +146,7 @@ public static PoolQoSStatus valueOf(PoolV2Mode poolMode) {
public QoSMessageType toMessageType() {
switch (this) {
case DOWN:
case DRAINING:
case UNINITIALIZED:
return QoSMessageType.POOL_STATUS_DOWN;
default:
Expand Down
Expand Up @@ -160,15 +160,26 @@ synchronized NextAction getNextAction(PoolQoSStatus incoming) {
case DOWN:
switch (currStatus) {
case READ_ONLY:
case DRAINING:
case ENABLED:
return NextAction.DOWN_TO_UP;
case DOWN:
return NextAction.NOP;
}
case DRAINING:
switch (currStatus) {
case READ_ONLY:
case ENABLED:
case DOWN:
return NextAction.DOWN_TO_UP;
case DRAINING:
return NextAction.NOP;
}
case READ_ONLY:
case ENABLED:
switch (currStatus) {
case DOWN:
case DRAINING:
return NextAction.UP_TO_DOWN;
case READ_ONLY:
case ENABLED:
Expand All @@ -192,7 +203,7 @@ synchronized NextAction getNextAction(PoolQoSStatus incoming) {
* viable readable status, and handling this transition will
* unnecessarily provoke an immediate system-wide scan.
*/
if (currStatus == PoolQoSStatus.DOWN) {
if (currStatus == PoolQoSStatus.DOWN || currStatus == PoolQoSStatus.DRAINING) {
if (exceedsGracePeriod()) {
return NextAction.UP_TO_DOWN;
}
Expand Down
Expand Up @@ -593,6 +593,16 @@ public boolean isReadPref0(String pool) {
}
}

public boolean isPoolDraining(String pool) {
read.lock();
try {
PoolInformation info = poolInfo.get(pool);
return info.getMode().isDisabled(PoolV2Mode.DRAINING);
} finally {
read.unlock();
}
}

public boolean isPoolViable(String pool, boolean writable) {
read.lock();
try {
Expand Down
Expand Up @@ -126,6 +126,22 @@ public void setPoolInfoMap(PoolInfoMap poolInfoMap) {

public QoSAction verify(FileQoSRequirements requirements, VerifyOperation operation)
throws InterruptedException {
/*
* If the parent source is DRAINING, we increase the required number of disk copies
* by one if the disk requirement is non-zero.
*
* NOTE: we do not need to force the source to be the parent, because the parent
* pool is still readable. In the case where a file has a single copy
* resident on the parent, the parent will be automatically chosen.
*/
String parent = operation.getParent();
if (parent!= null && poolInfoMap.isPoolDraining(parent)) {
int disk = requirements.getRequiredDisk();
if (disk > 0) { /* do not create a persistent copy in the case of cached-only replicas */
requirements.setRequiredDisk(disk + 1);
}
}

VerifiedLocations locations = classifyLocations(requirements, operation.getPoolGroup());
Optional<QoSAction> optional;

Expand Down
Expand Up @@ -349,7 +349,7 @@ public void handleVerification(PnfsId pnfsId) {
}

/*
* Refresh to pool group on the basis of the current replica locations.
* Refresh pool group on the basis of the current replica locations.
* If the verification results in an adjustment, the updated group will
* be included in the underlying store update.
*/
Expand Down
Expand Up @@ -99,6 +99,7 @@ public String selectCopySource(VerifyOperation operation,
}
return locations.iterator().next();
}

return selectSource(locations, operation.getTried());
}

Expand Down
Expand Up @@ -28,38 +28,23 @@ public class PoolV2Mode implements Serializable {
DISABLED_STAGE |
DISABLED_P2P_CLIENT;

public static final int DRAINING = DISABLED_RDONLY | 0x200;

private static final int RESILIENCE_ENABLED = 0x7F;
private static final int RESILIENCE_DISABLED = 0x80;

public static final int REPOSITORY_LOADING = 0x100;
public static final int DISABLED_RDONLY_REPOSITORY_LOADING =
DISABLED_RDONLY |
REPOSITORY_LOADING;
DISABLED_RDONLY | REPOSITORY_LOADING;
public static final int DISABLED_STRICT_REPOSITORY_LOADING =
DISABLED_STRICT |
REPOSITORY_LOADING;
DISABLED_STRICT | REPOSITORY_LOADING;

private static final String[] __modeString = {
"fetch", "store", "stage", "p2p-client", "p2p-server", "dead"
};

private int _mode = ENABLED;

/*
* For convenience.
*/
public synchronized boolean isResilienceEnabled() {
return !((_mode & RESILIENCE_DISABLED) == RESILIENCE_DISABLED);
}

public synchronized void setResilienceEnabled(boolean isResilienceEnabled) {
if (isResilienceEnabled) {
_mode &= RESILIENCE_ENABLED;
} else {
_mode |= RESILIENCE_DISABLED;
}
}

@Override
public String toString() {
int mode = getMode();
Expand All @@ -85,6 +70,9 @@ public String toString() {
sb.append(modeString);
}
}
if (isDisabled(DRAINING)) {
sb.append(",draining");
}
if (!isResilienceEnabled()) {
sb.append(",noresilience");
}
Expand Down Expand Up @@ -127,6 +115,21 @@ public synchronized boolean isEnabled() {
return isEnabled(_mode);
}

/*
* For convenience.
*/
public synchronized boolean isResilienceEnabled() {
return !((_mode & RESILIENCE_DISABLED) == RESILIENCE_DISABLED);
}

public synchronized void setResilienceEnabled(boolean isResilienceEnabled) {
if (isResilienceEnabled) {
_mode &= RESILIENCE_ENABLED;
} else {
_mode |= RESILIENCE_DISABLED;
}
}

@Override
public synchronized boolean equals(Object obj) {
if (this == obj) {
Expand Down
Expand Up @@ -1653,6 +1653,11 @@ class PoolDisableCommand implements Callable<String> {
@Option(name = "rdonly", usage = "equivalent to -store -stage -p2p-client")
boolean rdonly;

@Option(name = "draining", usage = "equivalent to -store -stage -p2p-client, "
+ "with an additional flag to indicate to QoS to treat this pool as "
+ "offline (and thus to make an additional replica of all files on it).")
boolean draining;

@Option(name = "strict", usage = "disallows everything")
boolean strict;

Expand All @@ -1672,6 +1677,9 @@ public String call() {
if (strict) {
modeBits |= PoolV2Mode.DISABLED_STRICT;
}
if (draining) {
modeBits |= PoolV2Mode.DRAINING;
}
if (stage) {
modeBits |= PoolV2Mode.DISABLED_STAGE;
}
Expand Down

0 comments on commit 85a4920

Please sign in to comment.