Skip to content

Commit

Permalink
dcache-qos: remove trigger to rescan pools on tag change
Browse files Browse the repository at this point in the history
    Motivation:

    Resilience had its own internal representation of
    the Pool Selection Unit which it maintained in a centralized
    location accessed by both the update operations as well
    as the pool scan operations.

    In migrating this functionality to QoS, it was decided,
    for a number of reasons, that this was no longer desirable
    or necessary, and that instead accssing the PSU shipped along with
    the Remote Pool Monitor was sufficient.

    An unanticipated consequence of this decision, however, now
    manifests itself in the way every restart of a pool, and
    all pools on initial startup, trigger forced scans.

    The reason for this is that the sequence of diff's
    between the current PSU and the new one will not always
    provide the pool tags (in the cost module) immediately,
    so it always looks like (to QoS) that they have changed.
    This can even happen periodically if the pool monitor
    does not carry the tags (= null) and then on the next
    iteration it does (periodic rescans can now be observed).

    One solution might be to rectify the diff logic, but this
    would require maintaining, once again, an ad hoc data
    structure based on the PSU along with the PSU.

    However, the problem which is trying to be handled here
    itself needs re-evaluation.  The original idea was that
    if the pool tags change, files which are distributed
    across pools based on these tags may need redistribution.
    But, unlike changing the requirements for the storage
    unit, changing the actual pool tags requires a pool
    restart.

    I do not think it unreasonable in that case to require
    that the admin responsible for changing those tags
    also decide whether the files involving those pools
    will require immediate redistribution, and if so,
    simply forcing a scan of the pool manually via
    the admin command.

    Hence the simplest solution is to eliminate the
    attempt to discover this and automatically force
    the scan.  In the event the admin does not schedule
    the pool for scan, the consequences of the changed
    tags will nevertheless be immediately felt for
    new files, and the redistribution will eventually
    take place during periodic scans.

    Modification:

    Remove the scan trigger code for changed tags.

    Result:

    No more scannning of all pools at startup or
    periodic rescanning of all pools outside the
    periodic window.

    Target: master
    Request: 9.1
    Request: 9.0
    Request: 8.2
    Patch: https://rb.dcache.org/r/14009/
    Requires-book: yes (included)
    Requires-notes: yes
    Acked-by: Tigran
  • Loading branch information
alrossi committed Jul 3, 2023
1 parent 70b9a43 commit 468a7a9
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 73 deletions.
10 changes: 10 additions & 0 deletions docs/TheBook/src/main/markdown/config-qos-engine.md
Expand Up @@ -1280,6 +1280,16 @@ be scheduled for scans.
>
> **Use the QoS transition to change existing disk+tape files to tape.**
### Changing pool tags

If a pool's tags are used to determine replica distribution (based on the storage
unit definition of `onlyOneCopyPer`) and these are changed, QoS will not automatically
force the pool to be scanned immediately upon restart (it will just be
scheduled for a restart scan based on the defined grace period).

> If it is desirable to rescan a pool's replicas immediately after its
> tags have been changed, this must be done manually (see above).
### Troubleshooting operations

Intervention to rectify qos handling should hopefully be needed infrequently;
Expand Down
Expand Up @@ -61,7 +61,6 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING

import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Sets;
import diskCacheV111.poolManager.CostModule;
import diskCacheV111.poolManager.PoolSelectionUnit;
import diskCacheV111.poolManager.PoolSelectionUnit.SelectionPool;
import diskCacheV111.poolManager.PoolSelectionUnit.SelectionPoolGroup;
Expand All @@ -70,11 +69,9 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
import diskCacheV111.poolManager.StorageUnitInfoExtractor;
import diskCacheV111.pools.PoolV2Mode;
import java.util.Collection;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.stream.Collectors;
import org.dcache.poolmanager.PoolInfo;
import org.dcache.poolmanager.PoolMonitor;
import org.dcache.qos.data.PoolQoSStatus;
import org.dcache.qos.services.scanner.data.PoolFilter;
Expand Down Expand Up @@ -117,7 +114,6 @@ public synchronized PoolOpDiff reloadAndScan(PoolMonitor newPoolMonitor) {
diff.getPoolsAddedToPoolGroup().removeAll(p);
diff.getPoolsRemovedFromPoolGroup().removeAll(p);
diff.getModeChanged().remove(p);
diff.getTagsChanged().remove(p);
});

PoolSelectionUnit currentPsu = newPoolMonitor.getPoolSelectionUnit();
Expand Down Expand Up @@ -157,16 +153,6 @@ public synchronized PoolOpDiff reloadAndScan(PoolMonitor newPoolMonitor) {
LOGGER.trace("Rescanning the pool groups whose marker changed.");
diff.getMarkerChanged().forEach(g -> scanPoolsOfModifiedPoolGroup(g, currentPsu));

LOGGER.trace("Rescanning the pools with changed tags.");
diff.getTagsChanged().keySet().stream()
.map(currentPsu::getPool)
.forEach(p -> poolOperationMap.scan(p.getName(),
null,
null,
null,
p.getPoolMode(),
true));

LOGGER.trace("Checking to see if previously uninitialized pools are now ready.");
poolOperationMap.saveExcluded();
lastRefresh = System.currentTimeMillis();
Expand Down Expand Up @@ -235,9 +221,6 @@ private PoolOpDiff compare(PoolMonitor currentPoolMonitor, PoolMonitor nextPoolM
LOGGER.trace("comparing pool mode");
comparePoolMode(diff, commonPools, nextPsu);

LOGGER.trace("comparing pool tags");
comparePoolTags(diff, commonPools, currentPoolMonitor, nextPoolMonitor);

return diff;
}

Expand Down Expand Up @@ -324,24 +307,6 @@ private void comparePoolsInPoolGroups(PoolOpDiff diff,
}
}

private void comparePoolTags(PoolOpDiff diff, Set<String> common, PoolMonitor current,
PoolMonitor next) {
CostModule currentCostModule = current.getCostModule();
CostModule nextCostModule = next.getCostModule();

diff.getNewPools()
.forEach(p -> diff.getTagsChanged().put(p, getPoolTags(p, nextCostModule)));

common.forEach(p -> {
Map<String, String> newTags = getPoolTags(p, nextCostModule);
Map<String, String> oldTags = getPoolTags(p, currentCostModule);
if (oldTags == null || (newTags != null
&& !oldTags.equals(newTags))) {
diff.getTagsChanged().put(p, newTags);
}
});
}

private Set<String> compareStorageUnits(PoolOpDiff diff,
PoolSelectionUnit currentPsu,
PoolSelectionUnit nextPsu) {
Expand Down Expand Up @@ -428,14 +393,6 @@ private Set<String> comparePoolGroups(PoolOpDiff diff,
return Sets.intersection(next, curr);
}

private Map<String, String> getPoolTags(String pool, CostModule costModule) {
PoolInfo poolInfo = costModule.getPoolInfo(pool);
if (poolInfo == null) {
return null;
}
return poolInfo.getTags();
}

/**
* Scans the "new" pool, also making sure all files have the sticky bit.
*/
Expand Down
Expand Up @@ -110,12 +110,6 @@ public final class PoolOpDiff {
*/
private final Map<String, PoolV2Mode> modeChanged = new HashMap<>();

/*
* (pool, tags)
*/
private final Map<String, Map<String, String>> tagsChanged
= new HashMap<>();

public Collection<String> getConstraintsChanged() {
return constraintsChanged;
}
Expand Down Expand Up @@ -160,10 +154,6 @@ public Multimap<String, String> getPoolsRemovedFromPoolGroup() {
return poolsRmved;
}

public Map<String, Map<String, String>> getTagsChanged() {
return tagsChanged;
}

public Collection<String> getUninitializedPools() {
return uninitPools;
}
Expand All @@ -182,7 +172,7 @@ public boolean isEmpty() {
&& newUnits.isEmpty() && oldUnits.isEmpty()
&& poolsAdded.isEmpty() && poolsRmved.isEmpty()
&& unitsAdded.isEmpty() && unitsRmved.isEmpty()
&& constraintsChanged.isEmpty() && tagsChanged.isEmpty()
&& constraintsChanged.isEmpty()
&& modeChanged.isEmpty();
}

Expand All @@ -199,14 +189,12 @@ public String toString() {
"Units Added: %s\n" +
"Units Removed: %s\n" +
"Constraints changed: %s\n" +
"Mode changed: %s\n" +
"Tags changed: %s\n",
"Mode changed: %s\n",
newPools, oldPools, uninitPools,
newGroups, oldGroups,
newUnits, oldUnits,
poolsAdded, poolsRmved,
unitsAdded, unitsRmved,
constraintsChanged, modeChanged,
tagsChanged);
constraintsChanged, modeChanged);
}
}
Expand Up @@ -71,9 +71,6 @@ public void shouldTriggerScanWhenPoolsAddedToOperationMap() throws Exception {
whenReloadAndScanIsCalled();
using(aPoolOperationMapVerifier()).verifyThat("add").isCalled(1).on(poolOperationMap)
.with("testpool03-0");
using(aPoolOperationMapVerifier()).verifyThat("scan").isCalled(1).on(poolOperationMap)
.with("testpool03-0", noAddedTo(), noRemovedFrom(), noStorageUnit(), anyMode(),
ignoringStateCheck());
}

@Test
Expand Down Expand Up @@ -283,18 +280,6 @@ public void shouldTriggerScanWhenPoolGroupGoesNonPrimary() throws Exception {
withStateCheck());
}

/*
* Pools with changed tags.
*/
@Test
public void shouldTriggerScanWhenPoolTagsChanged() throws Exception {
givenTestPool(POOL_ON_WHICH_TO_CHANGE_TAGS).withTagsChanged();
whenReloadAndScanIsCalled();
using(aPoolOperationMapVerifier()).verifyThat("scan").isCalled(1).on(poolOperationMap)
.with(POOL_ON_WHICH_TO_CHANGE_TAGS, noAddedTo(), noRemovedFrom(), noStorageUnit(),
anyMode(), ignoringStateCheck());
}

private void assertNoUpdatesToPoolOperationMap() {
verify(poolOperationMap, new AtMost(0)).add(any());
verify(poolOperationMap, new AtMost(0)).remove(any());
Expand Down

0 comments on commit 468a7a9

Please sign in to comment.