Skip to content

Commit

Permalink
dcache-resilience: skip invalid cancel filters
Browse files Browse the repository at this point in the history
Motivation:

We are encountering:

12 Jun 2018 09:33:36 (Resilience) [] Uncaught exception in thread FileOperationMap
java.util.NoSuchElementException: dcache-cms166-01
        at org.dcache.util.NonReindexableList.indexOf(NonReindexableList.java:192) ~[dcache-core-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.PoolInfoMap.getPoolIndex(PoolInfoMap.java:424) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileFilter.matchesPool(FileFilter.java:94) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileFilter.matches(FileFilter.java:163) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileOperationMap$TerminalOperationProcessor.cancel(FileOperationMap.java:270) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileOperationMap$TerminalOperationProcessor.gatherCanceled(FileOperationMap.java:298) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileOperationMap$TerminalOperationProcessor.processTerminated(FileOperationMap.java:229) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileOperationMap.scan(FileOperationMap.java:871) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at org.dcache.resilience.data.FileOperationMap.run(FileOperationMap.java:819) ~[dcache-resilience-3.1.16.jar:3.1.16]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]

when a pool name no longer maps to a known location in resilience.

This is a bug, reproducible following steps given under Testing.

Modification:

Check the filter to see that its pool(s) are still valid;
if not, skip the filter.  This involves checking both
the validity of pool names and pool indices.

Result:

This condition should not crash resilience.

Target: master
Request: 4.2
Request: 4.1
Request: 4.0
Request: 3.2
Bug: #4024
Acked-by: Tigran
  • Loading branch information
alrossi committed Jun 27, 2018
1 parent d36c17d commit f060954
Show file tree
Hide file tree
Showing 8 changed files with 118 additions and 13 deletions.
Expand Up @@ -85,10 +85,13 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING

import diskCacheV111.util.CacheException;
import diskCacheV111.util.PnfsId;

import dmg.cells.nucleus.CellCommandListener;
import dmg.util.command.Argument;
import dmg.util.command.Command;
import dmg.util.command.Option;

import org.dcache.resilience.data.FileCancelFilter;
import org.dcache.resilience.data.FileFilter;
import org.dcache.resilience.data.FileOperation;
import org.dcache.resilience.data.FileOperationMap;
Expand Down Expand Up @@ -607,7 +610,7 @@ protected String doCall() throws Exception {
return "Please provide a non-empty string value for state.";
}

FileFilter filter = new FileFilter();
FileFilter filter = new FileCancelFilter();

if (!"*".equals(pnfsids)) {
filter.setLastUpdateBefore(getTimestamp(lastUpdateBefore));
Expand Down
@@ -0,0 +1,81 @@
/*
COPYRIGHT STATUS:
Dec 1st 2001, Fermi National Accelerator Laboratory (FNAL) documents and
software are sponsored by the U.S. Department of Energy under Contract No.
DE-AC02-76CH03000. Therefore, the U.S. Government retains a world-wide
non-exclusive, royalty-free license to publish or reproduce these documents
and software for U.S. Government purposes. All documents and software
available from this server are protected under the U.S. and Foreign
Copyright Laws, and FNAL reserves all rights.
Distribution of the software available from this server is free of
charge subject to the user following the terms of the Fermitools
Software Legal Information.
Redistribution and/or modification of the software shall be accompanied
by the Fermitools Software Legal Information (including the copyright
notice).
The user is asked to feed back problems, benefits, and/or suggestions
about the software to the Fermilab Software Providers.
Neither the name of Fermilab, the URA, nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
DISCLAIMER OF LIABILITY (BSD):
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL FERMILAB,
OR THE URA, OR THE U.S. DEPARTMENT of ENERGY, OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Liabilities of the Government:
This software is provided by URA, independent from its Prime Contract
with the U.S. Department of Energy. URA is acting independently from
the Government and in its own private capacity and is not acting on
behalf of the U.S. Government, nor as its contractor nor its agent.
Correspondingly, it is understood and agreed that the U.S. Government
has no connection to this software and in no manner whatsoever shall
be liable for nor assume any responsibility or obligation for any claim,
cost, or damages arising out of or resulting from the use of the software
available from this server.
Export Control:
All documents and software available from this server are subject to U.S.
export control laws. Anyone downloading information from this server is
obligated to secure any necessary Government licenses before exporting
documents or software obtained from this server.
*/
package org.dcache.resilience.data;

/**
* <p>Filter used specifically with cancel operations. The
* matching logic is slightly different.</p>
*/
public class FileCancelFilter extends FileFilter {
protected boolean matchesPool(String toMatch,
Integer operationValue,
PoolInfoMap map) {
if (operationValue != null &&
!map.isValidPoolIndex(operationValue)) {
/*
* since this is a cancel operation,
* if the pool does not exist, we should
* cancel it not matter what the toMatch value is.
*/
return true;
}
return super.matchesPool(toMatch, operationValue, map);
}
}
Expand Up @@ -67,7 +67,7 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
/**
* <p>Simple implementation of matcher.</p>
*/
public final class FileFilter implements FileMatcher {
public class FileFilter implements FileMatcher {
private Set<String> state;
private Set<String> pnfsids;
private String retentionPolicy;
Expand All @@ -80,9 +80,9 @@ public final class FileFilter implements FileMatcher {
private Integer opCount;
private boolean forceRemoval = false;

private static boolean matchesPool(String toMatch,
Integer operationValue,
PoolInfoMap map) {
protected boolean matchesPool(String toMatch,
Integer operationValue,
PoolInfoMap map) {
if (toMatch == null) {
return true;
}
Expand All @@ -91,12 +91,11 @@ private static boolean matchesPool(String toMatch,
return operationValue == null;
}

Integer filterValue = map.getPoolIndex(toMatch);
if (filterValue == null) {
if (!map.hasPool(toMatch)) {
return false;
}

return filterValue.equals(operationValue);
return map.getPoolIndex(toMatch).equals(operationValue);
}

@Override
Expand All @@ -109,6 +108,7 @@ public boolean isSimplePnfsMatch() {
return pnfsids == null ? false : pnfsids.size() == 1;
}

@Override
public boolean isUndefined() {
return (null == pnfsids || pnfsids.isEmpty()) &&
null == state &&
Expand Down
Expand Up @@ -653,7 +653,7 @@ private void submit(FileOperation operation) {
* only to the current (running) operation.
*/
public void cancel(PnfsId pnfsId, boolean remove) {
FileFilter filter = new FileFilter();
FileFilter filter = new FileCancelFilter();
filter.setPnfsIds(pnfsId.toString());
filter.setForceRemoval(remove);
cancel(filter);
Expand Down
Expand Up @@ -80,6 +80,7 @@ public final class PoolFilter implements FileMatcher, PoolMatcher {
private Long lastScanBefore;
private Long lastScanAfter;

@Override
public boolean isUndefined() {
return null == state &&
null == pools &&
Expand All @@ -90,6 +91,7 @@ public boolean isUndefined() {
null == lastScanAfter;
}

@Override
public boolean isForceRemoval() {
return parent;
}
Expand Down
Expand Up @@ -631,6 +631,24 @@ public Set<Integer> getValidLocations(Collection<Integer> locations,
}
}

public boolean hasPool(String pool) {
read.lock();
try {
return pools.contains(pool);
} finally {
read.unlock();
}
}

public boolean isValidPoolIndex(Integer index) {
String pool = null;
try {
pool = getPool(index);
} catch (NoSuchElementException e) {
}
return pool != null;
}

public boolean isPoolViable(Integer pool, boolean writable) {
read.lock();
try {
Expand Down
Expand Up @@ -607,7 +607,7 @@ public void update(PoolStateUpdate update) {
* the second operation will complete successfully.
*/
operation.task.cancel(null);
FileFilter fileFilter = new FileFilter();
FileFilter fileFilter = new FileCancelFilter();
fileFilter.setForceRemoval(true);
fileFilter.setParent(update.pool);
fileOperationMap.cancel(fileFilter);
Expand Down
Expand Up @@ -74,6 +74,7 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
import org.dcache.alarms.PredefinedAlarm;
import org.dcache.poolmanager.PoolMonitor;
import org.dcache.poolmanager.SerializablePoolMonitor;
import org.dcache.resilience.data.FileCancelFilter;
import org.dcache.resilience.data.FileFilter;
import org.dcache.resilience.data.FileOperationMap;
import org.dcache.resilience.data.PoolFilter;
Expand Down Expand Up @@ -290,14 +291,14 @@ private void cancelCurrentPoolOperation(String pool) {
PoolFilter poolFilter = new PoolFilter();
poolFilter.setPools(pool);
poolOperationMap.cancel(poolFilter);
FileFilter fileFilter = new FileFilter();
FileFilter fileFilter = new FileCancelFilter();
fileFilter.setParent(pool);
fileFilter.setForceRemoval(true);
fileOperationMap.cancel(fileFilter);
fileFilter = new FileFilter();
fileFilter = new FileCancelFilter();
fileFilter.setSource(pool);
fileOperationMap.cancel(fileFilter);
fileFilter = new FileFilter();
fileFilter = new FileCancelFilter();
fileFilter.setTarget(pool);
fileOperationMap.cancel(fileFilter);
}
Expand Down

0 comments on commit f060954

Please sign in to comment.