Skip to content

Commit

Permalink
core: don't contact pools for regulary expired pins
Browse files Browse the repository at this point in the history
Motivation:

Pin-manager keeps a list on pins in different states in its pins database table. This is necessary
because unpinning can be manually requested ahead of the originally specified pin lifetime, in which case
the pool needs to be contacted for removal -- otherwise pools will remove expired sticky bits on their own.
Extending a pin's lifetime is also forwarded to and set in the pool.

Currently, pin-manager regularly expires pin entries in the database from state PINNED to READY_TO_UNPIN
when their lifetime expires. These pins in state READY_TO_UNPIN are later removed by another
process that contacts the pool where the replica is pinned and later removed the DB entry.

Contacting the pool is unnecessary for regular pin expirations, however. These pin entries can directly
be removed from the pins table and thus don't need to be limited to the chunked unpinning that aims
to limit the messaging burden.

The same is true for pins that expire in state PINNING: if the requests are still ongoing,
they will be cancelled by PoolManager who is also aware of the timeout.

Modification:

The regular pin expiration task sets the 'pool' field to null, indicating that a pool does not
need to be contaced for removal of the corresponding sticky bit.
These pin database entries in state READY_TO_UNPIN withut a pool are removed before of the
chunked pin expiry that necessitates contacting the pool.

Result:

Pin-manager only contacts a pool when necessary.
When it does not need to do so, pin removal should be much faster as these removals
are no longer controlled by the property `pinmanager.max-unpins-per-run`.

Target: master
Requires-notes: yes
Requires-book: yes
Patch: https://rb.dcache.org/r/14002/
Acked-by: Dmitry Litvintsev, Tigran Mkrtchyan
  • Loading branch information
lemora committed Jun 14, 2023
1 parent 0277178 commit a8e0c4f
Show file tree
Hide file tree
Showing 7 changed files with 69 additions and 25 deletions.
51 changes: 38 additions & 13 deletions docs/TheBook/src/main/markdown/config-pinmanager.md
@@ -1,33 +1,52 @@
THE PINMANAGER SERVICE
==================================

The purpose of the `pinmanager` service in dCache is ensuring the presence of a file replica on disk.
The purpose of the `pinmanager` service in dCache is ensuring the presence of a file replica on
disk.

It can be used by explicitly _(un)pinning_ files via the `admin` interface or the `bulk service`, but is also used by the `resilience` component to ensure having a certain number of replicas available. PinManager is also used for keeping a replica _online_ after fetching (staging) it from a connected tertiary storage system (sometimes called HSM - hierarchical storage manager), if staging is allowed.
It can be used by explicitly _(un)pinning_ files via the `admin` interface or the `bulk service`,
but is also used by the `resilience` component to ensure having a certain number of replicas
available. PinManager is also used for keeping a replica _online_ after fetching (staging) it from a
connected tertiary storage system (sometimes called HSM - hierarchical storage manager), if staging
is allowed.

-----
[TOC bullet hierarchy]
-----

## (Un-)Pinning Concept

A `pin`, also called `sticky`-ness, is a concept describing a file replica on a pool that cannot be deleted for a certain duration. The pin effectively suppresses automatic garbage collection for the lifetime of the pin.
A `pin`, also called `sticky`-ness, is a concept describing a file replica on a pool that cannot be
deleted for a certain duration. The pin effectively suppresses automatic garbage collection for the
lifetime of the pin.

Pins may have a finite or infinite lifetime. Pins also have an owner, which may be a dCache service (such as `resilience`) or client through a protocol such as `srm`. Only the owner is allowed to remove unexpired pins.
Several pins (for different users) can exist for the same `pnfsid`, and a file is considered pinned as long as at least one unexpired pin exists.
Pins may have a finite or infinite lifetime. Pins also have an owner, which may be a dCache
service (such as `resilience`) or client through a protocol such as `srm`. Only the owner is allowed
to remove unexpired pins. Several pins (for different users) can exist for the same `pnfsid`, and a
file is considered pinned as long as at least one unexpired pin exists.

## The Pin Life Cycle

When a pin is created, it will initially appear in state `PINNING`, then transition to state `PINNED` once the attempt is successful.
When a pin is created, it will initially appear in state `PINNING`, then transition to
state `PINNED` once the attempt is successful.

When a pin either has a finite lifetime that has expired or is directly requested to be removed, it is put into state `READY_TO_UNPIN`. An 'unpinning' background task runs regularly (default every minute), which selects a certain number of pins (default 200) in state `READY_TO_UNPIN` and attempts to remove them, during which the pins are in state `UNPINNING`.

On success, the pin is deleted from the pool in question as well as the database, on failure the pin is put into state `FAILED_TO_UNPIN`. Another background process regularly (default every 2h) resets all pins in state `FAILED_TO_UNPIN` back to state `READY_TO_UNPIN` in order to make them eligible to be attempted again.
When a pin either has a finite lifetime that has expired or is directly requested to be removed, it
is put into state `READY_TO_UNPIN`. An 'unpinning' background task runs regularly (default every
minute), which directly removes all pins in state `READY_TO_UNPIN` that don't require pool contact,
then selects a certain number of pins (default 200) in state `READY_TO_UNPIN` for which the
corresponding pool needs to be contacted and attempts to remove them as well. During unpinning the
pins are in state `UNPINNING`.

On success, the pin is deleted from the pool in question as well as the database, on failure the pin
is put into state `FAILED_TO_UNPIN`. Another background process regularly (default every 2h) resets
all pins in state `FAILED_TO_UNPIN` back to state `READY_TO_UNPIN` in order to make them eligible to
be attempted again.

## Configuring

The PinManager service can be run in a shared domain. It may also be deployed in high availability (HA) mode (coordinated via [ZooKeeper](config-zookeeper.md)) by having several PinManager cells in a dCache instance, which then need to share the same database and configuration.
The PinManager service can be run in a shared domain. It may also be deployed in high availability (
HA) mode (coordinated via [ZooKeeper](config-zookeeper.md)) by having several PinManager cells in a
dCache instance, which then need to share the same database and configuration.

```
pinmanager.db.host=pinman-db-hostname
Expand All @@ -38,8 +57,14 @@ pinmanager.db.user=dcache

Pins are managed in this central database as well as on the pools containing the replicas.

Pin expiration and pin unpinning are background tasks which are executed regularly. The property `pinmanager.expiration-period` controls how often to execute these tasks. The default value is 60 seconds.
Pin expiration and pin unpinning are background tasks which are executed regularly. The
property `pinmanager.expiration-period` controls how often to execute these tasks. The default value
is 60 seconds.

The number of pins that should at most be attempted to be removed per unpinning task run can be configured with the property `pinmanager.max-unpins-per-run` and default to 200. A value of -1 indicates that there is no limit on the number of pins the `PinManager` will attempt to unpin per run, which might lead to large CPU and memory loads if there are many pending unpin operations.
The number of pins that should at most be attempted to be removed and necessitate pool contact per
unpinning task run can be configured with the property `pinmanager.max-unpins-per-run` and default
to 200. A value of -1 indicates that there is no limit, which might lead to large CPU and memory
loads if there are many pending unpin operations.

Another background task takes care of resetting pins that previously failed to be removed. It can be configured via `pinmanager.reset-failed-unpins-period` and defaults to 2h.
Another background task takes care of resetting pins that previously failed to be removed. It can be
configured via `pinmanager.reset-failed-unpins-period` and defaults to 2h.
11 changes: 8 additions & 3 deletions modules/dcache/src/main/java/org/dcache/pinmanager/JdbcDao.java
@@ -1,7 +1,7 @@
/*
* dCache - http://www.dcache.org/
*
* Copyright (C) 2016 - 2020 Deutsches Elektronen-Synchrotron
* Copyright (C) 2016 - 2023 Deutsches Elektronen-Synchrotron
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
Expand Down Expand Up @@ -179,6 +179,7 @@ protected void addClause(String clause, Object... arguments) {
predicate.append(" AND ");
}
predicate.append(clause);

this.arguments.addAll(asList(arguments));
}

Expand Down Expand Up @@ -249,8 +250,12 @@ public JdbcPinCriterion stateIsNot(Pin.State state) {
}

@Override
public JdbcPinCriterion pool(String pool) {
addClause("pool = ?", pool);
public JdbcPinCriterion pool(@Nullable String pool) {
if (pool == null) {
addClause("pool IS NULL");
} else {
addClause("pool = ?", pool);
}
return this;
}

Expand Down
Expand Up @@ -235,8 +235,7 @@ private boolean containsPin(Collection<Pin> pins, String sticky) {
return message;
}

public PinManagerExtendPinMessage
messageArrived(PinManagerExtendPinMessage message)
public PinManagerExtendPinMessage messageArrived(PinManagerExtendPinMessage message)
throws CacheException, InterruptedException {
try {
Pin pin = _dao.get(_dao.where().pnfsId(message.getFileAttributes().getPnfsId())
Expand Down
Expand Up @@ -125,7 +125,7 @@ interface PinCriterion {

PinCriterion stateIsNot(Pin.State state);

PinCriterion pool(String pool);
PinCriterion pool(@Nullable String pool);

PinCriterion sticky(String sticky);

Expand Down
Expand Up @@ -108,7 +108,9 @@ private void markAllExpiredPinsReadyToUnpin() {

/**
* This task transitions all pins that have exceeded their lifetime and are in state PINNING or
* PINNED to state READY_TO_UNPIN.
* PINNED to state READY_TO_UNPIN. It removes the pool, which expires pins on its own and does
* not need to be contacted for regular expiries. As PoolManager is aware of the timeout for
* pins in state PINNING, it should also delete the request on its own if it is still ongoing.
*/
private class ExpirationTask implements Runnable {

Expand All @@ -123,8 +125,9 @@ public void run() {
.stateIsNot(READY_TO_UNPIN)
.stateIsNot(UNPINNING)
.stateIsNot(FAILED_TO_UNPIN),
dao.set().
state(READY_TO_UNPIN));
dao.set()
.state(READY_TO_UNPIN)
.pool(null));
} catch (JDOException | DataAccessException e) {
LOGGER.error("Database failure while expiring pins: {}",
e.getMessage());
Expand Down
Expand Up @@ -63,6 +63,9 @@ public void run() {
Executors.newSingleThreadExecutor());
NDC.push("BackgroundUnpinner-" + _count.incrementAndGet());
try {
// Fist try to unpin all poolless pins from the DB, for which chunking is unnecessary.
unpin_poolless();

Semaphore idle = new Semaphore(MAX_RUNNING);
unpin(idle, executor);
idle.acquire(MAX_RUNNING);
Expand All @@ -81,6 +84,15 @@ public void run() {
}
}

@Transactional
protected void unpin_poolless() {
PinDao.PinCriterion criterion = _dao.where().state(READY_TO_UNPIN).pool(null);
int deleted = _dao.delete(criterion);
if (deleted > 0) {
LOGGER.debug("Deleted {} poolless pin(s) from the database.", deleted);
}
}

@Transactional
protected void unpin(final Semaphore idle, final Executor executor)
throws InterruptedException {
Expand Down
6 changes: 3 additions & 3 deletions skel/share/defaults/pinmanager.properties
Expand Up @@ -172,9 +172,9 @@ pinmanager.reset-failed-unpins-period=2

# ---- Unpinning operations per task execution
#
# Pin unpinning is a background tasks. This property
# controls how many unpin operations should at most be processed
# per task execution. Use -1 for no limit.
# Pin unpinning is a background task. This property controls how many unpin
# operations that require contacting pools should at most be processed
# per unpin task execution. Use -1 for no limit.
#
pinmanager.max-unpins-per-run=200

Expand Down

0 comments on commit a8e0c4f

Please sign in to comment.