Skip to content

Introduction to Xrootd N2N for Disk Caching Proxy (Xcache) utilizing RUCIO metalink

Wei Yang edited this page Oct 25, 2019 · 34 revisions

Update 2019-08-02: add "pss.ccmlib ..." line to the config file for release 1.2 and above

What can Xcache and rucioN2N-for-Xcache do for me?

Xcache with rucioN2N-for-Xcache plugin allows users to access RUCIO managed ATLAS data via either one the following two URLs. Xcache will cache the accessed data.

root://xcache-host//atlas/rucio/scope:file
root://xcache-host//root://somehost.cern.ch//path_to_atlassomedisk/rucio/scope/fa/6b/file

(In the second form, the data URL is known to user: root://somehost.cern.ch//..., but we prefix the URL with root://xcache-host// in order to cache the data. This second form also work for data files in non-RUCIO managed path and storage)

For ATLAS, the RUCIO scope can be "user.me", "group.susy", "mc16_13TeV", "data15_13TeV", etc. The rucioN2N-for-Xcache plugin understands that they are the same file and share a single cache entry for it

Deploy Xcache and this plugin via Singularity

Xcache and the rucioN2N-for-Xcache plugin can be deployed via Singularity. If you choose to deploy this way, you need some basic knowledge of using Singularity. But you probably don't need to read the rest of this Wiki.

What is this plugin

rucioN2N-for-Xcache is a xrootd plugin module that will identify multiple copies of a distributed file based on ATLAS RUCIO data management system. It allows xrootd to use one cache entry for all of them. For example, the following files are considered to be the same:

/disney/rucio/scope/fa/6b/file
root://bronco.stanford.edu//legoland/rucio/scope/fa/6b/file
root://donkey.ucsd.edu//seaworld/rucio/scope/fa/6b/file
/atlas/rucio/scope:file

The plugin will return to the xrootd the following file name for caching:

/atlas/rucio/scope/fa/6b/file

fa and 6b come from command echo -n scope:file | md5sum | cut -b1-4

In the last case /atlas/rucio/scope:file which is a gLFN, the plugin will query the RUCIO server to get a metalink file of all possible file replicas (that support the root protocol).

The plugin also allows and caches non-RUCIO files.

Build a rpm from source

Refer to the above for requirement to build a rpm. Download tar ball from release 1.0 and run command

rpmbuild -tb rucioN2N-for-Xcache-1.0.tar.gz

The built rpm can be found at $HOME/rpmbuild/RPMS/x86_64/xrootd-rucioN2N-for-Xcache-*.rpm. The "debuginfo" rpm is for debugging only.

Compile from the source code

(if you don't want to use the above rpm method ...)

The source code has been tested to compile with Xrootd 4.10.0 and above. It is supported on RHEL6 and CentOS7 x86_64 platform or equivalent.

  • Make sure xrootd-devel, xrootd-server-devel, xrootd-client-devel rpms (4.10.0 and above) are installed (refers to the xrootd.org web page).
  • Install libcurl-devel rpm
  • Download the the source tar ball from github and expand it.
  • Type make. This will create the plugin XrdName2NameDCP4RUCIO.so.

A simple example of the Xrootd proxy cache

Pre-requisites

This plugin depends on features available in Xrootd release 4.10 and above.

To run a Xrootd proxy cache, choose a decent machine with adequate CPU, memory, network, and local disks for your need (network and local disks, especially the disk IOPS are the main things to consider). Make the above plugin .so available on the machine. Then install the following rpms:

xrootd xrootd-libs xrootd-server xrootd-cient-libs libssl libcurl

The proxy cache will likely work much better with TCMalloc (RHEL6 rpms: gperftools-devel and gperftools-libs; CentOS7 meta rpms: gpreftools).

On RHEL6 x86_64 platform, add the following in /etc/sysconfig/xrootd

export LD_PRELOAD=/path/to/libtcmalloc.so
export TCMALLOC_RELEASE_RATE=10
export XRD_METALINKPROCESSING=1
export XRD_LOCALMETALINKFILE=1
export XRD_STREAMERRORWINDOW=0

On CentOS7 x86_64 platform, add the following in /etc/systemd/system/multi-user.target.requires/xrootd@default.service (assuming @default instance is used), and then run systemctl daemon-reload

[Service]
Environment=LD_PRELOAD=/usr/lib64/libtcmalloc.so
Environment=TCMALLOC_RELEASE_RATE=10
Environment=XRD_METALINKPROCESSING=1
Environment=XRD_LOCALMETALINKFILE=1
Environment=XRD_STREAMERRORWINDOW=0

If the x509 related setup are not in standard locations, then one also need to define Unix environment X509_CERT_DIR, X509_VOMS_DIR, X509_USER_PROXY in /etc/sysconfig/xrootd or xrootd@default.service.

There is a Github wiki page with an example of clustering Xcache with this N2N

Configure a Xrootd proxy cache with the plugin

Please refers to the Xrootd Proxy Cache Configuration page for up-to-dated info. The following is only a simple example.

Assuming there are two local disks mounted at /xrdcache1 and /xrdcache2, and directory /xrd/namespace and /xrd/xrdcinfos exist on the machine to host xrootd namespace, caching info, etc. All directories should be writable by user "xrootd" (created by the rpms).

The following example /etc/xrootd/xrootd-clustered.cfg setup a simple xrootd proxy cache, and use this plugin:

all.export   /atlas/rucio r/o
all.export   /root:/
all.export   /xroot:/

xrootd.async maxtot 16384 limit 32
all.adminpath /var/spool/xrootd
all.pidpath /var/run/xrootd

# both /xrd and /xrdcache* should be owned by the user that runs xrootd.
oss.localroot  /xrd/namespace
oss.space meta /xrd/xrdcinfos
# /xrdcache1 is the mount point of disk space used as cache. So is /xrdcache2, etc. if you have more then one disks.
oss.space data /xrdcache1
oss.space data /xrdcache2
oss.path /atlas/rucio r/w

# pfc.ram is the total amount of RAM xrootd can use 
pfc.ram 64g
# pfc.diskuage low-water-mark high-water-mark are total space of /xrdcache1 and /xrdcache2
pfc.diskusage 8000g 9000g

pfc.spaces data meta
pfc.blocksize 1M
pfc.prefetch 0
pfc.trace info

ofs.osslib  /usr/lib64/libXrdPss.so

pss.origin localfile:1094
pss.cachelib /usr/lib64/libXrdFileCache.so
pss.config streams 512
pss.trace debug

pss.namelib -lfncache -lfn2pfn /usr/lib64/XrdName2NameDCP4RUCIO.so
pss.ccmlib /usr/lib64/XrdName2NameDCP4RUCIO.so

Before you can run anything, the uid that runs the xrootd process needs to have a valid X509 proxy with ATLAS VOMS attributes. By default this file is /tmp/x509up_u$(id -u). Please supply this file and keep it updated.

After creating the config file and the x509 proxy, user root can start the service with command:

service xrootd start.

Miscellaneous

Add the following to /etc/sysconfig/xrootd (or equivalent on RHEL7) to print out additional info in Xrootd log file.

export XRDPOSIX_DEBUG=1

Xrootd proxy cache can use many threats and file descriptors. Make sure you allow these in /etc/security/limits.d/somefile.conf, something like the following:

* soft nofile 32768
* hard nofile 32768
* soft nproc   8192