Skip to content

CPIO testing

Eric Vaandering edited this page May 26, 2022 · 9 revisions

Motivation

Fermilab has hundreds of petabytes written with Enstore. Most of this data is in CPIO format where each file of interest is stored on tape as a separate file and the file contents are wrapped inside a CPIO header. Enstore uses the "Old Binary Format" of CPIO described here: https://www.systutorials.com/docs/linux/man/5-cpio/

Increasingly functional tests and modifications of CTA are being carried out to get CTA to read this file format (as we currently understand it, there is no need to get CTA to write in this format).

In another difference from CTA, file headers and footers are not used on the tape. So the tape file position of the Nth file in CTA is roughly 3N (header, file, footer) and N in CTA (CPIO file).

Past (and passed) tests

  1. Our first test was to modify the File class in the underlying CASTOR code to both read and write CPIO formatted files, but only for very short files which fit inside the 256 kB buffer size of CASTOR. This was simply taking the file information provided by CASTOR, filling the appropriate fields in the CPIO format, adding a CPIO header and footer to the CASTOR buffer and writing that out to a virtual (MHVTL tape).

    On reading, the CPIO file is read back, the CPIO header information is accessed and that information is used to return a (short) file back to CASTOR. This only worked for small buffers. However, EOS/CTA checks the file size and checksum of the returned data against what is expected; only successfully returning the same data allows the test to pass.

  2. The code for extracting the original file was modified to extract a file which spans multiple 256 kB buffers. Class variables were added to the file reading class to allow the reader to know what kind of buffer it has. (With header, with header and trailer, with trailer, only trailer, or only file content). Each of these is possible depending on how the content and buffer boundaries overlap. This test currently passes for small files (everything fits in a 256 kB buffer). Tests with longer files require a different mechanism to create them.

  3. As an intermediate test we did some direct writing and reading from MHVTL with dd and with copying the MHVTL data/indx/meta files. Writing files directly with dd works but in the first attempt we must not have had the "tape" positioned to the correct place because we ended up with too few files. Copying in the data/indx/meta files from tape 7 to tape 6 and reading tape 6 back with DD gives the same results as reading tape 7 initially.

  4. Our next test was to recreate a CTA tape with one file where CTA has only been used to label and register the tape. (We actually overwrite the label). The single file has been written to "tape" using MHVTL and some python classes to mimic the CASTOR tape labels, etc. CTA metadata was placed into the CTA database with python SQLAlchemy methods. And files are placed into the EOS namespace with a modified version of the CASTOR to CTA migration code (migration/gRPC/EosImportFiles.cpp, C++ and Oracle based) which determines which files to place with a CSV file rather than a database. We've built a migration container to contain all the code necessary to do this and check it

    In more detail:

    • Making a file placeholder in EOS. The Castor migration code in CTA was repurposed to read a CSV file and make file entries in EOS which indicate that the file is on tape. The Castor migration works by reading SQL from Castor tables and writing it to temporary tables in the database, so we chose the CSV route instead.
    • Reading/writing actual files with MHVTL was more or less understood from test 3 above. We just need to get the ordering and file marks exactly right
    • By running a simple one-file archive and recall test, the CTA database format is well enough understood to fake out the CTA database. One field in the database was particularly troublesome to recreate, the checksum_blob which turns out to be a Google protobuf representation of an array of different kinds of checksums. Python code has been written to reproduce this.

    This code will likely form the basis of our eventual Enstore to CTA migration code

  5. We switched to a modified version of the DESY code and built on the mechanism from test #4. In the first test, we wrote an "Enstore tape" into MHVTL with python code (no more CASTOR headers), a slightly different VOL1 label, still just one small file. We were able to read this back successfully.

  6. We moved on to multiple files including files larger than the block size and quickly discovered a problem. CTA uses block positioning but Enstore does not store the block position on the tape. The CTA code was further hacked to force file-by-file (file mark skipping) positioning for Enstore labeled tapes. At that point, reading back files placed on "Enstore tapes" with MHVTL was successful.

Plan

  1. For the code, we will wait for the DESY patches for reading OSM/CPIO files to be accepted into the CTA code. At that point the final modifications for Enstore tape reading should be fairly easy to make. (Right now our code which reads Enstore tapes cannot read anything else.)

  2. Still working on details for using the logical library which is part of the IBM TS4500

Clone this wiki locally