clueFS — a tool for tracing I/O activity at the file system level
cluefs is a lightweight utility to collect data on the I/O events induced by an application when interacting with a file system. It emits detailed, machine-parseable data on every file system-level operation.
The trace information emitted by this utility is meant to be analysed using tools not included in this package. You can find a collection of such tools in a separate project.
The main goal of developing this utility is to observe and quantify the file I/O load induced by the software system being developed by the LSST data management team to process the data to be collected by the Large Synoptic Survey Telescope (LSST).
cluefs does not depend on LSST software system and can be used in several unrelated contexts. It may also be useful for other use cases, such as to get an overall understanding of how file systems work or to observe the (usually hidden and unexpected) operations performed when you mount a file system on your computer.
How to use
Let's suppose you want to observe what file operations the command
cat $HOME/data/hello.txt induces on the file system where the file
hello.txt is actually located. You can use
cluefs to expose the contents under the directory
$HOME/data (the shadow directory) through a synthesized file system mounted on
/tmp/trace. To mount the file system use the command:
$ cluefs --shadow=$HOME/data --mount=/tmp/trace &
Once the file system is successfully mounted, when an application accesses a file or directory under
cluefs emits an event for every call to the file system (e.g.
close, etc.). For instance, the command:
$ cat /tmp/trace/hello.txt
cluefs emit the events below (one event per line):
... 2015-07-10T13:14:13.066799456Z,2015-07-10T13:14:13.066854171Z,54715,fabio,1000,fabio,1000,/bin/cat,28997,/home/fabio/data/hello.txt,file,open,O_RDONLY,0000,14,4096,58 2015-07-10T13:14:13.067274118Z,2015-07-10T13:14:13.067287085Z,12967,fabio,1000,fabio,1000,/bin/cat,28997,/home/fabio/data/hello.txt,file,read,14,0,4096,14,58 2015-07-10T13:14:13.067602625Z,2015-07-10T13:14:13.069215159Z,1612534,fabio,1000,fabio,1000,/bin/cat,28997,/home/fabio/data/hello.txt,file,flush,O_RDONLY,14,58 2015-07-10T13:14:13.069899802Z,2015-07-10T13:14:13.0699212Z,21398,root,0,root,0,,0,/home/fabio/data/hello.txt,file,release,58 ...
To get detailed help on how to use this utility, including examples of usage, do:
$ cluefs USAGE: cluefs --mount=<directory> --shadow=<directory> [--out=<file>] [(--csv | --json)] [--ro] cluefs --help cluefs --version Use 'cluefs --help' to get detailed information about options and examples of usage.
When you are done collecting the trace information you want, you can unmount the file system created by
cluefs with the command:
$ sudo umount /tmp/trace
cluefs emits event records formatted in CSV or JSON. The format of each record is documented here.
How to install
This utility is tested on Scientific Linux v6 and v7, Ubuntu v14.04, CentOS v7 and MacOS X v10.9. It is possible
cluefs also works on other systems or other versions of those operating systems where its dependencies are satisfied (see below).
cluefs you need Filesystem in Userspace (FUSE) installed on your system. To to that, please follow the installation instructions for your operating system according in the table below:
|To install FUSE on ...||... follow the instructions below|
|Scientific Linux, CentOS||
|MacOS X||install the latest stable version of FUSE for OS X|
In addition, if you intend to build this software from sources you need both:
- the Go programming language tool chain, and
- a C compiler.
To install the Go tool chain please follow these detailed instructions. To install a C compiler please refer to the table below:
|To install C compiler on ...||... follow the instructions below|
|Scientific Linux, CentOS||
|MacOS X||download and install Xcode, including its command line tools|
The recommended way to install this tool is to download one of the ready-to-use binary files available for your target execution platform. Those are self-contained executable files so you only need to download, unpack and you are ready to start using the tool.
Alternatively, to build from sources do:
go get -u github.com/airnandez/cluefs
How this utility works
cluefs implements a synthesized file system which exposes all the files and directories existing on the underlying shadow file system. It intercepts each system call (e.g.
read, etc.), emits a trace event about the call and forwards the operation to the appropriate file system for execution.
cluefs collects the result of the operation and returns it to the calling application.
Although special attention has been given to make this utility as lightweight as possible, it is not intended to be permanently run in heavy-load I/O environments as there is an intrinsic non-zero performance penalty.
Currently, lock-related file system operations are not supported by
cluefs. That is, it does not emit traces for those operations and makes them appear as unsupported by the file system. These are the operations induced by calling the
fcntl(3) file system call using as second argument any of the values
You can contribute
Your contribution is more than welcome. There are several ways you can help:
- Test this software on your particular environment and let us know how it works. If it does not work for you and you think it should, please provide all the relevant details when opening a new issue
- If you find a bug, please report it by opening an issue
- If you spot a defect either in this documentation or in the source code documentation we consider it a bug so please let us know
- Providing feedback on how to improve this software by opening an issue
The items in our to-do list are documented separately.
Although we have payed a lot attention to make this utility as reliable as possible, it is still experimental and surely contains undiscovered bugs that may adversely affect your data.
In particular, please note that
cluefs does not protect you against any destructive operation you can normally perform on your data. Use it at your own risk.
This software was developed and is maintained by Fabio Hernandez at IN2P3 / CNRS computing center (Lyon, France).
This work is based in other people's work, including:
Copyright 2015 Fabio Hernandez
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.