BucketFS is a transactional FUSE filesystem for object storage. It mounts one or more buckets from a config file. Each bucket exposes a read-only namespace backed by a local cache, while a configured workdir is prefetched and bind-mounted as the only writable area. Paths outside the workdir are represented by zero-size placeholders created on directory listing; the real object is downloaded on first open/read.
All modifications stay local until the running process receives a stop signal
(for example, systemctl stop), which unmounts and uploads regular files from
the workdir back to S3. This keeps local access fast, downloads only the data
you need, and uploads only when you want it.
- Local POSIX filesystem backed by a cache directory
- Lazy object download on first open/read (outside workdir)
- Prefetched workdir with upload on shutdown
- Deterministic overwrite semantics
- Designed for workflows (Nextflow, HPC, batch jobs)
- A distributed POSIX filesystem (all POSIX ops are local to the cache)
- A shared multi-writer filesystem
- A live write-through S3 mount
- A replacement for NFS
Each bucket uses its own cache root (from cachepath + bucket name):
cache_root/
├── .<bucket>/ # Local object cache
├── .<bucket>.index # Cache index
└── .<bucket>.pid # Daemon PID
All POSIX operations operate on the local cache. S3 is only touched for lazy fetch (outside workdir) and explicit shutdown (upload).
sudo systemctl start bucketfs.serviceThe systemd unit reads /etc/bucketfs/bucketfs.conf and logs to
/var/log/bucketfs.log.
What happens:
- Local cache and mount path are prepared per bucket
- Workdir (if configured) is prefetched per bucket
- FUSE filesystem is mounted per bucket (one thread per bucket)
- After mounts are ready, the workdir is bind-mounted from cache into the bucket namespace
- Process runs until it receives a stop signal
sudo systemctl stop bucketfs.serviceWhat happens:
- SIGTERM triggers shutdown in the main process
- Unmounts workdir bind mounts
- Stops FUSE loops and unmounts FUSE mounts
- Uploads regular files under the workdir cache back to S3 (skipped if no workdir)
No deletes are performed on S3. Non-regular files are skipped.
bucketfs.conf is an INI file with one section per bucket:
[nf-data]
bucket = nf-data
aws_access_key_id = AKIA...
aws_secret_access_key = ...
region = us-east-1
endpoint = https://s3.amazonaws.com
workdir = /nf-data/workdir
#lastuploadfile = .exitcode
[ngi-igenomes]
bucket = ngi-igenomes
aws_access_key_id = AKIA...
aws_secret_access_key = ...
region = us-east-1
endpoint = https://s3.amazonaws.com
#workdir = /ngi-igenomes/workdirNotes:
- If
workdiris unset, the mount is read-only. workdirmust be inside/<bucket>or it will be ignored.cachepathdefaults to/tmp/bucketfsif unset.lastuploadfiledefaults to.exitcodeif unset.
Reads:
- Outside workdir:
lscreates zero-size placeholders;open/readdownloads the object to cache before serving - Inside workdir: served from the local cache (prefetched at mount)
Writes (local overlay):
- Allowed only inside workdir
- Written only to local cache (S3 untouched until shutdown)
Other POSIX operations:
mkdir,unlink,rmdir,rename,chmod,chown,utimens,statfs,readlink,symlink,link,truncateall operate on the local cacheopen,write,link, andsymlinkwill lazily download missing source objects before acting (outside workdir)
These rules are intentional and strict:
- On read miss: S3 overwrites local placeholder
- On shutdown: local overwrites S3 for regular files under workdir
- No merge
- No conflict resolution
- No remote deletes
workdirmust include/<bucket>(ignored otherwise)workdircannot be/- Prefetch failures abort the mount (no FUSE start)
This folder includes a prebuilt Debian package for amd64:
sudo dpkg -i bucketfs_amd64.deb
sudo apt-get -f installThen create your config (start from conf.sample) and place it at:
/etc/bucketfs/bucketfs.conf
Enable and start the service:
sudo systemctl enable --now bucketfs.servicesudo systemctl start bucketfs.service
# Run pipeline
nextflow run pipeline.nf --input /nf-data/data
# Sync and unmount
sudo systemctl stop bucketfs.service- Credentials are stored in
bucketfs.conf(AWS-style fields) - FUSE daemon does not need credentials after start
- Shutdown happens in the running process when it receives SIGTERM
- Not safe for concurrent writers across machines
- Writes are limited to the configured workdir
- No remote deletes (uploads only)
BucketFS gives you:
- POSIX where it matters
- Explicit uploads
- Deterministic behavior
- Zero hidden magic
Apache-2.0
- Diagrams
- Nextflow-specific section
- Eviction policy docs
- Security hardening notes