Unused Block Deletion
Clone this wiki locally
A common question that comes up with s3backer is: suppose you have a normal "upper" filesystem on top of s3backer via the usual loopback mount, and you have created some huge file in that upper layer filesystem. Now suppose sometime later you delete that huge file. Why is Amazon still charging for storing all that data?
When filesystems "delete" a file, they don't actually zero out the block data; instead, they simply mark those blocks as unused in their meta-data structure on disk. So although the filesystem will know that these blocks as unused and available, s3backer, not knowing anything at all about what the upper layer filesystem is doing, will think they are still in use. So you will continue paying Amazon for them whether they are holding actual file content or not.
Because it is "upper layer agnostic", the only time s3backer knows a block is not used is when you write all zeroes to it. So one way to clean up your filesystem is to use a utility like
zerofree. However, this requires unmounting the upper filesystem and manually running the utility. It can also take a long time.
The ideal solution would be for the upper layer filesystem to somehow communicate to s3backer which blocks it no longer cares about. However, for this to work, the communication has to go through several layers:
- From the upper layer filesystem to it's block device (which in this case is really a loopback mount)
- From the loopback "block device" to the underlying filesystem file on which it is mounted (in this case s3backer's
- From the underlying filesystem file to the filesystem containing it (in this case FUSE)
- From the FUSE filesystem to the s3backer process
Fortunately, this is now possible in Linux as all the links in the chain exist. However, you have to have new enough versions of everything.
Here is what is required to get this working. The requirements below correspond to the links in the chain listed above:
- The "upper" filesystem must support the
TRIMblock device operation. Examples include ext4, Btrfs (Linux >= 3.7), and XFS. For ext4 and Btrfs, you must also mount with the
- Loopback device support for
TRIMrequires Linux >= 3.2.
- FUSE filesystem support for
FALLOC_FL_PUNCH_HOLErequires Linux >= 3.5
- FUSE API support for
fallocate()requires FUSE >= 2.9.2 and s3backer >= 1.3.4.
So the bottom line is:
- s3backer >= 1.3.4
- FUSE >= 2.9.2
- Linux kernel >= 3.5
- A supporting "upper layer" filesystem (mounted with
-o discardif ext4 or Btrfs).
Check the following things:
uname -rshould show kernel version 3.5.0 or higher
- When you
./configures3backer you should see this message:
checking for fallocate() support in fuse... yes
fallocate -np -l 1024 testfileinside a directory in your "upper layer" filesystem should not generate an error
Linux provides a tool
fstrim which scans through mounted filesystems and discard unused blocks from filesystems. This tool can be used to "scrub" unused blocks from your s3backer bucket.
Note however that some Linux distributions (e.g., openSUSE) will by default configure
fstrim to run automatically every week or whatever. If your upper filesystem is mounted with
-o discard then you probably want to disable the automatic invocation, otherwise
fstrim will cause a zillion useless DELETE requests to be sent to Amazon.