Filereaper is a tool to remove files based on different and flexible policies.
It can work just as a command line executor and also being as a system cron manager.
Code documentation: http://filereaper.readthedocs.org/
Compatible with Python2.6 and Python2.7
- use python logger
- implement more policies
- Two installation modes
- Debian package: regular installation, you can get the latest package here: https://github.com/victorgp/filereaper/releases * (disclaimer: this package is directly installing Python modules with no virtual environment until i package it following the Debian guidelines for Python packages)
- Python egg: for development
- Filereaper can work in two different ways
As a command line executor:
Just executing filereaper by command line passing the corresponding parameters specifying file regexp, policies, etc.
Remove recursively files in /var/log/apache that matchs *.log keeping always a minimum of 2 files and removing the ones older than 20 days, being the files ordered by access time.
$ filereaper --keepminimum 2 --file_match ".*\.log" --recurse true --time_mode atime --older_than_d 20 --exclude_list main.log,main2.log --test_mode False /var/log/apache
As cron manager: (Under development)
This mode configures the Linux crontabs by specifying some configuration files. The idea is to have a configuration file per directory to clean.
By default, you will only need to add configuration files to /etc/filereaper/conf.d/ similar to the sample provided in conf directory: https://github.com/victorgp/filereaper/blob/master/conf/conf.d/module_config.sample
Filereaper will configure the system crontabs based on these configuration files, also, it has a storage layer so it remembers what is configured therefore the crontabs will always be in sync with the configuration files.
Whenever you add, modify or remove a configuration file, you just need to tell filereaper to reload:
- This is the list of parameters filereaper support (apart from policies):
test_mode Security parameter activated by default that prints the files to be removed instead of actually remove them. This is useful while testing your configuration/command line. Once you are sure it's correct, disable this parameter and filereaper will perform the removal.
Default: True (activated)
exclude_list Separated comma list with the files to exclude from the removal
Default: "" (empty)
file_match Only files that match this regexp
Default: ".*" (match everything)
file_groups_regexp You can group the files using this regexp so the removal will be applied per group.
For example, this is useful when you want to remove files keeping the the last 2 of each type, so if there exists these files:
supervisor_1.0_all.deb, supervisor_2.0_all.deb, supervisor_3.0_all.deb, python-meld3_0.4.5-3_amd64.deb, python-meld3_0.5.5-3_amd64.deb, python-meld3_0.6.5-3_amd64.deb
You can set this parameter as: "(.*?)_" and filereaper will apply the removal to two groups: "supervisor" and "python-meld3" therefore removing only: supervisor_1.0_all.deb and python-meld3_0.4.5-3_amd64.deb
keepminimum Setting this parameter assures filereaper will always keep at least N minimum files. This parameter has higher priority than any policy.
recurse Perform the removal recursively. The policies are applied to the full list of files gathered recursively.
remove_links If activated, filereaper removes symlinks. Only the symlink, not the file/dir contents were it points.
time_mode Parameter used to order the files, the possible values are atime (access time), ctime (creation time), mtime (modification time)
quiet Do not print anything neither to stdout nor to stderr
The policies are applied in the most restrictive way. This means removing the less as possible for security reasons. Therefore if a policy says to remove A and B files and another policy says to remove A and C, only A will be removed.
The method that performs the removal, first gathers all the files and then starts applyting the policies. Each policy can only either reduce or not modify the files to be removed, this list of files never grows up.
- Policies list:
- keeplast: Ordering by time (atime, ctime or mtime) filereaper will remove everything but the N last.
- older_than_d: Ordering by time (atime, ctime or mtime) filereaper will remove the files older than N days
- older_than_m: Ordering by time (atime, ctime or mtime) filereaper will remove the files older than N minutes
- older_than_s: Ordering by time (atime, ctime or mtime) filereaper will remove the files older than N seconds
If you want to contribute fixing bugs, improving the code, implementing more policies,... whatever! Just send a pull request. The code must be covered by tests and have documentation.
How to create a new policy?
You just need to create a class in src/filereaper/executor/policies/ that inherits BasePolicy and implements the "execute" method. There are already three parameters that you can use:
self.sorted_files -> the list of files sorted by "time_mode" so you can apply the policy over it self.params -> all params that you might need to implement your policy self.value -> the value of the policy parameter, for example, if the policy param is keeplast=2, self.value is 2
You will also need to add the corresponding new policy parameter to sbin/filereaper and update this documentation
You can use "python setup.py develop" to install the egg, you may also need to install the requirements with pip install -r requierements.txt.