Curator is an automation tool designed to manage log files, ensuring efficient disk space usage and compliance with data retention policies. It scans specified directories, identifies old log files, and performs cleanup or archival actions based on user-defined configurations.
- Scan Directories: Recursively scans specified directories for log files.
- Delete Policy: Deletes log files older than a configurable age.
- Archive Policy: Compresses and archives log files older than a configurable age.
- Policy Definition: Policies are defined through a simple and flexible YAML configuration file.
- Command-Line Interface: Provides a CLI for manual execution or integration into scheduling systems (e.g., cron).
- Python 3.8+
-
Clone the repository (if installing from source, otherwise skip to step 3 for PyPI soon to be available)
git clone https://github.com/your-org/curator.git cd curator -
Install dependencies:
pip install -r requirements.txt
-
Install Curator:
pip install . # Or, if installed from PyPI: # pip install curator
Curator uses a config.yml (or similar) file to define its policies. A basic configuration typically includes log directories, retention ages, and archive settings.
Create a config.yml file in your desired location (e.g., in the same directory where you run Curator or specify its path).
# Example config.yml
log_policies:
- name: web_server_logs
path: /var/log/apache2
file_pattern: "*.log"
delete_after_days: 30
archive_after_days: 7
archive_path: /var/log/apache2/archive
compress_format: gz # Options: 'gz', 'bz2', 'xz', 'zip'
# exclude_pattern: "access.log"
- name: application_logs
path: /var/log/myapp
file_pattern: "*.log"
delete_after_days: 60
# No archival for this policy, only deletion.
- name: system_journal
path: /var/log/journal
file_pattern: "*.journal"
delete_after_days: 90
archive_after_days: 30
archive_path: /mnt/backup/journal_archives
compress_format: bz2
log_level: INFO # Options: DEBUG, INFO, WARNING, ERROR, CRITICALlog_policies: A list of policy definitions.name: A unique name for the policy.path: The absolute path to the directory to scan.file_pattern: A glob pattern (e.g.,*.log,app-*.txt) to match log files.delete_after_days: (Optional) Log files older than this age will be deleted. Set tonullor omit to disable deletion.archive_after_days: (Optional) Log files older than this age will be archived. Set tonullor omit to disable archival. Must be less than or equal todelete_after_daysif both are present.archive_path: (Required ifarchive_after_daysis set) The directory where archived files will be stored.compress_format: (Required ifarchive_after_daysis set) The compression format. Supported:gz,bz2,xz,zip.exclude_pattern: (Optional) A glob pattern for files to explicitly exclude from the policy.
log_level: (Optional) The logging level for Curator. Defaults toINFO.
Once installed and configured, you can run Curator from your command line.
curator --config /path/to/your/config.yml--config <path>: Specifies the path to the configuration YAML file. (Required)--dry-run: Performs a simulated run, showing what actions would be taken without actually modifying any files. (Optional)--log-level <level>: Overrides thelog_levelspecified in the configuration file. (Optional)
To see what Curator would do without making any changes:
curator --config /etc/curator/config.yml --dry-runTo run Curator daily at 2 AM using cron, add the following line to your crontab (crontab -e):
0 2 * * * /usr/local/bin/curator --config /etc/curator/config.yml >> /var/log/curator.log 2>&1(Ensure /usr/local/bin/curator is the correct path to your installed curator executable, or use $(which curator).)