Clone this wiki locally
What is Janitor Monkey?
Janitor Monkey is a service which runs in the Amazon Web Services (AWS) cloud looking for unused resources to clean up. The design of Janitor Monkey is flexible enough to allow extending it to work with other cloud providers and cloud resources. The service is configured to run, by default, on non-holiday weekdays at 11 AM. The schedule can be easily re-configured to fit your business' need.
Janitor Monkey determines whether a resource should be a cleanup candidate by applying a set of rules on it. If any of the rules determines that the resource is a cleanup candidate, Janitor Monkey marks the resource and schedules a time to clean it up. The design of Janitor Monkey also makes it simple to customize the set of rules or to add new ones.
Since there can always be exceptions when you want to keep an unused resource longer, before a resource is deleted by Janitor Monkey, the owner of the resource will receive a notification a configurable number of days ahead of the cleanup time. This is to prevent a resource that is still needed from being deleted by Janitor Monkey. The resource owner can then flag the resources that they want to keep as exceptions and Janitor Monkey will leave them alone.
Why Run Janitor Monkey?
One of the great advantages of moving from a private datacenter into the cloud is that you have quick and easy access to nearly limitless new resources. To push out a new application release you can quickly build up a new cluster, or when you need more disk just attach a new volume, to backup your data just make a snapshot, to test out a new idea just create new instances and get to work. The downside of this flexibility is that it is pretty easy to lose track of the cloud resources that are no longer needed. Perhaps you forgot to delete the cluster with the previous version of your application, or forgot to destroy the volume when you no longer needed the extra disk. Taking snapshots is great for backups, but do you really need those from 12 months ago? It's not just forgetfulness that can cause problems, for example network errors can cause your request to delete an unused volume to get lost.
Oftentimes there are unused resources costing cloud users money, and we needed a solution to rectify this problem. Diligent engineers can manually delete unused resources but we need a way to automatically detect and clean them up. The solution is Janitor Monkey.
How Janitor Monkey Cleans
Janitor Monkey works in a process of "mark, notify, and delete". When Janitor Monkey marks a resource as a cleanup candidate, it schedules a time to delete the resource. The delete time is specified in the rule that marks the resource. Every resource is associated with an owner email, which can be specified as a tag on the resource or you can quickly extend Janitor Monkey to obtain the information from your internal system. The simplest way is using a default email address, e.g. your team's email list for all the resources. You can configure a number of days for specifying when to let Janitor Monkey send notifications to the resource owner before the scheduled termination. By default the number is 3, which means that the owner will receive a notification 3 business days ahead of the termination date. During the 3 day period, the resource owner can decide if the resource is OK to delete. In case a resource needs to be retained longer the owner can use a simple REST interface to flag the resource as not being cleaned by Janitor Monkey. The owner can always use another REST interface to remove the flag and Janitor Monkey will then be able to manage the resource again. When Janitor Monkey sees a resource marked as a cleanup candidate and the scheduled termination time is already passed, it will delete the resource. The resource owner can also delete the resource manually if he/she wants to release the resource earlier to save cost. When the status of the resource changes which makes the resource not a cleanup candidate, e.g. a detached EBS volume is attached to an instance, Janitor Monkey will unmark the resource and no termination will happen.
Refer to the Quick start guide to get started setting up and using Janitor Monkey.