Skip to content
This repository has been archived by the owner on Jun 10, 2023. It is now read-only.

a-detiste/cruft

Repository files navigation

cruft
=====

Preliminary notes:
------------------

 * This is a pre release version! Be careful, and take its results
   with a grain of salt.

 * cruft does not make any assumptions about the things you may have in places
   like /usr/local. You can teach cruft of such locations in several ways
   - read below.

 * If you have any suggestions on how to improve cruft, or if you make a
   /usr/lib/cruft/filters/* file for another package, please file them as a bug
   against cruft. A rough roadmap for cruft development can be found in TODO.

How it works, and what it does:
-------------------------------

cruft is a program to look over your system for anything that shouldn't be
there, but is; or for anything that should be there, but isn't.

Here is a brief description of how it achieves this. See also the image
"design.png" in this directory.

First, cruft produces the following four lists of files:

 * list of files ACTUALLY PRESENT on the system. This is produced by running
   'find' on each mounted filesystem in turn, except for:
    - filesystems of type such as nfs, proc, etc -- the full expression is in
      cruft_default_scan_fs() in common.sh
    - directories (and their contents) specified by --ignore
   This list is put into /var/spool/cruft/file_*

 * list of files which MUST be on the system.
   This for example includes the list of files dpkg knows about, diversions,
   lost+found directories and so on.
   This list (as are the following two) is produced by running so called
   'explain' scripts, located in /usr/lib/cruft/explain and /etc/cruft/explain.
   See below on how the explain scripts to run are selected.
   This list is taken from the explain scripts' standard output and put into
   /var/spool/cruft/expl_*

 * list of files which MAY be on the system.
   If a file is on this list, it means that it is OK for it to be present, but
   it is not a problem if the file is missing, either.
   This list is taken from the explain scripts' output to file descriptor 3,
   and put into /var/spool/cruft/mayx_*

 * list of files which MUST NOT be on the system.
   This list is taken from the explain scripts' output to file descriptor 4,
   and put into /var/spool/cruft/msnt_*

After producing the lists, the first one is compared to each of the three other
ones using a modified mergesort algorithm. This produces three new lists of
files:

 * files which are on the system, but are not mentioned in either of
   the "MAY" or "MUST" lists. It is placed into /var/spool/cruft/unex_*
 
 * files which are in the "MUST" list but are not in the "actually present"
   list. It is placed into /var/spool/cruft/miss_*
 
 * files which are in the "MUST NOT" list, and are in the "actually present"
   list. It is placed into /var/spool/cruft/frbn_*

Before reporting, each of the three lists is first filtered through the
patterns specified in the filter files for that type of list. The syntax and
semantics of the cruft patterns used in the filter files are documented in the
shellexp(3) manual page. See below for how the filter files are selected.


Selecting explain scripts to run
--------------------------------

Explain scripts are located in two directories:
 * /usr/lib/cruft/explain - where cruft installs its explain scripts
 * /etc/cruft/explain - meant for local explain scripts created by the
   administrator

Any file in /etc/cruft/explain overrides a file in /usr/lib/cruft/explain with
the same name. So if you want to override /usr/lib/cruft/explain/foo, just
create an executable file called /etc/cruft/explain/foo.

Also, any explain script will be run only if any of the following is true:
 - a package with the same name as the file is not completly purged - that is,
   at least one of the following exist for the package:
     * a non-empty file list,
     * a postrm file,
     * a prerm file.
 - the name of the file contains no lower-case characters. Thus, the way to
   have a script run regardless of any package being installed, just name it in
   UPPER_CASE


Selecting filter files
----------------------

For filtering the list of files, the
following filter files will be selected:

 - /etc/cruft/filters/*
 - installed "extrafiles" control files (in dpkg database) (*)
 - /usr/lib/cruft/filters-unex/*
 - /usr/lib/cruft/filters-broken_symlinks/*

However, if a filter file called "foo" (or "foo.extrafiles") exists in any of
the above location, only the first one in this list will be considered.
Moreover, the same CASE-type naming convention applies as with explain scripts
(that is, a filter file which has any lower case letters will only be run if a
package with the same name is present in the system).

(*): not yet implemented


How to make cruft not report some file
--------------------------------------

There are several ways to do this:

 * use "--ignore" which makes cruft ignore whole directory trees by not
   entering them at all, which speeds it up considerably. This is useful for
   large directory trees which local administrator is not interested in, like
   /home. An example:

                computer:~# cruft -m root --ignore /usr/local

 * create a filter file, which contains patterns of files which are not to be
   reported. This is a little more flexible, but requires cruft to traverse the
   directory tree, which takes some time. An example could be:

                computer:~# echo '/usr/local/**' > /etc/cruft/filters/USR_LOCAL
 
 * create an 'explanation' script which prints to FD3 the names of all files
   which are not to be reported. This is the most flexible way, since you can
   decide at runtime which files should be there, and which not, however
   usually requires cruft to traverse the directory tree twice. An example
   could be:

                computer:~# cat >/etc/cruft/explain/USR_LOCAL
                #!/bin/sh
                find /usr/local >&3
                ^D
                computer:~# chmod 755 /etc/cruft/explain/USR_LOCAL

Note that when traversing the filesystems when producing the above lists, cruft
does not follow symlinks. Also, for symlinks there is an additional mechanism,
which identifies and reports broken symlinks separately.

And that's about it.

--
Anthony Towns <ajt@debian.org>
Marcin Owsiany <porridge@debian.org>