Imported from Trac wiki; be wary of outdated information or markup mishaps.
On first look the Cabal code seems large and intimidating. This page is intended to give you a head start in understanding it.
All the Cabal modules live under
The modules can be roughly divided into two groups:
The declarative modules: They are mostly concerned with data structures like package descriptions. These modules live under
Distribution.*. Much of the code in these modules are utility functions for handling the data types and also functions for parsing and showing them.
The active modules: They are concerned with actually doing things like configuring, building and installing packages. These modules live under
According to SLOCCount Cabal is currently about 8,500 lines of code. This breaks down as about 3,000 lines for the declarative part and about 5,500 for the active part. Most modules are less than a few hundred lines, though there are a couple monsters nearer 1,000 lines.
Cabal is 100% Haskell. It uses hierarchical modules, a little bit of FFI in places and some CPP. It is otherwise Haskell 98. This is important since it has to work with Hugs, nhc98, jhc as well as ghc.
A further constraint is that because Cabal is used by both GHC and
Hugs to bootstrap the libraries, it can itself only depend on other
boot libraries, and only those shipped with all compilers and
available on all OSs. This means we cannot depend on various other
common packages like parsec or mtl, or GHC-specific packages like
template-haskell. We also avoid the package
the equivalent hierarchical modules in the
base package. That
currently leaves array, base, bytestring, containers, directory,
filepath, old-locale, old-time, pretty, process and random, but the
fewer dependencies the better.
Distribution/Version.hs (source) (docs): exports the
Version type along with a parser and pretty printer. A version is something like "1.3.3". It also defines
Dependency data types. Version ranges are like ">= 1.2 && < 2". A dependency is a package name and a version range, like "foo >= 1.2 && < 2".
Distribution/Package.hs (source) (docs): defines a package identifier along with a parser and pretty printer for it.
PackageIdentifiers consist of a name and an exact version (exact version as opposed to a dependency like above that uses a version range).
Distribution/Verbosity.hs (source) (docs): a simple
Verbosity type with associated utilities. There are 4 standard verbosity levels from
Verbose up to
Deafening. This is used for deciding what logging messages to print in the active parts.
Distribution/Compiler.hs (source) (docs): This has an enumeration of the various compilers that Cabal knows about. It also specifies the default compiler. Sadly you'll often see code that does case analysis on this compiler flavour enumeration like:
case compilerFlavor comp of GHC -> GHC.getInstalledPackages verbosity packageDb progconf JHC -> JHC.getInstalledPackages verbosity packageDb progconf
Obviously it would be better to use the proper
because that would keep all the compiler-specific code together.
Unfortunately we cannot make this change yet without breaking the
UserHooks api, which would break all custom
Setup.hs files, so for
the moment we just have to live with this deficiency. If you're
interested, see issue #57.
Distribution/System.hs (source) (docs):
Cabal often needs to do slightly different things on specific
platforms. You probably know about the
System.Info.os :: String
however using that is very inconvenient because it is a string and
different Haskell implementations do not agree on using the same
strings for the same platforms! (In particular see the controversy
over "windows" vs "ming32"). So to make it more consistent and easy
to use we have an
Distribution/License.hs (source) (docs):
.cabal file allows you to specify a license file. Of course
you can use any license you like but people often pick common open
source licenses and it's useful if we can automatically recognise
that (eg so we can display it on the hackage web pages). So you can
also specify the license itself in the
.cabal file from a short
enumeration defined in this module. It includes
Distribution/ParseUtils.hs (source) (no docs - hidden module):
.cabal file format is not trivial, especially with the
introduction of configurations and the section syntax that goes
with that. This module has a bunch of parsing functions that is
used by the
.cabal parser and a couple others. It has the parsing
framework code and also little parsers for many of the formats we
get in various
.cabal file fields, like module names, comma
separated lists etc.
Distribution/PackageDescription.hs (source) (docs):
This defines the data structure for the
.cabal file format.
There are several parts to this structure. It has top level info
Executable sections each of which have
BuildInfo data that's used to build the library or exe.
To further complicate things there is both a
GenericPackageDescription. This distinction relates to
When we initially read a
.cabal file we get a
GenericPackageDescription which has all the conditional sections.
Before actually building a package we have to decide on each
conditional. Once we've done that we get a
was done this way initially to avoid breaking too much stuff when
the feature was introduced. It could probably do with being
rationalised at some point to make it simpler.
Distribution/PackageDescription/Configuration.hs (source) (docs):
This is about the
feature. It exports
flattenPackageDescription which are functions for converting
GenericPackageDescriptions down to
PackageDescriptions. It has code
for working with the tree of conditions and resolving or flattening
Distribution/PackageDescription/Parse.hs (source) (docs):
This defined parsers and partial pretty printers for the
format. Some of the complexity in this module is due to the fact
that we have to be backwards compatible with old
.cabal files, so
there's code to translate into the newer structure.
Distribution/PackageDescription/Check.hs (source) (docs):
This has code for checking for various problems in packages.
There is one set of checks that just looks at a
in isolation and another set of checks that also looks at files in
the package. Some of the checks are basic sanity checks, others are
portability standards that we'd like to encourage. There is a
PackageCheck type that distinguishes the different kinds of check
so we can see which ones are appropriate to report in different
situations. This code gets uses when configuring a package when we
consider only basic problems. The higher standard is uses when when
preparing a source tarball and by hackage when uploading new
packages. The reason for this is that we want to hold packages that
are expected to be distributed to a higher standard than packages
that are only ever expected to be used on the author's own
Distribution/InstalledPackageInfo.hs (source) (docs):
.cabal file format is for describing a package that is not
yet installed. It has a lot of flexibility like conditionals and
dependency ranges. As such that format is not at all suitable for
describing a package that has already been built and installed. By
the time we get to that stage we have resolved all conditionals and
resolved dependency version constraints to exact versions of
dependent packages. So this module defines the
data structure that contains all the info we keep about an
installed package. There is a parser and pretty printer. The
textual format is rather simpler than the
.cabal format, there are
no sections for example. This is the format that
Distribution/Simple/Program.hs (source) (docs):
This provides an abstraction which deals with configuring and
running programs. A
Program is a static notion of a known program.
ConfiguredProgram is a
Program that has been found on the current
machine and is ready to be run (possibly with some user-supplied
default args). Configuring a program involves finding its location
and if necessary finding its version. There is also a
ProgramConfiguration type which holds configured and not-yet
configured programs. It is the parameter to lots of actions
elsewhere in Cabal that need to look up and run programs. If we had
a Cabal monad, the
ProgramConfiguration would probably be a reader
or state component of it.
The module also defines all the known built-in
Programs and the
defaultProgramConfiguration which contains them all.
Distribution/Simple/Command.hs (source) (docs):
This is to do with command line handling. The Cabal command
line is organised into a number of named sub-commands (much like
Command abstraction represents one of these
sub-commands, with a name, description, a set of flags.
can be associated with actions and run. It handles some common
stuff automatically, like the
--help and command line completion
flags. It is designed to allow other tools make derived commands.
This feature is used heavily in cabal-install.
Distribution/Simple/InstallDirs.hs (source) (docs):
This manages everything to do with where files get installed
(though does not get involved with actually doing any
installation). It provides an
InstallDirs type which is a set of
directories for where to install things. It also handles the fact
that we use templates in these install dirs. For example most
install dirs are relative to some
$prefix and by changing the
prefix all other dirs still end up changed appropriately. So it
PathTemplate type and functions for substituting for
Distribution/Simple/Compiler.hs (source) (docs):
This should be a much more sophisticated abstraction than it
is. Currently it's just a bit of data about the compiler, like it's
flavour and name and version. The reason it's just data is because
currently it has to be in
Show so it can be saved along
LocalBuildInfo. The only interesting bit of info it
contains is a mapping between language extensions and compiler
command line flags. This module also defines a
PackageDB type which
is used to refer to package databases. Most compilers only know
about a single global package collection but GHC has a global and
per-user one and it lets you create arbitrary other package
databases. We do not yet support this latter feature very much.
Distribution/Simple/PreProcess.hs (source) (docs):
This defines a
PreProcessor abstraction which represents a
pre-processor that can transform one kind of file into another.
There is also a
PPSuffixHandler which is a combination of a file
extension and a function for configuring a
PreProcessor. It defines
a bunch of known built-in preprocessors like cpp, cpphs, c2hs,
hsc2hs, happy, alex etc and lists them in
top of this it provides a function for actually preprocessing some
sources given a bunch of known suffix handlers. This module is not
as good as it could be, it could really do with a rewrite to
address some of the problems we have with pre-processors.
Distribution/Simple/Utils.hs (source) (docs): A large and somewhat miscellaneous collection of utility functions used throughout the rest of the Cabal lib and in other tools that use the Cabal lib like cabal-install. It has a very simple set of logging actions. It has low level functions for running programs, a bunch of wrappers for various directory and file functions that do extra logging.
Distribution/Simple/LocalBuildInfo.hs (source) (docs):
Once a package has been configured we have resolved
conditionals and dependencies, configured the compiler and other
needed external programs. The
LocalBuildInfo is used to hold all
this information. It holds the install dirs, the compiler, the
exact package dependencies, the configured programs, the package
database to use and a bunch of miscellaneous configure flags. It
gets saved and reloaded from a file (
dist/setup-config). It gets
passed in to very many subsequent build actions.
BuildInfowith the results)
Then based on all this it saves the info in the
writes it out to a file. It also displays various details to the
user, the amount of information displayed depending on the
Distribution/Simple/Build.hs (source) (docs):
This is the entry point to actually building the modules in a
package. It doesn't actually do much itself, most of the work is
delegated to compiler-specific actions. It does do some
non-compiler specific bits like running pre-processors. There's
some stuff to do with generating makefiles which is a well hidden
feature that's used to build libraries inside the GHC build system
but which we'd like to kill off and replace with something better
(doing our own dependency analysis properly). Half the module is
dedicated to generating the
Paths_pkgname module. This is a
module that Cabal generates for the benefit of packages. It enables
them to find their version number and find any installed data files
at runtime. This code should probably be split off into another
Distribution/Simple/Haddock.hs (source) (docs):
This module deals with the haddock and hscolour commands. Sadly
this is a rather complicated module. It deals with two versions of
haddock (0.x and 2.x). It has to do pre-processing for haddock 0.x
unliting and using
-D__HADDOCK__ for any source code
that uses cpp. It has to call ghc-pkg to find the locations of
documentation for dependent packages, so it can create links. The
hscolour support allows generating html versions of the original
source, with coloured syntax highlighting.
Distribution/Simple/Register.hs (source) (docs): This module deals with registering and unregistering packages. There are a couple ways it can do this, one is to do it directly. Another is to generate a script that can be run later to do it. The idea here being that the user is shielded from the details of what command to use for package registration for a particular compiler. In practice this aspect was not especially popular so we also provide a way to simply generate the package registration file which then must be manually passed to ghc-pkg. It is possible to generate registration information for where the package is to be installed, or alternatively to register the package inplace in the build tree. The latter is occasionally handy, and will become more important when we try to build multi-package systems. This module does not delegate anything to the per-compiler modules but just mixes it all in in this module, which is rather unsatisfactory. The script generation and the unregister feature are not well used or tested.
Distribution/Simple/SrcDist.hs (source) (docs):
This handles the
sdist command. The module exports an
action but also some of the phases that make it up so that other
tools can use just the bits they need. In particular the
preparation of the tree of files to go into the source tarball is
separated from actually building the source tarball.
The sdist action also does some distribution QA checks.
Distribution/Simple/GHC.hs (source) (docs):
This is a fairly large module. It contains most of the
GHC-specific code for configuring, building and installing
packages. It also exports a function for finding out what packages
are already installed. Configuring involves finding the ghc and
ghc-pkg programs, finding what language extensions this version of
ghc supports and returning a
involves calling the ghc-pkg program to find out what packages are
installed. Building is somewhat complex as there is quite a bit of
information to take into account. We have to build libs and
programs, possibly for profiling and shared libs. We have to
support building libraries that will be usable by GHCi and also
-split-objs feature. We have to compile any C files using
ghc. Linking, especially for
split-objs is remarkably complex,
partly because there tend to be 1,000's of .o files and this can
often be more than we can pass to the ld or ar programs in one go.
There is also some code for generating
Makefiles but the less said
about that the better. Installing for libs and exes involves
finding the right files and copying them to the right places. One
of the more tricky things about this module is remembering the
layout of files in the build directory (which is not explicitly
documented) and thus what search dirs are used for various kinds of
Distribution/Simple/UserHooks.hs (source) (docs):
This defines the API that
Setup.hs scripts can use to customise
the way the build works. This module just defines the
type. The predefined sets of hooks that implement the
Configure build systems are defined in
UserHooks is a big record of functions. There are 3 for each
action, a pre, post and the action itself. There are few other
miscellaneous hooks, ones to extend the set of programs and
preprocessors and one to override the function used to read the
.cabal file. This hooks type is widely agreed to not be the right
solution. Partly this is because changes to it usually break custom
Setup.hs files and yet many internal code changes do require
changes to the hooks. For example we cannot pass any extra
parameters to most of the functions that implement the various
phases because it would involve changing the types of the
corresponding hook. At some point it will have to be replaced.
Distribution/Simple/Setup.hs (source) (docs):
This is a big module, but not very complicated. The code is
very regular and repetitive. It defines the command line interface
for all the Cabal commands. For each command (like
etc) it defines a type that holds all the flags, the default set of
flags and a
Command that maps command line flags to and from the
corresponding flags type. All the flags types are instances of
for an explanation. The types defined here get used in the front
end and especially in
cabal-install which has to do quite a bit of
manipulating sets of command line flags. This is actually
relatively nice, it works quite well. The main change it needs is
to unify it with the code for managing sets of fields that can be
read and written from files. This would allow us to save configure
flags in config files.
Distribution/Simple/SetupWrapper.hs (source) (docs):
This is a wrapper around calling
Setup.hs scripts. It is
slightly more cunning than just calling
runghc Setup.hs args....
First of all, it checks the
.cabal file and sees if it specifies
any particular version of Cabal. It also checks the
build-type is anything other than
Custom and the version of
Cabal required is compatible then it does not run
Setup.hs at all,
instead it directly calls
defaultMainArgs. This is a good deal
quicker than compiling the
Setup.hs script. On the other hand, if
build-type is custom or the version of Cabal specified is not
compatible with the version being used, then it tried to compile
Setup.hs script with an appropriate version of the Cabal
library. This aspect is currently only implemented for ghc. Nothing
in the Cabal lib uses this module, it is provided for
Distribution/Simple.hs (source) (docs):
This is the command line front end to the
Simple build system.
The original idea was that there could be different build systems
that all presented the same compatible command line interfaces.
There is still a
Make system (see below) but in practice no
packages use it. This module exports the main functions that
Setup.hs scripts use. It re-exports the
UserHooks type, the
standard entry points like
the predefined sets of
UserHooks that custom
Setup.hs scripts can
extend to add their own behaviour.
Distribution/Make.hs (source) (docs):
This is an alternative build system that delegates everything
make program. All the commands just end up calling make with
appropriate arguments. The intention was to allow preexisting
packages that used makefiles to be wrapped into Cabal packages. In
practice essentially all such packages were converted over to the
Simple build system instead. Consequently this module is probably
not used much and it certainly only sees cursory maintenance and no
testing. Perhaps at some point we should stop pretending that it
Last edited by benmachine,