Fetching contributors…
Cannot retrieve contributors at this time
315 lines (243 sloc) 13.6 KB
.. role:: code(strong)
.. role:: file(literal)
.. role:: var(emphasis)
pppp — Poor's Python Pre-Processor
(As within Pymacs @VERSION@)
:Author: François Pinard
:Copyright: © Progiciels Bourbeau-Pinard inc., Montréal 2010
.. contents::
.. sectnum::
Why pppp?
The Python community has long resisted the idea of a pre-processor for
Python, and quite understandbly. The usual features of a pre-processor
for other languages are well served at run-time in Python, alleviating
the need.
The advent of Python 3 changes the picture somehow, as Python 3 does not
accept some Python 2 constructs, and vice-versa. In many situations,
one cannot (at least without stretched stunts) write a single source
file which can be compiled by both Python 2 and Python 3. The languages
are so similar that it is irritating to keep sources separate: this is
too much a burden for maintenance, in my opinion.
This :code:`pppp` tool was written to help porting Pymacs to Python 3.
I guess it could be useful for other Python programs or packages.
Report problems, documentation flaws, or suggestions to François Pinard:
There is no installation machinery for :code:`pppp`, and Pymacs does not
install it either. To install it, merely pick the :file:`pppp` script
from the top-level of the Pymacs distribution, either going through the
main Pymacs site:
or more directly from GitHub:
Copy that file somewhere on your search path, and make it executable.
That's all to it.
Pre-processor syntax
There are two mechanisms in :code:`pppp`. One does in-line substitutions,
the other takes care of conditional compilation. In-line substitution
occurs first, one line at a time, then conditional compilation occurs
on the result of the substitutions.
The two mechanisms both rely on a preset *context*, which is a set
of definitions. Each definition relates a name to a Python value.
The context is built under the control of options given to the
:code:`pppp` program. The same context is used both for substitution and
The behaviour of :code:`pppp` is currently unspecified when substitutions
of a single line produces multiple lines, for which the first or any
other is meant for conditional compilation. So don't do that!
Substitution is triggered on each occurrence of ``@``\ :var:`NAME`\
``@`` in the sources. In each case, if :var:`NAME` is not the name of
a context element, substitution just does not happen and the occurrence
is left undisturbed — silently, without diagnostic. If substitution
happens, :var:`NAME` and the surrounding ``@`` delimiters get replaced
by the string of the value associated with that name within the context.
Unless ``@``\ :var:`NAME`\ ``@`` sits within a Python string or a Python
comment, it is invalid Python syntax. So (contrarily to conditionals
described below), if the substitution notation is used, pre-processing
is likely mandatory.
Conditional compilation is merely driven by usual Python ``if``
statements. However, to be considered for conditional compilation, the
``ìf``, ``elif`` or ``else`` lines should have the following colon
(``:``) on the same physical line. Moreover, such lines should not use
Python comments.
The test expression associated with the ``if`` or any ``elif`` is
evaluated using the pre-processor context. If all the variables or
functions referred to by the expression are known in the context (and
presuming there is no syntax error or other run-time error while
evaluating the expression), the expression gets a dependable value.
The ``if`` or ``elif`` line is itself removed (well, in some cases, an
``elif`` might become an ``else``), and the following block of lines is
adjusted according to the expression value, likely shifted back or fully
removed. Similarily, ``else`` clauses may sometimes get simplified.
While it is possible to use very invalid Python syntax which, through
:code:`pppp` conditional compilation, is turned into a valid Python
program; users are much invited to use conditional compilation in such a
way that sources meant for :code:`pppp` are directly legal Python syntax.
This idea of writing conditionals as correct Python could be pushed
even further. If the user manages to compute and assign the context
variables at run-time in the Python program, conditional compilation
for some name could be replaced by run-time checks on that name merely
by *not* defining the name in the :code:`pppp` context. By doing so,
the test expressions involving that name may not be resolved by the
pre-processor, and the simplifications just does not occur.
Invoking :code:`pppp`
The :code:`pppp` command is called using the usual syntax for Unix / Linux
pppp [OPTION]... [ARGUMENT]...
The operating mode of the program, and the meaning of arguments, depend
on some options being used or not. Option ``-c`` forces clean out
mode, option ``-m`` forces merge mode. Otherwise, the program uses the
pre-processing mode.
The ``-h`` option is special. When given, a short help summary is
written on standard output, and then, the program exits immediately.
The ``-v`` option raises the verbosity level of the program, which then
produces output about created directories, written files or deleted
Setting the context
The context used for the pre-processing is initially empty. It does not
even have Python builtins. It is then filled through the use of ``-C``
or ``-D`` options, which may be repeated when there are many definitions
to introduce, or when there is a need to override previous settings.
Option ``-D`` :var:`name` adds :var:`name` into the context, associating
it with the Python value ``True``. Option ``-D`` :var:`name`\ ``=``\
:var:`expr` adds :var:`name` into the context, associating with the
value of the Python expression :var:`expr`. Beware of Python characters
which also have a meaning for the shell, proper quoting may be needed.
Here is, for example, how to define a string while calling :code:`pppp`::
pppp -D "version='@VERSION@'" ...
While evaluating :var:`expr`, there is no restriction to the context,
and builtins are indeed available. For exemple, to add the builtin
:code:`ord` into the context, merely use ``-D ord=ord``.
Option ``-C`` :var:`FILE` reads and evaluates :var:`FILE` as a Python
source. All variables computed at the outer level then become names
in the context, and the values of these variables become the values
associated with the names within the context. Any function defined
at the outer level of :var:`FILE` also gets available to :code:`pppp`
Beware of uncleaned variables in :var:`FILE`. For example, an ``import
sys`` creates a ``sys`` variable, which you normally clean with ``del
sys`` near the end of :var:`FILE`. If you do not do so, that variable
is available to the pre-processor. So if you have a line like::
if sys.version_info[:2] == (2, 7):
somewhere in your :code:`pppp` source, this might be evaluated as ``True``
or ``False`` at pre-processing time rather than at run-time, and this
might not be what you wanted.
Pre-processing files
Without options ``-c`` nor ``-m``, the arguments to the program indicate
which files are going to be pre-processed. If there is no argument
at all, this is a special case by which standard input is read,
pre-processed and then written to standard output.
Otherwise, only eligible files are retained for pre-processing. To be
eligible, the name of a file should end with ``.in``. If an argument
names a directory, that directory is recursively searched to find all
files with such an ``.in`` suffix. When a directory has a ``.in``
suffix (either given as an argument, or a subdirectory of a directory
argument), *all* the files it contains become eligible, including all
files of its subdirectories, recursively.
Now, that ``.in`` suffix may be changed to something else, using the
``-s`` :var:`NAME` suffix option. The period is part of the option
value. For example, ``-s '.in'`` is equivalent to not specifying it.
Each eligible file is pre-processed and written on another file, the
name of which is related to the name of the file being read. That
name is produced by removing the ``.in`` suffix, and more precisely,
by removing all ``.in`` suffixes, would they appear in directory names
or file names. Moreover, the optional ``-o`` :var:`OUTPUT_DIRECTORY`
option introduces a directory into which all resulting files are
collected: it effectively prepends :var:`OUTPUT_DIRECTORY/` to all
output names. If the suffix gets declared empty through ``-s ''``, then
*all* files are eligible, and because output names would be identical to
the input names, the ``-o`` option becomes mandatory.
You do not have to prepare intermediate directories to receive output
files. These are created on the fly, as needed.
Pre-processing uses substitutions and conditionals. Substitutions
automatically occur on all eligible files. Conditionals, however, only
apply for files which are known to be Python sources. If option ``-p``
is given, all files are considered to be Python sources. Otherwise, a
Python source has a file name which ends with ``.py`` or ````, or
appears to use a Python shebang line (the precise heuristic checks that
the first line of the file starts with ``!#`` and has ``ython`` written
somewhere in it).
The :code:`pppp` tool assumes, by default, that the Python sources
consistently use an indentation step, and that the indentation step is 4
columns. This can be changed with the ``-i`` :var:`INDENT` option. For
example, ``-i 8`` means that the indentation step is 8 columns.
By default, :code:`pppp` generates white lines in the pre-processed
results to replace any removed lines. The idea is to guarantee usable
line numbers in any later traceback, that is, numbers that refer to the
correct position within the original file, before it was pre-processed.
The file name would still differ by the ``.in`` suffix, of course, which
is a lesser worse. Whenever, as side-effect of substitutions, a single
input line yields many output lines, line synchronisation may be lost.
:code:`pppp` then inhibits the production of replacement white lines until
the line synchronisation is recovered. Option ``-n`` wholly inhibits
the production of any white line only meant for synchronisation.
Because tracebacks mention the file name after pre-processing, and not
the original source before pre-processing, users are likely to inspect
the resulting file, and after a while, start modifying it without
realizing their mistake: a resulting file might be overwritten by a later
invocation of :code:`pppp`, so loosing user's modifications. To play safe,
:code:`pppp` attempts to detect this: it copies the modification time from
the original into any resulting file it produces. Then, whenever a
resulting file is newer than the original source, :code:`pppp` raises an
error instead of deleting or rewriting it. Finally, as a way to force
Python recompilation in case the resulting file becomes different, it
removes an already compiled Python file, if any. If you want to force
deletions or rewritings regardless, use option ``-f``.
Cleaning out files
As a convenience for :file:`Makefile` writers, there is an option to
help at cleaning out derived files. With ``-c`` specified, any file that
would have been produced in pre-processing mode is removed instead.
Of course, to be useful, the command arguments naming files or
directories should be the same as those used for pre-processing.
Merging versions
As a way to help prepare a Python file for :code:`pppp` pre-processing,
the program offers a mode able to produce a pre-processable file out of
two versions of a given Python source. For example::
pppp -mD VERSION2 >
compares :file:`` with :file:`` and produces a merged
version on :file:``. Then, the command::
pppp -D VERSION2=False
would produce a file :file:`` which is equivalent to
:file:``, while the command::
pppp -D VERSION2
would produce a file :file:`` which is equivalent to
Whenever option ``-m`` is used, exactly one ``-D`` option provides the
segregating name used in added conditionals, and two arguments tell the
versions to be compared.
Beware that this mode was quickly written, and stays rather crude and
approximative. This is merely a way to get started. The real and
patient work comes afterwards, with a text editor, to clean and fixup
things, and bring the merged result closer to real Python syntax.
While editing the result, you might find some ``#endif (pppp)`` lines
generated here and there. These are protective measures, so the later
pre-processing does not clearly produce wrong results. These lines
usually indicate problematic areas, for which revision and careful
refactoring is especially needed.
+ The need of a very consistent indentation, as far as the indentation
step is considered, may be too stringent a condition. It would surely
be nicer if :code:`pppp` was able to adapt to the indentation in use.
+ This tool is easily fooled by unindented comments or multi-line
strings, as it is driven only by textual line indentation. It does not
follow whether a line is part of multi-line string or not.