poporg — Editing program comments or strings with Org mode
poporg is a small Emacs Lisp project, to help editing program string or comments using Org mode.
Literate programming with Org is often presented as mixing programs snippets within an Org document, with tools to extract pure programs out of the Org files. I would prefer it the other way around: mixing documentation snippets within program source code, with tools to extract pure Org documentation from the source files.
Emacs does not nicely handle multiple major modes in a single buffer. So far, many solutions have been implemented, all yielding some level of happiness, but none are perfect. The poporg approach avoids the problem by extracting the block comment or the string, from a buffer using a major programming mode, into a separate buffer to be edited in Org mode, but containing only that block comment or that string. Once the edit is completed, the modified comment or string gets re-integrated in the buffer containing the program, replacing the original contents.
To install poporg, move files
rebox.el at a place
where Emacs will find them. You might byte-compile the files if you
To use poporg, you need to pick some unused keybinding and add a few
lines to your
~/.emacs file. For one, I picked
C-c e o and added
(autoload 'poporg-dwim "poporg" nil t) (global-set-key "\C-ceo" 'poporg-dwim)
While editing a buffer containing a program, you may edit a comment
block or a string (often a doc-string) in Org mode by placing the
cursor within or nearby that comment or string, and calling
poporg-dwim using your selected keybinding. This pops another buffer
in Org Mode (hence the project name), containing the comment or
string. Once your edition is done, right in the popped up editing
poporg-dwim again to complete the edition, or merely kill
that buffer to abandon the edition.
More precisely, if the cursor is within a comment block or a string, this is what gets edited. If the cursor is not within a comment block or a string, a comment or string following the cursor gets selected instead. Otherwise, this is the comment or string which precedes the cursor which is selected for edition. Python mode receives a special treatment: if the cursor is within a string, it is assumed to be a sextuple-quoted string (that is, a triple double-quoted one), and this is what the tool selects.
While the comment or string is being copied in the editing buffer and
until the edition is completed, the original comment or string in the
original buffer is greyed out and protected against accidental
poporg-dwim again from within a greyed out
region recovers the editing buffer, it does not create a new
one. poporg asks for confirmation when the user attempts to kill an
editing buffer which has modifications. poporg also prevents the
original buffer from being killed while there are pending poporg
edits, the user should either complete or abandon all those edits
before killing the original buffer.
Functions added to
poporg-edit-hook are run once the poporg editing
buffer has been set up with its contents, with the common prefix
already removed, these functions may further modify the buffer
contents. Functions added to
poporg-edit-exit-hook are run
when poporg is about to reinstate the common prefix and move back the
editing buffer contents into the original programming buffer, these
functions may alter the contents as needed. (I did not need these
hooks, so let’s talk if you need them to be defined differenty!)
The following list is organized in decreasing order of approximative or subjective priority. You may also check if there are any issues on GitHub.
- If the cursor is located immediately before the opening delimiter of
a string before
poporg-dwim, some extraneous text to edit may be collected from before the cursor.
- The protective measures against losing a pending edition do not work when the user plainly exits Emacs.
- If characters are added immediately before or immediately after the region being edited, while the edition is pending, the characters after the region are preserved when the user completes its poporg edition, but the characters before the region are lost, while they should have been preserved.
- Even if a region being edited is intangible (meaning that the cursor cannot be pushed into it), it is not read-only and could have its contents deleted by editing from either end of the region. I suspect (without being sure) that this bug, and the preceding one, come from the fact overlays and text-properties do not behave the same.
- Ideally, the region being edited should be read-only but not
intangible, in that the cursor could be moved into it, from where a
poporg-dwimcommand would popup the associated edit buffer. This would be particularly useful when a user has many pending poporg edits.
- It has been suggested, and rightly so, that
C-c C-cwould be a nice keybinding for completing a poporg edit. The problem with this is that the edit buffer uses Org mode, where
C-c C-cis overcrowded with many functionnalities already; some care would be needed to make sure this command, used with another intent, does not unexpectedly close the edition.
- I do not much like that poporg depends on Rebox, which is a complex piece of code compared to the reminder of poporg. For comments, Rebox studies the file contents to guess comment delimiters and box styles, while for strings, poporg rather relies the syntax analysis previously made by the programming major mode, and expressed through faces. These approaches are too different, maybe both are wrong anyway. Moreover, the faces approach easily fools poporg when a comment or string does not use a uniform face. One advantage of using Rebox might be that it brings poporg closer to the capability of editing Org mode comments for a wider variety of boxing patterns.
- Once the string and comment is back into the programming buffer, we loose Org mode highlighting and presentation details, which is unfortunate. Multiple editing modes in Emacs are not able to highlight sections of a file according to the intended mode for each section: there is a single mode for the whole buffer in fact. Org mode, on the other hand, has the virtue of correctly highlighting the code snippets it contains, so surely, there is a way to do things as they should, that might be understood and recycled, I’m not sure.
- poporg should ideally be accompanied by a set of conventions and some tools for proper extraction of an Org file out of program sources. One is already provided for Python, it would be nice to also have some support for other languages.
poporg recycles a few ideas from two previous Emacs projects:
- my PO mode (source and documentation), for the idea of using
separate buffers for edition. For PO files, the need is quite
msgstrstrings use escaping which is easy to get wrong, so the idea of a separate buffer is a way to remove that concern from the user, PO mode unquotes before presenting the string to the user, and requotes it once the editing is completed. This was also solving the problem that
msgstrfields, and the reminder of the PO file, could be using different character sets.
- my Rebox tool (source and documentation), for finding the boundaries of block comments. Originally in Emacs Lisp, this tool has later rewritten in Python at the time I was developing Pymacs, with a few minor improvements while doing so. Le Wang, starting from my old Emacs Lisp, wrote a much enhanced version (source and video). For poporg, however, the needs are modest, so it includes the old Emacs Lisp version. See the very last section of the Rebox documentation for more historial context.
Major programming modes show comments and strings in full, and when these comments or strings are written using Org, with all parts of a link visible, it may be disruptive to those sensible to line width limits. The nice org-link-minor-mode tool takes good care of this, by hiding the usually invisible parts of an Org link in other modes.
Org comes with many tools for spreading Org over other major modes, among which the following minor modes which may be added to other major modes:
Org also has the following globally available commands:
The width of Org links often triggers the line length limit check of the pep8 program, which gets annoying when one uses flymake to get real-time feedback while writing. To silence these, I took advantage of this nice workaround, installing a pep8 replacement program, and managed so flymake uses that replacement program instead of pep8.
Extractor for Python
extradoc.py tool in this poporg project has the purpose of
extracting and processing the Org contents of a set of Python sources.
I used the
.py suffix just in case there could be other
tools for similarly handling sources in other languages. This
extradoc.py tool presumes that all Org text is made up by
concatenating the content of all sextuple-quoted strings (I mean
triple double-quoted strings). Moreover, prefixed strings are not
recognized. Here is its own documentation:
Extract documentation from one or more Python sources. Documentation lies in all unprefixed, sextuple-quoted strings. Usage: extradoc.py [OPTION]... [SOURCE]... Options: -c PREFIX Common prefix for all output files. -s Split output in directory PREFIX, obey #+FILE directives. -h Produce an HTML file, either PREFIX.html or PREFIX/NAME.html. -o Produce an Org file, either PREFIX.org or PREFIX/NAME.org. -p Produce a PDF file, either PREFIX.pdf or PREFIX/NAME.pdf. -t Produce a translation file, name will be PREFIX.pot. -v Be verbose and repeat all of Emacs output. -D SYM Define SYMbol as being True -D SYM=EXPR Define SYMbol with the value of EXPR. -I TAGS Only include sections having one of TAGS in their header. -X TAGS Exclude sections having one of TAGS in their header. If no SOURCE are given, the program reads and process standard input. Option -c is mandatory. If -h or -p are used and -o is not, file PREFIX.org should not pre-exist, as the program internally writes it and then deletes it. Some non-standard Org directives are recognized: #+FILE: NAME.org Switch output to NAME.org, also requires -s. #+IF EXPR Produce following lines only if EXPR is true, else skip. #+ELIF EXPR Expected meaning within an #+IF block. #+ELSE Expected meaning within an #+IF block. #+ENDIF Expected meaning to end an #+IF block. EXPRs above are Python expressions, eval context comes from -D options. TAGS represents a comma-separated list of Org tags. To get through, a line should go through the #+IF system, not be within an excluded section, and if any included sections is specified, then either be part of one of them or within the introduction (that is, before the first header).