DEUCE - Deuce is (not yet) Emacs under Clojure
Because it's there -- George Mallory
Note: Absolutely NOTHING works yet. Expect until Q3 2012 before anything even remotely interesting. (I plan to work full-time on this from August 2012.)
Also - there's a risk I'll give up, far before reaching the current benchmark of JVM Emacsen: JEmacs.
The target version of Emacs is 24.1. It's assumed to live under
configure-emacs will download it if not.
For a minimal Emacs build:
./configure-emacs # downloads emacs-24.1.tar.bz if needed make -C emacs # takes a few minutes. ./emacs/src/temacs -Q --batch --eval "(print (emacs-version))" # ./smoke
temacs is "bare impure Emacs", the raw C version of Emacs, without any Emacs Lisp preloaded.
The Emacs Lisp lives under
-q --no-site-file --no-splash, it basically suppresses all customizations.
--batch won't open the display editor.
The above should output:
Loading loadup.el (source)... Using load-path (<path-to>/emacs/lisp) Loading emacs-lisp/byte-run... [... loads of Emacs Lisp loaded ...] "GNU Emacs 24.1 (x86_64-unknown-linux-gnu) of 2012-08-08 on hraberg-VPCZ21C5E"
The task at hand is to get rid of the bare impure Emacs, replace it with Clojure and the JVM, while keeping Emacs Lisp running.
Clojure will be a first class citizen along Emacs Lisp in this new world. There may be ways to get this build even smaller, haven't looked into it yet.
./collect-tags and add something like this to your
;; To navigate between C and Emacs Lisp (require 'etags-select) (require 'etags-table) (global-set-key "\M-." 'etags-select-find-tag) (setq etags-table-search-up-depth 10)
There are probably better and cleaner ways of doing this, as TAGS includes TAGS-LISP (there's a hint at here).
Emacs Lisp to Clojure
There are several issues (like dynamic scoping), but nothing too hard or exciting. This layer will work similar to shen.clj, that is, basically a simple source to source transformer between Emacs Lisp and Clojure. Emacs Lisp bytecode, and anything related to evaluation of Emacs Lisp in bare Emacs will simply be replaced with Clojure and not ported. Emacs Lisp is a more complex language than K Lambda (which underpins Shen) though, which also was designed specifically for porting.
The special forms of Emacs Lisp live in
C to Clojure
A large part of bare Emacs is pretty redundant in 2012, so this will be mapped to JVM languages, and exposed to Emacs Lisp as the same primitives it has come to know and love. A subset of the Emacs C code is dealing with buffers, regex and other editing specifics, which will be harder to just replace.
Bare impure Emacs is 203692 lines of C spread over 65 files and another 19912 lines of header files. There are around 1064 primitive
defsubr in the minimal build.
The actual porting of the C will be done using a tactic of avoidance until a function is needed, auto generation of its signatures second, and hand crafting the actual implementation last.
etrace can be linked to Emacs and when compiling with
-finstrument-functions to get a crazy amount of tracing "insight" into what Emacs is doing.
strace is another alternative to see what Emacs is doing system call-wise, like to simply see just what files it opens.
I don't expect the visual editor to exist for quite a while. Initially, the editor itself will be implemented using Charva (or similar) Java curses/console library to keep things as simple as possible, compatibility wise. Eventually Swing, SWT and browser based front ends can be added to the mix.
Larger than the technical challenges - which are mainly about scale - is the fact it doesn't seem to be any large regression suite for Emacs one can use to ensure one is on the right track. There are some tests, and other editors, like Zile, have Emacs compatibility test suites for at least editing that could be reused:
- Emacs is using
ert.elfor regression testing. Stallman's comments.
- Zile tests runs against both Zile and Emacs.
- Org-mode testing/README
- Regression Testing XEmacs may or may not work with GNU Emacs.
lein uberjar will bundle together Deuce, Clojure and the Emacs Lisp from GNU Emacs into an executable jar (which currently cannot do anything).
The Road Map
My guess is that it will take roughly a month to get anything useful at all out of batch mode with basic Emacs Lisp cross compilation. An editor that can do anything but crashing another 2 months. An actual useful, somewhat compatible subset of Emacs 6 months.
A potential first milestone is to get
ert.el testing itself in batch mode.
Matching the performance and exact characteristics of the C code for buffers etc. isn't a goal.
100% compatibility is never expected, as the port needs to be driven by the need to support a useful, growing subset of Emacs Lisp packages and Emacs features.
Once Emacs works again, we can move it forward into the future, where it originally came from. I eventually envision something quite different from Emacs, but that may very well end up being another project all together. But having a useful subset of Emacs running on the JVM may come handy when one least excepts it.
The real goal is to bring back some of the fun of extending one's programming environment, by removing some of the old constraints and open up new possibilities - while respecting the Emacs tradition.
Appendix: Which Approach?
This is the fun part. A mix of all these ideas and more may play a part.
The common theme here is that we have something that works: Emacs, and prefer to move as quickly as possible to something else that should be a small, but working, subset: Deuce. There are many ways to get lost on this road.
The first foray out of the base camp will be a combination of porting the "ideal" Emacs Lisp runtime while clearing way for the "real" Emacs boot. I expect this to take 1-2 weeks and fail, but to learn a bit about how Emacs does things, and slowly adjust to the altitude.
Start from the beginning. Get Emacs starting and take it from there.
- + The most obvious approach. Easy to see where one is at.
- + YAGNI can be used.
- - Every step forward may derail into multiple sub problems, each one requiring it's own mindset and toolbox.
- - False sense of security when you load all initial Emacs Lisp without evaluating any of it.
Treat Emacs Lisp as it's own problem and solve it first.
- + To some extent, this must be made, getting the basic semantics of Emacs Lisp ported on top of Clojure early on, as everything else hinges on it.
- - Lacks clear delineation - what is a minimal Emacs Lisp runtime?
- - Emacs Lisp is boring on its own.
Roll up the sleeves and just port the damn thing, function by function until it starts working.
- + It's simple to understand.
- - It's hard to do. Risk of missing the woods for the trees.
- - Impossible to know what to avoid, or verify that they're working as intended together.
One approach is to embed a JVM inside Emacs, and let it eat its way out.
- + Emacs stays working, A/B testing of individual functions can be made.
- + One could maybe implement the Emacs Lisp runtime on top of Clojure this way, and slide it into a C bare impure Emacs, to divide the problem into two distinct parts.
- - Requires writing messy and potentially buggy glue code in C, and may get stuck in the implementation details of bare impure Emacs.
- - Hard to know how far one has to go.
- - Two parallel Emacs Lisp runtimes to manage.
Compile Emacs as an library, and actually call it from Java, and move more and more pieces over.
- + A/B mode possible, you can run Emacs in this mode for testing, if nothing else.
- - Still requires bootstrapping bare Emacs in Clojure, with the additional confusion of having to manage and share state with C.
Event recording from working Emacs, playback in Deuce, alternatively multiplexing a user session, comparing the two Emacs Lisp runtimes live.
- + Captures broad, real world, test cases.
- - Only works later in the game, once Emacs Lisp is somewhat working.
- - Requires infrastructure on the Emacs side, anything form C, Emacs Lisp meta programming to keyboard macros.
- + "Easy", assuming the converter exists, Emacs depends on very few libraries.
- + Great if the code is readable.
- - But it most likely won't be, and while turning C into readable Clojure is a fun problem, it's likely out of scope.
- - Basing a port on generated source feels wrong and leads to a lack of hackability of the new core.
Avoid porting functions at all costs.
- + Self evident, less code is always better code. "This is simple!"
- + Certain parts of Emacs are better backed by Java's encoding, regex and IO handling than it's own.
- + Some functions will never be missed.
- - Some attempts to side step old Emacs functions with impostors may back fire and lead down compatibility hell, and cutting corners may end up costly.
- - Sometimes it's easier to just do it.
Get to this editor, rumored to be bundled with Emacs, as quick as possible.
- + This is actually what we want, isn't it?
- + If the editor is visibly broken, one notices.
- - Directly side tracks into screen buffer management and other potential time sinks.
- - Risk of building the house without sound foundation.
- - Seeing how broken it will be early on could be bad for morale.
Scott McKay's "Dylan Environment Universal Code Editor"
I recently found out about this other Emacs clone, also named Deuce (2001):
Actually, I called it Deuce as a conscious homage to Zwei, then force-fit an acronym: Dylan Environment Universal Code Editor.
Scott then further talks about this:
A buffer is then dynamically composed of "section nodes" [..] it costs a little in performance, but in return it's much easier to build some cool features [like fonts, graphics].
Which nicely leads into richer, HTML based versions - the general approach in "
deuce.clj" is expected to aim for support of text based Emacs buffers, but skip the more modern (but obsolete) graphics features of GNU Emacs and head straight for the browser.
I'll revisit the name if the Clojure port actually becomes usable and the name clash with Dylan Deuce leads to confusion.
EMACS: The Extensible, Customizable Display Editor Richard Stallman, 1981
Emacs and Common Lisp Tom Tromey, 2012
Emacs Lisp in Edwin Scheme Matthew Birkholz, 1993
Thoughts On Common And Scheme Lisp Based Emacs Xah Lee, 2008
The Craft of Text Editing: Emacs for the Modern World Craig A. Finseth, 1991, 1999
Down with Emacs Lisp: Dynamic Scope Analysis Matthias Neubauer and Michael Sperber, 2001
Portable Hemlock Another Common Lisp Emacs.
JEmacs - The Java/Scheme-based Emacs Per Bothner, 2000
Zile "Zile is lossy Emacs"
uemacs Linus' micro Emacs.
YMACS Ymacs is an Emacs-like editor that works in your browser.
Pymacs Emacs to Python interface
el4r EmacsRuby engine - EmacsLisp for Ruby