Skip to content
This repository has been archived by the owner on Sep 23, 2021. It is now read-only.

kana/vim-xire

Repository files navigation

Xire — Turn Vim script into a programmable programming language

Introduction

Xire is a tool to compile a domain-specific language into Vim script, and Xire is also the name of the DSL itself. The DSL is based on Scheme which is a dialect of Lisp. Xire turns Vim script into a programmable programming language.

One might wonder about reasons to use Xire. There are the following problems about Vim script:

  • Vim script is weak. Because it is not designed as a programming language at the first. Vim script is just a series of Ex commands, and its syntax is too arbitrary as a programming language. Vim script is nearly impossible to extend the language itself, because the ways to abstract details are limited.

  • There are several interfaces to other languages. For example, Vim version 7.3 provides interfaces to Lua, Perl, Python, Ruby, Scheme and Tcl. But the foreign language interfaces are not portable and they are not seamlessly integrated with Vim.

And all of the above problems can be resolved with Xire.

Requirements

  • Vim version 7.3 or later

  • Gauche version 0.9 or later

Installation

To install Xire, run commands like the following:

git clone git://github.com/kana/vim-xire.git
cd ./vim-xire
./DIST gen
./configure
make install

The above commands install the following:

  • Command-line tools

  • Gauche package vim.xire which is required to run the command-line tools

External commands

xirec

A compiler that compile a Xire script into Vim script. It reads a Xire script from the standard input, then it writes the resulting Vim script to the standard output.

Typical usage
$ xirec <foo.xire >foo.vim

Xire script

Abstract

Xire script is a Scheme-based DSL to write Vim script. Xire script consists of a series of Xire expressions. Xire expressions are evaluated sequentially from first to last.

Each Xire expression is a list of S expressions. The first value of the list is used to determine the type of the expression. Each Xire expression can be classified into one of the following categories:

Directives

Directives control the current compiling process. For example, a directive defines a new macro, while another directive alters Scheme environment for the current compiling process.

Macros

Macros are corresponding to pieces of Vim script. There are two types of macros. One is a statement macro which is expanded into a series of Ex commands, and the other is an expression macro which is expanded into an expression of Vim script. Macros can be expanded into macros. Such macros are recursively expanded until non-macro results are expanded.

Macros

Macros are defined with defstmt and defexpr. For example, a macro which is expanded into :if statement in Vim script can be defined as follows:

(defstmt if
  [(_ $cond:qexpr $then:qstmt)
   `(if ,$cond ,$then (begin))]
  [(_ $cond:expr $then:stmt $else:stmt)
   ($if $cond $then $else)])

Macros consist of one or more clauses. Macro definitions are mostly same as match, but there are several differences to define transformation rules easily.

Interpretation

A use of macro (the form) is matched against each pattern of clause, from first one to last one.

If a matching clause is found, the form is expanded by the transformation rule of the clause, then this form returns the expanded form.

Otherwise, it is an error.

For example:

  • (if 0 1) is matched to the first clause of example if.

  • (if 0 1 2) is matched to the second clause of the if.

  • (if 0 1 2 3) is not matched to any clause of the if; so it is an error.

Clauses

The details of a clause are as follows:

  • A clause is a list with two elements.

  • The first element of a clause is a pattern.

  • The second element of a clause is a body.

For example:

  • Example if consists of two clauses.

  • The first clause is [(_ $cond:expr $then:stmt) …​].

  • The second clause is [(_ $cond:expr $then:stmt $else:stmt) …​].

  • (_ $cond:expr $then:stmt) and (_ $cond:expr $then:stmt $else:stmt) are patterns.

Patterns

The details of a pattern are as follows:

  • A pattern is a list of S expressions.

    • Examples: (break), (return $value:expr), (if $cond:expr $then:stmt)

  • In a pattern, a symbol which name starts with $ is called a slot.

  • Slots are symbols. The format of slot symbols is $<name>:<type>, where <name> is the name of a slot and <type> is the type of a resulting value.

    • Examples: $cond:expr, $then:stmt

The details of pattern-matching process are as follows:

  • Non-slot values in a pattern match the same objects in a sense of equal?.

    • Example: Pattern (break) matches only (break).

  • Slot values in a pattern are treated as pattern variables. They match arbitrary objects.

    • Example: Pattern (return $value:expr) matches (return 1), (return (list)), etc.

  • The symbol _ in a pattern is also treated as a pattern variable. It matches arbitrary object, but matched object can not be referred in corresponding body. It can be used to show "don’t care" placeholder.

    • Example: Pattern (rem _) matches (rem 1), (rem (2 3)), etc.

  • As a special case, the last value in a pattern may be ... (the symbol spelled with three periods). The pattern just before ... is applied repeatedly until it consumes all elements in the given object.

    • Example: Pattern (echo $value:expr ...) matches (echo 1), (echo 1 2), etc.

Bodies and transformation

If a use of macro (the form) matches the pattern of a clause, the form is transformed into a new form. This transformation process is based on body of the matching clause.

Bodies are arbitrary Scheme expressions, but they are evaluated with the following context:

  • Symbol form is bound to the form.

  • Symbol ctx is bound to the context in which the form is compiled. FIXME: Write about API to use context.

  • Symbols such as $<name>:<type> are bound to values in the form which are matched to correcponding slots in the pattern of a clause.

  • Symbols such as $<name> are bound to results of transform-value with corresponding values bound to $<name>:<type>.

transform-value is called with the following arguments:

  • A part of the form bound to a slot $<name>:<type>.

  • #t if the slot is followed by ..., or #f otherwise.

  • Slot type as a symbol.

  • Internal information to process compilation.

For example, suppose that the form (if c t e) is expanded with example if:

  • $cond:expr is bound to c.

  • $then:stmt is bound to t.

  • $else:stmt is bound to e.

  • $cond is bound to result of (transform-value $cond:expr #f 'expr …​).

  • $then is bound to result of (transform-value $then:stmt #f 'stmt …​).

  • $else is bound to result of (transform-value $else:stmt #f 'stmt …​).

  • Then the body of the second clause are evaluated.

Intermediate format of Vim script

Xire script is finally compiled into corresponding code in Vim script. However, there is a huge gap between Xire script and Vim script. So that Xire script is compiled into code in an intermediate format of Vim script, then resulting code in the intermediate format is finally compiled into Vim script.

The intermediate format is called IForm. An IForm object represents a statement or an expression. There are the following functions to create an IForm object:

FIXME: Add details about naming convensions. See vim.xire.compiler-pass1 at the moment.

Statements

($def gvar expr)

Represents a statement to define a global variable.

($gset gvar expr)

Represents a statement to modify a global variable.

($let lvars stmt)

Represents a statement to define local variables.

($lset lvar expr)

Represents a statement to modify a local variable.

($begin stmts)

Groups zero or more statements as a single statement.

($if expr then-stmt else-stmt)

Equivalent to :if.

($while expr stmt)

Equivalent to :while.

($for lvar expr stmt)

Roughly equivalent to :for.

($break)

Equivalent to :break.

($next)

Equivalent to :continue.

($ret expr)

Equivalent to :return.

($func func-name args stmt)

Roughly equivalent to :function.

($ex obj-or-iforms)

Represents an arbitrary statement.

Expressions

($const obj)

Represents a constant expression. obj can be a boolean, a number, a regular expression or a string.

($gref gvar)

Represents a global variable reference.

($lref lvar)

Represents a global variable reference.

($call subr-name arg-exprs)

Represents a compound expression using an operator such as +. See also vim.xire.compiler.pass-final.

($call func-expr arg-exprs)

Represents a function call.

Conventions of string values

The syntax of string literals is different between Scheme and Vim script. So that there are the following limitations on Scheme strings which are compiled into Vim script:

Available backslash-escape notations in Scheme strings

In Scheme strings, only the following backslash-escape notations may be used:

  • \\

  • \"

  • \f

  • \n

  • \r

  • \t

  • \uNNNN

  • \xNN

  • \<whitespace>*<newline><whitespace>*

All but the last notation are also available in Vim script. The last notation is not available in Vim script, but it is processed and simply discarded by reader of Scheme.

So that external representation of Scheme strings and ones of Vim script strings are the same if the above condition is met. Therefore it’s possible to write Scheme strings as if they are Vim script strings.

Other notations (\0 and \UNNNNNNNN) must not be used. Because:

  • There is no equivalent for \UNNNNNNNN in Vim script.

  • Vim script cannot handle NUL character as is. Though we can write "\0" in Vim script, such strings are essentially wrong. So that it must not be used.

Unavailable backslash-escape notations in Vim script strings

The following backslash-escape notations in Vim script are not available in Xire script:

Label Notations Meaning

(o)

\., \.., \...

Arbitrary byte, in octal digits

(x)

\x.

Arbitrary byte, in single hex digit

(X)

\X., \X..

Equivalent to \x. and \x..

(U)

\U....

Equivalent to \u....

(b)

\b

Equivalent to \<BS>

(e)

\e

Equivalent to \<Esc>

(k)

\<Key>

Special key sequence

  • (o), (x) and (X): Use "\xNN" instead.

  • (b) and (e): Use "\xNN" instead.

  • (U): Incompatible with "\UNNNNNNNN" notation in Gauche strings and it is rarely used.

  • (k): Use (kbd "<Key> …​") form instead.

Normalization of variable names

While various characters such as $, ! and % can be used as variable names in Scheme, variable names in Vim script must match to #/^[A-Za-z_][A-Za-z_0-9]*$/. So that it is generally an error to use such characters for symbols in IForm.

But, for convenience, several characters (more precisely, patterns) can be used for symbols in IForm.

Pattern Replacement Example Symbol Replacement Result

#/\?$/

_p

eq?

eq_p

#/!$/

_x

set!

set_x

#/->/

_to_

vector->list

vector_to_list

#/[-%]/

_

read-char

read_char

Directives

(defexpr name clause ...)

defexpr directive defines a new expression macro.

name (arbitrary symbol)

Specifies the name of the new macro.

clause ([pat body ...])

Specifies a transformation process for the new macro. See also Bodies and Transformation.

(defstmt name clause ...)

defstmt directive defines a new statement macro.

name (arbitrary symbol)

Specifies the name of the new macro.

clause ([pat body ...])

Specifies a transformation process for the new macro. See also Bodies and Transformation.

There are also the following shorthands for defstmt:

(defstmt <name>)

Roughly equivalent to the following:

(defstmt <name> <name>)
(defstmt <name> :!)

Roughly equivalent to the following:

(defstmt <name> <name>)
(defstmt <name> <name>!)
(defstmt <name> <ex-command-name>)

Roughly equivalent to the following:

(defstmt <name>
  [(_)
   ($ex '(<ex-command-name>))])  ; <ex-command-name> must be a symbol.

(scheme scheme-expr ...)

scheme directive evaluates arbitrary scheme-exprs as if (begin scheme-expr ...).

Helper API

The following Scheme API is available to define advanced Xire macros:

(scheme-object->vim-script-notation x)

A function which converts a given Scheme object into the corresponding Vim script notation. See also IForm.

(transform-value form-or-forms manyp type upper-ctx)

A function which compiles given form-or-forms in Xire script into Vim script, according to other arguments:

form-or-forms

A form or a list of forms written in Xire script.

manyp

A boolean value which specifies the format of form-or-forms. If this value is #f, form-or-forms is treated as a form, and this function returns a resulting Vim script in IForm. Otherwise, form-or-forms is treated as a list of forms, and this function returns a list of resulting Vim script in IForm.

type

A symbol which specifies the type of form. If this value is:

expr

Given form is compiled as an expression.

form

Given form is not compiled; it is returned as is.

qexpr

Same as type form. This type is to express given form is expected to be an expression.

qstmt

Same as type form. This type is to express given form is expected to be a statement.

qsym

Like type form, but form must be a symbol.

stmt

Given form is compiled as a statement.

sym

Given form is compiled as an expression, but the form must be a symbol, and variable renaming is not applied to form. This type is to express a name of variable, a name of entry in a dictionary etc.

Otherwise

It is an error.

upper-ctx

An object which specifies the context of the original caller of form.

Built-in macros

See vim.xire.builtin at the moment. FIXME: Write about details of built-in macros.

Testing

In source tree of Xire, all files matching the pattern t/*.t are test scripts. The simplest way to run tets is to run each test script. For example:

$ ./t/context.t
ok 1 - should be a class for a context
ok 2 - should copy a given context
ok 3 - should raise error if non-expression context is given
ok 4 - should raise error if non-statement context is given
...

Since test scripts output results in TAP version 12. So that it is recommended to use prove or other TAP harness to test all scripts easily.

Roadmap

0.0.0

Finish infrastructure of the compiler.

0.1.0

Cover major features of Vim script to write applications without drawbacks. Minor features such as :| (which works the same as :p) will not be supported.

0.2.0

Add more useful syntax.

0.3.0

Add more documentation.

0.4.0

Add features for ease of debugging.

0.5.0

Tune up performance.

License

So-called MIT/X license.

Copyright © 2009-2011 Kana Natsuno <kana@whileimautomaton.n3t>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.