Getting Started

Richard Hundt edited this page Jul 1, 2013 · 8 revisions
Clone this wiki locally

This section aims to help you getting up and running with Lupa.

Prerequisites

Lupa is currently only tested on Mac OSX and Linux since that's what I have to hack on. Feel free to send feedback or patches if it doesn't build on your platform.

LuaJIT currently depends on Lua being available during compilation. This is due to be changed soon, but for now it means that you'll need to install Lua and its development package (if your distro separates these).

Installing

First grab a copy of the repo here:

git clone https://github.com/richardhundt/lupa.git

Then just run:

make

This will create an executable in:

./bin/lupa

NOTE: Lupa is still experimental, so it doesn't install to some global system path yet. It also means everything is likely to change.

Running

We'll assume that the Lupa executable is in your path to save typing, so I'll just say lupa instead of ./bin/lupa from now on.

To compile and run a Lupa script:

$ lupa <source>

To list the generated Lua code (useful for debugging):

$ lupa -l <source>

First Steps

Okay, so now that we've got that out of the way, it's time to dive in, so here are a few of the key things to be aware of:

Lupa vs Lua

Lupa translates to Lua source code.

This means that Lua semantics leak through. You can generally access any Lua builtin function and get at Lupa's guts. This is intentional. Lupa is a language for getting stuff done. Sometimes that means that having syntactic and semantic flexibility in the language is useful. It also means that you can call into Lua code, which is nice, since there are plenty of useful libraries out there for Lua.

Unlike Lua, though, Lupa still makes it easy to program in a resonably safe way. It catches typos at compile time, raises exceptions if trying to access properties of objects which don't implement them, makes constraint checking declarative, and allows you to write clean and clear method contracts.

Basic Types

In Lupa, a type is generally thought of in terms of its structure and interface. In Lua terms, the type of something is its metatable.

Lupa provides metatables for all native types, including nil, boolean number, function and thread (coroutine) types. The string type has it's standard Lua metatable extended to make it consistent with other types.

All these native Lua types inherit functionality from the type Type, which is the root of the type hierarchy.

Since there is no static type checking in Lupa, all types are expected to define a coerce and check method. In fact, anything which implements these two methods can be used as guard expressions.

The metatable itself holds static members while it also keeps a separate table for virtual members. This means we don't use the common Lua idiom of setting class.__index = class, but instead something like this describes virtual property lookup more accurately (in Lua, and simplified for brevity):

local super = ... or Any
local class = setmetatable({ }, Class)
class.__slots = setmetatable({ }, {
   __index = super.__slots
})
class.__index = function(obj, key)
   local val = class.__slots[key]
   if val ~= nil then
      return val
   end
   error("no such member "..tostring(key), 2)
end

We also see here that the default base of all classes is Any. In fact, Any is the root of the inheritance hierarchy for all (non-native) objects.

The Any class essentially defines our meta-object protocol, and provides the following standard methods: is, can, does, check, coerce and toString.

All builtin types and user defined classes and instances support these methods. This means you can say:

var a = 42
if a is Number { // same as a.is(Number)
  print("a number")
}

Nil

The nil literal is the Lua nil with the metatable Nil.

Booleans

The literals true and false are Lua boolean primitives with the metatable: Boolean.

Numbers

Lupa uses Lua's (64 bit floating point) number type for literal numbers, but adds support for octal literals as well as LuaJIT's extentions for long and unsigned long integer and hex literals. The following are all valid:

var a = 42     // plain Number type
var b = 4.2    // ditto
var c = 0x2a   // ditto but hex
var d = 42L    // 64 bit long cdata
var e = 42UL   // 64 bit unsigned long cdata
var f = 0x2aUL // ditto but hex
var g = 0644   // octal literal

Strings

String literals come as single or double quoted. These have different meaning. Double quoted strings are interpolated, whereas single quoted strings are not:

var answer = 42
print("${answer} petunias") // => 42 petunias
print('${answer} petunias') // => ${answer} petunias

In double quoted strings, any expression may be used between the ${ and } markers. If the value is not already a string, then Lupa will try to coerce the value to a string.

For both single and double quoted string literals, there is also a long form delimited by ''' and """ respectively:

var answer = 42

var long_string = """
This is a very long
string which still
holds the ${answer}
"""

Lupa strings also support all the standard Lua string methods, but unlike Lua, you can call these methods on string literals directly without putting them in parentheses:

var mesg = "the answer is %d".format(42)

Strings support match, gmatch and gsub methods just like ordinary Lua strings which support Lua's patterns.

Patterns

In addition to Lua's string matching patterns, Lupa also links against LPeg, and adds PEG pattern literals and rule declarations (for structural types).

Pattern literals are delimited by slashes:

var word_pattern = / [a-zA-Z]+ /
if word_pattern ~~ "friend" {
  print("enter")
}

Pattern body expressions are PEG (parsing expression grammar) sequences, with support for all the LPeg capture types (substitution captures, group captures, match-time captures, fold-captures, etc.).

Unlike other languages which provide regular expressions, Lupa patterns are integrated into the language. This means that they are not strings passed to an embedded regular expression compiler and engine (such as PCRE), but composible with other elements of the language.

To illustrate this, here's an example:

var sp   = / " " | "\n" | "\t" | "\n" /
var word = / [a-zA-Z]+ /
var sent = / { word (sp word)+ "." } -> function(s) { '"'+s+'"' } /

The last line in the example above composes a pattern from two lexical variables which themsleves are patters and uses a function literal to quote the capture sentence.

Grammars are constructed in that nominal types can declare patterns as rules in their body. Here's the example macro expander from the LPeg website translated to Lupa:

object Macro {

    rule text {
        {~ <item>* ~}
    }
    rule item {
        <macro> | [^()] | '(' <item>* ')'
    }
    rule arg {
        ' '* {~ (!',' <item>)* ~}
    }
    rule args {
        '(' <arg> (',' <arg>)* ')'
    }
    rule macro {
        | ('apply' <args>) -> '%1(%2)'
        | ('add'   <args>) -> '%1 + %2'
        | ('mul'   <args>) -> '%1 * %2'
    }
}

var s = "add(mul(a,b),apply(f,x))"
print(Macro.text(s))

The Lupa grammar is self bootstrapped, so hopefully that can serve as a reference until I finish this document. ;)

Tables

Lupa also has a Table type which is a metatable associated with table literals, which share common syntax with Lua, except that semicolons aren't valid separators (so commas only):

var tab = {
  'foo',
  'bar',
  hop = 'jump',
  ['this'] = that,
}

We also have array literals:

var ary = [ "a", "b", "c" ]

which have Array as the metatable and which respond to push, pop, shift, unshift, len, sort, reverse and so on.

In addition to types provided by Lua, Lupa also has the following:

Ranges

A range expression takes the form: "["<expr>";"<expr>(";"<expr>)?"]". For example:

[1;10;2] each => { print(_) }
// count down
for i in [10;0;-1] {
  print(i)
}

Ranges are useful not only for loops, but also checking that values fall within them when used as guards:

var a : [1;10] = 1
a = 42              // error 42 is out of range 

Enum

Enum literals are declared as:

enum SockType {
  STREAM    = 1,
  DGRAM,    = 2,
  RAW,       // default 3
  RDM,       // default 4
  SEQPACKET, // default 5
  DCCP,      // ...
  PACKET    = 10,
}

Scoping

Lupa has two kinds of scopes. The first is simple lexical scoping, which is seen in function and class bodies, and control structures.

The second kind of scope is the environment scope, which is modeled after Lua 5.2's _ENV idea, where symbols which are not declared in a compilation unit, are looked up in a special __env table, which delegates to Lua's _G global table.

At the top level of a script, class, object, trait and function declarations are bound to __env, while variable declarations remain lexical.

Inside class, object and trait bodies, only function declarations bind to __env. Method and property declarations bind to self (the class or object).

Inside function bodies, function declarations are lexical and are not hoisted to the top of the scope, meaning they are only visible after they are declared.

Variable declarations declared as var are always lexical. To declare a variable bound to the environment, use our:

var answer = 42  // ordinary lexical
our DEBUG = true // bound to environment
// bound to the environment (__env.envfunc)
function envfunc() {
    // ...
}
// a lexical function
var localfunc = function() {
    // ...
}
class MyClass {
    // this function is only visible in this block
    function hidden() {
        // ...
    }
    method munge() {
        hidden()
    }
}

Nested function declarations are also lexical, however the differ from function literals in that inside a function declaration, the function itself is always visible, so can be called recursively:

function outer() {

    // inner function is lexical
    function inner() {
        // inner itself is visible here
    }

    // not quite the same thing
    var inner = function() {
        // inner itself is not visible here
    }
}

Variables

Lexical variables are introduced with the var keyword, followed by a comma separated list of identifiers, and an optional = followed by a list of expressions.

var a, b         // declare only
var c, d = 1, 2  // declare and assign

Variables can also be introduced using the our keyword, which, as mentioned earlier binds to the environment table:

function life_etc() {
    print("the answer is ${answer}")
}
our answer = 42
life_etc()

Guards

Various declarations may also include guard expressions:

var s : String = "first"

Future updates to guarded variables within a given scope cause the guard's coerce method to be called with the value as argument to allow the guard to coerce the value or raise an exception.

The above statement (loosely) translates to the following Lua snippet:

local s = String:coerce("first")

Classes and traits, as well as built-in types Number, String, Boolean, Array, Table and Function can be used as guards.

Custom guards can also be created using a guard declaration:

guard Size(sample : Number) {
    if !sample > 0 {
        throw TypeError.new("${sample} does not pass Size constraint")
    }
    return sample
}
var size : Size = 4.2

Assignment

Assignments can be simple binding expressions:

everything.answer = 42

... or compound:

a += 1

Operators

Unary

Lupa defines the following unary operators:

  • # - len
  • - - unm
  • ! - logical not
  • ~ - bnot
  • ... - unpack
  • ? - maybe (for guard expressions)

The # and - operators have the same meaning as in Lua. The ! simply translates to Lua's not. The ... as prefix operator uses unpack to destructure a table based value. Prefix ~ is bitwise not. Prefixing a type or guard name with ? returns a new nillable guard which composes Nil and the type guard (so the assertion allows nil or the type) as a union.

Infix

Lupa is an object oriented language, so the . operator always means method call. This holds even if parentheses are omitted. The dot operator therefore always translates to a : call in Lua, with parentheses automatically inserted. However sometimes it is useful to access a member directly without calling it (especially when interfacing with Lua libraries). This can be achieved using the :: operator which translates to a . in Lua and never does automatic parentheses insertion.

Here are some examples with their Lua translations:

var x = pt.x
var x = pt::x
var x = math::floor(pt.x)
pt::x = 42
pt.y = 69
local x = pt:x()
local x = pt.x
local x = math.floor(pt:x())
pt.x = 42
pt:__set_y(69)

Lua's built-in arithmetic and logical operators are supported, but using C-like syntax. So instead of or we write || and so on.

Additionally Lupa supports bitwise operators (also C-like), which can be overloaded, a match operator, written ~~, its negation, written !~,

The list of binary operators in order of highest to lower precedence:

  • ** - pow
  • *,/,% - mul, div, mod
  • +,-,~ - add, sub, concat
  • <<,>>,>>>,^,|,& - lshift, rshift, arshift, bxor, bor, band
  • ==,!=,<=,>=,<,>,~~,!~ - eq, ne, le, ge, lt, gt, match, not(match)
  • && - logical and
  • || - logical or
  • can,is,as,does - type related

Overloading works the same as in Lua, in that static methods on the meta object are used. These are named in the same way for Lua's built-in operators and share the same semantics. Extended operators, such as the bitwise operators can be overloaded in the same way, and are named accordingly. For example, to overload <<, define a static method __lshift.

class Point {
   static method __add(that : Point) {
      return Point.new(.x + that.x, .y + that.y)
   }
}

Additionally, any method may be called using infix notation if called with one parameter:

var mesg = "Hello %s" format "World!"

Which is just another way of saying:

var mesg = "Hello %s".format("World!")

This syntax applies to all method calls, so the following are equivalent:

var d = Dog.new("Fido")
var d = Dog new "Fido"
10.times((i) => { print(i) })
10 times (i) => { print(i) }
10 times => print(_) // same as above

Post-Circumfix

Additionally, post-circumfix operators are allowed in certain contexts. Array and Table subscripts are actually defined as _[] and _[]= methods. These can be used to implement your own collection:

class NumberArray {
    has data = [ ]
    method _[](index) {
        .data[index]
    }
    method _[]=(index, value : Number) {
        .data[index] = value
    }
}
var nums = NumberArray.new
nums[1] = 42

NOTE

Property assignment is a bit special, in that foo.bar = 42 translates to foo.__set_bar(42). There is no __get_bar because you can call methods without parenthesis, where foo.bar is the same as foo.bar() and has members set up such accessors for you.

Identifiers

Indentifiers in Lupa come in two flavours. The first type are the familiar type seen in most languages (currently ? and ! are supported in the first and last positions respectively). The following pattern describes these:

name = / (%alpha | "_" | "$" | "?") (%alnum | "_" | "$")* "!"? /

Other other kind of identifiers consist only of punctuation as described earlier under Operators. These are used in method declarations:

class Point {
    has x : Number = 0
    has y : Number = 0
    method +(b : Point) : Point {
        Point.new(.x + b.x, .y + b.y)
    }
}

Modules

Modules are simply Lupa source files. There are no additional namespaces constructs within the language to declare modules or packages.

Symbols are not exported by default. To export symbols, the export statement can be used. It has the form export <name> [, <name>]* Symbols can be imported using the import statement, which takes the form import [<name> [, <name>] from <dotted_path>.

For example:

/*--- file: ./my/shapes.lu ---*/
export Point, Point3D

class Point {
    has x = 0
    has y = 0
    method move(x, y) {
        self.x = x
        self.y = y
    }
}
class Point3D from Point {
    has z = 0
    method move(x, y, z) {
        super.move(x, y)
        self.z = z
    }
}

/*--- file: test.lu ---*/
import Point, Point3D from my.shapes

var p = Point3D.new
p.move(1, 2, 3)

It is an error to attempt to export a symbol which is never declared, or is declared but evaluates to nil.