Skip to content

LanguageIntroduction

JT Olds edited this page Mar 15, 2015 · 1 revision

How to use Pants

This introduction is under development. Beware incomplete thoughts and poor writing.

About Pants

At its lowest level, Pants is simply Lambda Calculus in continuation-passing-style with some syntax sugar translated to C.

A common feeling within the software engineering community is that language expressiveness can come at the cost of performance, and Pants is setting out to disprove that hypothesis. Pants is an attempt to create a powerful scripting-like language with full support for fancy custom control structures and DSLs without any performance compromises or any messy scope bloat.

Of course, in its current very alpha state, there is an incredibly unfortunate amount of performance compromise. However, a primary design goal is to create a language that is easy to analyze statically, and therefore provide future deep optimizations.

Pants syntax

The basic grammar (which will be explained in more detail) is as follows:

<program>      := <expression>*

<expression>   := <assignment>
                | <application>

<assignment>   := <term> (`=` | `:=`) <expression>

<appliciation> := <term>+
<term>         := <header>* <value> <trailer>*
<header>       := <left-open-call-indicator>
<trailer>      := <right-open-call-indicator>
                | <index>
                | <field>
                | <closed-call>

<index>        := `[` <expression>+ `]`
<field>        := `.` <identifier>

<closed-call>  := `(` [ <out-argument>+ `;` ] <out-argument>* `)`


<out-argument> := <application>
                | <identifier> `:` <application>
                | `:(` <expression>+ `)`
                | `::(` <expression>+ `)`

<value>        := `(` <expression>+ `)`
                | <function>
                | <bytestring>
                | <charstring>
                | <identifier>
                | <integer>
                | <float>
                | <dictionary>
                | <array>

<function>     := `{` [ `|` [ <in-argument>+ `;` ] <in-argument>+ `|` ]
                      <expression>+ `}`

<in-argument>  := <identifier>
                | <identifier> `:` <application>
                | `:(` <identifier> `)`
                | `::(` <identifier> `)`

<array>        := `[` <application>* `]`

<dictionary>   := `{` (<application> `:` <application>)* `}`

<left-open-call-indicator>  := `@` | `.`
<right-open-call-indicator> := `.`

Notably, an identifier is any set of unicode characters, excluding the following set:

' ', '\n', '\r', '\t', ';', ',', '(', ')', '[', ']', '{', '}', '|', "'", '"', '.', ':', '@', and '#'

Further, an identifier can't start with a digit, and cannot be simply "=". This means that the only keyword in Pants is "=".

If I haven't gotten sick of trying to figure out Boost::Spirit::Qi (or C++ parser combinators in general) by the time you read this, you can find the full grammer in a strange C++ syntax in parser.cpp.

Update: I did get sick of it, here's the new parser.

Okay, Pants syntax in English

Expressions

Pants is simply a series of expressions. An expression can be either an assignment or an application. Expressions are separated by newlines or semicolons. An assignment is simply a variable name, a specific assignment operator (there's two, more on that in a minute), and a subexpression.

An application is where things get interesting.

Terms and values

An application, in the context of Pants, is a space-separated list of one or more terms. A term is a piece of syntax you're probably used to. All of the following examples are terms:

  1
  variable
  object.field
  array[index]
  "hello world"
  myfunction(1, 2)
  {"key1": "value1"}
  {|arg1, arg2| print(arg1, arg2)}
  (some sub expression)

Notice that a term can be a value or something more complicated. Pants has a simple list of value types, but those value types include user-created objects, anonymous functions, strings, etc. Note that a dictionary and an anonymous function are similar in syntax. The following are dictionaries:

  {}
  {"key1": "value1"}
  {1: 2, 3: "4"}

while the following are anonymous functions:

  {"value"}
  {print("value")}
  {|x| print(x)}
  {|x,y| print(y)}

Applications and function calls

If an application is made up of simply one term, the value of the application is just that term.

However, if an application is made up of two or more terms, then the application turns into an "open" function call.

There's two types of function calls in Pants - open calls, and closed calls. Closed calls are in the traditional syntax you're used to. Closed calls have a function value followed by a parenthetical list of arguments. Closed function calls are parsed as a single term, and examples of closed function calls are:

  factorial(10)
  ackermann(4,3)
  print("hello")
  socket.write(data, len)

An open call, in contrast, has no parenthetical argument list. In an application with two or more terms, in the absence of a call indicator (more on call indicators in a second), the first term is assumed to be a function, and the remaining terms are assumed to be arguments. As a result, the following is a list of examples of open calls:

  factorial 10
  ackermann 4 3
  print "hello"
  socket.write data len

This list of examples is semantically identical to the previous list of closed call examples.

An interesting caveat is that

  do_stuff()

and

  do_stuff

are not semantically identical. The first evaluates to the return value of the do_stuff function call, whereas the second evaluates to the do_stuff function itself. Since the second do_stuff example is not a two-or-more term application, the value of the application is the single term it consists of.

This open-call behavior gives you the ability to write powerful and expressive domain specific languages and language extensions. Recall how easy it is to define an anonymous function. Pants implements the traditional if keyword as a regular library function that takes two (optionally 3) arguments. The following two if function calls are identical:

  if (some_predicate(val1, val2)) {
    print "true!"
  }

  if((some_predicate(val1, val2)), {print "true!"})

Call indicators and left arguments

So far, we've only discussed functions that take arguments on the right side of the function call. In fact, most languages only support arguments on the right-hand-side of the function identifier. Pants, however, supports left arguments as well.

Recall that in the absence of a call indicator, a two-or-more term application assumes that the first term is the function to call. However, an application allows for one of the terms to have a call indicator specified. If a call indicator is specified, then the term with the call indicator is the function to call. A call indicator is either a trailing period, a preceding period, or a preceding at-symbol.

Suppose we wanted to support something like Ruby's unless trailing keyword. We want to use it in the following example:

  { print "we did it!" } @unless we_didnt

In this example, we are going to run the anonymous function that outputs "we did it!" as long as the variable we_didnt is false. Here, the at-sign is a call-indicator, saying that in this 3-term application, unless is the function we want to run. We're going to provide the anonymous function as the first left-argument and the value of we_didnt as the first right-argument.

So, we need unless to be defined as a function that takes one left and one right argument. Turns out, we can do that as follows:

  unless = {|code_thunk; predicate|
   if (not predicate) {
     code_thunk()
   }
  }

Here, the semicolon in the argument definition list separates left-arguments and right-arguments.

Observant readers will note that we can rewrite unless a little more simply as:

  unless = {|code_thunk; predicate| if (not predicate) code_thunk}

Advanced argument passing

Pants supports a whole slew of argument passing styles, such as arbitrary argument lists (both passing and receiving), default values in function definitions, keyword arguments, and a number of other things.

TODO

Semantics

To explain the semantics of Pants, we'll start off with a simple program. Here, we'll implement the standard while loop, then add features as we need them, such as break and continue.

Call-by-value

Let's take a first stab at it.

  while = {|test, body|
    if test() {
      body()
      while(test, body)
    }
  }

  myval = true
  while {myval} {
    print "hi"
    myval := false
  }

The definition of while here is recursive, and works because Pants supports tail-recursion. In other words, while will loop because the while is calling itself, and due to tail-recursion, the stack doesn't grow and waste memory.

You'll notice that in this example, test is an anonymous function that returns myval. What happens if we wrote while like this?:

  # Broken implementation of while
  while = {|test, body|
    if test {
      body()
      while(test, body)
    }
  }

  myval = true
  while (myval) {
    print "hi"
    myval := false
  }

Here, we've simply changed the definition of while to not call test like a function, and we've changed the usage of while to take myval surrounded by parenthesis instead of brackets.

So what's wrong? We set myval to true, and then call while with it, so while gets true for test. We never actually re-evaluate myval at any point, so while will run forever.

In the correct definition, we're passing an anonymous function as the first argument to while that returns whatever the current value of myval is, so at each loop iteration of while, myval gets re-evaluated when we call test inside of while.

Definition vs. mutation and variable hiding

So, let's look at our usage of while again:

  myval = true
  while {myval} {
    print "hi"
    myval := false
  }

You may be wondering what's up with the two different kinds of assignment operators being used with myval. You also may recall that I briefly mentioned before that there are two forms of assignment in Pants.

The first style of assignment is definition. When a variable gets defined, it is introduced into the scope as a new variable, regardless of any previous variables that have been defined. To specify a definition, the programmer uses a single equals "=". If we only used a single equals to assign values to myval, we would get not one, but two distinct, different variables named myval. But we don't want to make a new myval every loop iteration; we want to mutate the previously defined myval.

The second style of assignment is mutation and it is specified by a colon-equals, like ":=". When a variable is mutated, it must already be defined and in scope, and the value of that variable changes to whatever new value it is getting mutated to.

Operators

You may have noticed that there are a great deal of possible identifiers, with only one reserved identifier or keyword ("="). This means that the following are all completely valid identifiers, and thus, variables:

  +, -, /, *, ^, !, ==, <, <=, >, >=, !=

Because Pants supports left-arguments, all of the operators you are used to are defined simply as functions and require call-indicators for infix notation. E.g.:

  1 .+ 2
  4 .- 2
  1 .== 3
  "hey" .!= "there"

All of the provided operators are defined to either accept one left and one right argument, two left arguments, or two right arguments, such that infix, prefix, and postfix notation are all supported.

With this in mind, the following codeblock is equivalent:

  +(1, 2)
  4 2 @-
  == 1 3
  "hey" !=. "there"

Full example: each

We've now covered enough semantics to implement something slightly more useful, such as something like Ruby's each method. Let's assume an iterable object has a size method and supports index operations.

  each = {|iterable; func|
    i = 0
    while {i .< iterables.size()} {
      func(iterables[i])
      i := i .+ 1
    }
  }

We can now use each as follows:

  [1, 2, 3] .each {|item|
    print item
  }

Returns

Okay, let's say we want to write an include? function that checks an iterable for some item, and returns true if it exists, false otherwise. We want to use it like:

  [1, 2, 3] .include? 2 # should be true
  [1, 2, 3] .include? 4 # should be false

Let's give it a shot.

  include? = {|iterable; item|
    iterable .each {|possible_item|
      if (possible_item .== item) {
        return true
      }
    }
    return false
  }

Hold up cowboy, return out of what function? There's three anonymous functions there.

Because of this problem, Pants does not define a return anywhere. However, Pants defines a new variable called cont (which stands for continuation, more on those later) at the beginning of every function scope. You can save this value in the function scope it makes sense to have and use it later.

So, let's try again:

  include? = {|iterable; item|
    return = cont
    iterable .each {|possible_item|
      if (possible_item .== item) {
        return true
      }
    }
    return false
  }

Now this include? function works, because return returns out of the right function.

You may be thinking, this cont thing is magical! You're right.

Continuations ===

Continuations are, as we just saw, essentially save-able wrappers of return statements.

Notably, if you have saved a continuation, you can call it at any point in time afterwards, and as many times as you want. Check out this mind-bending example:

  saved_cont = null
  myfunction = {
    return = cont
    saved_cont := return
    return 1
  }

  val = myfunction()
  print val
  if (val .== 1) {
    saved_cont 2
  }

The output of this code example is:

  1
  2

What is going on here? When myfunction is called, it saves its continuation, and then returns 1. We store the 1 in val, print val, and then check the value of val. If it's a 1, we call the continuation. So far so good.

What happens when we call the saved continuation is that we fall out of the original call to myfunction with a return value of 2.

So, in this alternate universe we just found ourselves in, where myfunction returned 2, we store the 2 in val, print val, and check if val is 1. It's not, so we're done.

You can read more about continuations at Wikipedia's continuation entry.

Dynamic Variables

You may be aware of the distinction between dynamic and lexical scoping. If not, suffice it to say that every mainstream programming language uses lexical scope. Let's write a motivating example:

  x = 1
  f1 = { print x }
  f2 = {
    x = 2
    f1()
  }

In lexical scoping, when a variable is encountered, its value is determined based on where it was last defined, lexically, in the source file. In the above example, with lexical scoping, f2 will output 1.

In dynamic scoping on the other hand, when a variable is encountered, its value is determined based on the call stack. In the above example, with dynamic scoping, f2 will output 2.

There are pros and cons to both types of scoping, but it's clear that over decades of programming language research, lexical scoping is far better, clearer to understand, and easier to reason about. If you're interested in learning more about various types of language scoping rules, check out Wikipedia's scope entry.

Pants is lexically scoped.

However, Pants also implements dynamic scoping; it just doesn't give you direct access to it. The only way to access dynamically scoped variables is through lexically-scoped proxy objects that you construct with the DynamicVar object constructor. And in case you're wondering, we'll talk more about objects later.

Whenever you call the DynamicVar object constructor, a completely brand new dynamic variable is created for you that has not been used in your program before. Here's an example usage of DynamicVar which we'll walk through in a second:

  my_dv = DynamicVar()
  my_dv.call "hello, dynamic scope" {
    print my_dv.get()
  }

First, DynamicVar is in fact a zero-argument function that returns an object. We save this dynamic variable proxy object in a lexical variable, my_dv.

Second, dynamic variables are set using the call member function. The call member function takes two arguments. First, it takes the value to set the dynamic variable to, and second, it takes a block of code to run in which the dynamic variable is set to that value.

Last, in any scope, you can retrieve the value of a dynamic variable by calling the get member function on the proxy object. If the dynamic variable isn't defined in your current call stack, an exception will be raised (more on exceptions in a minute).

This allows you, along with continuations, to do some really interesting things.

Full example: break and continue

Let's rewrite while so that we can have break and continue control structures.

First, the code, and then an explanation.

  while_dv = DynamicVar()
  while = {|test, body|
    if test() {
      should_continue = {
        fall_out_of_should_continue = cont
        while_dv.call fall_out_of_should_continue {
          body()
          true
        }
      }
      if should_continue() {
        while(test, body)
      }
    }
  }
  break = { while_dv.get()(false) }
  continue = { while_dv.get()(true) }

The overall strategy here is that within a while loop, we will define a dynamic variable that will allow for popping out of the loop using a continuation, which will take a value as to whether or not the loop should continue with the next iteration. We'll then define break and continue to use that dynamic variable appropriately.

So, first, we instantiate a new dynamic variable proxy object. It's going to be used by while, break, and continue, so we define it first.

Next, we define while. As before, while still works recursively, and, as before, the base case is to do nothing and just return if the test evaluates to false.

In the case that test is true, though, we're going to run one loop iteration. We want some function that will return true if we should continue and loop again, and will return false if we need to stop. That function is should_continue.

should_continue is very first going to save its own continuation. The continuation of should_continue is then stored in the dynamic variable, and the body of the loop is called. If body returns without incident, we want to loop again, so the last statement of the block is true. The last statement is then propagated up as should_continue's return value.

However, if for some reason the continuation of should_continue is called from within body, the rest of the body will not run, and that value will fall out of should_continue as the return value.

So, we have all of the pieces in place. We'll see if test is good. If it is, we'll run body. If body completes successfully, we'll loop again. If body does not complete, it is due to a continuation being called, which will be due to break or continue. break will return false, meaning we should not loop again, whereas continue will return true.

We've just implemented a fancy control structure completely within the language. Let's see how it works:

  while {true} {
    print "still running"
    if (all_done()) { break() }   # could also be: if all_done() break
  }

Exceptions

At this point, you may be wondering why, if dynamic scope can be so useful, why no currently popular language uses or supports it. The truth is nearly all modern languages support a very specialized use-case of dynamic scope: exceptions.

If you think about it, exceptions are simply an exception handler data structure stored in dynamic scope. Whenever you start a try-block, a new exception handler is stored in dynamic scope for the exception handler dynamic variable, and all throw or raise does is call it.

With the addition of continuations, we can completely implement exceptions within Pants as well.

  throw_dv = DynamicVar()
  try = {|body, handler|
    fall_out_of_try = cont
    throw_dv.call {|error| fall_out_of_try handler(error)} {
      body()
    }
  }
  throw = {|error| throw_dv.get()(error)}

If you understood while, break, and continue, try and throw should actually look pretty straightforward to you.

Pants does a little bit of extra work to make exceptions nice and useful - specifically, Pants keeps track of throw_dv and uses it at runtime for a number of runtime exceptions, such as array-out-of-bounds, invalid function calls, missing object fields, etc, etc.

Let's see how exceptions work:

  try {
    { throw "can't divide by zero!" } @unless (a .> 0)
    print (b ./ a)
  } {|e|
    print "exception!" e
  }

Objects

Arrays

Dictionaries