Find file
Fetching contributors…
Cannot retrieve contributors at this time
443 lines (319 sloc) 13.5 KB
These notes are from 2006-09-13 around 7 p.m.
(f x y)
o m: x n: y
(o.m x y) === (o .m x y)
(o:m x y) === (o :m x y)
(x := a)
(x y := a b)
(f x y) => ( x y)
(a + b) => ( + b) =[]=> (+ a b)
expr : atom
| list
list : '(' expr+ ')'
| '(' atom+ ':=' expr+ ')'
atom : NAME
special methods:
New notes, revised on 2006-09-21 around 8:30 p.m.
(f x y)
o m: x n: y
(o .m x y)
(o :m x y)
(x :: a)
# not yet # (x y :: a b)
(f x y) => (f :run x y)
(a + b) => (a :run + b) =[a :run]=> (+ a b)
expr : atom
| list
list : '(' expr+ ')'
| '(' atom+ '::' expr+ ')'
atom : NAME
special methods:
New notes, revised on 2006-10-02 around 6:30 p.m.
perhaps soon restore
(o.m x y) === (o .m x y)
(o:m x y) === (o :m x y)
((x y) :: (list a b))
(a + b) => (a :+ b)
New notes, added on 2006-11-19 around 10:10 p.m.
LX shorthand notation:
(x :m :n a) === ((x :m) :n a) ???
(x :m a :n b) === ((x :m a) :n a) ???
Support overloading by number of args as in E:
def queue ():
(run x):
# push item x onto the back
# pull an item from the front
These methods would be known as run:1 and run:0, similarly to E's convention.
Also allow variable arg lists as in Scheme:
def o ():
(run x . args):
# do something cool
But maybe don't allow the programmer to mix the two (that seems hard)
Use overloading of the run:n methods to give the following api:
def q (queue :make)
q 42
print (q)
def a (array :make 3)
a 0 "a"
a 1 "b"
a 2 "c"
print (a 0) (a 1) (a 2)
def h (hash-table :make)
h "foo" 42
print (h "foo")
New notes, added on 2006-12-28 around 6:15 p.m.
Here is how to mix the two without sacrificing much speed:
- Each multi-arg method generates several entries meth:arity, meth:(arity+1),
meth:(arity+2), etc. up to meth:m where m is the arity of the largest
method plus one or the method expansion constant (MEC), whichever is
greater. MEC will initially be 10, but can be tuned for a space-time
- Each fixed-arg method generates a single entry meth:arity
- More specific entries override less specific ones
- The largest variable-arity method also generates an entry meth:+
For example:
def o ():
(run x) x
=> run:1
(run x y) (x + y)
=> run:2
(run x y z . w) x
=> run:3
=> run:4
=> run:6
=> run:7
=> run:8
=> run:9
=> run:10
=> run:11
(run a b c d e) 42
=> run:5
(run a b c d e f g h i j k l . m): 12
=> run:12
=> run:+
To lookup a method with MEC or fewer arguments, just lookup name:args. If
the lookup is for a call with more than MEC arguments, first try name:args.
If it doesn't exist then try name:+.
New notes, added on 2007-03-08 around 3:50 p.m.
(Updated 2007-04-12 around 11:07 a.m. to fix typo.)
Source transformations for a more convenient import statement:
New keyword
load x is like previous import x
import x ==> def x (import x)
import (x y) ==> def y (import x)
import x a b ==> begin (def x (import x)) (def a (x :a)) (def b (x :b))
import x (a i) (b j) ==>
begin (def x (import x)) (def i (x :a)) (def j (x :b))
Source transformations for more convenient exporting of symbols:
export a b c ==>
obj () ((a) a) ((b) b) ((c) c)
New notes, added on 2007-03-16 around 3:11 p.m.
Literal syntax for: dates, times, email addrs, paths, uris
other formats?
Protocols: http https ftp data file imap mailto pop sip (others?)
New notes, added on 2007-03-20 around 1:52 a.m.
Here is how to offer optional args in addition to var args and overloading by
arity. It is done by (surprise, surprise) another source transformation.
The idea: expand a method definition with optional arguments into several
method definitions, each of which calculates one optional value and passes it
to the next. The usual mechanism for selecting a method by arity will cause
the appropriate one to be called.
For example (using a pseudo-syntax that wouldn't actually work):
obj () ((m x y=1 z=2 w=3) body)
obj ():
(m x) (self:m x 1)
(m x y) (self:m x y 2)
(m x y z) (self:m x y z 3)
(m x y z w) body
Unfortunately, this means that a call to m:1 will execute three extra method
calls. This transformation would play nicely with the PIC optimizer, making
this not very burdensome, but it's still less than ideal.
Alternatively, the example above could be expanded to the following:
obj ():
(m x) (self:m x 1 2 3)
(m x y) (self:m x y 2 3)
(m x y z) (self:m x y z 3)
(m x y z w) body
This would reduce the number of method calls but add some bloat in the
generated code, which might affect locality and cache behavior. It's unclear
which strategy is better; I'll need to do some profiling to get a real
answer. My hunch is that the second version with more code but fewer run time
calls will win overall because the expressions that get duplicated will tend
to be very small -- nearly always just loading compile-time constants.
Now it just needs a syntax. I said above that "x=1" wouldn't work. That's
because "x=1" is a perfectly valid variable name. I'm currently considering
further abusing the colon, as in "x:1", but that is unsatisfying. It also
looks too similar to a method call when the default expr is a variable
lookup, as in "x:y". I could use a parenthesized form, but I'm saving that
for destructuring binds.
New notes, added on 2007-03-20 around 2:21 a.m.
A neat side effect of doing optional args by this source transformation
method is that it becomes pretty easy to give meaningful, well-defined
behavior if the user wants to put an optional argument in the middle of the
list. They don't have to go only at the end. Consider:
obj () ((m x y=3 z) body)
obj ():
(m x z) (self:m x 3 z)
(m x y z) body
Or, a slightly more complicated example:
obj () ((m a b=3 c d=7 e) body)
obj ():
(m a c e) (self:m a 3 c e)
(m a b c e) (self:m a b c 7 e)
(m a b c d e) body
It still needs a syntax, though. I've also skirted around another issue:
doing this as a source transformation causes trouble in reliably referring to
the receiver. I cleverly avoided this problem in my pseudo-syntax by using
the word "self", which has no special meaning in LX. This would be no problem
in bytecode, but it's tricky as a source transformation. I may have to
introduce another "impossible" keyword.
New notes, added on 2007-04-10 around 12:03 a.m.
The name is Sodium!!! That is all.
New notes, added on 2007-04-11 around 1:55 p.m.
Three things today. First, I'm strongly considering swapping the meanings of
dot and colon in method invocation. So x.m would be a blocking call and x:m
would be a nonblocking call. Originally, I chose dot for the asynchronous
operation because I wanted to encourage people to use it by making it easier
to type, compared with the colon. But after writing just a couple thousand
lines of Sodium (nee LX) code, I've observed that blocking calls are going to
be much more common than nonblocking calls no matter what. I should optimize
for the common case.
By the way, I'm also changing my terminology slightly. Until now I had been
pretty careful about referring to synchronous method invocations as
"immediate calls" and asynchronous method invocations as "eventual sends". I
think a better approach is to refer to them both as "calls" or "message
sends" (a la Smalltalk) or "invocations" or whatever, to reinforce the notion
that they are more similar than different. Then, where necessary, one can
distinguish the two by using an adjective such as "blocking" or "nonblocking"
or "synchronous" or "asynchronous" (or even "immediate" or "eventual").
Second, I'm strongly considering adding a bit more low-level syntax (i.e. not
a parse tree transformation) to let the programmer omit more parentheses. I
first mentioned this on 2006-11-19 but didn't consider it very seriously.
Here's how it works:
Currently, any combination can contain at most one "IMESS" or "SMESS" token
(that is, a symbol prefixed by a colon or dot) and that token must occur in
the second position. The occurrence of another of these tokens is an error.
Instead, I will let the occurrence of another such token signal that
everything seen so far in the combination should be treated as if it were
wrapped in parens. That means that this
(o a b c):m x y
can be shortened to
o a b c:m x y
and this
((x:foo):bar):baz a b
can be shortened to
x:foo:bar:baz a b
This feature almost always lets you omit a set of parens that starts at the
beginning of the line. The exception is
which cannot be shortened. That's because the :m would be in the second
position and x itself (rather than the result of calling x:run with zero
args) would be sent the message.
To go along with this, I'm planning on changing the treatment of punctuation
messages. Currently, 3 + 4 is translated to 3 :+ 4 by a parse tree
transformation during eval. Instead, I'm going to move it all the way down to
the lexer. I'll have the lexer emit an IMESS token when it encounters a word
of just one or two characters in the list of special punctuation. So x-y will
still be a single symbol token, but x - y will lex the same way as x:- y.
Third, I'm weakly considering making the equals sign more special to provide
for a readable syntax for optional params. This would involve adding one more
token type to the lexer, a "keyword" token (for lack of a better name), that
looks like a symbol suffixed with an equals sign. Then, in a list of formal
params, a keyword followed by an expr would count as one optional parameter.
This is essentially the pseudo-syntax I used in my earlier example, but made
real. This would mean that x=5 is no longer a valid variable name; you can't
use '=' in identifiers any more. Having used Scheme, I like the ability to
use nearly any character in a variable name, so I would lament the loss of
the equals sign. But, then again, the lack of '=' in identifiers doesn't seem
to bother Python users very much.
This would also provide a syntax for keyword args. Unfortunately, syntax is
not the hard part of keyword args. I'd love to do keyword args some day but
I'll need some inspiration on the mechanism.
New notes, added on 2007-04-11 around 3:11 p.m.
Well, the last piece of the puzzle in providing optional args is a way to
refer to the self object reliably, even when that object is not bound to any
name in scope. Right now, the answer is implicit self. Since an IMESS token
cannot currently occur as the first thing in a combination, I'll use that to
mean that the receiver is self. Although generally I agree that explicit is
better than implicit, I think this is better than adding a reserved word or
something. And arguably, this isn't really implicit, it's just terse. The
notation is unambiguous (unlike, say, Ruby) and it doesn't break any existing
Updated notes on 2007-04-12 around 11:07 a.m.
(Originally added on 2007-03-08 around 3:50 p.m.)
Source transformations for a more convenient import statement:
New keyword
load-module x is like previous import x
import x
=> def x (load-module x)
import (x -> y)
=> def y (load-module x)
import x a b
=> def x (load-module x)
def a (x :a)
def b (x :b)
import x (a -> i) (b -> j) ==>
=> def x (load-module x)
def i (x :a)
def j (x :b)
Source transformations for more convenient exporting of symbols:
export a b c
=> obj () ((a) a) ((b) b) ((c) c)
export a (b -> y) c
=> obj () ((a) a) ((y) b) ((c) c)
New notes, added on 2007-04-25 around 3:35 p.m.
Well, I've swapped the meaning of colon and dot for messages and implemented
the extra paren-omitting syntax feature. You can now put a message anywhere
in a combination and the reader will insert parens for you. There's one small
but significant change from my original description of this feature, though.
I've changed the associativity so that a message only applies to the
subexpression immediately before it rather than all subexpressions up to that
point. You could say it's "right-associative".
So my previous example
o a b c:m x y
actually means
o a b (c:m x y)
rather than
(o a b c):m x y
I made this change after looking at a few examples and noticing that it
allows you to omit far more parens than my original plan.
I haven't yet changed the lexer to turn punctuation into messages.
New notes, added on 2008-09-19 around 10:52 a.m.
Use the exclamation mark "!" as shorthand for logical negation as in Arc.
This means "!x" gets rewritten to "(not x)".