NOTES

These notes are from 2006-09-13 around 7 p.m.

Lisp:
    (f x y)


Smalltalk:
    o m: x n: y


LX:
    (o.m x y) === (o .m x y)
    (o:m x y) === (o :m x y)
    (x := a)
    (x y := a b)
    '(...)

    (f x y) => (f.run x y)
    (a + b) => (a.run + b) =[a.run]=> (+ a b)

    expr : atom
         | list

    list : '(' expr+ ')'
         | '(' atom+ ':=' expr+ ')'

    atom : NAME
         | INT
         | DEC

special methods:
    run

New notes, revised on 2006-09-21 around 8:30 p.m.

Lisp:
    (f x y)


Smalltalk:
    o m: x n: y


LX:
    (o .m x y)
    (o :m x y)
    (x :: a)
    # not yet # (x y :: a b)
    '(...)

    (f x y) => (f :run x y)
    (a + b) => (a :run + b) =[a :run]=> (+ a b)

    expr : atom
         | list

    list : '(' expr+ ')'
         | '(' atom+ '::' expr+ ')'

    atom : NAME
         | INT
         | DEC

special methods:
    run


New notes, revised on 2006-10-02 around 6:30 p.m.

LX:
    perhaps soon restore
    (o.m x y) === (o .m x y)
    (o:m x y) === (o :m x y)

    soon
    ((x y) :: (list a b))

    (a + b) => (a :+ b)

New notes, added on 2006-11-19 around 10:10 p.m.

LX shorthand notation:
    (x :m :n a) === ((x :m) :n a) ???
    (x :m a :n b) === ((x :m a) :n a) ???

Support overloading by number of args as in E:
    def queue ():
        (run x):
            # push item x onto the back
        (run):
            # pull an item from the front

These methods would be known as run:1 and run:0, similarly to E's convention.

Also allow variable arg lists as in Scheme:
    def o ():
        (run x . args):
            # do something cool

But maybe don't allow the programmer to mix the two (that seems hard)

Use overloading of the run:n methods to give the following api:

    def q (queue :make)
    q 42
    print (q)

    def a (array :make 3)
    a 0 "a"
    a 1 "b"
    a 2 "c"
    print (a 0) (a 1) (a 2)

    def h (hash-table :make)
    h "foo" 42
    print (h "foo")

New notes, added on 2006-12-28 around 6:15 p.m.

Here is how to mix the two without sacrificing much speed:

  - Each multi-arg method generates several entries meth:arity, meth:(arity+1),
    meth:(arity+2), etc. up to meth:m where m is the arity of the largest
    method plus one or the method expansion constant (MEC), whichever is
    greater. MEC will initially be 10, but can be tuned for a space-time
    tradeoff.
  - Each fixed-arg method generates a single entry meth:arity
  - More specific entries override less specific ones
  - The largest variable-arity method also generates an entry meth:+

  For example:

    def o ():
        (run x) x
            => run:1
        (run x y) (x + y)
            => run:2
        (run x y z . w) x
            => run:3
            => run:4
            => run:6
            => run:7
            => run:8
            => run:9
            => run:10
            => run:11
        (run a b c d e) 42
            => run:5
        (run a b c d e f g h i j k l . m): 12
            => run:12
            => run:+

    To lookup a method with MEC or fewer arguments, just lookup name:args. If
    the lookup is for a call with more than MEC arguments, first try name:args.
    If it doesn't exist then try name:+.

New notes, added on 2007-03-08 around 3:50 p.m.
(Updated 2007-04-12 around 11:07 a.m. to fix typo.)

Source transformations for a more convenient import statement:

  New keyword
  load x is like previous import x

  import x ==> def x (import x)

  import (x y) ==> def y (import x)

  import x a b ==> begin (def x (import x)) (def a (x :a)) (def b (x :b))

  import x (a i) (b j) ==>
  begin (def x (import x)) (def i (x :a)) (def j (x :b))

Source transformations for more convenient exporting of symbols:

  export a b c ==>
  obj () ((a) a) ((b) b) ((c) c)

New notes, added on 2007-03-16 around 3:11 p.m.

  Literal syntax for: dates, times, email addrs, paths, uris

  pseudo-ISO:
    2007-03-16
    2007-03-16T15:13
    2007-03-16T15:13Z
    2007-03-16T15:13+5
    2007-03-16T3:13pm
    2007-03-16T03:13pm
    2007-03-16T03:13:32pm
    2007-03-16T03:13:32.2345pm
    03:13am
    03:13pm
    03:13:32pm
    03:13:32.2345pm
  other formats?

  kr@xph.us

  /usr/local/foo/bar.txt

  http://xph.us/software/
  http://xph.us/software/unlambda
  http://xph.us/software/unlambda.html

  Protocols: http https ftp data file imap mailto pop sip (others?)

New notes, added on 2007-03-20 around 1:52 a.m.

  Here is how to offer optional args in addition to var args and overloading by
  arity. It is done by (surprise, surprise) another source transformation.

  The idea: expand a method definition with optional arguments into several
  method definitions, each of which calculates one optional value and passes it
  to the next. The usual mechanism for selecting a method by arity will cause
  the appropriate one to be called.

  For example (using a pseudo-syntax that wouldn't actually work):

    obj () ((m x y=1 z=2 w=3) body)

  becomes

    obj ():
        (m x) (self:m x 1)
        (m x y) (self:m x y 2)
        (m x y z) (self:m x y z 3)
        (m x y z w) body

  Unfortunately, this means that a call to m:1 will execute three extra method
  calls. This transformation would play nicely with the PIC optimizer, making
  this not very burdensome, but it's still less than ideal.

  Alternatively, the example above could be expanded to the following:

    obj ():
        (m x) (self:m x 1 2 3)
        (m x y) (self:m x y 2 3)
        (m x y z) (self:m x y z 3)
        (m x y z w) body

  This would reduce the number of method calls but add some bloat in the
  generated code, which might affect locality and cache behavior. It's unclear
  which strategy is better; I'll need to do some profiling to get a real
  answer. My hunch is that the second version with more code but fewer run time
  calls will win overall because the expressions that get duplicated will tend
  to be very small -- nearly always just loading compile-time constants.

  Now it just needs a syntax. I said above that "x=1" wouldn't work. That's
  because "x=1" is a perfectly valid variable name. I'm currently considering
  further abusing the colon, as in "x:1", but that is unsatisfying. It also
  looks too similar to a method call when the default expr is a variable
  lookup, as in "x:y". I could use a parenthesized form, but I'm saving that
  for destructuring binds.

New notes, added on 2007-03-20 around 2:21 a.m.

  A neat side effect of doing optional args by this source transformation
  method is that it becomes pretty easy to give meaningful, well-defined
  behavior if the user wants to put an optional argument in the middle of the
  list. They don't have to go only at the end. Consider:

  obj () ((m x y=3 z) body)

  obj ():
      (m x z) (self:m x 3 z)
      (m x y z) body

  Or, a slightly more complicated example:

  obj () ((m a b=3 c d=7 e) body)

  obj ():
      (m a c e) (self:m a 3 c e)
      (m a b c e) (self:m a b c 7 e)
      (m a b c d e) body

  It still needs a syntax, though. I've also skirted around another issue:
  doing this as a source transformation causes trouble in reliably referring to
  the receiver. I cleverly avoided this problem in my pseudo-syntax by using
  the word "self", which has no special meaning in LX. This would be no problem
  in bytecode, but it's tricky as a source transformation. I may have to
  introduce another "impossible" keyword.

New notes, added on 2007-04-10 around 12:03 a.m.

  The name is Sodium!!! That is all.

New notes, added on 2007-04-11 around 1:55 p.m.

  Three things today. First, I'm strongly considering swapping the meanings of
  dot and colon in method invocation. So x.m would be a blocking call and x:m
  would be a nonblocking call. Originally, I chose dot for the asynchronous
  operation because I wanted to encourage people to use it by making it easier
  to type, compared with the colon. But after writing just a couple thousand
  lines of Sodium (nee LX) code, I've observed that blocking calls are going to
  be much more common than nonblocking calls no matter what. I should optimize
  for the common case.

  By the way, I'm also changing my terminology slightly. Until now I had been
  pretty careful about referring to synchronous method invocations as
  "immediate calls" and asynchronous method invocations as "eventual sends". I
  think a better approach is to refer to them both as "calls" or "message
  sends" (a la Smalltalk) or "invocations" or whatever, to reinforce the notion
  that they are more similar than different. Then, where necessary, one can
  distinguish the two by using an adjective such as "blocking" or "nonblocking"
  or "synchronous" or "asynchronous" (or even "immediate" or "eventual").

  Second, I'm strongly considering adding a bit more low-level syntax (i.e. not
  a parse tree transformation) to let the programmer omit more parentheses. I
  first mentioned this on 2006-11-19 but didn't consider it very seriously.
  Here's how it works:

  Currently, any combination can contain at most one "IMESS" or "SMESS" token
  (that is, a symbol prefixed by a colon or dot) and that token must occur in
  the second position. The occurrence of another of these tokens is an error.
  Instead, I will let the occurrence of another such token signal that
  everything seen so far in the combination should be treated as if it were
  wrapped in parens. That means that this

    (o a b c):m x y

  can be shortened to

    o a b c:m x y

  and this

    ((x:foo):bar):baz a b

  can be shortened to

    x:foo:bar:baz a b

  This feature almost always lets you omit a set of parens that starts at the
  beginning of the line. The exception is

    (x):m

  which cannot be shortened. That's because the :m would be in the second
  position and x itself (rather than the result of calling x:run with zero
  args) would be sent the message.

  To go along with this, I'm planning on changing the treatment of punctuation
  messages. Currently, 3 + 4 is translated to 3 :+ 4 by a parse tree
  transformation during eval. Instead, I'm going to move it all the way down to
  the lexer. I'll have the lexer emit an IMESS token when it encounters a word
  of just one or two characters in the list of special punctuation. So x-y will
  still be a single symbol token, but x - y will lex the same way as x:- y.

  Third, I'm weakly considering making the equals sign more special to provide
  for a readable syntax for optional params. This would involve adding one more
  token type to the lexer, a "keyword" token (for lack of a better name), that
  looks like a symbol suffixed with an equals sign. Then, in a list of formal
  params, a keyword followed by an expr would count as one optional parameter.
  This is essentially the pseudo-syntax I used in my earlier example, but made
  real. This would mean that x=5 is no longer a valid variable name; you can't
  use '=' in identifiers any more. Having used Scheme, I like the ability to
  use nearly any character in a variable name, so I would lament the loss of
  the equals sign. But, then again, the lack of '=' in identifiers doesn't seem
  to bother Python users very much.

  This would also provide a syntax for keyword args. Unfortunately, syntax is
  not the hard part of keyword args. I'd love to do keyword args some day but
  I'll need some inspiration on the mechanism.

New notes, added on 2007-04-11 around 3:11 p.m.

  Well, the last piece of the puzzle in providing optional args is a way to
  refer to the self object reliably, even when that object is not bound to any
  name in scope. Right now, the answer is implicit self. Since an IMESS token
  cannot currently occur as the first thing in a combination, I'll use that to
  mean that the receiver is self. Although generally I agree that explicit is
  better than implicit, I think this is better than adding a reserved word or
  something. And arguably, this isn't really implicit, it's just terse. The
  notation is unambiguous (unlike, say, Ruby) and it doesn't break any existing
  code.

Updated notes on 2007-04-12 around 11:07 a.m.
(Originally added on 2007-03-08 around 3:50 p.m.)

Source transformations for a more convenient import statement:

  New keyword
  load-module x is like previous import x

  import x
    => def x (load-module x)

  import (x -> y)
    => def y (load-module x)

  import x a b
    => def x (load-module x)
       def a (x :a)
       def b (x :b)

  import x (a -> i) (b -> j) ==>
    => def x (load-module x)
       def i (x :a)
       def j (x :b)

Source transformations for more convenient exporting of symbols:

  export a b c
    => obj () ((a) a) ((b) b) ((c) c)

  export a (b -> y) c
    => obj () ((a) a) ((y) b) ((c) c)

New notes, added on 2007-04-25 around 3:35 p.m.

  Well, I've swapped the meaning of colon and dot for messages and implemented
  the extra paren-omitting syntax feature. You can now put a message anywhere
  in a combination and the reader will insert parens for you. There's one small
  but significant change from my original description of this feature, though.
  I've changed the associativity so that a message only applies to the
  subexpression immediately before it rather than all subexpressions up to that
  point. You could say it's "right-associative".

  So my previous example

    o a b c:m x y

  actually means

    o a b (c:m x y)

  rather than

    (o a b c):m x y

  I made this change after looking at a few examples and noticing that it
  allows you to omit far more parens than my original plan.

  I haven't yet changed the lexer to turn punctuation into messages.


New notes, added on 2008-09-19 around 10:52 a.m.

  Use the exclamation mark "!" as shorthand for logical negation as in Arc.
  This means "!x" gets rewritten to "(not x)".