/
NOTES
442 lines (319 loc) · 13.5 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
These notes are from 2006-09-13 around 7 p.m.
Lisp:
(f x y)
Smalltalk:
o m: x n: y
LX:
(o.m x y) === (o .m x y)
(o:m x y) === (o :m x y)
(x := a)
(x y := a b)
'(...)
(f x y) => (f.run x y)
(a + b) => (a.run + b) =[a.run]=> (+ a b)
expr : atom
| list
list : '(' expr+ ')'
| '(' atom+ ':=' expr+ ')'
atom : NAME
| INT
| DEC
special methods:
run
New notes, revised on 2006-09-21 around 8:30 p.m.
Lisp:
(f x y)
Smalltalk:
o m: x n: y
LX:
(o .m x y)
(o :m x y)
(x :: a)
# not yet # (x y :: a b)
'(...)
(f x y) => (f :run x y)
(a + b) => (a :run + b) =[a :run]=> (+ a b)
expr : atom
| list
list : '(' expr+ ')'
| '(' atom+ '::' expr+ ')'
atom : NAME
| INT
| DEC
special methods:
run
New notes, revised on 2006-10-02 around 6:30 p.m.
LX:
perhaps soon restore
(o.m x y) === (o .m x y)
(o:m x y) === (o :m x y)
soon
((x y) :: (list a b))
(a + b) => (a :+ b)
New notes, added on 2006-11-19 around 10:10 p.m.
LX shorthand notation:
(x :m :n a) === ((x :m) :n a) ???
(x :m a :n b) === ((x :m a) :n a) ???
Support overloading by number of args as in E:
def queue ():
(run x):
# push item x onto the back
(run):
# pull an item from the front
These methods would be known as run:1 and run:0, similarly to E's convention.
Also allow variable arg lists as in Scheme:
def o ():
(run x . args):
# do something cool
But maybe don't allow the programmer to mix the two (that seems hard)
Use overloading of the run:n methods to give the following api:
def q (queue :make)
q 42
print (q)
def a (array :make 3)
a 0 "a"
a 1 "b"
a 2 "c"
print (a 0) (a 1) (a 2)
def h (hash-table :make)
h "foo" 42
print (h "foo")
New notes, added on 2006-12-28 around 6:15 p.m.
Here is how to mix the two without sacrificing much speed:
- Each multi-arg method generates several entries meth:arity, meth:(arity+1),
meth:(arity+2), etc. up to meth:m where m is the arity of the largest
method plus one or the method expansion constant (MEC), whichever is
greater. MEC will initially be 10, but can be tuned for a space-time
tradeoff.
- Each fixed-arg method generates a single entry meth:arity
- More specific entries override less specific ones
- The largest variable-arity method also generates an entry meth:+
For example:
def o ():
(run x) x
=> run:1
(run x y) (x + y)
=> run:2
(run x y z . w) x
=> run:3
=> run:4
=> run:6
=> run:7
=> run:8
=> run:9
=> run:10
=> run:11
(run a b c d e) 42
=> run:5
(run a b c d e f g h i j k l . m): 12
=> run:12
=> run:+
To lookup a method with MEC or fewer arguments, just lookup name:args. If
the lookup is for a call with more than MEC arguments, first try name:args.
If it doesn't exist then try name:+.
New notes, added on 2007-03-08 around 3:50 p.m.
(Updated 2007-04-12 around 11:07 a.m. to fix typo.)
Source transformations for a more convenient import statement:
New keyword
load x is like previous import x
import x ==> def x (import x)
import (x y) ==> def y (import x)
import x a b ==> begin (def x (import x)) (def a (x :a)) (def b (x :b))
import x (a i) (b j) ==>
begin (def x (import x)) (def i (x :a)) (def j (x :b))
Source transformations for more convenient exporting of symbols:
export a b c ==>
obj () ((a) a) ((b) b) ((c) c)
New notes, added on 2007-03-16 around 3:11 p.m.
Literal syntax for: dates, times, email addrs, paths, uris
pseudo-ISO:
2007-03-16
2007-03-16T15:13
2007-03-16T15:13Z
2007-03-16T15:13+5
2007-03-16T3:13pm
2007-03-16T03:13pm
2007-03-16T03:13:32pm
2007-03-16T03:13:32.2345pm
03:13am
03:13pm
03:13:32pm
03:13:32.2345pm
other formats?
kr@xph.us
/usr/local/foo/bar.txt
http://xph.us/software/
http://xph.us/software/unlambda
http://xph.us/software/unlambda.html
Protocols: http https ftp data file imap mailto pop sip (others?)
New notes, added on 2007-03-20 around 1:52 a.m.
Here is how to offer optional args in addition to var args and overloading by
arity. It is done by (surprise, surprise) another source transformation.
The idea: expand a method definition with optional arguments into several
method definitions, each of which calculates one optional value and passes it
to the next. The usual mechanism for selecting a method by arity will cause
the appropriate one to be called.
For example (using a pseudo-syntax that wouldn't actually work):
obj () ((m x y=1 z=2 w=3) body)
becomes
obj ():
(m x) (self:m x 1)
(m x y) (self:m x y 2)
(m x y z) (self:m x y z 3)
(m x y z w) body
Unfortunately, this means that a call to m:1 will execute three extra method
calls. This transformation would play nicely with the PIC optimizer, making
this not very burdensome, but it's still less than ideal.
Alternatively, the example above could be expanded to the following:
obj ():
(m x) (self:m x 1 2 3)
(m x y) (self:m x y 2 3)
(m x y z) (self:m x y z 3)
(m x y z w) body
This would reduce the number of method calls but add some bloat in the
generated code, which might affect locality and cache behavior. It's unclear
which strategy is better; I'll need to do some profiling to get a real
answer. My hunch is that the second version with more code but fewer run time
calls will win overall because the expressions that get duplicated will tend
to be very small -- nearly always just loading compile-time constants.
Now it just needs a syntax. I said above that "x=1" wouldn't work. That's
because "x=1" is a perfectly valid variable name. I'm currently considering
further abusing the colon, as in "x:1", but that is unsatisfying. It also
looks too similar to a method call when the default expr is a variable
lookup, as in "x:y". I could use a parenthesized form, but I'm saving that
for destructuring binds.
New notes, added on 2007-03-20 around 2:21 a.m.
A neat side effect of doing optional args by this source transformation
method is that it becomes pretty easy to give meaningful, well-defined
behavior if the user wants to put an optional argument in the middle of the
list. They don't have to go only at the end. Consider:
obj () ((m x y=3 z) body)
obj ():
(m x z) (self:m x 3 z)
(m x y z) body
Or, a slightly more complicated example:
obj () ((m a b=3 c d=7 e) body)
obj ():
(m a c e) (self:m a 3 c e)
(m a b c e) (self:m a b c 7 e)
(m a b c d e) body
It still needs a syntax, though. I've also skirted around another issue:
doing this as a source transformation causes trouble in reliably referring to
the receiver. I cleverly avoided this problem in my pseudo-syntax by using
the word "self", which has no special meaning in LX. This would be no problem
in bytecode, but it's tricky as a source transformation. I may have to
introduce another "impossible" keyword.
New notes, added on 2007-04-10 around 12:03 a.m.
The name is Sodium!!! That is all.
New notes, added on 2007-04-11 around 1:55 p.m.
Three things today. First, I'm strongly considering swapping the meanings of
dot and colon in method invocation. So x.m would be a blocking call and x:m
would be a nonblocking call. Originally, I chose dot for the asynchronous
operation because I wanted to encourage people to use it by making it easier
to type, compared with the colon. But after writing just a couple thousand
lines of Sodium (nee LX) code, I've observed that blocking calls are going to
be much more common than nonblocking calls no matter what. I should optimize
for the common case.
By the way, I'm also changing my terminology slightly. Until now I had been
pretty careful about referring to synchronous method invocations as
"immediate calls" and asynchronous method invocations as "eventual sends". I
think a better approach is to refer to them both as "calls" or "message
sends" (a la Smalltalk) or "invocations" or whatever, to reinforce the notion
that they are more similar than different. Then, where necessary, one can
distinguish the two by using an adjective such as "blocking" or "nonblocking"
or "synchronous" or "asynchronous" (or even "immediate" or "eventual").
Second, I'm strongly considering adding a bit more low-level syntax (i.e. not
a parse tree transformation) to let the programmer omit more parentheses. I
first mentioned this on 2006-11-19 but didn't consider it very seriously.
Here's how it works:
Currently, any combination can contain at most one "IMESS" or "SMESS" token
(that is, a symbol prefixed by a colon or dot) and that token must occur in
the second position. The occurrence of another of these tokens is an error.
Instead, I will let the occurrence of another such token signal that
everything seen so far in the combination should be treated as if it were
wrapped in parens. That means that this
(o a b c):m x y
can be shortened to
o a b c:m x y
and this
((x:foo):bar):baz a b
can be shortened to
x:foo:bar:baz a b
This feature almost always lets you omit a set of parens that starts at the
beginning of the line. The exception is
(x):m
which cannot be shortened. That's because the :m would be in the second
position and x itself (rather than the result of calling x:run with zero
args) would be sent the message.
To go along with this, I'm planning on changing the treatment of punctuation
messages. Currently, 3 + 4 is translated to 3 :+ 4 by a parse tree
transformation during eval. Instead, I'm going to move it all the way down to
the lexer. I'll have the lexer emit an IMESS token when it encounters a word
of just one or two characters in the list of special punctuation. So x-y will
still be a single symbol token, but x - y will lex the same way as x:- y.
Third, I'm weakly considering making the equals sign more special to provide
for a readable syntax for optional params. This would involve adding one more
token type to the lexer, a "keyword" token (for lack of a better name), that
looks like a symbol suffixed with an equals sign. Then, in a list of formal
params, a keyword followed by an expr would count as one optional parameter.
This is essentially the pseudo-syntax I used in my earlier example, but made
real. This would mean that x=5 is no longer a valid variable name; you can't
use '=' in identifiers any more. Having used Scheme, I like the ability to
use nearly any character in a variable name, so I would lament the loss of
the equals sign. But, then again, the lack of '=' in identifiers doesn't seem
to bother Python users very much.
This would also provide a syntax for keyword args. Unfortunately, syntax is
not the hard part of keyword args. I'd love to do keyword args some day but
I'll need some inspiration on the mechanism.
New notes, added on 2007-04-11 around 3:11 p.m.
Well, the last piece of the puzzle in providing optional args is a way to
refer to the self object reliably, even when that object is not bound to any
name in scope. Right now, the answer is implicit self. Since an IMESS token
cannot currently occur as the first thing in a combination, I'll use that to
mean that the receiver is self. Although generally I agree that explicit is
better than implicit, I think this is better than adding a reserved word or
something. And arguably, this isn't really implicit, it's just terse. The
notation is unambiguous (unlike, say, Ruby) and it doesn't break any existing
code.
Updated notes on 2007-04-12 around 11:07 a.m.
(Originally added on 2007-03-08 around 3:50 p.m.)
Source transformations for a more convenient import statement:
New keyword
load-module x is like previous import x
import x
=> def x (load-module x)
import (x -> y)
=> def y (load-module x)
import x a b
=> def x (load-module x)
def a (x :a)
def b (x :b)
import x (a -> i) (b -> j) ==>
=> def x (load-module x)
def i (x :a)
def j (x :b)
Source transformations for more convenient exporting of symbols:
export a b c
=> obj () ((a) a) ((b) b) ((c) c)
export a (b -> y) c
=> obj () ((a) a) ((y) b) ((c) c)
New notes, added on 2007-04-25 around 3:35 p.m.
Well, I've swapped the meaning of colon and dot for messages and implemented
the extra paren-omitting syntax feature. You can now put a message anywhere
in a combination and the reader will insert parens for you. There's one small
but significant change from my original description of this feature, though.
I've changed the associativity so that a message only applies to the
subexpression immediately before it rather than all subexpressions up to that
point. You could say it's "right-associative".
So my previous example
o a b c:m x y
actually means
o a b (c:m x y)
rather than
(o a b c):m x y
I made this change after looking at a few examples and noticing that it
allows you to omit far more parens than my original plan.
I haven't yet changed the lexer to turn punctuation into messages.
New notes, added on 2008-09-19 around 10:52 a.m.
Use the exclamation mark "!" as shorthand for logical negation as in Arc.
This means "!x" gets rewritten to "(not x)".