Skip to content

Commit d89a267

Browse files
committed
Implement DOES>
1 parent 5562831 commit d89a267

File tree

7 files changed

+1005
-818
lines changed

7 files changed

+1005
-818
lines changed

Diff for: DefGen/src/defgen/core.clj

+2-2
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@
141141
(let [name-size (if hidden? 0 (count name))
142142
header-size (+ 1 ;; PSF+5bit name length
143143
2 ;; LFA
144-
1)] ;; CFA
144+
2)] ;; CFA
145145
(assoc rom-def ::name-size name-size ::header-size header-size)))
146146

147147
(defn token-byte-size
@@ -195,7 +195,7 @@
195195
[(str (when immediate? "0x80|")
196196
(if hidden? 0 (count name)))]
197197
[(xt-to-bytes (rom-def-prev-xts id))]
198-
[cfa])))
198+
[0 cfa])))
199199
rom-defs)))
200200

201201
(defn extract-branch-span

Diff for: TODO.md

+3
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# Before Release
22

3+
* Change XTs to point at the CFA (which is what a DOES>-inserted XT already targets...) instead of the NFA. This will make all XTs immediately executable.
4+
* This may make it possible to put real XTs on the return stack, which would avoid that bogus $8xxx thing that only happens to work because Arduino's have less than 32KB of RAM.
5+
* Change ROM Definitions to be $4xxx instead of $Cxxx; the later could show up if we have >16KB dictionary (since User Defintions are $8xxx and $Cxxx just means that the second-highest bit was set).
36
* Implement the remaining `CORE` words (the ones whose tests have been commented out in `test_core.cpp`).
47
* Do we need `I*` tokens or can normal tokens just use inProgramSpace? (and if so, is that actually smaller from a compile perspective?)
58
* Looks like we can make this change and that it will be (slightly) smaller, if not (slightly) slower. We shouldn't make the change until we can run the tests on the Arduino though (just in case we break something).

Diff for: definitions/core.edn

+85-12
Original file line numberDiff line numberDiff line change
@@ -154,9 +154,9 @@
154154
:name ":"
155155
:args [[] []]
156156
:source ": : ( \"<spaces>name\" -- )
157-
CREATE HIDE -1 ALLOT ['] DOCOLON C, ]"
157+
CREATE HIDE -2 ALLOT ['] DOCOLON XT, ]"
158158
:pfa [:create :hide
159-
:zero :oneminus :allot :icharlit "DOCOLON" :ccomma
159+
:zero :oneminus :oneminus :allot :icharlit "DOCOLON" :xtcomma
160160
:rtbracket
161161
:exit]}
162162

@@ -170,14 +170,18 @@
170170
;; around and basically do a similar thing to decide if this is not a
171171
;; DO* word. We need some way to differentiate between these things: a
172172
;; token and a definition.
173+
;; FIXME XT@XT should really be CFA@ (similar to LFA@) and then do the
174+
;; offset on its own (just like LFA@). This will make it easier to
175+
;; eventually change target of an XT (to the CFA instead of the LFA, for
176+
;; example).
173177
{:token :compilecomma
174178
:name "COMPILE,"
175179
:args [[:xt] []]
176180
:source ": COMPILE, ( xt --)
177181
DUP TOKEN? IF C, EXIT THEN
178-
DUP 3 + ( >CFA) C@XT DUP $70 < IF C, DROP ELSE DROP XT, THEN ;"
182+
DUP 3 + ( >CFA) XT@XT DUP $70 < IF C, DROP ELSE DROP XT, THEN ;"
179183
:pfa [:dup :tokenq :izbranch 3 :ccomma :exit
180-
:dup :oneplus :oneplus :oneplus :cfetchxt
184+
:dup :oneplus :oneplus :oneplus :xtfetchxt
181185
:dup :icharlit 0x70 :lessthan :izbranch 5
182186
:ccomma :drop :ibranch 3
183187
:drop :xtcomma
@@ -186,8 +190,8 @@
186190
{:token :constant
187191
:args [[:x] []]
188192
:source ": CONSTANT ( x \"<spaces>name\" -- )
189-
CREATE -1 ALLOT ['] DOCONSTANT C, , ;"
190-
:pfa [:create :zero :oneminus :allot :icharlit "DOCONSTANT" :ccomma
193+
CREATE -2 ALLOT ['] DOCONSTANT XT, , ;"
194+
:pfa [:create :zero :oneminus :oneminus :allot :icharlit "DOCONSTANT" :xtcomma
191195
:comma
192196
:exit]}
193197

@@ -215,7 +219,7 @@
215219
BL PARSE-WORD DUP 0= IF ABORT THEN ( ca u)
216220
TUCK ( u ca u) NAME, ( u)
217221
HERE >XT ( u this-xt) SWAP C, LATEST @ XT, ( this-xt) LATEST !
218-
['] DOCREATE C, ALIGN ;"
222+
['] DOCREATE XT, ALIGN ;"
219223
:pfa [:bl :parseword :dup :zeroequals :izbranch 2 :abort
220224
:tuck
221225
; NAME,
@@ -225,7 +229,10 @@
225229
:oneminus :dup :cfetch :ccomma :ibranch -9
226230
:twodrop
227231
:here :toxt :swap :ccomma :latest :fetch :xtcomma :latest :store
228-
:icharlit "DOCREATE" :ccomma
232+
:icharlit "DOCREATE" :xtcomma
233+
;; FIXME This ALIGN seems wrong; the VM will just point W right
234+
;; after the CFA, for example, and not take ALIGN into account.
235+
;; We should probably no-op ALIGN and then stop calling it.
229236
:align
230237
:exit]}
231238

@@ -244,6 +251,72 @@
244251
0 'PREVLEAVE ! ['] (DO) COMPILE, HERE ; IMMEDIATE"
245252
:pfa [:zero :tickprevleave :store :icharlit :pdo :compilecomma :here :exit]}
246253

254+
;; CREATE puts a two-byte XT containing the DOCREATE token into the Code
255+
;; Field. Various other defining words (CONSTANT, VARIABLE, etc.)
256+
;; replace that token with a different token. DOES> also needs to
257+
;; replace the Code Field, but needs to replace it with an XT and not a
258+
;; token. That reason alone is why the Code Field is two bytes instead
259+
;; of one byte.
260+
;;
261+
;; What goes in this DOES>-modified Code Field? The XT of the runtime
262+
;; behavior of the defining word -- the code after DOES> in other words.
263+
;;
264+
;; Here is how CREATE..DOES>.. works:
265+
;;
266+
;; Compile Time behavior:
267+
;; 1. Defining word is created that looks like: ": NAMEDZERO: CREATE 0 , DOES> @ ;".
268+
;; 2. DOES> is an immediate word that compiles (does>) into the defining
269+
;; word's thread.
270+
;; 3. User calls this word to create a new word: "NAMEDZERO: nothing".
271+
;; 4. Defining word runs CREATE, which creates a new definition. In
272+
;; this case, the definition compiles a zero into the dictionary.
273+
;; 5. Then (does>) executes (remember, we are still in the NAMEDZERO:
274+
;; thread at this point). (does>) exits the current definition (which
275+
;; is "NAMEDZERO:" in this example) by popping the return stack
276+
;; pointer. That pointer contains the address of the word after
277+
;; (does>) from the NAMEDZERO: definition itself. That pointer is
278+
;; the start of the runtime behavior of the defining word.
279+
;; 6. (does>) then gets the address of the LATEST definition (which is
280+
;; the one that we CREATEd in step 4) and advances to the CFA (which
281+
;; so far is DOCREATE).
282+
;; 7. (does>) rewrites that CFA to point to the address after (does>),
283+
;; which was popped in step 5.
284+
;;
285+
;; Runtime behavior:
286+
;; 1. The defined word ("nothing" in this example) runs. Its CFA points
287+
;; to the thread after (does>). The Enforth inner interpreter has
288+
;; special code that detects this situation -- a Code Field that
289+
;; contains an XT and not a token -- and knows that the Code Field
290+
;; was modified by (does>). The inner interpreter jumps to DODOES.
291+
;; 2. DODOES pushes the IP to the return stack -- we're calling another
292+
;; word here, just like in DOCOLON. Then DODOES pushes W (the PFA of
293+
;; the *defined* word, aka "nothing" in this example) to the data
294+
;; stack. Finally, DODOES sets IP to the runtime thread of the
295+
;; defining word (from "NAMEDZERO:").
296+
;; 3. Execution continues in the body of the defining word.
297+
{:token :pdoes
298+
:name "(does>)"
299+
:flags #{:headerless}
300+
:source ": (does>)
301+
R> >XT ( XT of runtime behavior for the defining word)
302+
LATEST @ >NFA kNFAtoCFA + ( addr of original CFA)
303+
XT! ( rewrite CFA to point to runtime part of defining word) ;"
304+
:pfa [:rfrom :toxt
305+
:latest :fetch :tonfa :icharlit "kNFAtoCFA" :plus
306+
:xtstore
307+
:exit]}
308+
309+
{:token :does
310+
:name "DOES>"
311+
:flags #{:immediate}
312+
:source ": DOES> ( -- ) ['] (does>) XT, ; IMMEDIATE"
313+
:pfa [:icharlit "((uint8_t)((ROMDEF_PDOES >> 8) & 0xff))"
314+
:icharlit 8 :lshift
315+
:icharlit "((uint8_t)((ROMDEF_PDOES ) & 0xff))"
316+
:or
317+
:xtcomma
318+
:exit]}
319+
247320
{:token :dot
248321
:name "."
249322
:args [[:n] []]
@@ -703,8 +776,8 @@
703776
{:token :tobody
704777
:name ">BODY"
705778
:args [[:xt] [:a-addr]]
706-
:source ": >BODY ( xt -- a-addr) XTMASK AND 'DICT + 4 + ( NFA + LFA + CFA) ;"
707-
:pfa [:xtmask :and :tickdict :plus :icharlit 4 :plus
779+
:source ": >BODY ( xt -- a-addr) XTMASK AND 'DICT + kNFAtoPFA + ;"
780+
:pfa [:xtmask :and :tickdict :plus :icharlit "kNFAtoPFA" :plus
708781
;; TODO Add :align if we start aligning the PFA.
709782
:exit]}
710783

@@ -774,8 +847,8 @@
774847
{:token :variable
775848
:args [[] []]
776849
:source ": VARIABLE ( \"<spaces>name\" -- )
777-
CREATE -1 ALLOT ['] DOVARIABLE C, 0 , ;"
778-
:pfa [:create :zero :oneminus :allot :icharlit "DOVARIABLE" :ccomma
850+
CREATE -2 ALLOT ['] DOVARIABLE XT, 0 , ;"
851+
:pfa [:create :zero :oneminus :oneminus :allot :icharlit "DOVARIABLE" :xtcomma
779852
:zero :comma
780853
:exit]}
781854

Diff for: definitions/enforth.edn

+19-7
Original file line numberDiff line numberDiff line change
@@ -432,11 +432,8 @@
432432
:name "LFA@"
433433
:args [[:xt] [:xt]]
434434
:flags #{:headerless}
435-
:source ": LFA@ ( xt -- xt')
436-
1+ ( NFA>LFA) DUP C@XT 8 LSHIFT SWAP 1+ C@XT OR ;"
437-
:pfa [:oneplus
438-
:dup :cfetchxt :icharlit 8 :lshift
439-
:swap :oneplus :cfetchxt :or :exit]}
435+
:source ": LFA@ ( xt -- xt') 1+ ( NFA>LFA) XT@XT ;"
436+
:pfa [:oneplus :xtfetchxt :exit]}
440437

441438
;; Length of the definition name, not necessarily the NFA field (because
442439
;; FFI trampolines have a zero-length NFA since the name is stored in
@@ -704,14 +701,14 @@
704701
0= IF TYPE SPACE [CHAR] ? EMIT CR ABORT THEN ( ca u addr)
705702
HERE >XT ( ca u addr this-xt) ROT %00100000 OR C, ( ca a xt)
706703
LATEST @ XT, ( ca addr this-xt) LATEST !
707-
['] DOFFI0 ( ca addr ffitokenbase) OVER FFIDEF-ARITY + C,
704+
['] DOFFI0 ( ca addr ffitokenbase) OVER FFIDEF-ARITY + XT,
708705
( ca addr) ALIGN , DROP ;"
709706
:pfa [:bl :parseword :dup :zeroequals :izbranch 2 :abort
710707
:twodup :findffidef :zeroequals :izbranch 8
711708
:type :space :icharlit "'?'" :emit :cr :abort
712709
:here :toxt :rot :icharlit 0x20 :or :ccomma
713710
:latest :fetch :xtcomma :latest :store
714-
:icharlit "DOFFI0" :over :ffidefarity :plus :ccomma
711+
:icharlit "DOFFI0" :over :ffidefarity :plus :xtcomma
715712
:align :comma :drop
716713
:exit]}
717714

@@ -722,6 +719,14 @@
722719
:source ": XT, ( xt -- ) DUP 8 RSHIFT C, C, ;"
723720
:pfa [:dup :icharlit 8 :rshift :ccomma :ccomma :exit]}
724721

722+
{:token :xtfetchxt
723+
:name "XT@XT"
724+
:args [[:xt] [:xt]]
725+
:flags #{:headerless}
726+
:source ": XT@XT ( xt -- xt') DUP C@XT 8 LSHIFT SWAP 1+ C@XT OR ;"
727+
:pfa [:dup :cfetchxt :icharlit 8 :lshift
728+
:swap :oneplus :cfetchxt :or :exit]}
729+
725730
{:token :xtflag
726731
:args [[] [:u]]
727732
:flags #{:headerless}
@@ -731,3 +736,10 @@
731736
:args [[] [:u]]
732737
:flags #{:headerless}
733738
:pfa [:icharlit 0xff :icharlit 0x3f :icharlit 8 :lshift :or :exit]}
739+
740+
{:token :xtstore
741+
:name "XT!"
742+
:args [[:xt :addr] []]
743+
:flags #{:headerless}
744+
:source ": XT! ( xt addr -- ) OVER 8 RSHIFT OVER C! 1+ C! ;"
745+
:pfa [:over :icharlit 8 :rshift :over :cstore :oneplus :cstore :exit]}

Diff for: enforth.c

+79-8
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ typedef enum EnforthToken
7575
DOCOLONROM,
7676
DOCONSTANT,
7777
DOCREATE,
78-
DODOES,
7978
DOVARIABLE,
8079
/* Unused */
8180
/* Unused */
81+
/* Unused */
8282

8383
DOFFI0 = 0x78,
8484
DOFFI1,
@@ -97,7 +97,7 @@ typedef enum EnforthToken
9797
*/
9898

9999
static const int kNFAtoCFA = 1 /* PSF+namelen */ + 2 /* LFA */;
100-
static const int kNFAtoPFA = 1 /* PSF+namelen */ + 2 /* LFA */ + 1; /* CFA */
100+
static const int kNFAtoPFA = 1 /* PSF+namelen */ + 2 /* LFA */ + 2; /* CFA */
101101

102102
static const int kTaskUserVariableSize = 8;
103103
static const int kTaskReturnStackSize = 32;
@@ -311,10 +311,10 @@ void enforth_resume(EnforthVM * const vm)
311311
&&DOCONSTANT,
312312
&&DOCREATE,
313313

314-
0, /* &&DODOES, */
315314
&&DOVARIABLE,
316315
0, /* Unused */
317316
0, /* Unused */
317+
0, /* Unused */
318318

319319
/* $78 - $7F */
320320
&&DOFFI0,
@@ -399,12 +399,26 @@ void enforth_resume(EnforthVM * const vm)
399399
}
400400

401401
DISPATCH_XT:
402-
/* Convert the XT into a Word Pointer depending on the type
403-
* of XT (ROM definition or user definition) and then read
404-
* the CFA. */
402+
/* This is a two-byte XT that points at another word. We
403+
* need to find the word that is being targeted by this XT
404+
* and set W to that word's PFA. We then need to dispatch
405+
* to the word's Code Field. The Code Field will almost
406+
* always contain a DO* token, *unless the word was modified
407+
* by a defining word.* In that case, the Code Field will
408+
* contain a full, two-byte XT and that XT will point to the
409+
* runtime behavior of the defining word (the code after
410+
* DOES> in the defining word).
411+
*
412+
* Note that ROM Definitions never use DOES> and so we can
413+
* assume that the Code Field in a ROM Definition will
414+
* always contain a token. */
405415
if ((xt & 0xC000) == 0xC000) /* ROM Definition: 0xCxxx */
406416
{
407-
w = (uint8_t*)((uint8_t*)definitions + (xt & 0x3FFF) + kNFAtoCFA);
417+
/* Advance past the empty first byte in the Code Field,
418+
* then read the token in the second byte. */
419+
/* TODO We could probably eliminate the first byte in
420+
* ROM Definitions. */
421+
w = (uint8_t*)((uint8_t*)definitions + (xt & 0x3FFF) + kNFAtoCFA + 1);
408422
#ifdef __AVR__
409423
token = pgm_read_byte(w++);
410424
#else
@@ -413,8 +427,24 @@ void enforth_resume(EnforthVM * const vm)
413427
}
414428
else /* User Definition: 0x8xxx */
415429
{
430+
/* Load the Code Field. */
416431
w = (uint8_t*)(vm->dictionary.ram + (xt & 0x3FFF) + kNFAtoCFA);
417-
token = *w++;
432+
xt = *w++ << 8;
433+
xt |= *w++;
434+
435+
/* We're done if this is a token. */
436+
if (xt < 0x80)
437+
{
438+
token = xt;
439+
goto DISPATCH_TOKEN;
440+
}
441+
else
442+
{
443+
/* Not a token, which means that this word was
444+
* defined by DOES>; jump to DODOES to perform the
445+
* runtime behavior of the defining word. */
446+
goto DODOES;
447+
}
418448
}
419449
}
420450

@@ -510,6 +540,8 @@ void enforth_resume(EnforthVM * const vm)
510540
#ifdef __AVR__
511541
if (inProgramSpace)
512542
{
543+
/* FIXME Can this ever happen? How would a ROM word
544+
* know about a RAM word? */
513545
/* Set the high bit on the return address so that we
514546
* know that this address is in ROM. */
515547
ip = (uint8_t*)((unsigned int)ip | 0x8000);
@@ -608,6 +640,45 @@ void enforth_resume(EnforthVM * const vm)
608640
}
609641
continue;
610642

643+
DODOES:
644+
{
645+
/* We're currently sitting in a word that is calling a word
646+
* defined with DOES>. IP has been advanced beyond that
647+
* call and is still in the calling word. XT contains the
648+
* Code Field of the defined word (and thus points to the
649+
* DOES> part of the defining word's thread). We need to
650+
* push IP to the stack, push the PFA of the defined word to
651+
* the stack, and then continue execution at XT (the
652+
* defining word's runtime behavior). */
653+
#ifdef __AVR__
654+
if (inProgramSpace)
655+
{
656+
/* FIXME Can this ever happen? How would a ROM word
657+
* know about a RAM word? */
658+
/* Set the high bit on the return address so that we
659+
* know that this address is in ROM. */
660+
ip = (uint8_t*)((unsigned int)ip | 0x8000);
661+
662+
/* We are no longer in program space since, by design,
663+
* DODOES is only ever used for user-defined words in
664+
* RAM. */
665+
inProgramSpace = 0;
666+
}
667+
#endif
668+
(--returnTop)->ram = ip;
669+
670+
/* W points at the PFA of the defined word; push that to the
671+
* stack per the runtime behavior of DOES>. */
672+
*--restDataStack = tos;
673+
tos.ram = w;
674+
675+
/* Point IP at the DOES> portion of the defining word (which
676+
* is the target of the defined word's Code Field and is
677+
* thus in XT). */
678+
ip = (uint8_t*)(vm->dictionary.ram + (xt & 0x3FFF));
679+
}
680+
continue;
681+
611682
DOCREATE:
612683
DOVARIABLE:
613684
{

0 commit comments

Comments
 (0)