Skip to content
Bushmills edited this page Jan 13, 2022 · 31 revisions
  • concepts:

    • forward referencing of words, either only those compiled to new words, or also those called upon while interpreting.
      Feature can be enabled and disabled any time (disabled by default)
    • not exactly a purely incremental compiler
      • semicolon triggers compilation. Up to then are words only buffered.
        This allows the compiler to do post processing the word, useful for
        • replacing code sequences against more efficient functional equivalents.
        • delegating some compilation tasks from words to post processing.
          For example, create ... does> defining words are ripped apart by the post processor. The run time "does>" part is stored separately, to give the define time words semantics applicator easier access to run time code. Mind you, this is bash targeted, which doesn't allow to simply point instruction pointer inside of a function.
  • specific words where deviating from ANS:

    • <#
      expects single, not double.
      yoda doesn't know doubles and has no support for those
    • / mod /mod are truncating, not floored
      I consider changing these to produce floored results
    • query
      doesn't exist. Instead there's query$ which utilises string stack to store input.
    • evaluate
      not exposed. Instead, evaluate$ takes input from string stack (where query$ puts it)
    • abort, ?abort
      most likely somewhat different
    • quit
      can't do a "real" warm start. It just - optionally - empties stacks, then nests to a new instance of quit. Not a problem as long as it loops in its infinite loop, but you may want to avoid tapping ctrl-c thousands of times).
    • words
      without user defined vocabularies, words can't display words in only the context vocabulary, as there is none. "context vocabularies" do exist, but not as named vocabularies, made context by executing them. Therefore will words just display words in all populated vocabularies (4 exist, but not all of them are necessarily populated), each vocabulary under a header displaying its purpose: stateless, interpret, compile and unresolved.
    • r> >r
      return stack is not used to hold return addresses. yoda won't complain if "return stack" is used in an unbalanced way.
    • parse ( c -- ) ( string: -- $1 )
      parses input for delimiting character with ASCII c and places parsed portion as top item on string stack
    • convert, uconvert ( n / u -- ) ( string: -- $1 )
      convert signed or unsigned numeric n or u to string, pushed to string stack
    • from / from$ include yoda source code
      While from parses a space delimited string from input, from$ takes file name from string stack.
      Files without slash in their name are searched for through list of library dirs (configuration item).
      from works nicely with forward referenced items in resolving library files which aren't conditionally
      included from postlib: need word from libfile
  • Implementation

    • strings. An additional string stack has been added, entering strings doesn't go through s" word. Instead are strings recognised and dealt with by a pattern matching mechanism, invoked on words not found in the dictionary. Literal numbers, ASCII values of single characters, shell commands are subject to the same mechanism.
    • there is no virtual machine. Compiling a word creates a bash function, containing, from a bash viewpoint, "native code". The yoda interpreter operates within the same shell environment as the functions its compiler generates - yoda doesn't shell (unless bash does for invoking an external command).
    • Due to its forward referencing capability does yoda not need all of the functionality necessary for compiling a program already prior to beginning of compilation. Instead can it resolve those after compilation on a need-to-include base. This allows for a more automatic and comfortable library inclusion management. The library "postlib", referenced during resolve passes, exists for this purpose: compile nothing from it, unless needed. This mechanism is similar to Tom Almy's cforth.
    • yoda has now eliminated immediate words. While the word "immediate" still exists and is used, its semantics are different: it causes moving the header of last word into the compiler context vocabulary, rather than marking the word as state smart. Such a word doesn't have interpret time semantics (and can't be found when not compiling). For interpret time semantics of such a word, create it again with identical name, and specify only those. This mechanism is similar to what's used in Chuck Moore's cmforth.
    • "execution tokens" are extracted from function name associated with a word. Vice versa, executing code associated with execution token modifies latter to yield a function name. function names have numerical components for this purpose, therefore are execution tokens still represented by integers.