STM8 eForth Programming

Thomas edited this page Oct 18, 2018 · 38 revisions

Practical STM8 eForth Programming

Interactive programming is an asset for developing and tuning cyber-physical systems. The idea isn't new: MCS BASIC-52 was quite common in the 1980s. Recent examples are eLua or MicroPython.

STM8 eForth, like one of the earliest embedded interactive systems, is very compact: on an STM8S003F3P6, a cheap µC with 8K Flash and 1K RAM, it consumes less than half the memory resources while serving both as programming environment and as a library for application code. It even provides operating system features like a shell, scripting, multitasking, interrupt handlers, and character I/O redirection!

Forth is a very simple but highly extensible programming language. A program can be as simple as this:

: hello ." Hello World!" ;

Forth is a language with self-reference: the Forth console (the interpreter) is written in Forth, and the compiler extends the interpreter by placing new words into a dictionary.

This page provides an STM8 eForth walk-through. There is also an example code page and the W1209 data logging thermostat project, a showcase application that implements thermostat and data logging features on the W1209 low-cost thermostat hardware.

Many skilled programmers have worked with Forth for decades, and it has a rich tradition. While there are many useful resources on-line, the following free books are recommended:

  • Starting Forth, which provides a friendly introduction to the programming language (most of the examples work with STM8 eForth), and
  • Thinking Forth by Leo Brodie as an introduction to the Forth programming method.

Using the Forth Console

TG9541/STM8EF board support packages focus on ease of use but there is always room for improvement. You're invited to open a support ticket if you face a problem!

Connecting the Console

For flashing the Forth binary to a board, please refer to Getting Started in the breakout-board section. For communicating with the board to your PC, you'll find information in STM8 eForth Programming Tools.

The following items are recommended:

  • a supported board or a bare µC and on a breakout board)
  • a programming adapter for programming a binary, e.g. ST-LINK V2 compatible
  • a "TTL" serial interface adapter (or a PC with a serial interface, and a level shifter)
  • a terminal program (e4thcom works best, but plain terminal program like Picocom or Hyperterm will also work)

You can run first experiments with the help of the STM8 simulator uCsim instead of using hardware (it's used in the Continuous Integration chain).

Interacting with the µC

STM8 eForth uses a terminal emulation program and a standard serial interface for accessing the console (or telnet in the simulation use case).

The console can be used for interpreting and compiling code interactively. If you've familiar with a HP RPN calculator you already know how to write arithmetic expressions:

The input 12 + 7 = * 5 = on a conventional pocket calculator, or return (12 + 7) * 5; in C, is equivalent to 12 7 + 5 * in Forth. Postfix notation doesn't need brackets.

Forth is a stack based language: most of the data-flow between subroutines ("words") won't require variables or registers. This has the advantage that words can be chained to form a phrase, and phrases can be compiled to form a new word.

Interactive code execution in Forth follows a simple pattern: put data on the stack, write the name of one or more Forth words (e.g. + and . for printing), and press enter.

10 -2 * .<enter> -20 ok

After typing 10 -2 * ., pressing <enter> starts the evaluation. 10 and -2 are pushed to the stack, + pops (consumes) both numbers, adds them, and pushes the result. . pops the result and prints it. The stack is now in the same state as before this operation. The interpreter prints ok, and waits for new input. This simple console user interface is known as a "REPL" (Read Evaluate Print Loop).

A simple Forth like STM8EF is typeless, and all information on the stack is represented as 16bit numbers (i.e. two 16 numbers for 32 bit data).

On the lowest level of Forth there are words for manipulating the data stack, e.g. DUP (duplicate), SWAP (swap the two elements), DROP (remove), ROT (rotate the top 3 elements).

The word .S is a debugging word: it prints (.) the contents of the stack ("S") without changing them:

6 7 -1 2 ok
* DUP . -2 ok
.S
6 7 -2 <sp  ok
SWAP
.S
6 -2 7 <sp  ok

On a higher level there are words that manipulate data, text, or code. Please refer to Starting Forth for further reading.

A Forth interpreter cycle takes a number (0 .. n) of items from from the stack, and pushes (0 .. m) items back.

When there is an error (e.g. a word is unknown), the interpreter resets the stack. It will also tell you when there is a stack "underflow" after execution, which means that more items were consumed than what was initially on the stack.

Note that running DROP DROP DROP 0 0 0 on an empty stack will cause an underflow but this won't trigger an error. For performance reasons a simple Forth, like STM8 eForth, words don't test the stack balance during evaluation (but Forth can be extended to do that).

Defining Words

The Forth console can be used for defining new words. Usually, a Forth programmer breaks the problem down into easily testable words (units of code), and then defines new words that contain "phrases" of already defined words.

This enables a bottom-up approach to programming: through program-test iterations, and refactoring, a "domain specific language" emerges (note that for most conventional programming languages building domain specific languages is an advanced feature). Please refer to Thinking Forth for an in-depth discussion of the method.

Defining new words is very simple: write : followed by the name of the new word, a sequence of already defined words, and ; at the end:

: by-2   ( w -- w )  \ multiply w by -2
  -2 * ; ok
10 by-2 . -20 ok

The sequence :, identifier, code, and ; is called a "colon definition". The word : brings Forth into compilation state.

In a simple Forth, there is next to no syntax checking (in Forth syntax is a convention). There are very few requirements for the choice of the word identifier: it must consist of one or more printable characters (including numbers, punctuation marks, etc). Forth words can be redefined but definitions compiled before the redefinition won't change (they will continue to use the old word).

There are words that can only be used during compilation (e.g. ; which terminates a colon definition). Other words will be executed immediately in compilation mode (by using the IMMEDIATE modifier). It's also possible to write words that do one thing in interpreter mode, and anther during compilation (execution semantics sensitive words, e.g. CONSTANT).

Of course there are structure words in Forth, e.g. IF and THEN. As Forth is a stack oriented language it does things a bit differently:

: test  ( n -- )   \ demonstrate if else then
  IF
    ."  true"
  ELSE
    ."  false"
  THEN ;

0 test false ok
5 test true ok

In Forth, code is data, too. The compiler simply transforms one stream of data (text) into another stream of data (code). Structure words, like IF, ELSE, THEN, rely on storing addresses for unresolved branch targets on the data stack, much like the data in-between arithmetics words in 2 3 4 + *.

In Forth, there is no fundamental difference between compiler and interpreter. In fact, the role of the interpreter is similar to a C preprocessor:

: param ( -- n )   \ use the interpreter for calculating a parameter
  [ -622 37 100 */ ] LITERAL ; \ only store the result of the calculation
param . 230 ok

[ switches from the compilation to the interpreter state. The code until ] is executed, and the result is then pushed to the data stack. The immediate word LITERAL compiles the a stack value as a constant into the word param. Note that immediate words like [ and LITERAL can be defined by labeling a new word as IMMEDIATE after compilation (likewise a word can be labeled as "compile only" which means that they won't be executed in interpreter state, e.g. IF or THEN).

STM8 eForth can compile code to RAM or to NVM (non volatile memory). When the system is in mode RAM, new words get compiled into volatile memory. After switching to NVM mode, the compile target is Flash memory. Note that NVM mode must be terminated by typing RAM if newly defined words shall be retained. Please refer to STM8 eForth Compile to Flash for more information.

STM8 eForth Properties and Core Words

STM8 eForth is a 16bit STC Forth, which means that data stack, return stack, a memory cell, and addresses are all 16 bit wide (some words use 8 or 32 bit data). In STC, compiled code is executable machine code.

Code units in Forth are called words. To list all Forth words available in a session (core, and user-defined) type WORDS on the Forth console. Also note that in the globalconf.inc of most boards the option CASEINSENSITIVE is set (i.e. using words or WORDS does the same).

The STM8EF glossary docs/words.md (a list of Forth words in forth.asm) describes what words using a n extended data-flow notation:

;       @       ( a -- n )      ( TOS STM8: -- Y,Z,N )
;       Push memory location to stack

@ (read) gets address a from the stack, reads the 16bit cell at the 16bit address a and puts the value n on the stack (the Forth stack comment conventions are described here). The second part can be safely ignored (for core programming in assembly this can be read as: "After execution of @ the register Y contains the TOS (top of stack) value, and the Z (zero), and N (negative) flags correspond to n").

Depending on configuration in a board's globalconf.inc only a subset of the words listed in words.md is linked (visible). However, the alias feature can be used to access words that are part of the binary, but not linked.

Example STM8 Forth Code

The following section contains some simple idioms, patterns, and example programs. Board-W1209 contains some more examples for startup code with an interactive background task that uses W1209 I/O.

Defining Start-up Code

The following example defines a simple greeting word. It's also possible to initialize background tasks, or to run complex embedded control applications.

NVM
: mystart CR 3 FOR I . NEXT CR ." Hi!" CR ;
' mystart 'BOOT !
RAM

NVM switches to Flash mode. mystart is the word that's to be run as start-up code. ' (tick) retrieves its address of mystart, 'BOOT retrieves the address of the startup word pointer, and ! stores the address of our word to it. RAM changes to RAM mode and stores pointers permanently.

On reset or through cold start STM8EF now shows the following behavior:

COLD
 3 2 1 0
Hi!

The original start-up behavior can be restored by running ' HI 'BOOT !, or using RESET, which not only makes STM8EF forget any vocabulary in Flash, but also resets the start-up code to HI.

Reading an Analog Port

The STM8S003F3 and STM8S103F3 both have 5 usable multiplexed ADC channels (AIN2 to AIN6). The words ADC! and ADC@ provide access to the STM8 ADC.

The following example shows how to read AIN3, which is an alternative function of PD2:

: conv ADC! ADC@ ;
3 conv . 771 ok

ADC! selects a channel for conversion, ADC@ starts the conversion and gets the result. The example declares the word conv to combine both actions. Please note that the conversion time of ADC@ is longer after selecting a different channel with ADC!.

Please note that in STM8Sx003F3P6 chips, AIN5 and AIN6 are an alternative function of the ports PD5 and PD6. These GPIO pins are also used for RS232 TxD and RxD. The phrase 6 ADC! switches PD6 to analog mode (AIN6) while detaching the UART (RxD). The eForth system will appear to be hanging (the phrase 6 ADC! ADC@ 0 ADC! . will show a 10bit analog read-out of the RxD level).

Setting Board Outputs

The STM8SEF board support provides the word OUT! for setting the binary state of up to 16 relays, LEDs or other digital output devices specific to that board (new board support packages should the mapping in board.fs or in boardcore.inc).

The following example blinks Relay 1 of a C0135 STM8S103 Relay Control Board, the relay of a W1209 thermostat or the status LED of a STM8S103F3P6 Breakout Board with a frequency of about 0.75 Hz ( 1/(128 * 5 msec) ):

: task TIM 128 AND IF 1 ELSE 0 THEN OUT! ;
' task BG !

You can set individual port bits with the B! command BitState Address Bit# B!.

    1 $5011 0 B!   \  PD0 is output in PD_DDR
    0 $500F 0 B!   \ led ON (low) in PD_ODR
    1 $500F 0 B!   \ led off

Redefining OUT! to use PD0 ...

    : OUT! 1 $5011 0 B! 0= NEGATE $500F 0 B! ; \redefine OUT!

You can find all the register addresses in the specific MCUs .efr file

7S-LED Display Character Output

Data output to 7S-LED displays is supported by vectored I/O. In background tasks the EMIT vector points to E7S by default, and using it is simple.

The LED display is organized in right-aligned digit groups (e.g. boards W1219 2x 3-digits, or W1401 3x 2-digits) that each work similar to the display of a pocket calculator:

  • CR moves the cursor to the first (leftmost, upper) 3-digit group without blanking it
  • SPACE after another printable character moves the cursor to the next group, and blanks it
  • . is rendered as DP without shifting characters to the left
  • , is rendered as a blank

Emitting more than the characters of a group won't spill into the next group (23..,5 is rendered as 3. 5).

Please refer to STM8 eForth Board Character IO for a detailed discussion, and example code.

The following code displays different data (all scaled to a range of 0..99) on the 3x2 digit 7S-LED groups of the board W1401:

: timer TIM 655 / ;
: ain 5 ADC! ADC@ 100 1023 */ ;
: show timer . ain . BKEY . CR ;
' show bg !

The word show displays the values scaled to 0..99 from the BG timer, the sensor analog input, and the board key bitmap BKEY followed by a CR (new line). When the word show runs in the background, it displays the ticker on the left yellow 7S-LED group, ain on the middle red LEDs, and the board key bitmap on the right yellow group.

eForth Counted Loop with FOR .. NEXT

STM8EF is based on eForth, a Forth dialect that builds all higher level words out of a small set of primitives. Most Forth dialects provide the loop structure DO <condition> IF LEAVE THEN +LOOP, and unlike "pure eForth" STM8 eForth provides this structure, too.

: count DO I . LOOP ;
20 10 count 10 11 12 13 14 15 16 17 18 19 ok

: countdown DO I . -1 +LOOP ;
10 20 countdown 20 19 18 17 16 15 14 13 12 11 10 ok

However, the DO..LOOP structure is optional: "pure eForth" implements a FOR .. NEXT loop which runs "from start down-to 0".

: countdown ( n -- ) 
   FOR I . NEXT
;
9 countdown 9 8 7 6 5 4 3 2 1 0 ok

A loop structure similar to DO .. LEAVE .. LOOP can be implemented with an "idiomatic" combination of FOR .. NEXT, WHILE, and ELSE .. THEN:

: myLoop ( n1 n2 -- )
   FOR
      DUP I = NOT WHILE
      I . \ limit not reached
   NEXT
      ."  end"
   ELSE
      ."  limit"
      R> DROP \ remove the FOR loop counter
   THEN
   DROP    \ drop limit
;

In this example, 5 10 myloop prints 10 9 8 5 6 limit, and 5 4 myloop prints 4 3 2 1 0 end.

WHILE puts a the address of its conditional branch target on the stack above the start address of FOR the for loop (which is used when compiling NEXT). During compilation, ELSE then uses this address to make WHILE exit to the code block delimited by THEN. Note that the ELSE clause is responsible of removing the loop counter from the return stack.

Copying characters into a buffer

Loops are also powerful ways of copying data one character at a time into or out of a buffer. The following example demonstrates one way of doing this. It was taken from the nRF24 library under development and has been heavily commented with stack notations to help showcase Forth.

: nRF>b ( a c r -- s )  
\ copy count c bytes from reg. r to buffer at address a  
\ reg. r is on the nRF24 device and can be up to 32 bytes long.  
\ It holds the message the nRF24 received from another nRF24 transmitter  
\ return nRF24 STATUS s  
   _CSN.LOW ( a n r -- ) \ this definition pulls a pin low as a precursor to SPI communication  
   SPI ( a n s -- )      \ we have send the register r to the nRF24 and the status bytes was returned  
   >R ( a n -- )         \ put the status byte onto the return stack for now  
   0 DO (  a -- )        \ start a DO...LOOP   
      -1 SPI ( a c -- )  \ here we transmit a dummy byte, -1 could be anything,  
                         \ and the SPI command returns the next character in the nRF24 message buffer  
      OVER ( a c a -- )  \ bring a copy of a to the top of the stack   
      C! ( a -- )        \ c was stored to address a as 8 bit character ( C!)   
      1+ ( a -- a+1 )    \ now add 1 to the address ready to store the next character  
   LOOP                  \ go do it all again until this has been done n times  
   _CSN.HIGH ( a+n --- ) \ pull this pin high again, the address pointer is still on the data stack  
   DROP                  \ drop the address pointer from the data stack  
   R> ( -- s)            \ move the status register from the return stack back to the data stack 
                         \ Caution: the return stack must be identical to what it was when this definition was called  
                         \ since the top of the return stack is the address telling Forth where to jump to next  
;  

Of course, one of the features of Forth loops is they don't have to start at 0. The do loop could have been written as follows (with less comments this time) :

: nRF>b ( a c r -- s )  
   _CSN.LOW ( a n r -- ) 
   SPI ( a n s -- ) 
   >R ( a n -- ) 
   over + swap ( a n --- a+n a  )
   DO ( a+n a -- ) \ our do loop index will start at a, and end at a+n-1
      -1 SPI ( -1 -- c )
      I ( c -- c a ) \ I is now the address pointer
      C! ( c a -- ) \ c was stored to address a as 8 bit character ( C!) 
   LOOP 
   _CSN.HIGH 
   R> ( s ) 
; 

One of the great things about Forth is that it "extends" to suit the way you, the programmer, might think about coding.

Recursion in STM8 eForth

If an existing word in is re-defined in eForth the old definition can be used in the new definition, e.g.:

16 32 64 .S
 16 32 64 <sp  ok
: .S BASE @ >R HEX .S R> BASE ! ; reDef .S ok
 10 20 40 <sp  ok

The downside is that it's difficult to define recursive functions: in eForth, linking the new word to the dictionary is delayed until ; is executed.

The following example shows how to do recursion in eForth:

: RECURSE last @ NAME> CALL, ; IMMEDIATE

From the execution of : on, the word last provides the address of the code field of the new word.

The fibonacci function demonstrates a recursive reference with f(n-1) and f(n-2).

#require RECURSE

: fibonacci DUP 2 < IF DROP 1 ELSE DUP 2 - RECURSE SWAP 1 - RECURSE + THEN ;
: fibnums FOR I fibonacci U. NEXT ;
15 fibnums 987 610 377 233 144 89 55 34 21 13 8 5 3 2 1 1 ok

On a STM8S clocked with 16Mhz, the 27199 calls of 23 fibonacci (the maximum for 16bit arithmetics) execute in about 2.6s from RAM, or 2.3s from Flash. While that's no big problem for the stack this is a very long time for most embedded applications (one would obviously use a table lookup).

Tree Traversal

On a µC with limited RAM, recursion should be used with care. However, for some algorithms, e.g. tree traversal, recursion the got-to solution. The following code demonstrates 3 types of tree traversal, a tree metric, and building tree data structures:

#require RECURSE

\ binary tree (dictionary)
\ https://rosettacode.org/wiki/Tree_traversal#Forth
\ minor modifications for eForth

: node ( l r data -- node ) here >r , , , r> ;
: leaf ( data -- node ) 0 0 rot node ;
: >data  ( node -- ) @ ;
: >right ( node -- ) 2+ @ ;
: >left  ( node -- ) 2+ 2+ @ ;

: preorder ( xt tree -- )
  dup 0= if 2drop exit then
  2dup >data swap execute
  2dup >left recurse
       >right recurse ;

: inorder ( xt tree -- )
  dup 0= if 2drop exit then
  2dup >left recurse
  2dup >data swap execute
       >right recurse ;

: postorder ( xt tree -- )
  dup 0= if 2drop exit then
  2dup >left recurse
  2dup >right recurse
       >data swap execute ;

: max-depth ( tree -- n )
  dup 0= if exit then
  dup  >left recurse
  swap >right recurse max 1+ ;

\ Define this binary tree
\         1
\        / \
\       /   \
\      /     \
\     2       3
\    / \     /
\   4   5   6
\  /       / \
\ 7       8   9

variable tree
7 leaf 0      4 node
5 leaf 2 node
8 leaf 9 leaf 6 node
0      3 node 1 node tree !

\ run some examples with "." (print) as the node action
cr ' . tree @ preorder    \ 1 2 4 7 5 3 6 8 9
cr ' . tree @ inorder     \ 7 4 2 5 1 8 6 9 3
cr ' . tree @ postorder   \ 7 4 5 2 8 9 6 3 1
cr tree @ max-depth .     \ 4

Note that the example traversal code (e.g. preorder) accepts the address of an action word. Of course, most practical applications (e.g. binary search) require some changes, but Forth makes very lightweight and powerful solutions possible. It also gives you an idea of the code density: the 32 lines of code, including tree definition, and 4 traversal routines, compile to 385 bytes of binary code. That's about 200 bytes per screen. In practical applications, often only one traversal routine is needed, and tree building words sometimes can be left out of the binary.

Defining Defining Words with CREATE .. DOES>

As an extension of eForth TG9541/STM8EF supports defining defining words with CREATE..DOES>. Defining words, like CREATE, VARIABLE, or :, can be compared to classes in an OOP language, like Java, with a single method besides the constructor.

As an example, the definition of the defining word VALUE:

: VALUE CREATE , DOES> @ ;

New VALUE instances can now be defined with in the following way:

10000 VALUE ONE
31415 VALUE PI
: circumference ( n -- C )
   2* PI ONE */ ;
\ test
500 circ . 3141 ok

The clause between CREATE and DOES> is executed during "define time", whereas the clause between DOES> and ; is the runtime part of the new word (note: DOES> replaces the normal runtime code of CREATE at compile time).

For prototyping and for high-level code, CREATE .. DOES> is useful, and the overhead it causes doesn't matter. For frequently used defining words (e.g. CONSTANT, or VARIABLE) directly coded compile time, and run time words are better.

Here is an example from Forth.com:

DECIMAL
: HASH ( -- )
  42 EMIT
;

: .row ( c -- )
  CR 1 7 FOR
    2DUP AND IF HASH ELSE SPACE THEN 2*
  NEXT 2DROP
;

: SHAPE ( 8 times n -- )
  CREATE 7 FOR C, NEXT
  DOES> DUP 7 FOR DUP R@ + C@ .row NEXT DROP CR
;

HEX AA AA FE FE 38 38 38 FE SHAPE castle
7F 46 16 1E 16 06 0F 00 SHAPE F

The "define time" behavior uses a FOR .. NEXT loop for storing 8 bytes in the dictionary. The runtime part prints the each of these bytes as a bit pattern:

F<enter>
#######
 ##   #
 ## #
 ####
 ## #
 ##
####
    
 ok

Further Reading

The introduction on this page, among other topics, doesn't cover working with temporary words in RAM and using aliases for accessing unlinked words in the binary, or programming Interrupts in Forth. Please refer to the links at the beginning of the page, and in the sidebar!

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.