An implementation of a Whitespace parser relying only on the State monad
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Whitespace parser Build Status

Associated blog post:

Example of a parser implementation of the Whitespace programming language using only StateT to convey the state, the stack and the output of the program.

Whitespace deals with 3 characters only: " ", "\t", and "\n".

This parser relies massively upon StateT and misc combinators, as we can see in the root parser:

def imp: StateT[F, (String, Stack), String] = for {
   output <- stackCommands.all <+>
     ioCommands.all <+>
     arithmeticCommands.all <+>
     heapCommands.all <+>
   rest <- imp <+> StateT.pure[F, (String, Stack), String]("")
} yield output + rest

The state of our parser contains the program and the stack of execution. The result is a String which is what is print to the screen.

Whitespace has 5 big chunks of commands to determine how to interpret the following characters.

For instance, one of the ioCommands (triggered by "\t\n") is " " which means: print the character on top of the stack.

Hello, world!

As a typical example, here we are parsing the Hello world from Wikipedia:

val helloWorld = "   \t  \t   \n\t\n     \t\t  \t \t\n\t\n     \t\t \t\t  \n\t\n     \t\t \t\t  \n\t\n     \t\t \t\t\t\t\n\t\n     \t \t\t  \n\t\n     \t     \n\t\n     \t\t\t \t\t\t\n\t\n     \t\t \t\t\t\t\n\t\n     \t\t\t  \t \n\t\n     \t\t \t\t  \n\t\n     \t\t  \t  \n\t\n     \t    \t\n\t\n  \n\n\n"
val Some(((rest, stack), output)) = new WhitespaceParser[Option].eval(helloWorld)
println(s"Stack: $stack")
println(s"Output: $output")
Stack: List(33, 100, 108, 114, 111, 119, 32, 44, 111, 108, 108, 101, 72)
Output: Hello, world!


Details about the syntax here:


The parser is not complete, some commands are missing, but you get the idea.