Programming in Toki Pona
============

Firstly, an FAQ:

What is toki pona?
- 
toki pona (always written in lowecase) is a conlang (constructed language) created by Sonja Lang, and is known for its small vocabulary and general simplicity. It has around 130 words, no conjugations/declensions or gender and every word can be either a noun, a verb or an adjective depending on context. Its numbering system is also super simple, similar to the roman numerals.

It was chosen to be used in this programming language because i think its a fun challenge to translate 'harder' concepts such as multiplication and if/elses into such a simple language, and also its appearence when using the logography. By the way, this is how you say 'toki pona' in toki pona:

![toki pona logo](./assets/smol.png)

How does it look like?
- 
There are two ways to view this language, the first one is using the romanization, and the second one is using toki pona's logography. Both can be seen below:

![Sample code in both views](./assets/sampleCode_compare.png "Sample code in both views")

The romanization is the one that will appear here in the code snippets and such. If you want to see it in its logographical form, there is a simple way to see it in an html, as we can see:

Its done by pulling an external toki pona style sheet into the html, as so:

``` 
<head>
    <link rel="stylesheet" href="https://davidar.github.io/linja-pona/stylesheet.css"></link>
</head>
<body class="linja-pona" style="font-size: xx-large">
    toki+pona li pona tawa mi 
</body>
```

This exact sample can be [easily seen here](./logography/firstExample.html). Another one, as seen above, looks [like this](./logography/secondExample.html).
I cant seem to make this link open in the browser, but you should be able to open it ([maybe even inside your IDE](vscode:extension/george-alisson.html-preview-vscode), with some extensions). The only differences in the html are the ```<br>``` used for line breaks and ```&emsp;``` used for spaces (not necessary, but more read-able).

The brackets in the variable names are used simply to allow the correct visualization in logography, for the [cartuches](https://en.wikipedia.org/wiki/Cartouche "Variable names like: [_nanpa_wan]").

What commands are in it?
- 
Currently, there are:

- 3 variable types, 'nimi' for strings, 'nanpa' for integers and 'kulupu' for an array of integers
- variable names in cartuches
- number assignment using the toki pona numbering system
- operations such as: sum, subtraction, multiplication, division
- if-else statements with 'la (condition) ni (commands) pini'
- while statements with 'awen la (condition) ni (commands) pini'
- 'toki' for printing out a variable (not compatible with arrays)

There is also a [cheat-sheet](./assets/cheat_sheet.png) included, which is probably easier to understand.

----------------------------------------
----------------------------------------

```
<program> ::= <declarations><commands>

<declarations> ::=  <declaration> |
                   <declaration> <declarations>

<declaration> ::= nimi id |
                 nanpa id |
                 kulupu id 

<commands> ::=  <command> |
                <command> <commands>

<command> ::= id sama <expression>

<expression> ::= id |
                 <number> |
                 <expression> en <expression> |
                 <expression> ala en <expression> |
                 <expression> namako <expression> |
                 <expression> kipisi <expression> |

<number> ::= <compound> |
             <compound> ala

<compound> ::= <simple> |
               <simple> <compound>

<if-else> ::= la (<expression> <comp> <expression>) ni <commands> pini |
              la (<expression> <comp> <expression>) ni <commands> ante la <if-else>

<comp> ::= sama sama | 
           sama ala 
```

------------------------------
------------------------------

Below, is where the code starts! Its all commented so you should be able to follow along:

In [8]:
# make sure to 'pip install rply' if you haven't already!

# simple testing toki pona code below:

# nanpa [_nanpa_wan] 
# nanpa [_nanpa_tu] 
# nanpa [_nanpa_tu_wan] 
#
# [_nanpa_wan] sama luka wan
# [_nanpa_tu] sama tu
#
# [_nanpa_tu_wan] sama [_nanpa_wan] en [_nanpa_tu]

# ------------------------------------------------ #
# [_nanpa_tu_wan] should be equal to 8 (luka tu wan) btw
# code in a string below:
simpleTesting = "nanpa [_nanpa_wan] nanpa [_nanpa_tu] nanpa [_nanpa_tu_wan] [_nanpa_wan] sama luka wan [_nanpa_tu] sama tu [_nanpa_tu_wan] sama [_nanpa_wan] en [_nanpa_tu]"


In [9]:
from rply import LexerGenerator

lg = LexerGenerator()

# This part of the code transforms all of the code we will input into tokens
# The tokens will then later be used to parse the code into a tree

# contemplating making this into just a 'sama', and a if not be in the else
# possibly force an if/else to always have an else ??
# lg.add('SAME', r'( sama sama )') # lg.add('DIFF', r'( sama ala )')

lg.add('WHILE', r'(awen la )')
lg.add('IF', r'(la )')
lg.add('ELSE', r'(ante la )')
lg.add('THEN', r'( ni )')
lg.add('END', r'( pini)')

lg.add('EQUALS', r'( sama )')
lg.add('SUB', r'( ala en )')
lg.add('ADD', r'( en )')
lg.add('MUL', r'( namako )')
lg.add('DIV', r'( kipisi )')

lg.add('INT', r'(nanpa )')
lg.add('STR', r'(nimi )')
lg.add('ARR', r'(kulupu )')

lg.add('SIMPLE', r"\b(?:luka|tu|mute|wan)\b")	# it will separate all of the numbers, this will be dealt with shortly
lg.add('NEG', r"( ala)")	# this is a negative number
lg.add('ID', r'\[(\_([jklmnpstw]?[aeiou])+)+\]') 		# how does this work this perfectly

lg.add('OPEN_PARENS', r'\(')
lg.add('CLOSE_PARENS', r'\)')
# question, do i need a delimiter? 

# lg.ignore('\s+') # probably not gonna need this, but check

lexer = lg.build()

In [14]:
from rply.token import BaseBox

class Program(BaseBox):
    def __init__(self, decls,cmmds):
        self.decls = decls
        self.cmmds = cmmds

    def accept(self, visitor):
        visitor.visit_program(self)

# ------------------------------------------- #
# Declarations

class Declarations(BaseBox):
    def __init__(self, decl,decls):
        self.decl = decl
        self.decls = decls

    def accept(self, visitor):
        visitor.visit_declarations(self)

class Declaration(BaseBox):
    def __init__(self, id,tp):
        self.id = id
        self.tp = tp

    def accept(self, visitor):
        visitor.visit_declaration(self)

# ------------------------------------------- #
# Commands

class Commands(BaseBox):
    def __init__(self, cmmd,cmmds):
        self.cmmd = cmmd
        self.cmmds = cmmds

    def accept(self, visitor):
        visitor.visit_commands(self)

class Command(BaseBox):
    def __init__(self, id,expr):
        self.id = id
        self.expr = expr

    def accept(self, visitor):
        visitor.visit_command(self)

# ------------------------------------------- #

class Atrib(BaseBox):
    def __init__(self, expr):
        self.expr = expr

class IfElse(BaseBox):
    def __init__(self, expr1, comp, expr2, ie1,ie2):
        self.expr1=expr1
        self.comp = comp
        self.expr2=expr2
        self.ie1=ie1
        self.ie2=ie2

    def accept(self, visitor):
        visitor.visit_ifelse(self)

class Expr(BaseBox):
    def accept(self, visitor):
        method_name = 'visit_{}'.format(self.__class__.__name__.lower())
        visit = getattr(visitor, method_name)
        visit(self)

# ------------------------------------------- #
# Number and its classifications

class Number(Expr):
    def __init__(self, compound, ala):
        self.compound = compound
        self.ala = ala 

class Compound(Expr):
    def __init__(self, simple, compound):
        self.simple = simple
        self.compound = compound

class Simple(Expr):
    def __init__(self, value, neg):
        ones = value.count('wan')
        twos = value.count('tu')
        fives = value.count('luka')
        twentys = value.count('mute')
        hundreds = value.count('ale')
        trueNumber = ones + twos*2 + fives*5 + twentys*20 + hundreds*100
        if neg:
            trueNumber = -trueNumber
        self.value = trueNumber

# ------------------------------------------- #
# Other expressions

class Id(Expr):
    def __init__(self, value):
        self.value = value

# -- the basic binary operations, they are all handeled by the BinaryOp class -- #
class BinaryOp(Expr):
    def __init__(self, left, right):
        self.left = left
        self.right = right
class Add(BinaryOp):
  pass
class Sub(BinaryOp):
  pass
class Mul(BinaryOp):
  pass
class Div(BinaryOp):
  pass


In [15]:
from rply import ParserGenerator

pg = ParserGenerator(
    # A list of all token names, accepted by the lexer.
    ['COMPARE', 'NOTCOMPARE', 'EQUALS',
     'ADD', 'SUB', 'MUL', 'DIV', 
     'WHILE', 'IF', 'ELSE',
     'ID', 'SIMPLE', 'NEG',
     'INT', 'STR', 'ARR',
     'OPEN_PARENS', 'CLOSE_PARENS', 
    ],
    # A list of precedence rules with ascending precedence, to
    # disambiguate ambiguous production rules.
    precedence=[
        ('left', ['ADD', 'SUB']),
        ('left', ['MUL', 'DIV'])
    ]
)

# ------------------------------------------- #

@pg.production('program : declarations commands')
def program(p):
    return Program(p[0],p[1])

@pg.production('declarations : declaration')
def declarations(p):
    return Declarations(p[0],None)

@pg.production('declarations : declaration declarations')
def declarations(p):
    return Declarations(p[0],p[1])

@pg.production('declaration : INT ID')
def declaration_integer(p):
    return Declaration(p[1].getstr(), "int")

@pg.production('declaration : STR ID')
def declaration_integer(p):
    return Declaration(p[1].getstr(), "str")

@pg.production('declaration : ARR ID')
def declaration_integer(p):
    return Declaration(p[1].getstr(), "array")

# ------------------------------------------- #

@pg.production('commands : command')
def commands_command(p):
    return Commands(p[0],None)

@pg.production('commands : command commands')
def command_commands(p):
    return Commands(p[0],p[1])

@pg.production('command : ID EQUALS expression')
def commands_command(p):
    return Command(p[1].getstr(),p[2])

# ------------------------------------------- #
# if-else statements

@pg.production('if-else : IF OPEN_PARENS expression COMP expression CLOSE_PARENS THEN commands END')
def expression_ifelse1(p):
    return IfElse (p[2],p[3],p[4],p[6],None)

@pg.production('if-else : IF OPEN_PARENS expression COMP expression CLOSE_PARENS THEN commands ELSE if-else')
def expression_ifelse3(p):
    return IfElse (p[2],p[3],p[4],p[6],p[8])

# ------------------------------------------- #

@pg.production('compounds : compound')
def compounds_compound(p):
    return Compound(p[0],None)

@pg.production('compounds : compound compounds')
def compound_compounds(p):
    return Compound(p[0],p[1])

@pg.production('compound : SIMPLE')
def compound_simple(p):
    return Number(p[0].getstr(), 0)

@pg.production('compound : SIMPLE NEG')
def compound_simpleneg(p):
    return Number(p[0].getstr(), 1) 

# ------------------------------------------- #

@pg.production('expression : ID')
def expression_id(p):
    return Id(p[0].getstr())

@pg.production('expression : expression ADD expression')
@pg.production('expression : expression SUB expression')
@pg.production('expression : expression MUL expression')
@pg.production('expression : expression DIV expression')
def expression_binop(p):
    left = p[0]
    right = p[2]
    if p[1].gettokentype() == 'ADD':
        return Add(left, right)
    elif p[1].gettokentype() == 'SUB':
        return Sub(left, right)
    elif p[1].gettokentype() == 'MUL':
        return Mul(left, right)
    elif p[1].gettokentype() == 'DIV':
        return Div(left, right)
    else:
        raise AssertionError('Oops, this should not be possible!')



In [13]:
@pg.error
def error_handler(token):
    raise ValueError("Ran into a %s where it wasn't expected" % token.gettokentype())

parser = pg.build()
string = 'nanpa [_linja_sike] [_linja_sike] sama luka'
arvore=parser.parse(lexer.lex(string))

  parser = pg.build()
  parser = pg.build()
  parser = pg.build()


KeyError: 'COMP'