# –ü–∞—Ä—Å–µ—Ä –≥—Ä–∞–º–º–∞—Ç–∏–∫–∏ –∞–≤—Ç–∞–Ω–¥–∏–ª–∞

## Lark

### calculator

In [1]:
from lark import Lark, Transformer, v_args

In [2]:
calc_grammar = """
    ?start: sum
          | NAME "=" sum    -> assign_var
    ?sum: product
        | sum "+" product   -> add
        | sum "-" product   -> sub
    ?product: atom
        | product "*" atom  -> mul
        | product "/" atom  -> div
    ?atom: NUMBER           -> number
         | "-" atom         -> neg
         | NAME             -> var
         | "(" sum ")"
    %import common.CNAME -> NAME
    %import common.NUMBER
    %import common.WS_INLINE
    %ignore WS_INLINE
"""

In [3]:
@v_args(inline=True)    # Affects the signatures of the methods
class CalculateTree(Transformer):
    from operator import add, sub, mul, truediv as div, neg
    number = float

    def __init__(self):
        self.vars = {}

    def assign_var(self, name, value):
        self.vars[name] = value
        return value

    def var(self, name):
        try:
            return self.vars[name]
        except KeyError:
            raise Exception("Variable not found: %s" % name)

In [4]:
calc_parser = Lark(calc_grammar, parser='lalr', transformer=CalculateTree())

In [5]:
calc = calc_parser.parse

In [6]:
calc("a = 1+2")

3.0

In [7]:
calc("1+a*-3")

-8.0

### calculator +

In [17]:
calc_grammar = """
    ?start: sum
          | NAME "=" sum    -> assign_var
    ?sum: product
        | "·õù" sum product   -> add
        | "·ö∏" sum product   -> sub
    ?product: atom
        | "·õ™" product atom  -> mul
        | "·õÑ" product atom  -> div
    ?atom: NUMBER           -> number
         | "-" atom         -> neg
         | NAME             -> var
         | "(" sum ")"
    %import common.CNAME -> NAME
    %import common.NUMBER
    %import common.WS_INLINE
    %ignore WS_INLINE
"""

In [18]:
calc_parser = Lark(calc_grammar, parser='lalr', transformer=CalculateTree())
calc = calc_parser.parse

In [19]:
calc("a = ·õ™ 3 4")

12.0

In [16]:
calc("a = ·ö∏ 1 2")

-1.0

In [21]:
#calc("a = ·ö∏ 1 2 ·õù 3 4") # –Ω–µ —Ä–∞–±–æ—Ç–∞–µ—Ç

## Turtle 

https://github.com/lark-parser/lark/blob/master/examples/turtle_dsl.py

In [22]:
from lark import Lark

### turtle_grammar

In [23]:
turtle_grammar = """
    start: instruction+
    instruction: MOVEMENT NUMBER            -> movement
               | "c" COLOR [COLOR]          -> change_color
               | "fill" code_block          -> fill
               | "repeat" NUMBER code_block -> repeat
    code_block: "{" instruction+ "}"
    MOVEMENT: "f"|"b"|"l"|"r"
    COLOR: LETTER+
    %import common.LETTER
    %import common.INT -> NUMBER
    %import common.WS
    %ignore WS
"""

In [24]:
parser = Lark(turtle_grammar)

In [25]:
def run_turtle(program):
    parse_tree = parser.parse(program)
    for inst in parse_tree.children:
        run_instruction(inst)

In [26]:
def run_instruction(t):
    if t.data == 'change_color':
        turtle.color(*t.children)   # We just pass the color names as-is

    elif t.data == 'movement':
        name, number = t.children
        { 'f': turtle.fd,
          'b': turtle.bk,
          'l': turtle.lt,
          'r': turtle.rt, }[name](int(number))

    elif t.data == 'repeat':
        count, block = t.children
        for i in range(int(count)):
            run_instruction(block)

    elif t.data == 'fill':
        turtle.begin_fill()
        run_instruction(t.children[0])
        turtle.end_fill()

    elif t.data == 'code_block':
        for cmd in t.children:
            run_instruction(cmd)
    else:
        raise SyntaxError('Unknown instruction: %s' % t.data)

In [27]:
text = """
        c red yellow
        fill { repeat 36 {
            f200 l170
        }}
    """

In [29]:
parse_tree = parser.parse(text)

In [30]:
parse_tree

Tree(Token('RULE', 'start'), [Tree('change_color', [Token('COLOR', 'red'), Token('COLOR', 'yellow')]), Tree('fill', [Tree(Token('RULE', 'code_block'), [Tree('repeat', [Token('NUMBER', '36'), Tree(Token('RULE', 'code_block'), [Tree('movement', [Token('MOVEMENT', 'f'), Token('NUMBER', '200')]), Tree('movement', [Token('MOVEMENT', 'l'), Token('NUMBER', '170')])])])])])])

In [31]:
for inst in parse_tree.children:
    print(inst)

Tree('change_color', [Token('COLOR', 'red'), Token('COLOR', 'yellow')])
Tree('fill', [Tree(Token('RULE', 'code_block'), [Tree('repeat', [Token('NUMBER', '36'), Tree(Token('RULE', 'code_block'), [Tree('movement', [Token('MOVEMENT', 'f'), Token('NUMBER', '200')]), Tree('movement', [Token('MOVEMENT', 'l'), Token('NUMBER', '170')])])])])])


### grammar+

In [101]:
avt_grammar = """
    start: instruction+
    instruction: OPERATOR NUMBER NUMBER     -> operator
               | OPERATOR "‚àÖ" NUMBER     -> operator
               | "‚∞ä" NUMBER NUMBER NUMBER NUMBER -> chi_sq
               | "—Ø" NUMBER NUMBER          -> percent
               | "·¨à" vector vector -> corr
    vector: "„Äé" NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER "„Äè"
    OPERATOR: "·õù"|"·ö∏"|"·õ™"|"·õÑ"
    %import common.LETTER
    %import common.INT -> NUMBER
    %import common.WS
    %ignore WS
"""

In [102]:
parser = Lark(avt_grammar)

In [103]:
avt_code = """
        ·ö∏ 100 20
        ·õÑ 100 2
        ‚∞ä 113 144 125 129
        ·¨à „Äé10 15 10 13 10 21„Äè„Äé14 18 10 18 6 19„Äè
        —Ø 100 13
    """

In [104]:
parse_tree = parser.parse(avt_code)

In [105]:
parse_tree

Tree(Token('RULE', 'start'), [Tree('operator', [Token('OPERATOR', '·ö∏'), Token('NUMBER', '100'), Token('NUMBER', '20')]), Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '2')]), Tree('chi_sq', [Token('NUMBER', '113'), Token('NUMBER', '144'), Token('NUMBER', '125'), Token('NUMBER', '129')]), Tree('corr', [Tree(Token('RULE', 'vector'), [Token('NUMBER', '10'), Token('NUMBER', '15'), Token('NUMBER', '10'), Token('NUMBER', '13'), Token('NUMBER', '10'), Token('NUMBER', '21')]), Tree(Token('RULE', 'vector'), [Token('NUMBER', '14'), Token('NUMBER', '18'), Token('NUMBER', '10'), Token('NUMBER', '18'), Token('NUMBER', '6'), Token('NUMBER', '19')])]), Tree('percent', [Token('NUMBER', '100'), Token('NUMBER', '13')])])

In [106]:
for inst in parse_tree.children:
    print(inst)

Tree('operator', [Token('OPERATOR', '·ö∏'), Token('NUMBER', '100'), Token('NUMBER', '20')])
Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '2')])
Tree('chi_sq', [Token('NUMBER', '113'), Token('NUMBER', '144'), Token('NUMBER', '125'), Token('NUMBER', '129')])
Tree('corr', [Tree(Token('RULE', 'vector'), [Token('NUMBER', '10'), Token('NUMBER', '15'), Token('NUMBER', '10'), Token('NUMBER', '13'), Token('NUMBER', '10'), Token('NUMBER', '21')]), Tree(Token('RULE', 'vector'), [Token('NUMBER', '14'), Token('NUMBER', '18'), Token('NUMBER', '10'), Token('NUMBER', '18'), Token('NUMBER', '6'), Token('NUMBER', '19')])])
Tree('percent', [Token('NUMBER', '100'), Token('NUMBER', '13')])


In [40]:
from scipy.stats import chisquare

In [42]:
ch = chisquare([16, 18, 16, 14])
ch.pvalue

0.9188914116546758

In [107]:
from scipy.stats import chisquare
import numpy as np

def summation(num1, num2):
    return int(num1) + int(num2)

def subtraction(num1, num2):
    return int(num1) - int(num2)

def multiplication(num1, num2):
    return int(num1) * int(num2)

def division(num1, num2):
    if num2 == '0':
        return '‚àÖ'
    return int(num1) / int(num2)

def run_instruction(t):
    if t.data == 'operator':
        name, number1, number2 = t.children 
        if number1 == '‚àÖ' or number2 == '‚àÖ':
            return '‚àÖ'
        return {'·õù': summation,
          '·ö∏': subtraction,
          '·õ™': multiplication,
          '·õÑ': division}[name](number1, number2)

    elif t.data == 'chi_sq':
        square_values = [int(val) for val in t.children]
        ch = chisquare(square_values)
        return(ch.pvalue)
    
    elif t.data == 'percent':
        number1, number2 = t.children
        return int(number2) / int(number1) * 100

    elif t.data == 'corr':
        vec1, vec2 = t.children
        array1 = np.array([int(val) for val in vec1.children])
        array2 = np.array([int(val) for val in vec2.children])
        return np.corrcoef(array1, array2)[0][1]

    else:
        raise SyntaxError('Unknown instruction: %s' % t.data)

In [108]:
for inst in parse_tree.children:
    print(run_instruction(inst))

80
50.0
0.27909720379860026
0.7359363818524864
13.0


In [109]:
avt_code0 = """
        ·õÑ 100 0
        ·õù ‚àÖ 3
    """

In [110]:
parse_tree0 = parser.parse(avt_code0)

In [112]:
for inst in parse_tree0.children:
    print(inst)

Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '0')])
Tree('operator', [Token('OPERATOR', '·õù'), Token('NUMBER', '3')])


In [111]:
for inst in parse_tree0.children:
    print(run_instruction(inst))

‚àÖ


ValueError: not enough values to unpack (expected 3, got 2)

### experiment grammar

In [125]:
avt_grammar1 = """
    start: instruction+
    instruction: OPERATOR NUMBER NUMBER     -> operator
               | OPERATOR VALUE VALUE     -> operator
               | "‚∞ä" NUMBER NUMBER NUMBER NUMBER -> chi_sq
               | "—Ø" NUMBER NUMBER          -> percent
               | "·¨à" vector vector -> corr
    vector: "„Äé" NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER "„Äè"
    OPERATOR: "·õù"|"·ö∏"|"·õ™"|"·õÑ"
    VALUE: NUMBER|"‚àÖ"|NUMBER","NUMBER
    %import common.LETTER
    %import common.INT -> NUMBER
    %import common.WS
    %ignore WS
"""

In [126]:
parser = Lark(avt_grammar1)

In [127]:
avt_code1 = """
        ·õÑ 100 0
        ·õù ‚àÖ 3
        ·ö∏ 13,3 3
    """

In [128]:
parse_tree = parser.parse(avt_code1)
parse_tree

Tree(Token('RULE', 'start'), [Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '0')]), Tree('operator', [Token('OPERATOR', '·õù'), Token('VALUE', '‚àÖ'), Token('VALUE', '3')]), Tree('operator', [Token('OPERATOR', '·ö∏'), Token('VALUE', '13,3'), Token('VALUE', '3')])])

In [129]:
for inst in parse_tree.children:
    print(inst)

Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '0')])
Tree('operator', [Token('OPERATOR', '·õù'), Token('VALUE', '‚àÖ'), Token('VALUE', '3')])
Tree('operator', [Token('OPERATOR', '·ö∏'), Token('VALUE', '13,3'), Token('VALUE', '3')])


In [130]:
from scipy.stats import chisquare
import numpy as np

def summation(num1, num2):
    return float(num1) + float(num2)

def subtraction(num1, num2):
    return float(num1) - float(num2)

def multiplication(num1, num2):
    return float(num1) * float(num2)

def division(num1, num2):
    if num2 == '0':
        return '‚àÖ'
    return float(num1) / float(num2)

def float_true(val):
    if ',' in val:
        return val.replace(',', '.')
    else:
        return val

def run_instruction(t):
    if t.data == 'operator':
        name, number1, number2 = t.children 
        if number1 == '‚àÖ' or number2 == '‚àÖ':
            return '‚àÖ'
        else:
            number1 = float_true(number1)
            number2 = float_true(number2)
        return {'·õù': summation,
          '·ö∏': subtraction,
          '·õ™': multiplication,
          '·õÑ': division}[name](number1, number2)

    elif t.data == 'chi_sq':
        square_values = [int(val) for val in t.children]
        ch = chisquare(square_values)
        return(ch.pvalue)
    
    elif t.data == 'percent':
        number1, number2 = t.children
        return int(number2) / int(number1) * 100

    elif t.data == 'corr':
        vec1, vec2 = t.children
        array1 = np.array([int(val) for val in vec1.children])
        array2 = np.array([int(val) for val in vec2.children])
        return np.corrcoef(array1, array2)[0][1]

    else:
        raise SyntaxError('Unknown instruction: %s' % t.data)

In [131]:
for inst in parse_tree.children:
    print(run_instruction(inst))

‚àÖ
‚àÖ
10.3


### extend grammar: var assign, operators, conditions

In [385]:
avt_grammar3 = """
    start: instruction+
    instruction: OPERATOR NUMBER+     -> operator
               | "‚∞ä" ( NUMBER~4 | vector | WORD ) -> chi_sq
               | "—Ø" ( NUMBER~2 | vector | WORD ) -> percent
               | "·¨à" ( vector~2 | WORD~2 ) -> corr
               | WORD NUMBER    -> assign_var
               | CONDITION NUMBER~2 -> condition_digit
    vector: "„Äé"  NUMBER* "„Äè"
    OPERATOR: "·õù"|"·ö∏"|"·õ™"|"·õÑ"
    CONDITION: "êÑ∑"|"ëöê"|"‚âà"|"‚ââ"|"·û¢"|"·≠ï"|"„ÇÖ"
    
    NUMBER: (DIGIT+ "," DIGIT+ | DIGIT+ | "‚àÖ")
    WHITESPACE: (" " | "\\n" | "\\t")+
    
    %import common.DIGIT
    %import common.WORD
    %ignore WHITESPACE
"""

In [387]:
parser = Lark(avt_grammar3, parser="lalr" )

In [388]:
avt_code1 = """
        ·¨à „Äé10 15 10 13 10 21„Äè„Äé14 18 10 18 6 19„Äè
        ·õÑ 100 0
        ·õù ‚àÖ 3
        ·ö∏ 13,3 3
        orationem 43,74
        êÑ∑ 4 3
        ëöê 24,3 24,4
        
    """

In [389]:
parse_tree = parser.parse(avt_code1)
parse_tree

Tree(Token('RULE', 'start'), [Tree('corr', [Tree(Token('RULE', 'vector'), [Token('NUMBER', '10'), Token('NUMBER', '15'), Token('NUMBER', '10'), Token('NUMBER', '13'), Token('NUMBER', '10'), Token('NUMBER', '21')]), Tree(Token('RULE', 'vector'), [Token('NUMBER', '14'), Token('NUMBER', '18'), Token('NUMBER', '10'), Token('NUMBER', '18'), Token('NUMBER', '6'), Token('NUMBER', '19')])]), Tree('operator', [Token('OPERATOR', '·õÑ'), Token('NUMBER', '100'), Token('NUMBER', '0')]), Tree('operator', [Token('OPERATOR', '·õù'), Token('NUMBER', '‚àÖ'), Token('NUMBER', '3')]), Tree('operator', [Token('OPERATOR', '·ö∏'), Token('NUMBER', '13,3'), Token('NUMBER', '3')]), Tree('assign_var', [Token('WORD', 'orationem'), Token('NUMBER', '43,74')]), Tree('condition_digit', [Token('CONDITION', 'êÑ∑'), Token('NUMBER', '4'), Token('NUMBER', '3')]), Tree('condition_digit', [Token('CONDITION', 'ëöê'), Token('NUMBER', '24,3'), Token('NUMBER', '24,4')])])

In [390]:
class AvtandilProg():
    
    def __init__(self):
        self.vars = {'êÉ∞': ''}

    def assign_var(self, nv):
        try:
            name, value = nv.children
            name = str(name)
            value = str(value)
        except:
            name, value = nv
        self.vars[name] = value
        return value

    def var(self, name):
        return self.vars[name]
    
    def summation(self, num1, num2):
        return float(num1) + float(num2)
    
    def subtraction(self, num1, num2):
        return float(num1) - float(num2)
    
    def multiplication(self, num1, num2):
        return float(num1) * float(num2)
    
    def division(self, num1, num2):
        if num2 == '0':
            return '‚àÖ'
        return float(num1) / float(num2)
    
    def float_true(self, val):
        if ',' in val:
            return val.replace(',', '.')
        else:
            return val
    
    def run_instruction(self, t):
        if t.data == 'operator':
            oper = str(t.children[0])
            numbers = []
            val = 0
            for num in t.children[1:]:
                num = str(num)
                if num == '‚àÖ':
                    val = '‚àÖ'
                    break
                else:
                    numbers.append(float_true(num))
            if not val:
                for i,n in enumerate(numbers):
                    if not i:
                        val = n
                    elif val == '‚àÖ':
                        break
                    else:
                        val = {'·õù': summation,
                               '·ö∏': subtraction,
                               '·õ™': multiplication,
                               '·õÑ': division}[oper](val, n)
            self.assign_var(('êÉ∞', val))
    
        elif t.data == 'chi_sq':
            square_values = [float(val) for val in t.children]
            ch = chisquare(square_values)
            return(ch.pvalue)
        
        elif t.data == 'percent':
            number1, number2 = t.children
            return float(number2) / float(number1) * 100
    
        elif t.data == 'corr':
            vec1, vec2 = t.children
            array1 = np.array([float(val) for val in vec1.children])
            array2 = np.array([float(val) for val in vec2.children])
            return np.corrcoef(array1, array2)[0][1]  
    
        else:
            raise SyntaxError('Unknown instruction: %s' % t.data)
    
    def condition_digit(self, cnd):
        condition, value1, value2 = cnd.children
        
        if condition == 'êÑ∑':
            if float(float_true(value1)) == float(float_true(value2)):
                return True
            else:
                return False
                                                 
        elif condition == 'ëöê':
            if float(float_true(value1)) != float(float_true(value2)):
                return True
            else:
                return False
                                                 
        elif condition == '‚âà':
            value1 = float(float_true(value1))
            value2 = float(float_true(value2))
            if 0 <= value1 - value2 <= 1 or 0 <= value2 - value1 <= 1:
                return True
            else:
                return False
        
        #elif condition == '‚ââ':
            #value1 = float(float_true(value1))
            #value2 = float(float_true(value2))
            #if 0 > value1 - value2 >= 1 and value2 - value1 >= 1:
                #return True
            #else:
                #return False

In [391]:
avt = AvtandilProg()

In [392]:
parse_tree.children[0]

Tree('corr', [Tree(Token('RULE', 'vector'), [Token('NUMBER', '10'), Token('NUMBER', '15'), Token('NUMBER', '10'), Token('NUMBER', '13'), Token('NUMBER', '10'), Token('NUMBER', '21')]), Tree(Token('RULE', 'vector'), [Token('NUMBER', '14'), Token('NUMBER', '18'), Token('NUMBER', '10'), Token('NUMBER', '18'), Token('NUMBER', '6'), Token('NUMBER', '19')])])

In [393]:
parse_tree.children[3]

Tree('operator', [Token('OPERATOR', '·ö∏'), Token('NUMBER', '13,3'), Token('NUMBER', '3')])

In [394]:
avt.run_instruction(parse_tree.children[3])

In [395]:
avt.vars['êÉ∞']

10.3

In [396]:
parse_tree.children[4]

Tree('assign_var', [Token('WORD', 'orationem'), Token('NUMBER', '43,74')])

In [397]:
avt.assign_var(parse_tree.children[4])

'43,74'

In [398]:
avt.vars

{'êÉ∞': 10.3, 'orationem': '43,74'}

In [399]:
parse_tree.children[5]

Tree('condition_digit', [Token('CONDITION', 'êÑ∑'), Token('NUMBER', '4'), Token('NUMBER', '3')])

In [400]:
avt.condition_digit(parse_tree.children[5])

False

In [401]:
parse_tree.children[6]

Tree('condition_digit', [Token('CONDITION', 'ëöê'), Token('NUMBER', '24,3'), Token('NUMBER', '24,4')])

In [402]:
avt.condition_digit(parse_tree.children[6])

True

In [403]:
print(parse_tree.pretty()) 

start
  corr
    vector
      10
      15
      10
      13
      10
      21
    vector
      14
      18
      10
      18
      6
      19
  operator
    ·õÑ
    100
    0
  operator
    ·õù
    ‚àÖ
    3
  operator
    ·ö∏
    13,3
    3
  assign_var
    orationem
    43,74
  condition_digit
    êÑ∑
    4
    3
  condition_digit
    ëöê
    24,3
    24,4

