In [1]:
%pylab inline
from enum import Enum

Populating the interactive namespace from numpy and matplotlib


# Algebra Solver

## Problem statement

The first stage would be to being able to solve algebraic expressions with only operations and constants.

For example, solve: $\dfrac{1}{2}\cdot3+5\cdot(2\cdot7^3)+(-1)^2$.

In later stages it might be desirable to solve equalities, inequalities, and express in terms.

### Operations

We should be able to support the following:

* Addition/subtraction
* Multiplication/division (also implicit multiplication)
* Parenthesis
* Exponents
* Square root
* Trigonometric functions
* Logarithms

### Order of operations

We should follow PEMDAS:

1. Parenthesis
2. Exponents
3. Multiplication/Division
4. Addition/Subtraction

## Parsing

### Token class

This will hold the type of token and the value of the token. There are different types of tokens:

**Types:** `symbol`, `function`, `term`, `constant`

In [2]:
class Token():
    def __init__(self, type, value):
        self.type = type
        self.value = value
    
    def __repr__(self):
        return "['%s': '%s']" % (self.type, self.value)

### Tokenizer class

This is engine for the parser, all the character/digits will be converted into a list of tokens which can be used in lexical analysis later on.

**Whitespace characters:** `<space>`, `\n`, `\r`, `\t`.

**Supported symbols:** `+`, `-`, `*`, `/`, `=`, `^`, `(`, `)`, `,`.

**Supported functions:** `sqrt`, `sin`, `asin`, `cos`, `acos`, `tan`, `atan`, `log`, `ln`.

In [16]:
import string

class Tokenizer():
    whitespace = [' ', '\r', '\n', '\t']
    symbols = {'+':'add', '-':'subtract', '*':'multiply', '/':'divide', '=':'equal',
               '^':'power', '(':'left-parenthesis', ')':'right-parenthesis', ',':'comma'}
    functions = ['sqrt', 'sin', 'asin', 'cos', 'acos', 'tan', 'atan', 'log', 'ln']
    
    def __init__(self, expression):
        self.expression = expression
        self.index = 0
        self.tokens = []
        self.tokenize()
        
    def next(self, skip = 1):
        self.index += skip
        
    def end(self):
        return self.index >= len(self.expression)
    
    def nextCharacter(self):
        return self.expression[self.index:self.index+1]
    
    def eatWhitespace(self):
        while self.nextCharacter() in self.whitespace:
            self.next()
            
    def eatDigit(self):
        digit = ''
        while self.nextCharacter().isdigit() or self.nextCharacter() == '.':
            digit += self.nextCharacter()
            self.next()
        return digit
    
    def eatWord(self):
        word = ''
        while self.nextCharacter() in string.ascii_lowercase + string.ascii_uppercase:
            word += self.nextCharacter()
            self.next()
        return word
            
    def tokenize(self):
        while(not self.end()):
            self.eatWhitespace()
            if self.nextCharacter() in self.symbols:
                token = Token('symbol', self.symbols[self.nextCharacter()])
                self.tokens.append(token)
                self.next()
            elif self.nextCharacter() in string.ascii_lowercase + string.ascii_uppercase:
                word = self.eatWord()
                if word in self.functions:
                    token = Token('function', word)
                elif len(word) == 1:
                    token = Token('term', word)
                else:
                    raise ValueError('Invalid character at index %i' % self.index)
                self.tokens.append(token)
            elif self.nextCharacter().isdigit():
                digit = self.eatDigit()
                token = Token('constant', digit)
                self.tokens.append(token)
            else:
                raise ValueError('Invalid character at index %i' % self.index)

We can use the tokenizer with any mathematical expression, like:

In [17]:
tokenizer = Tokenizer('(2g + sqrt(36x) * ln(2) * acos(6)A * 2) / 3.14 tan(5) = log(23,5) + 3')

This is example of the output of the tokenizer. The result is a list of `Token` objects.

In [18]:
tokenizer.tokens

[['symbol': 'left-parenthesis'],
 ['constant': '2'],
 ['term': 'g'],
 ['symbol': 'add'],
 ['function': 'sqrt'],
 ['symbol': 'left-parenthesis'],
 ['constant': '36'],
 ['term': 'x'],
 ['symbol': 'right-parenthesis'],
 ['symbol': 'multiply'],
 ['function': 'ln'],
 ['symbol': 'left-parenthesis'],
 ['constant': '2'],
 ['symbol': 'right-parenthesis'],
 ['symbol': 'multiply'],
 ['function': 'acos'],
 ['symbol': 'left-parenthesis'],
 ['constant': '6'],
 ['symbol': 'right-parenthesis'],
 ['term': 'A'],
 ['symbol': 'multiply'],
 ['constant': '2'],
 ['symbol': 'right-parenthesis'],
 ['symbol': 'divide'],
 ['constant': '3.14'],
 ['function': 'tan'],
 ['symbol': 'left-parenthesis'],
 ['constant': '5'],
 ['symbol': 'right-parenthesis'],
 ['symbol': 'equal'],
 ['function': 'log'],
 ['symbol': 'left-parenthesis'],
 ['constant': '23'],
 ['symbol': 'comma'],
 ['constant': '5'],
 ['symbol': 'right-parenthesis'],
 ['symbol': 'add'],
 ['constant': '3']]

## Expression tree

The next step is to create an expression tree from the tokens. We should be able to solve expressions with only constants and operations.

### Node class

In [15]:
class Node():
    def __init__(self, left, right, operation):
        self.left = left
        self.right = right
        self.operation = operation
        
class Constant():
    def __init__(self, value):
        self.value = value

### ExpressionTree class

In [11]:
class ExpressionTree():
    def __init__(self, tokens):
        self.tokens = tokens
        self.stack = []
        self.index = 0
        self.parse()
        
    def next(self, skip = 1):
        self.index += skip
        
    def peek(self):
        return self.stack.last
        
    def end(self):
        return self.index >= len(self.tokens)
    
    def nextToken(self):
        return self.tokens[self.index:self.index+1]
    
    def parse(self):
        while(not self.end()):
            token = self.nextToken()
            if token.type == 'constant':
                
                self.stack.push(token)
            elif token.type == 'symbol':
                
            elif token.type == 'function':
            elif token.type == 'term':
            else:
                raise ValueError('Unrecognized token type: %s' % token.type)
            self.next()

We can use this to parse the tokens into an expression tree, like so:

In [13]:
tokenizer = Tokenizer('2+(5+3)*3/(1+1)')
expressionTree = ExpressionTree(tokenizer.tokens)

[['constant': '2']]
[['symbol': 'add']]
[['symbol': 'left-parenthesis']]
[['constant': '5']]
[['symbol': 'add']]
[['constant': '3']]
[['symbol': 'right-parenthesis']]
[['symbol': 'multiply']]
[['constant': '3']]
[['symbol': 'divide']]
[['symbol': 'left-parenthesis']]
[['constant': '1']]
[['symbol': 'add']]
[['constant': '1']]
[['symbol': 'right-parenthesis']]


## Lexical analysis

In this phase we are going to verify if the entered syntax is valid. Things we check in this phase are:

* Is there more than $1$ `=` sign in the expression?
* Does every function get the right amount of arguments?

### LexicalAnalysis class

In [None]:
def LexicalAnalysis():
    def __init__(self, expressionTree):
        self.expressionTree = expressionTree