# Advanced Expression Manipulation

SymPy has a lot of functions to simplify and manipulate expressions. But sometimes there isn't a function that does exactly what you want. This section will go over how SymPy expressions work internally so that you can write advanced expression manipulation functions of your own.

## Args and Expression Trees

In [1]:
from sympy import *
x, y, z = symbols('x y z')

Every SymPy expression is built up as "tree", starting from the innermost expressions. Every SymPy operation, like addition, multiplication, and exponentiation, as well as functions like `sin` or `cos` is a node in the expression tree. 

Consider the expression $$x^2 + xy.$$ As an expression tree, it looks like this:


![expression-tree](images/expression-tree.svg)

We can see this with the `srepr` function in SymPy, which prints an expression in long form:

In [2]:
expr = x**2 + x*y
srepr(expr)

"Add(Pow(Symbol('x'), Integer(2)), Mul(Symbol('x'), Symbol('y')))"

`Add`, `Pow`, and `Mul` are classes representing addition, exponentiation, and multiplication. The arguments of those classes are the terms of the respective addition, power, or multiplication. You can see node type of an expression using `.func` and the subnodes using `.args`.

In [3]:
expr.func

sympy.core.add.Add

In [4]:
expr.args

(x**2, x*y)

In [5]:
expr.args[0].func

sympy.core.power.Pow

In [6]:
expr.args[0].args

(x, 2)

By drilling down into args, we can access the entire expression. Every expression in SymPy can be recreated using just `.func` and `.args`, using 

```
expr.func(*expr.args)
```

In [7]:
expr

x**2 + x*y

In [8]:
expr.func(*expr.args)

x**2 + x*y

Also note that numbers, such as `2` are wrapped with `Integer()`. This is because every node in a SymPy expression tree must be a SymPy class. The Python int `2` does not have any SymPy methods on it, and doesn't behave the way we would want it to. For example, if we divide two Python integers, we get a float, whereas in SymPy we want a rational number.

In [9]:
a = 1
b = 2
a/b

0.5

In [10]:
a = Integer(1)
b = Integer(2)
a/b

1/2

You usually don't need to care about this because when you create an expression, Python numbers are wrapped with the appropriate SymPy types automatically. 

In [11]:
expr2 = x + 2

In [12]:
expr2.args

(2, x)

In [13]:
type(expr2.args[0])

sympy.core.numbers.Integer

However, for us, this matters, because it means that whenever you look at an object's `.args`, you can always be sure that every element of it is itself a SymPy object. In particular, every element of `.args` will itself have `.args`. Nodes at the bottom of the expression tree (or "atomic" nodes) have empty `.args`:

In [14]:
expr2.args

(2, x)

In [15]:
expr2.args[0].args

()

In [16]:
expr2.args[1].args

()

Note that the `args` structure of an expression will not necessarily match the way it is printed. One example of this is that the `args` may be in a different order (`args` for `Add` and `Mul` are always sorted).

In [17]:
x + 1

x + 1

In [18]:
(x + 1).args

(1, x)

A more pertinent example comes from expressions like $x - y$ or $\frac{x}{y}$.

In [19]:
srepr(x - y)

"Add(Symbol('x'), Mul(Integer(-1), Symbol('y')))"

In [20]:
srepr(x/y)

"Mul(Symbol('x'), Pow(Symbol('y'), Integer(-1)))"

In [21]:
srepr(1/x)

"Pow(Symbol('x'), Integer(-1))"

SymPy doesn't have classes for subtraction or division. A subtraction $x - y$ is represented as $x + -1\cdot y$, and a division $x/y$ is represented as $x\cdot y^{-1}$. This may seem unusual at first, but it makes working with expressions much more uniform. Whether or not something is a "subtraction" or "division" is up to the printers. 

### Exercises

#### Nested args

Use nested `.args` calls to get the 3 in `expr`. (remember that you can use `Control-Enter` to execute a cell without moving to the next cell)

In [22]:
expr = x**2 - y*(2**(x + 3) + z)

#### Creating expressions from classes

Create the following objects without using any mathematical operators like `+`, `-`, `*`, `/`, or `**` by explicitly using the classes `Add`, `Mul`, and `Pow`.  You may use `x` instead of `Symbol('x')` and `4` instead of `Integer(4)`.

$$x^2 + 4xyz$$
$$x^{(x^y)}$$
$$x - \frac{y}{z}$$


## Traversal

Now that we've seen how to unpack an expression, we can build basic traversal algorithms. The simplest way to build an algorithm in SymPy is to write a function recursively. 

Two things are important to remember:

1. **SymPy expressions are immutable.** If we want to change something about an expression, we need to create a new expression. 
2. **SymPy expressions can be compared with `==`.** `==` always does exact structural equality. `a == b` always evaluates to `True` or `False`, and it is `True` if and only if `a` and `b` are exactly the same structurally as expressions (for symbolic equality, use `Eq`).

For example, let's look at how we might write a basic function that replaces `x` with `y` (normally you would do this with `subs`, `xreplace`, or `replace`).

In [23]:
def replace(expr, x, y):
    if expr == x:
        return y
    
    # Base case of the recursion
    if not expr.args:
        return expr

    newargs = []
    for arg in expr.args:
        arg = replace(arg, x, y)
        newargs.append(arg)
    return expr.func(*newargs)

In [24]:
expr

x**2 - y*(2**(x + 3) + z)

In [25]:
replace(expr, x, 2*x)

4*x**2 - y*(2**(2*x + 3) + z)

### Exercise

Write a [post-order traversal](https://en.wikipedia.org/wiki/Tree_traversal#Post-order,_LRN) function that prints each node.

In [26]:
def post(expr):
    """
    Post-order traversal

    >>> expr = x**2 - y*(2**(x + 3) + z)
    >>> post(expr)
    -1
    y
    2
    3
    x
    x + 3
    2**(x + 3)
    z
    2**(x + 3) + z
    -y*(2**(x + 3) + z)
    x
    2
    x**2
    x**2 - y*(2**(x + 3) + z)
    """


In [27]:
post(expr)