# Symbolic computation in Python

We have seen a few examples of symbolic computation in Python. 

 * Defining symbolic expressions, such as:
     - Polynomials
     - Algebraic expressions involving addition, multiplication, and standard mathematical functions like $\sin(x)$ and $e^x$.
     - [Symbolic expressions](numbers.in.python.ipynb) for algebraic number types, such as $\frac{1}{\sqrt{2}+1}$
 * [Differentiation](NA.PI-Newton.ipynb) of symbolic functional expressions. 
 
In these notes we will explore the sympy library a little further, getting an idea for what it can do, how sympy works, and what its limitations are.  


## Features of Sympy

The [features](http://www.sympy.org/en/features.html) of Sympy are vast and the scope of the library is changing rather quickly. It aims to be able to accomplish all forms of symbolic computation that *can* in principle be done by a computer. 

The qualifier in the above sentence is rather important. There are many basic algebraic tasks that are *non-computable*, in the sense that we have proofs that it is *impossible* to write a computer program that computes the answer to certain algebraic problems.  A closely related fact is that many differential equations do not have closed-form solutions, i.e. their solutions are not expressible in terms of *elementary functions*. 

These issues lead to certain unavoidable issues or problems in symbolic computation.  
 * For certain kinds of requests, sympy will try to answer your query, but there are *no* estimates for how long it might take, *nor* how much system memory it will require to complete the task you have asked of it.  In effect sympy may or may not give you an answer to these kinds of requests. 
 * But sympy *also* has various highly-effective algorithms with solid run-time and memory-usage estimates, meaning many tasks can be accomplished reliably.  
 
We will talk a little about both sorts of algorithms available in sympy.
 
 * * * 

### Calculus: derivatives and anti-derivatives

Sympy can compute derivatives of symbolic functions, using exactly the same tools we use: 
 * The chain rule 
 * The product rule
 * A table of derivaties of elementary functions, such as $x^n$, $e^x$ and $\sin(x)$. 

In [None]:
## 1) calculus, differentiation, integration, limits.
import sympy as sp
x = sp.Symbol('x')
f = x**3 + sp.sin(x)
print(sp.diff(f,x))

In [None]:
## We can ask sympy to give a more pleasant presentation
sp.pprint(sp.diff(f,x))

In [None]:
## Check a function is a solution to a differential equation.

f = 1/(1 + sp.exp(-x))
fp = sp.diff(f,x)
print("f == "+str(f))
print("f' == "+str(fp)+"\n")

In [None]:
## f' = f(1-f) is an example of a logistic de.  Let's check that the above is a solution
print("Solution to logistic de f'=f(1-f) if: ")
print(str(f*(1-f))+" == "+str(fp)+"\n")

## They do not look equal! Let's ask sympy to check

print("Sympy thinks they are equal: "+str(fp == f*(1-f))+"\n")

## As we saw before, Sympy's "==" operator does not investigate our 
## concern very carefully -- it is essentially telling us whether or not
## the expressions look syntactically the same.  We already know they do not!
## Let's try asking sympy to think a little harder about this.

ode = fp - f*(1-f)
print("Before simplification: f'-f*(1-f) == ")
sp.pprint(ode)
print("After simplification: ")
ode_reduced = sp.simplify(ode)
sp.pprint(ode_reduced)
print("\nsp.simplify applied to the difference is zero, so the expressions are equal.")

Sympy also has the capacity to compute anti-derivatives. This is perhaps surprising since there are many functions in mathematics that *do not* have anti-derivatives that are expressable in terms of *elementary functions*.   For example,

$$f(x) = e^{-x^2}$$

can not be expressed as a finite combination (sums, products, powers, quotients) of polynomials or trig functions.  

In [None]:
## let's start with some examples

f = x**5
sp.pprint(f)

In [None]:
af = sp.integrate(f, x)
sp.pprint(af)

One does not need to give concrete instances of functions to sympy -- one can ask for general rules as well.  For example.

In [None]:
a,b,c = sp.symbols('a b c')
f = a*x**b+c
sp.pprint(f)
print("")
af = sp.integrate(f,x)
sp.pprint(af)

sympy is perfectly content giving *complicated* answers.  This is a feature of its underlying data type -- a tree.  Notice the type that $\int f$ is.

In [None]:
print("The anti-derivative is of type: "+str(type(af)))

In [None]:
## right!  the last operation is addition. Let's ask it about the anti-derivative of x^b
print("The anti-derivative of x**b is of type: "+str(type(sp.integrate(x**b,x))))

Let's be mean and ask sympy about $\int e^{-x^2} dx$

In [None]:
f1 = sp.exp(-x**2)
sp.pprint(f1)
fi = sp.integrate(f1, x)
sp.pprint(fi)

This is better than no information at all. sympy is telling us that (a rescaling) of this is called the $erf$ or <a href="http://docs.sympy.org/0.7.1/modules/mpmath/functions/expintegrals.html#erf">*error function*</a>.  Sympy can work with this function. 

In [None]:
f2 = (x**2)*fi
sp.pprint(f2)
sp.pprint(sp.integrate(f2,x))

We can similarly ask for definite integrals. 

We will request $\int_{-\infty}^\infty e^{-x^2} dx$ below.

In [None]:
defint = sp.integrate( f1, (x,-sp.oo,sp.oo))
sp.pprint(defint)
## "oo" -- two letter o's -- is sympy's notation for infinity

Let's also request $\int_{-1}^1 e^{-x^2} dx$

In [None]:
defint = sp.integrate( f1,(x,-1,1))
sp.pprint(defint)
## and if we are not content with such abstract stuff, we can request a float approximation
print("To 40 decimal places: "+str(defint.evalf(40)))

In [None]:
## Let's plot erf.
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np
%matplotlib inline

## notice we can not use "numpy" here as the erf function
## is not defined in that library.  We could also use
## mpmath here.
FI = sp.lambdify(x, fi, "math")
fig, ax1 = plt.subplots(nrows=1, figsize=(10,10))
res = 1000
dom = [4.0*(float(i)/res-1.0) for i in range(2*res)]
ax1.plot(dom, [FI(p) for p in dom], color="black")
plt.show()

In [None]:
# A small variant on the above, let's try to graph the anti-derivative of sin(x^2)
f3 = sp.sin(x**2)
fi = sp.integrate(f3,x)
sp.pprint(fi)

In [None]:
fig, ax1 = plt.subplots(nrows=1, figsize=(10,10))
res = 1000
dom = [8.0*(float(i)/res-1.0) for i in range(2*res)]
## the fresnel function is not implement in numpy or math, so
## we need to use mpmath
FI = sp.lambdify(x, fi, "mpmath")
ax1.plot(dom, [FI(p) for p in dom], color="black")
plt.show()

From the above graph it would appear the Fresnel function has a pair of horizontal
asymptotes. We can ask sympy to check this. 


In [None]:
lim = sp.limit(fi,x, sp.oo)
print( lim )
## pretty?
sp.pprint( lim )
## and a float?
print("As a float (mpf): "+str(lim.evalf(40)))

### Keep in mind sympy can fail!  Let's look at what that looks like.

We will ask it to evaluate a slightly more complicated integral:

$$ \int e^{e^{-x^2}} dx $$

In [None]:
sp.pprint( sp.integrate(sp.exp(sp.exp(-x**2)),x) )

i.e. when sympy fails on these kinds of requests, it **gives up** and returns your original input, *unchanged*. 

What is going on *under the hood* is that sympy is using something called the <a href="https://en.wikipedia.org/wiki/Risch_algorithm">**Risch algorithm**</a>. Technically sympy has developed an extension of the Risch algorithm. . . and sympy's extension what is known as a *semi-algorithm*.  As far as I am aware, every software package that can compute symbolic integrals (Mathematical, Matlab, Maple, etc) all use variants of the Risch algorithm, which similarly boils-down to a careful application of <a href="https://en.wikipedia.org/wiki/Liouville%27s_theorem_(differential_algebra)">Liouville's Theorem in differential algebra</a>. 

The **Liouville Theorem** states that if $f(x)$ has an anti-derivative that is *elementary* (a sum, product, quotient, composite involving exponential or logarithm, polynomials or trig functions *recursively*) then there is and expression for the function $F$ with $F'=f$ of the form:

$$ F(x) = v(x) + \sum_{i=1}^n c_i \ln(g_i(x)) $$

where $v(x)$ and the functions $g_1(x),\cdots,g_n(x)$ are elementary function expressible entirely in terms of sums, product, and quotients of $f(x)$, and polynomials.  

eg: $$\frac{e^{x^2+2}+2}{\ln(x)+x^8+\sin(e^x+x^2)}$$
is an elementary function.

The Risch algorithm goes one step further and reduces the number of possibilities one has to consider to a finite number.  In Risch's original paper he confined himself to a fairly simple class of elementary functions on which his work provides a genuine algorithm. Sympy, on the other hand, has looser constraints than Risch's original paper and so it does not know how to solve this problem for every possible input. Sympy can fail, which is why it is called a semi-algorithm.

When you provide sympy with a *difficult* integral that it does not know how to handle immediately, it spends some time *searching* for an answer, but if it does not find one quickly it gives up. 

Sympy's extensions to the Risch algorithm are some of the most commonly-used anti-derivatives, such as $erf$ and the Fresnel function.

## Asking sympy to solve equations

Sympy has some fairly sophisticated algorithms to solve polynomial equations. It uses this intelligence for solving polynomial equations to build tools to solve (symbolically) a wide array of equations, even ones that are not polynomial. 

Sympy can:

 * Factor polynomials.
 * Find roots of polynomials, symbolically as well as numerically. 
 * Solve (symbolically as well as numerically) simultaneous polynomial equations.
 * Solve simultaneous equations that are not polynomial
     - sympy can *sometimes* do this symbolically
     - can usually do this numerically, using a variety of methods, including the multi-variable Newton's method.  The numeric methods are not always guanteed to find all solutions. If you *need* all solutions you might have to use more specialized methods. 
 * Sympy's polynomial equation solvers, on the other hand, will always find good approximations to all the solutions.


In [None]:
## Let's give an example of factoring polynomials

p = x**2 - 2*x + 1
sp.factor(p)

In [None]:
## Sympy is okay with multi-variable polynomials
y = sp.Symbol('y')

p = x*y - x - y + 1
sp.factor(p)

In [None]:
## We can also ask to factor in a variety of ways

p = x*x - 2
sp.pprint(sp.factor(p))
## the above is factoring using rational numbers i.e. the output is telling us you can 
## not factor this polynomial into a product of two polynomials with rational coefficients.

sp.pprint(sp.factor(p, extension=sp.sqrt(2)))
## here we have allowed simpy to use the square root of 2 in the coefficients,
## so it gives us what we expect.

In [None]:
## a few more examples of sympy's factoring code
p = x*x + 2
sp.pprint(sp.factor(p))
## here we allow sqrt(2)i in the coefficients
sp.pprint(sp.factor(p, extension=sp.sqrt(-2)))
p = x*x + 1
## and here we allow "Gaussian integers" i.e. complex
## numbers of the form a+bi with a and b integers.
sp.pprint(sp.factor(p, gaussian=True))

The nice thing about the previous algorithms is they always work.  The weak aspect is it is not particularly flexible if your goal is to know the roots of the polynomial in the real line or complex plane. To find the roots, we can use sympy's solve function.

In [None]:
p = x*x - 2
sp.pprint(sp.solve(p,x))

p = x*x + 2
sp.pprint(sp.solve(p,x))

### sympy solve

The general format is to call $solve(f,x)$ where $f$ is a sympy expression -- a function -- and $x$ is the variable used by that function.  Sympy will attempt to find all the solutions to the equation
$$f(x) = 0$$

The solve algorithm runs out of steam fairly quickly. See the next three computations.


In [None]:
p = a*x**2 + b*x + c
sp.pprint(p)
sp.pprint(sp.solve(p,x))

In [None]:
## a general cubic
d,e = sp.symbols('d e')
p = a*x**3 + b*x**2 + c*x + d
sp.pprint(p, use_unicode=True)
sp.pprint(sp.solve(p,x), use_unicode=True)

In [None]:
## a general quartic. 

## Warning: on my less powerful laptop running Python 2, this request terminates
##  in failure quickly.  On my powerful laptop running Python 3, this request 
##  is taken "seriously" and consumes significant processor time. 
p = a*x**4 + b*x**3 + c*x**2 + d*x + e
sp.pprint(p)
sp.pprint(sp.solve(p,x))

While there is a closed-form solution to the roots of a degree 4 polynomial, sympy's *solve* routine can not derive it.  

On the other hand it is known there is no *closed form* for the roots of a degree $5$ or higher polynomial.  So one has to resort to other methods.

In [None]:
## let's leave sympy and find the roots numerically
p = x**3 + x**2 - x - 1
P = sp.Poly(p,x).coeffs()
print("The extracted coefficients: "+str(P))
P=[1, -12,48,-64, 0]

In [None]:
## Both numpy and mpmath have such algorithms
print(str(np.roots(P))+"\n")

## mpmath's
import sympy.mpmath as mp
## this command works in Python 2... strangely it has problems in Python 3
#print(str(mp.polyroots(P))+"\n")

## this appears to be an error in the current sympy.mpmath implementation, 
## so we bypass it and use the most recent mpmath library. 
import mpmath as mp2
print(str(mp2.polyroots(P))+"\n")

## clean up the output
mp2.mp.pretty = True
print(str(mp2.polyroots(P))+"\n")

## up the precision
mp2.mp.prec += 3 ## dps should be approximately at least 3.33 times prec
mp2.mp.dps += 10
print(str(mp2.polyroots(P))+"\n")

## roots

The sympy *polyroots* and numpy *roots* functions are guanteed to find all the solutions of your polynomial equation.  

* * * 

While we're at it, notice mpmath can be used to solve multi-variable equations.

Here we ask mpmath to solve
$$(x^2+y^2, xy) = (4, 1)$$

It is using a multi-variable Newton method.

In [None]:
f = [lambda x,y: x**2 + y**2 - 4, lambda x, y: x*y-1.0 ]
## The mpmath findroot algorithm can take lists of multi-variable functions. 
## i.e. it prefers to think of a vector-valued function as a list of functions
## and solving the equation f(x,y) = (0,0) it thinks as solving the list of
## equations simultaneously. 

#print(f[0](1,1),f[1](1,1))
# as it looks, f is a list of callable 2-variable functions

## and the findroot call
roots = mp2.findroot(f,(0.2,0.1))
print("roots: "+str(roots))
## let's check it is a solution
print("check sol: "+str(f[0](roots[0],roots[1])) + ", " +str(f[1](roots[0],roots[1])) )
print("yes!")

We can similarly ask sympy to solve families of equations.

In [None]:
sol = sp.solve([x**2 + y**2 - 4, x*y-1 ])
print(type(sol))
print(len(sol))

In [None]:
for i in range(len(sol)): 
    print("Solution " + str(i))
    sp.pprint(sol[i])
    print("\n")
## sol[i] is a dict object
print("Solutions are dictionary type: "+str(type(sol[0])))

In [None]:
## This is how we can access them. 
print("x-coord of solution 0: "+str(sol[0][x].evalf(10)))
print("y-coord of solution 0: "+str(sol[0][y].evalf(10)))

In [None]:
## And we can visualize the solutions with matplotlib
fig, ax = plt.subplots(figsize=(10,10))
circle = plt.Circle( (0,0), 2, color='r', fill=False)

x1 = np.linspace(0.2, 4)
y1 = x1**(-1)
ax.plot(x1, y1,'-')

x2 = np.linspace(-4,-0.2)
y2 = x2**(-1)
ax.plot(x2, y2,'-')

ax.set_title('Visualizing the above solutions', fontsize=18)

## the root we found with Newton's method, a big black x.
ax.plot([roots[0]], [roots[1]], 'bx', markersize=28)

## the roots we found with sympy, yellow dots.
ax.plot([sol[i][x].evalf(10) for i in range(len(sol))], [sol[i][y].evalf(10) for i in range(len(sol))],'yo', markersize=10)

fig.gca().add_artist(circle)
ax.set_xlim(-3,3)
ax.set_ylim(-3,3)
plt.show()

The next example gives you a sense for how simpy solves complicated algebraic equations. 

Notice the equation below is basically the previous equation but with $x$ replaced by $\cos(x)$ and $y$ replaced by $\sin(y)$. 

This gives you a key insight into how sympy solves equations. It has strong *core* routines for solving polynomial equations, which is uses as a fundamental building-block for solving more complicated algebraic equations.

In [None]:
sol = sp.solve([sp.cos(x)**2 + sp.sin(y)**2 - 4, sp.cos(x)*sp.sin(y)-1 ])
print(type(sol))
print(len(sol))
for i in range(len(sol)): 
    print("Solution " + str(i))
    sp.pprint(sol[i])
    print("\n")  

## Summary

Sympy has a systematic way of solving algebraic equations.  

 * At its core is a strong semi-algorithm to solve systems of polynomial equations.
     - An example system: $x^3+3xy+1=0 = y^4-x^2y+x+4 $
     - To solve such equations it uses a **Groebner basis algorithm** to convert such systems of *multi-variable polynomial* equations into systems of *single-variable* polynomial equations. This is a reliable algorithm, although it can be slow (double-exponential run-time estimates).
     - Sympy does not know how to write "closed-form" solutions to all single variable polynomial equations, but it has a significant repository of closed-form solutions, as we have seen. 
     - As a side-note, it is known that the solutions to a general degree $5$ polynomial equation $a_5x^5+a_4x^4+a_3x^3+a_2x^2+a_1x+a_0=0$ can not be expressed (as in the degree 1 through 4 cases) as a rational polynomial in rational powers (recursively) of $a_5,\cdots,a_0$.   So one can not hope for simple formulas for such roots.

 * For more complicated systems, such as $\sin(x)^3+3\sin(x)\cos(y)+1=0=\sin(y)^4-\cos(x)^2+\cos(x)+4$ sympy thinks of the system as a polynomial equation in the variables $\sin(x), \cos(y)$ and computes $\sin^{-1}$ and $\cos^{-1}$ of the solutions.
 
 * Sympy also has adapted versions of the above algorithms to find closed-form solutions to some differential equations.  We will talk more about this when we discuss simulating solutions to differential equations.  One particularly simple case appears below.

### Sympy and linear algebra

We've seen how one can solve systems of linear equations using numpy's matrix facilities. Sympy has the same abilities but it can also do a few things numpy can not. 

For instance, a standard family of ODEs that arise are called **linear first-order differential equations**. They have the form:

$$\vec v'(t) = A \vec v(t)$$

here $\vec v : \mathbb R \to \mathbb R^n$ is a time-varying vector.  $A$ is an $n \times n$ matrix.  The solutions to these differential equations are:

$$\vec v(t) = exp(A) \vec v_0$$
where $\vec v_0 = \vec v(0)$ and the matrix exponential $exp(A)$ is 
$$\exp(A) = \sum_{k=0}^\infty \frac{1}{k!} A^k$$

Sympy is perfectly happy computing this matrix exponentials in closed-form.  In *math 342* we teach students how to do this by using *ideas*. 

In [None]:
t=sp.Symbol('t')

A=sp.Matrix(2,2,[1,0,0,1])
print("A == "+str(A)+"\n")
eA = sp.exp(t*A)
print("exp(tA)\n")
sp.pprint(eA)

A=sp.Matrix(2,2,[-1,0,1,-1])
print("A == "+str(A)+"\n")
eA = sp.exp(t*A)
print("exp(tA)\n")
sp.pprint(eA)


In [None]:
## we need to tell sympy t is a real variable otherwise...
t=sp.Symbol('t',real=True)
#t=sp.Symbol('t')
B=sp.Matrix(4,4,[1,0,0,0, 1,1,0,0, 0,0,1,1, 0,0,-1,1])
eB = sp.exp(t*B)
print("B == "+str(B)+"\n")
## when we expand e^(x+iy) = cos(x) + i sin(y) in the argument
## on the nextline, it thinks t might have both a real and an imaginary part,
## resulting in a true but not-fully-informative expression.
sp.init_printing(use_latex=True)

eB=sp.simplify(eB).expand(complex=True)
sp.pprint(eB)

## Plotting in sympy

Sympy has a variety of 2d and 3d plotting tools.  We demonstrate a few below. 

My first impression is they are not as well-developed as one might hope. 

In [None]:
## Parametric surfaces
from sympy.plotting import plot3d_parametric_surface
plot3d_parametric_surface(x*sp.cos(t), x*sp.sin(t), t, 
                          (x,0.5,1.5), (t, 0, 20),
                          title="spiral")

In [None]:
## implicit curves in the plane. 
## here we plot two curves together, 
## we call the individual plots, set them to the variables
## curve1 and curve2, and we ask sympy not to display them when
## they are created. We then merge them with the "extend" command, 
## and display
from sympy.plotting import plot_implicit
curve1 = plot_implicit(x**12+y**12 - 1,(x,-1,1),(y,-1,1), 
              adaptive=False, points=1600, 
              xlabel="1st axis", ylabel="2nd axis", show=False)
curve2 = plot_implicit(x**2+y**2 - 1,(x,-1,1),(y,-1,1), 
              adaptive=False, points=1600, 
              xlabel="1st axis", ylabel="2nd axis", show=False)
curve1.extend(curve2)
curve1.show()

In [None]:
## Let's plot a tire tube...
bigR = 1.0
smR = 0.2
plot3d_parametric_surface(
    (bigR+smR*sp.cos(x))*sp.cos(t),     (bigR+smR*sp.cos(x))*sp.sin(t),     smR*sp.sin(x), 
    (x,0, 6.3), (t, 0, 6.3), 
    title="Tire Tube", xlabel="1st axis", ylabel="2nd axis")

## My impression is sympy's plotting routines are (at present) a little primitive and you are better off plotting with matplotlib, or other more sophisticated libraries. 

See the [visualization](visualization.ipynb) notebook for more information on data visualization libraries.