
<div>These notes are based on Prof. Norman Wildberger's YouTube video on Polynomial Equations found <a href="https://www.youtube.com/watch?v=XHC1YLh67Z0&list=PLzdiPTrEWyz7hk_Kzj4zDF_kUXBCtiGn6&index=1">here</a>. They are being hosted at <a href="https://www.ladatavita.com/">ladatavita.com</a> and available from my Github repo at: <a href="https://github.com/jgab3103/Jamie-Gabriel">https://github.com/jgab3103/Jamie-Gabriel</a></div>

### Solving Polynomial Equations: 1

In [1]:
import sympy as sp
from IPython.core.display import HTML

<div>This video introduces the idea of finding roots of polynomials, and takes us on a tour with examples of the more classical ways that polynomials can be solved (meaning via the work of del Ferro, Tataglia, Cardano). Things start out simple enough, but we quickly see that finding roots of equations beyond the quadratic case (such as cubic, quartic etc.) can become difficult indeed.</div>

Start by creating some variables:

In [2]:
a, b, x = sp.symbols('a, b, x')

Now create a very simple equation and get a solution:

In [3]:
r1 = sp.Eq(a + b*x, 0)
r1

Eq(a + b*x, 0)

In [4]:
s1 = sp.solve(r1, x)[0]
s1

-a/b

No issues so far! Time to examine something slightly more complicated:

In [5]:
r2 = sp.Eq(2 - 3 * x + 5*x^2, 0)
r2

Eq(5*x**2 - 3*x + 2, 0)

Now lets solve it: 

In [6]:
s2 = sp.solve(r2, x)
sp.pretty_print(s2)
# SymPy Note:Note also that SymPy often provides a a nice latex output, but this doesn't happen with lists of solutions, so we will use 
# the SymPy pretty_print function

⎡3    √31⋅ⅈ  3    √31⋅ⅈ⎤
⎢── - ─────, ── + ─────⎥
⎣10     10   10     10 ⎦


So finding the roots for this is starting to get difficult - and the roots have complex numbers! 
<br/>

Note that the complex numbers arise because of the nature of the discriminant in a quadratic equation. If it is negative, 
we will be dealing with negative square roots. Let's verify this on the quadratic equation by using SymPy discriminant function:

In [7]:
sp.discriminant(r2) < 0

True

We could of course avoid this complex roots issues by just ensuring we don't get a negative discriminant (although this will not get us very far!): 

In [8]:
r3 = sp.Eq(2 - 9 * x + 5*x^2, 0)
r3

Eq(5*x**2 - 9*x + 2, 0)

We get the following solution: 

In [9]:
s3 = sp.solve(r3, x, dict=True)
sp.pretty_print(s3)

⎡⎧   9    √41⎫  ⎧   √41   9 ⎫⎤
⎢⎨x: ── - ───⎬, ⎨x: ─── + ──⎬⎥
⎣⎩   10    10⎭  ⎩    10   10⎭⎦


Even without the complex numbers, we still see solutions in standard form involving surds, coming from the applictation of the quadratic formula. The problem with surds is that we often cannot get from them to a complete solution. Think of it this as the $ \sqrt{2} $ problem, where we have not really got an answer, but a label for an algorithm that can be run forever and will only ever provide an approximation. 
<br/><br/>
Let's see this approximation in action, and generate a numerical solution that is an approximation to a limited number of places: 


In [10]:
[print(i[x].n()) for i in s3]

0.259687576256715
1.54031242374328


[None, None]

So it seems that for many quadratics, we cannot solve them beyond getting a numerical approximation. <br/><br/>
And things only get messier when we look at cubics:

In [11]:
r4 = sp.Eq(3 + x - 2 * x^2 + 4 * x^3, 0)
r4

Eq(4*x**3 - 2*x**2 + x + 3, 0)

We can solve it using a formula, and get a numerical approximation, but again, it doesn't give a precise answer:

In [12]:
s4 = sp.solve(r4, x, dict=True)
[print(i[x].n()) for i in s4]

0.597298399134921 + 0.850292734117196*I
0.597298399134921 - 0.850292734117196*I
-0.694596798269842


[None, None, None]

Looking at the answer in terms of surds and complex numbers really starts to make things look complicated. This equation has 3 solutions, 2 complex and 1 real, and we can only know them approximately and find ourselves looking at this kind or representation:

In [13]:
sp.pretty_print(s4)

⎡⎧                                                            ______________⎫ 
⎢⎪                                           ⎛  1   √3⋅ⅈ⎞    ╱ 169   9⋅√353 ⎪ 
⎢⎪                                           ⎜- ─ - ────⎟⋅3 ╱  ─── + ────── ⎪ 
⎢⎪   1                   1                   ⎝  2    2  ⎠ ╲╱    16     16   ⎪ 
⎢⎨x: ─ + ───────────────────────────────── - ───────────────────────────────⎬,
⎢⎪   6                      ______________                  3               ⎪ 
⎢⎪         ⎛  1   √3⋅ⅈ⎞    ╱ 169   9⋅√353                                   ⎪ 
⎢⎪       6⋅⎜- ─ - ────⎟⋅3 ╱  ─── + ──────                                   ⎪ 
⎣⎩         ⎝  2    2  ⎠ ╲╱    16     16                                     ⎭ 

 ⎧                        ______________                                    ⎫ 
 ⎪       ⎛  1   √3⋅ⅈ⎞    ╱ 169   9⋅√353                                     ⎪ 
 ⎪       ⎜- ─ + ────⎟⋅3 ╱  ─── + ──────                                     ⎪ 
 ⎪   1   ⎝  2    2  ⎠ ╲╱    16     16              

Lets have a look at higher order equations:

In [14]:
r5 = sp.Eq(3 + x - 2 * x^2 + 5 * x^3 - x^4, 0)
r5

Eq(-x**4 + 5*x**3 - 2*x**2 + x + 3, 0)

In [15]:
s5 = sp.solve(r5, x, dict=True)
[print(i[x].n()) for i in s5]

0.5 - 0.866025403784439*I
0.5 + 0.866025403784439*I
-0.645751311064591
4.64575131106459


[None, None, None, None]

In [16]:
sp.pretty_print(s5)

⎡⎧   1   √3⋅ⅈ⎫  ⎧   1   √3⋅ⅈ⎫                          ⎤
⎢⎨x: ─ - ────⎬, ⎨x: ─ + ────⎬, {x: 2 - √7}, {x: 2 + √7}⎥
⎣⎩   2    2  ⎭  ⎩   2    2  ⎭                          ⎦


So, sometimes it looks like, as the leading coefficient term gets higher things do not get more complicated? Though the above example still has surds and complex numbers. 
<br/><br/>
Here is another example: 

In [17]:
r6 = sp.Eq(3 + x - 2 * x^2 + 5 * x^3 -  7 *x^4, 0)
r6

Eq(-7*x**4 + 5*x**3 - 2*x**2 + x + 3, 0)

Let's solve for a numerical solution, and then a solution with surds and complex numbers: 

In [18]:
s6 = sp.solve(r6, x, dict=True)
[print(i[x].n()) for i in s6]

1.00000000000000
0.146399152654902 + 0.848164704395753*I
0.146399152654902 - 0.848164704395753*I
-0.578512591024090


[None, None, None, None]

In [19]:
sp.pretty_print(s6)

⎡        ⎧                                                                   _
⎢        ⎪                                                  ⎛  1   √3⋅ⅈ⎞    ╱ 
⎢        ⎪                                                  ⎜- ─ - ────⎟⋅3 ╱  
⎢        ⎪     2                      80                    ⎝  2    2  ⎠ ╲╱   
⎢{x: 1}, ⎨x: - ── + ───────────────────────────────────── - ──────────────────
⎢        ⎪     21                        ________________                   3 
⎢        ⎪              ⎛  1   √3⋅ⅈ⎞    ╱ 3481   9⋅√3569                      
⎢        ⎪          147⋅⎜- ─ - ────⎟⋅3 ╱  ──── + ───────                      
⎣        ⎩              ⎝  2    2  ⎠ ╲╱   686       98                        

_______________⎫  ⎧                           ________________                
3481   9⋅√3569 ⎪  ⎪          ⎛  1   √3⋅ⅈ⎞    ╱ 3481   9⋅√3569                 
──── + ─────── ⎪  ⎪          ⎜- ─ + ────⎟⋅3 ╱  ──── + ───────                 
686       98   ⎪  ⎪     2    ⎝  2    2  ⎠ ╲╱   686 

Turns out last time we just got lucky. As powers get higher, things tend to get more complicated. And therein lies the problem. Again, what we are seeing here is the classical story around the ways to obtain roots from polynomials. This approach aims to transform these types of equations so they be expressed with radicals, and and it tells us about the orientation of del Ferro, Tartaglia and Cardano. 
<br/><br/>
But is there another way to approach this problem, maybe a way in which we could avoid radicals, to come to a more legitimate solution? Could we adopt another point of view?
<br/><br/>

### Solving the quadratic another way

So let's start again, and see how we might solve these equations in a way that does not involve radicals. 
<br/><br/>
Our overall strategy here wil be 

-  Return to the quadratic case, in its most general form of $ c_0 + c_1x + c_2x^2 = 0 $
-  Use a substitution for x and $c_0$ using some terms from a Power Series 
-  From here, find solutions for $x$ that that do not involve radicals (and, interestingly, turn out to be Catalan numbers)


Start by creating variables we will need: 

In [20]:
c_0, c_1, c_2, x = sp.symbols('c_0, c_1, c_2, x')

From there, create a general form of the quadratic equation: 

In [21]:
r7 = sp.Eq(c_0  + c_1 * x + c_2 * x^2, 0)
r7

Eq(c_0 + c_1*x + c_2*x**2, 0)

Note that from the outset we will assume that:

- <b>$c_0 \ne 0$</b> - if this was not the case, we could just find a solution by factoring (for example: $ 3x \times x^2 \Rightarrow x(3 + x)  \Rightarrow x = 0, x= -3)$ . 
- <b>$c_2 \ne 0$</b> - again, if this was not the case, this pushes us back to linear equation. 
- <b>$t^6 = 0$</b> - Note we haven't introduced $t$ yet. We will shortly and this resriction will apply and be explained.

Apart from that, assume nothing! We are not even assuming we know anything about the nature of each of the coefficients $c_0, c_1, c_2$. They are just kind of mathematical object, maybe be a power series of some kind, maybe a poly-number. It doesn't matter to us  

And from here, we want to solve for x in terms of $c_0, c_1$ and $c_2$ and get solution that do not involve radicals. 



<hr/>
We are going to begin by utilising a power series to see if that will help us. Regardless of problems of might arise when dealing with power series' (which can manifest themselves as some kind of infinite structure that is hard for us to gasp), using a subset of power series terms might allow us to have some insights into solving this quadratic. More specifically let's use a subset of terms only up to a power of 6. Why 6? No reason except it makes things managable and might shed some light for future investigation. Given the problem at hand, its a logical place to start. <br/><br/>
Generally, this is a really powerful strategy in mathematics exploration: letting one mathematical object stand in place of another one, and leveraging off that second object's properties to gain some insight and a different view of what is happening. 
<br/><br/>
Being by creating some more variables that we are going to need: 

In [22]:
# get some new variables, and go up to degree 5
t, a_0, a_1, a_2, a_3, a_4, a_5 = sp.symbols('t, a_0, a_1, a_2, a_3, a_4, a_5')

Lets make a subsitution. Let $c_0 = t$:

In [23]:
r8 = r7.subs(c_0, t)
r8

Eq(c_1*x + c_2*x**2 + t, 0)

Now let's make another substitution. We are going to use some terms from a power series that can take the place of $x$. Note that we are using a small number of terms, just 6 terms,  and the number 6 has arbitrarily chosen to see if we can see some kind of pattern. Note also that because we are restricting the terms, anything above $t^5$ we will regard as 0. Hence the above supposition, that $t^6$ can be regarded as 0. 

So here is our new value of $x$: 

In [24]:
r8_5 = sp.Eq(x, a_0 + a_1*t + a_2* t^2 + a_3 * t^3 + a_4* t^4 + a_5*t^5)
r8_5

Eq(x, a_0 + a_1*t + a_2*t**2 + a_3*t**3 + a_4*t**4 + a_5*t**5)

Let's subsitute this new value we have for $x$ into the original equation: 

In [25]:
r9 = r8.subs(r8_5.lhs, r8_5.rhs)
r9

Eq(c_1*(a_0 + a_1*t + a_2*t**2 + a_3*t**3 + a_4*t**4 + a_5*t**5) + c_2*(a_0 + a_1*t + a_2*t**2 + a_3*t**3 + a_4*t**4 + a_5*t**5)**2 + t, 0)

Now expand the equation: 

In [26]:
r10 = r9.expand()
r10

Eq(a_0**2*c_2 + 2*a_0*a_1*c_2*t + 2*a_0*a_2*c_2*t**2 + 2*a_0*a_3*c_2*t**3 + 2*a_0*a_4*c_2*t**4 + 2*a_0*a_5*c_2*t**5 + a_0*c_1 + a_1**2*c_2*t**2 + 2*a_1*a_2*c_2*t**3 + 2*a_1*a_3*c_2*t**4 + 2*a_1*a_4*c_2*t**5 + 2*a_1*a_5*c_2*t**6 + a_1*c_1*t + a_2**2*c_2*t**4 + 2*a_2*a_3*c_2*t**5 + 2*a_2*a_4*c_2*t**6 + 2*a_2*a_5*c_2*t**7 + a_2*c_1*t**2 + a_3**2*c_2*t**6 + 2*a_3*a_4*c_2*t**7 + 2*a_3*a_5*c_2*t**8 + a_3*c_1*t**3 + a_4**2*c_2*t**8 + 2*a_4*a_5*c_2*t**9 + a_4*c_1*t**4 + a_5**2*c_2*t**10 + a_5*c_1*t**5 + t, 0)

Note that we can group the equation in terms of $t$:

In [27]:
r10.lhs.collect(t)

a_0**2*c_2 + a_0*c_1 + 2*a_4*a_5*c_2*t**9 + a_5**2*c_2*t**10 + t**8*(2*a_3*a_5*c_2 + a_4**2*c_2) + t**7*(2*a_2*a_5*c_2 + 2*a_3*a_4*c_2) + t**6*(2*a_1*a_5*c_2 + 2*a_2*a_4*c_2 + a_3**2*c_2) + t**5*(2*a_0*a_5*c_2 + 2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1) + t**4*(2*a_0*a_4*c_2 + 2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1) + t**3*(2*a_0*a_3*c_2 + 2*a_1*a_2*c_2 + a_3*c_1) + t**2*(2*a_0*a_2*c_2 + a_1**2*c_2 + a_2*c_1) + t*(2*a_0*a_1*c_2 + a_1*c_1 + 1)

Recall that because $t^6 = 0$, we can truncate this equation to ignore the higher powers of $t$ above 6:

In [28]:
order = 6
coeffs = sp.Poly(r10, t).coeffs()
r11 = sum(t**n * coeffs[-(n+1)] for n in range(order))
r11
# Note on Sympy: to do this we convert the equation to a Polynomial in Sympy, and iterate through it to remove higher powers of t. There is probably  
# more elegant way to do this

a_0**2*c_2 + a_0*c_1 + t**5*(2*a_0*a_5*c_2 + 2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1) + t**4*(2*a_0*a_4*c_2 + 2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1) + t**3*(2*a_0*a_3*c_2 + 2*a_1*a_2*c_2 + a_3*c_1) + t**2*(2*a_0*a_2*c_2 + a_1**2*c_2 + a_2*c_1) + t*(2*a_0*a_1*c_2 + a_1*c_1 + 1)

The last couple of calculations have created an expression, so lets set things back to an equation set to 0. We will also put it back into the Sympy Poly form again, which has handy methods that will help us work through the calculations: 

In [29]:
r12 = sp.Eq(r11, 0)
r13 = sp.Poly(r12, t)
r13

Poly((2*a_0*a_5*c_2 + 2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1)*t**5 + (2*a_0*a_4*c_2 + 2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1)*t**4 + (2*a_0*a_3*c_2 + 2*a_1*a_2*c_2 + a_3*c_1)*t**3 + (2*a_0*a_2*c_2 + a_1**2*c_2 + a_2*c_1)*t**2 + (2*a_0*a_1*c_2 + a_1*c_1 + 1)*t + a_0**2*c_2 + a_0*c_1, t, domain='ZZ[a_0,a_1,a_2,a_3,a_4,a_5,c_1,c_2]')

Now lets start with the very simple case that assumes that $t = 0$. This means that everything multiplied by $t$ becomes 0, and is leaves the constant term of this polynomial in $t$: 

In [30]:
r14 = r13.coeffs()[-1]
r14

a_0**2*c_2 + a_0*c_1

Let's set it to 0:

In [31]:
r14 = sp.Eq(r14, 0)
r14

Eq(a_0**2*c_2 + a_0*c_1, 0)

And now solve it: 

In [32]:
r15 = sp.factor(r14, a_0)
r16 = sp.solve(r15, a_0, dict = True)
sp.pretty_print(r16)

⎡         ⎧    -c₁ ⎫⎤
⎢{a₀: 0}, ⎨a₀: ────⎬⎥
⎣         ⎩     c₂ ⎭⎦


So we end up with two solutions of what $a_0$. Lets focus on the first case: 

In [33]:
r16[0]

{a_0: 0}

Note the coeffients of this polynomial in $t$. One quick thing to do would be to substitute in 0 for $a_0$ wherever it appears: 

In [34]:
r13.coeffs()

[2*a_0*a_5*c_2 + 2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1,
 2*a_0*a_4*c_2 + 2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1,
 2*a_0*a_3*c_2 + 2*a_1*a_2*c_2 + a_3*c_1,
 2*a_0*a_2*c_2 + a_1**2*c_2 + a_2*c_1,
 2*a_0*a_1*c_2 + a_1*c_1 + 1,
 a_0**2*c_2 + a_0*c_1]

In [35]:
r17 = sp.Poly(r13.subs(a_0, 0), t)
r17

Poly((2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1)*t**5 + (2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1)*t**4 + (2*a_1*a_2*c_2 + a_3*c_1)*t**3 + (a_1**2*c_2 + a_2*c_1)*t**2 + (a_1*c_1 + 1)*t, t, domain='ZZ[a_1,a_2,a_3,a_4,a_5,c_1,c_2]')

Also, now that we have a value of $a_0$, we can work iteratively through all the other parts of the polynomical and solve for $a_0, a_1, a_2, a_3, a_4$ and $a_5$. Start by turning the polynomial into a system of equations: 

In [36]:
r18 = [sp.Eq(r17.coeffs()[i], 0) for i in range(len(r17.coeffs()))]
r18

[Eq(2*a_1*a_4*c_2 + 2*a_2*a_3*c_2 + a_5*c_1, 0),
 Eq(2*a_1*a_3*c_2 + a_2**2*c_2 + a_4*c_1, 0),
 Eq(2*a_1*a_2*c_2 + a_3*c_1, 0),
 Eq(a_1**2*c_2 + a_2*c_1, 0),
 Eq(a_1*c_1 + 1, 0)]

Now solve the system: 

In [37]:
values_to_solve =  [a_0, a_1, a_2, a_3, a_4, a_5]
r19 = sp.nonlinsolve(r18, values_to_solve)
solutions_as_list = list(list(r19)[0])
r19, solutions_as_list

({(a_0, -1/c_1, -c_2/c_1**3, -2*c_2**2/c_1**5, -5*c_2**3/c_1**7, -14*c_2**4/c_1**9)},
 [a_0,
  -1/c_1,
  -c_2/c_1**3,
  -2*c_2**2/c_1**5,
  -5*c_2**3/c_1**7,
  -14*c_2**4/c_1**9])

In [38]:
s20 = [sp.Eq(solutions_as_list[i], values_to_solve[i]) for i in range(len(solutions_as_list))]
s20

[True,
 Eq(-1/c_1, a_1),
 Eq(-c_2/c_1**3, a_2),
 Eq(-2*c_2**2/c_1**5, a_3),
 Eq(-5*c_2**3/c_1**7, a_4),
 Eq(-14*c_2**4/c_1**9, a_5)]

So now we have values for $a_0, a_1, a_2, a_3, a_4$ and $a_5$ in terms of $c_1$ and $c_2$ for the case where $a_0$ is 0. Recall our original plan was to get a value for $x$. So we can now substitute back into get values for $x$

In [39]:
values_to_solve = values_to_solve[1:]
solutions_as_list = solutions_as_list[1:]

r21 = r8_5.subs(values_to_solve[0], solutions_as_list[0])
r21 = r21.subs(values_to_solve[1], solutions_as_list[1])
r21 = r21.subs(values_to_solve[2], solutions_as_list[2])
r21 = r21.subs(values_to_solve[3], solutions_as_list[3])
r21 = r21.subs(values_to_solve[4], solutions_as_list[4])
r21 = r21.subs(a_0, 0)

r21
#SymPy Note: Seeing Scientific Workplace make such light work of these types of tasks suggests there are far more 
# elegant ways to do this which I have not noticed. Feel free to to provide something more elegant!

Eq(x, -t/c_1 - c_2*t**2/c_1**3 - 2*c_2**2*t**3/c_1**5 - 5*c_2**3*t**4/c_1**7 - 14*c_2**4*t**5/c_1**9)

So we started with a general quadratic in the form of $t+c_1x+c2_x^2=0$ with a view to finding a value of $x$ that doesn't involve radicals. And we seem to have done it! Well, in a limited way at least. 

So it looks like there may be some kind of algebraic way of getting around our usual answers involving square roots.And so what is this series of numbers that appears to be emerging? Specifically, these coefficents of $c_2$, the numbers: 1, 1, 2, 5, 14? Well, it turns out these numbers can be found in a sequence of numbers that is ubiquitous throughout various fields of mathematics: <a href="https://oeis.org/A000108#:~:text=A000108%20%2D%20OEIS&text=Catalan%20numbers%3A%20C(n),n%2B1)!).&text=Also%20called%20Segner%20numbers.">The Catalan Numbers!</a>

Lots more to investigate! But that is the end of Part 1