**Artificial Inteligence (CS550)**
<br>
Date: **12 February 2020**
<br>
Location: **SU, NEW STEM building**
<br>
Room: **304**

Title: **Seminar №4**
<br>
Speaker: **Dr. Shota Tsiskaridze**

In [1]:
%load_ext autoreload
%autoreload 2

%matplotlib inline

<h2 align="center">Root Finding</h2>

<h3 align="center">Formulation of the problem</h3>

- **Root Finding Problem.**
 - Refers to the general problem of **searching for a solution** of an equation $f(x) = 0$  for some function $f(x)$.
 - This is a very general problem and it comes up a lot in mathematics!
 - For example, if we want to optimize a function $f(x)$ then we need to find critical points and therefore solve the equation ${f}'(x)=0$.
 

- **Analytical solution.**
 - There are **few examples** where there exist exact methods for finding solutions. 
 - You can find the roots of the **quadratic equation**: $$ax^2 + bx + c =0,$$
   simply by applying the formula: $$x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}.$$
 - There is a general formula to solve a **cubic equation** and even a **quartic equation**, but the formula is too complicated to be useful.


- **What can we do when no analytical solution is known?** 
 - Use numerical methods to find approximate solutions.


<h3 align="center">Bisection Method</h3>

- **Applicability:** The algorithm applies to any continuous function $f(x)$ on an interval $[a,b]$ where the function $f(x)$ changes sign, i.e. $f(a)\cdot f(b) < 0$
<center><img src="images/Bisection.gif" width="400" height="300" alt="Example" /></center>
- The **bisection method**:
 - divides the interval in two, selects the subinterval where the sign of $f(x)$ changes and repeats.
 - is based on the **Intermediate Value Theorem**;
 - does not (in general) produce an exact solution of an equation $f(x)=0$;
 - give an estimate of the absolute error in the approxiation;
  - always converges to a root of $f(x) = 0$.

<h3 align="center">Bisection Method: Algorithm</h3>

The bisection method procedure is:
 1. Choose a starting interval $[a_0, b_0]$ such that $f(a_0)\cdot f(b_0) < 0$;
 2. For each sub-interval $[a_n, b_n]$ take a midpoint of $m_n = (a_n + b_n)/2$ and compute $f(m_n)$;
 3. Determine the next sub-interval $[a_{n+1}, b_{n+1}]$:
   - if $f(a_n)\cdot f(m_n) < 0$, then $[a_{n+1}, b_{n+1}] = [a_n, m_n]$;
   - if $f(b_n)\cdot f(m_n) < 0$, then $[a_{n+1}, b_{n+1}] = [m_n, b_n]$;
 4. Repeat (2) and (3) until the interval $[a_N, b_N]$ reaches some predetermined length;
 5. Return the midpoint value $m_N$.

<h3 align="center">Bisection Method: Implementation</h3>

In [2]:
def bisection(f,a,b,N):
    """
    Input: f = function, a = float, b = float, N = integer (positive)
    f : the function for which we are trying to approximate a solution f(x)=0.
    a : left border of the initial interval.
    b : right border of the initial interval.
    N : the number of iterations to implement.
    
    Output:
    x_N : the midpoint of the Nth interval computed by the bisection method.
    
    """
    
    if (f(a) * f(b) >= 0):
        print("Bisection method fails.")
        return None
    
    a_n = a
    b_n = b
    for n in range(0, N):
        m_n = (a_n + b_n)/2
        f_m_n = f(m_n)
        if f(a_n)*f_m_n < 0:
            a_n = a_n
            b_n = m_n
        elif f(b_n)*f_m_n < 0:
            a_n = m_n
            b_n = b_n
        elif f_m_n == 0:
            print("Found exact solution.")
            return m_n
        else:
            print("Bisection method fails.")
            return None
    return (a_n + b_n)/2

<h3 align="center">Example: Golden Ration</h3>

Let's use our function with input parameters $f(x) = x^2 - x - 1$ and $N = 25$ iterations on $[a, b] = [1, 2]$ to approximate the **golden ratio**:

$$\phi = \frac{1 + \sqrt{5}}{2}$$
The golden ratio $\phi$ is a root of $f(x)$,

In [3]:
f = lambda x: x**2 - x - 1
phi_approx = bisection(f,1,2,25)
phi_approx

1.618033990263939

The absolute error is guaranteed to be less than $(a-b)/2^{N+1} = 2^{-26}$.

In [4]:
error_bound = 2**(-26)
error_bound

1.4901161193847656e-08

Let's verify that absolute error is in the error bound.

In [5]:
phi_true = (1 + 5**0.5)/2
abs(phi_true - phi_approx) < error_bound

True

<h3 align="center">Secant Method</h3>

- **Applicability:** The algorithm applies to any continuous function $f(x)$ on an interval $[a,b]$ where the function $f(x)$ changes sign, i.e. $f(a)\cdot f(b) < 0$.
<center><img src="images/Secant.gif" width="350" height="300" alt="Example" /></center>
- The **secant  method**:
 - is very similar to the bisection method except instead of dividing each interval by choosing the midpoint the secant method divides each interval by the secant line connecting the endpoints;
 - is based on the **Intermediate Value Theorem**;
 - always converges to a root of $f(x) = 0$.





<h3 align="center">Secant Line Formula</h3>

Let $f(x)$ be a continuous function on a closed interval $[a,b]$ such that $f(a)\cdot f(b) < 0$.
<br>
Consider the line connecting the endpoint values $(a, f(a))$ and $(b, f(b))$.
<br> The line connecting these two points is called the **secant line** and is given by the formula:

$$y = \frac{f(b) - f(a)}{b-a}(x-a) + f(a).$$

The point where the secant line crosses the $x$-axis is given by:

$$0 = \frac{f(b) - f(a)}{b-a}(x-a) + f(a).$$

which gives us the solution $x$:

$$x = a - f(a) \frac{b-a}{f(b) - f(a)}.$$




<h3 align="center">Secant Method: Algorithm</h3>

The bisection method procedure is:
 1. Choose a starting interval $[a_0, b_0]$ such that $f(a_0)\cdot f(b_0) < 0$;
 2. Compute $f(x_n)$, where $x_n$ is given by the secant line:
 $$x_n = a_n - f(a_n) \frac{b_n-a_n}{f(b_n) - f(a_n)}.$$
 3. Determine the next sub-interval $[a_{n+1}, b_{n+1}]$:
   - if $f(a_n)\cdot f(m_n) < 0$, then $[a_{n+1}, b_{n+1}] = [a_n, m_n]$;
   - if $f(b_n)\cdot f(m_n) < 0$, then $[a_{n+1}, b_{n+1}] = [m_n, b_n]$;
   
   
 4. Repeat (2) and (3) until the interval $[a_N, b_N]$ reaches some predetermined length;
 5. Return the value $x_N$.

<h3 align="center">Secant Method: Implementation</h3>


In [6]:
def secant(f,a,b,N):
    """
    Input: f = function, a = float, b = float, N = integer (positive)
    f : the function for which we are trying to approximate a solution f(x)=0.
    a : left border of the initial interval.
    b : right border of the initial interval.
    N : the number of iterations to implement.
    
    Output:
    x_N : the value x_N of the Nth interval computed by the secant method.
    
    """
    
    if f(a)*f(b) >= 0:
        print("Secant method fails.")
        return None
    a_n = a
    b_n = b
    for n in range(0,N):
        m_n = a_n - f(a_n)*(b_n - a_n)/(f(b_n) - f(a_n))
        f_m_n = f(m_n)
        if f(a_n)*f_m_n < 0:
            a_n = a_n
            b_n = m_n
        elif f(b_n)*f_m_n < 0:
            a_n = m_n
            b_n = b_n
        elif f_m_n == 0:
            print("Found exact solution.")
            return m_n
        else:
            print("Secant method fails.")
            return None
    return a_n - f(a_n)*(b_n - a_n)/(f(b_n) - f(a_n))

<h3 align="center">Example: Supergolden Ration</h3>

 Let's find an approximation of the **supergolden ratio**: the only real root of the function $g(x) = x^3 - x^2 - 1$.
 
Let's use our function with input parameters $g(x) = x^3 - x^2 - 1$ and $N = 25$ iterations on $[a, b] = [1, 2]$ to approximate the **supergolden ratio**:

$$\psi = \frac{1 + \sqrt[3]{\frac{29 + 3\sqrt{93}}{2}} + \sqrt[3]{\frac{29 - 3\sqrt{93}}{2}}}{3}.$$

In [7]:
g = lambda x: x**3 - x**2 - 1
g(1), g(2)

(-1, 3)

In [8]:
psi_approx = secant(g, 1, 2, 25)
psi_approx

1.4655712318713536

Let's compare our approximation with the exact solution:

In [9]:
psi_true = (1 + ((29 + 3*93**0.5)/2)**(1/3) + ((29 - 3*93**0.5)/2)**(1/3))/3
error = abs(psi_approx - psi_true)
error

5.4145576910968884e-12

<h3 align="center">Newton's Method</h3>

- **Applicability:** The algorithm applies to any continuous function $f(x)$ on an interval $[a,b]$ where the function $f(x)$ changes sign, i.e. $f(a)\cdot f(b) < 0$.
<center><img src="images/Newton.gif" width="400" height="300" alt="Example" /></center>
- The **Newton  method**:
 - is a root finding method that uses linear approximation.
 - guess a solution $x_0$ of the equation $f(x)=0$, compute the linear approximation of $f(x)$ at $x_0$ and then find the $x$-intercept of the linear approximation.
 - usually converges very quickly and this is its main advantage.
 - is not guaranteed to converge and this is obviously a big disadvantage.





<h3 align="center">Newton's Formula</h3>

Let $f(x)$ be a differentiable function. If $x_0$ is a near a solution of $f(x) = 0$, then we can approximate $f(x)$ by the tangent line at $x_0$ and compute the $x$-intercept of the tangent line. The equation of the tangent line at $x_0$ is:
$$y = {f}'(x_0)(x-x_0) + f(x_0).$$

The $x_0$-intercept is $x_1$ the solution  of the equation:

$$0 = {f}'(x_0)(x_1-x_0) + f(x_0),$$
and we solve for $x_1$:
$$x_1 = x_0  - \frac{f(x_0)}{{f}'(x_0)}.$$

If we implement this procedure repeatedly, then we obtain a sequence given by the recursive formula:

$$x_{n+1} = x_n - \frac{f(x_n)}{{f}'(x_n)}.$$

which (potentially) converges to a solution of the equation $f(x) = 0$.

<h3 align="center">Newton's Method: Implementation</h3>


In [10]:
def newton(f,Df,x0,eps,imax):
    """
    Input: f = function, Df = function, x0 = float, eps = float, imax = integer (positive)
    f    : the function for which we are trying to approximate a solution f(x)=0.
    Df   : derivative of f(x).
    x0   : initial guess for a solution f(x)=0.
    eps  : stopping criteria is abs(f(x)) < eps.
    imax : maximum number of iterations of Newton's method.
    
    Output:
    x_N : the value x_N of the intercept of the tangent line after Nth iterations.
    """
    
    xn = x0
    for n in range(0,imax):
        f_x_n = f(xn)
        if abs(f_x_n) < eps:
            print('Found solution after',n,'iterations.')
            return xn
        Df_x_n = Df(xn)
        if Df_x_n == 0:
            print('Zero derivative. No solution found.')
            return None
        xn = xn - f_x_n/Df_x_n
    print('Exceeded maximum iterations. No solution found.')
    return None

<h3 align="center">Example: Supergolden Ration</h3>

 Let's find an approximation of the **supergolden ratio** using the **Newton's Method**: the only real root of the function $f(x) = x^3 - x^2 - 1$.
 
$$\psi = \frac{1 + \sqrt[3]{\frac{29 + 3\sqrt{93}}{2}} + \sqrt[3]{\frac{29 - 3\sqrt{93}}{2}}}{3}.$$

In [11]:
g = lambda x: x**3 - x**2 - 1
Dg = lambda x: 3*x**2 - 2*x
psi_approx = newton(g,Dg,1,1e-10,20)
psi_approx

Found solution after 6 iterations.


1.4655712318767877

<h3 align="center">Divergent Example</h3>

Newton's method **diverges** in certain cases. For example, if the tangent line at the root is vertical as in $f(x) = x^{1/3}$.

In [12]:
h = lambda x: x**(1/3)
Dh = lambda x: (1/3)*x**(-2/3)
approx = newton(h,Dh,0.1,1e-2,1000)

Exceeded maximum iterations. No solution found.


<h3 align="center">Exercises</h3>

1. Consider the function $h(x) = x^3 - x - 1$. The only real root of $h(x)$ is called the **plastic number** and is given by:

$$\varphi = \frac{\sqrt[3]{108 + 12\sqrt{69}}+ \sqrt[3]{108 - 12\sqrt{69}}}{6}$$

2. Choose $x_0$ and implement $5$ iterations of Newton's method to approximate the plastic number for $h(x)$.
3. Use the exact value above to compute the absolute error after 2 iterations of **Newton's method**.
4. Starting with the sub-interval $[1,2]$, how many iterations of the **bisection method** is required to achieve the same accuracy?
4. Starting with the sub-interval $[1,2]$, how many iterations of the **secant method** is required to achieve the same accuracy?