# 3.1: Solving Nonlinear Equations (Root Finding)

## The Basic Premise

Assume that $f$ is continuous on $[a,b]$ denoted by $f \in C[a,b]$. (NOTE: $C[a,b]$ is the set of all functions on $[a,b]$). Then there are 3 possible cases:

* There is exactly 1 root $x^* \in [a,b]$
* There are multiple roots $x_1^*, x_2^*, \cdots, x_k^* \in [a,b]$
* There are no real roots 

## Iterative Root Finding Methods

The basic idea is that we are given an initial guess $x_0 \in [a,b]$. Then we generate a sequence of iterates $x_1, x_2, x_3, \cdots$ that hopefully converges to $x^*$. We stop iterating at some finite $n$. Use $x_n \approx x^*$ as your final answer.

It is important that you choose a good stopping criteria depending on the problem that you want to solve. Some examples are listed here:

* Choose fixed $n$ in advance and simply do that many iterations
    * crude and usually a very bad idea
* Stop when $|x_n - x_{n-1}| < AT$ where $AT$ is the absolute tolerance that you choose in advance.
    * This will work when you have really big numbers
* Stop when $\frac{|x_n - x_{n-1}|}{|x_n|} < RT$ where $RT$ is the relative tolerance that you choose in advance
    * Textbook claims that this one is more robust if your numbers are small
    * Can combine this with the previous bullet point
        * i.e.: stop when BOTH are true
* Check the residual: $|f(x_n) - f(x^*)| = |f(x_n)| < FT$ where $FT$ is some function tolerance that we define ahead of time

## NUMERICAL METHOD: Bisection Method

The Bisection Method is a consequence of IVT (Intermediate Value Theorem). 

If $f$ is continuous on $[a,b]$ **AND** $f(a) \cdot f(b) < 0$ (this means that $f(a)$ and $f(b)$ have opposite signs), then there exists a point $x^* \in (a,b)$ such that $f(x^*) = 0$.

The idea is as follows (assume that there is exactly 1 root): 
* STEP 0: Let $x_0 = \frac{b+a}{2}$. Check $f(x_0)$ and update $b_1 = x_0, a_1 = x_0$. 
* STEP 1: Let $x_1 = \frac{b_1+a_1}{2}$. Check $f(x_0)$ and update $b_2 = b_1, a_2 = x_1$.
* STEP 2: Let $x_0 = \frac{b_2+a_2}{2}$. Check $f(x_0)$ and update $b_3 = b_2, a_3 = x_2$.

The code for this algorithm can be seen below:

In [6]:
%%file bisectionMethod.m

% f = function handle
% a = left endpoint of interval
% b = right endpoint of interval
% where f(a) * f(b) < 0
% tol = error tolerance on residual

% xn = nth iterate (final approximation)
% n = number of iterations

function [xn, n] = bisectionMethod(f,a,b,tol)

    residual = Inf;                % Initializing residual
    n = 0;                         % # of iterations
    
    while (residual > tol)
    
        xn = (a+b)/2;              % nth iterate is the midpoint of the interval
        residual = abs(f(xn));
        
        if f(a) * f(xn) < 0
            b = xn;
        else
            a = xn;
        end
        
        n = n + 1;                  % Increment loop counter
    
    end

end

Created file 'C:\Users\bushn\Home\Notes\STEM-Notes\MATH350 Numerical Methods\Chapter 3 - Nonlinear Equations in 1-Variable\bisectionMethod.m'.


In [8]:
f = @(x) x.^2 - 3;

format long
a = 0;
b = 2;
tol = 1e-7;

[x_approx, n] = bisectionMethod(f,a,b,tol)


xstar = sqrt(3) % true root
err = abs(x_approx - xstar)


x_approx =

   1.732050836086273


n =

    25


xstar =

   1.732050807568877


err =

     2.851739600018277e-08




## Why This Algorithm is NOT Super Bueno

* If you want a specific error tolerance, you can't really know that in advance
* It's really easy to break this script depending on the function that you input

While these are some issues with the algorithm, we can do some mathy mathz (aka error analysis) to help improve our bisection method algorithm.

## Fixing the Bisection Method

On the $n^{th}$ iteration, we know $x^* \in [a_n, b_n]$. We also know that $x_n = \frac{a_n+b_n}{2}$ SO $|x_n-x^*| \le \frac{1}{2}(b_n-a_n)$. **BL: The distance is halved.** However, on the second iteration we can write it as $\frac{1}{2}\frac{1}{2}(b_{n-1}-a_{n-1}) = \frac{1}{2}\frac{1}{2}\frac{1}{2}(b_{n-2}-a_{n-2}) = \cdots = \frac{1}{2^{n+1}}(b_0-a_0)$. 

This makes the absolute error $\epsilon_n \le \frac{1}{2^{n+1}}(b-a)$. So we can simplify as follows:

$$
    \begin{aligned}
        \frac{1}{2^{n+1}}(b-a) &\le ATOL\\
        \frac{1}{2}\frac{b-a}{2^n} &\le ATOL\\
        \frac{b-a}{2^n} &\le 2 \cdot ATOL\\
        2^n &\ge \frac{b-a}{2 \cdot ATOL}\\
        n &\ge \log_2\left(\frac{b-a}{2 \cdot ATOL}\right)
    \end{aligned}
$$

Now you can use this formula to know the number of iterations. If $n \notin \mathbb{Z}$, then you can simply round up in your code.

Here's an example of a better bisection method:

In [1]:
%%file betterBisectionMethod.m

% f = function handle
% a = left endpoint of interval
% b = right endpoint of interval
% where f(a) * f(b) < 0
% atol =  absolute value of error tolerance on residual

% xn = nth iterate (final approximation)
% n = number of iterations

function [xn, n] = bisectionMethod(f,a,b,atol)

    N = ceil(log2((b-a)/(2 * atol))); % stopping iteration
    
    for n = 0:N
    
        xn = (a+b)/2;              % nth iterate is the midpoint of the interval
        residual = abs(f(xn));
        
        if f(a) * f(xn) < 0
            b = xn;
        else
            a = xn;
        end
    
    end

end

Created file 'C:\Users\bushn\Home\Notes\STEM-Notes\MATH350 Numerical Methods\Chapter 3 - Nonlinear Equations in 1-Variable\betterBisectionMethod.m'.


In [None]:
f = @(x) x.^2 - 3;

format long
a = -0.8;
b = 1;
atol = 1e-7;

[x_approx, n] = bisectionMethod(f,a,b,atol)


xstar = sqrt(3) % true root
err = abs(x_approx - xstar)

## The Convergence of the Bisection Algorithm

We saw that the absolute error $\epsilon_n = |x_n - x^*| \le \frac{1}{2^{n+1}(b-a)}$. So:

$$
    0 \le \epsilon_n \le \frac{1}{2} \frac{b-a}{2^n} \approx \epsilon_{n-1} = \frac{1}{2}^2\frac{b-a}{2^{n-1} \approx \epsilon_{n-2} = \cdots = \frac{1}{2}^n \frac{b-a}{2} \approx \epsilon_0}
$$

By the sandwich theorem, as $n \rightarrow \infty$, $\epsilon_n \rightarrow 0$ so convergence is guaranteed.

**In general**, if an iterative root-finding method satisfies $\epsilon_n \approx p^n\epsilon_0$ for $p < 1$, then the convergence rate is derived to be:

$$
    RATE = -\log_{10}(p)
$$

then it will take $k = \text{ceil}\left(\frac{1}{RATE}\right)$ iterations to gain one digit of decimal precision. 

**FOR THE BISECTION ALGORITHM:** $p = \frac{1}{2}$, $RATE = -\log_{10}\frac{1}{2}\approx 0.301$ and we can expect $k = \text(ceil)\frac{1}{-\log_{10}\frac{1}{2}} = \text{ceil}(3.321) = 4$. This has *linear* convergence which is pretty slow (a good algorithm would have *quadratic* or *cubic* convergence). Bisection pretty much always converges but it's very slow.