# Root Finding for Systems of Equations: Newton-Raphson
---

GENERAL PROBLEM: find the simultaneous roots of a system of $N$ non-linear functions 
$f_{1}(x_{1},\ldots,x_{N}), \dots, f_{N}(x_{1},\ldots,x_{N})$, in $N$ variables $x_{1},\ldots,x_{N}$. That is, find a combination of $x_{1},\ldots,x_{N}$ that satisfy the system of equations

\begin{align}
  \left.\begin{array}{l}
    f_{1}(x_{1}, \ldots, x_{N}) = 0 \\
    \quad\vdots \quad\quad\vdots \quad\quad\vdots \\
    f_{N}(x_{1}, \ldots, x_{N}) = 0
  \end{array}\right\}
  \quad\leftrightarrow\quad
  \mathbf{f}(\mathbf{x}) = \mathbf{0}
\end{align}

IDEA: generalize the Newton-Raphson method to systems of of equations, leading to a vectorized iteration scheme.

PRE-REQUISITES:   
- Newton-Raphson method for a single equavariable
- Solving systems of linear equations (LU decomposition, Gaussian elimination, etc)

REFERENCES:
- [1] Burden and Faires, *Numerical Analysis, 7th edition*.
- [2] Ralston and Rabinowitz, *A First Course in Numerical Analysis, 2nd edition*.
- [3] Press et al, *Numerical Recipes: the Art of Scientific Computing, 3rd edition*.
- [4] Stoer and Bulirsch, *Introduction to Numerical Analysis, 2nd edition*.

## 0. Review of Newton-Raphson for a single variable

Briefly recall the Newton-Raphson method for a single variable. We want to find solutions to the non-linear equation $f(x)=0$. If the equation were linear, it would be trivial to solve for x. But in general when the equation is non-linear, it may be difficult to find its roots. The idea behind the Newton-Raphson method is to replace the original non-linear equation with a linear approximation to the function at a given point (i.e., a tangent line), and then solve the resulting linear problem (i.e., find the zero-crossing of that tangent line). When things go right, the solution to the linearized problem gives an improved approximant to the solution of the non-linear problem, which can then be used to construct a new linear problem (i.e., new tangent line whose zero-crossing is even closer to the searched-for root). The process is then iterated until the solution is found. 

To start the process, we expand the function $f(x)$ around an initial guess $x_{0}$, to first order in $(x-x_{0})$, giving

\begin{align}
  f(x) \approx f(x_{0}) + (x - x_{0})f'(x_{0}).
\end{align}

Evaluating this at the root $x=x_{*}$ makes the left hand side vanish (since $f(x_{*})=0$). Re-arranging gives an approximant for the root

\begin{align}
   x_{*} \approx x_{0} - \frac{f(x_{0})}{f'(x_{0})}.
\end{align}

This is then used as the basis for an iteration scheme, using the iteration formula

\begin{align}
  x_{i+1} = x_{i} - \frac{f(x_{i})}{f'(x_{i})}.
\end{align}

If the derivative of the function is not available, one may approximate it in some way. For example, substituting $f'(x_{i})$ with a forward-difference approximation gives

\begin{align}
  x_{i+1} = x_{i} - \frac{(x_{i} - \delta{x_{i}})f(x_{i})}{f(x_{i} + \delta{x_{i}}) - f(x_{i})}.
\end{align}

Recall that one needs to take care in choosing $\delta{x_{i}}$ at each iteration in order to keep the solution from diverging on the one hand (if $\delta{x_{i}}$ is taken to be too large), or from being overrun by round-off error (if $\delta{x_{i}}$ is taken to be too small).

## 1. Extending Newton-Raphson to systems of equations

Now we want to find solutions to the *system* of non-linear equations 

\begin{align}
  f_{1}(x_{1}, \ldots, x_{N}) = 0 \\
  \quad\vdots \quad\quad\vdots \quad\quad\vdots \\
  f_{N}(x_{1}, \ldots, x_{N}) = 0
\end{align}

which we can write in vector form as

\begin{align}
  \mathbf{f}(\mathbf{x}) = \mathbf{0}
  \quad,\quad\text{where}\quad\quad
  \mathbf{f}(\mathbf{x}) = 
    \left[\begin{array}{c}
      f_{1}(\mathbf{x})\\
      \vdots \\
      f_{N}(\mathbf{x})
    \end{array}\right]
  \quad,\quad
  \mathbf{x} = (x_{1}, \ldots, x_{N})
\end{align}

Notice that if the system of equations were linear, we could use something akin to Gaussian elimination to solve it. Since the equations are non-linear, the problem is not as easy. However, the basic idea behind Newton-Raphson (and in fact many root-finding methods) is to turn a difficult non-linear problem into a sequence of easier linear problems. Here we simply generalize the process to more than one variable.

As with the single variable case, we start by expanding $\mathbf{f}(\mathbf{x})$ around an initial guess $\mathbf{x}_{0}$, to first order in $(\mathbf{x}-\mathbf{x}_{0})$, giving

\begin{align}
  \mathbf{f}(\mathbf{x}) \approx \mathbf{f}(\mathbf{x}_{0}) 
  + (\mathbf{x} - \mathbf{x}_{0})\cdot\mathbf{J}(\mathbf{x}_{0})
\end{align}

where $\mathbf{J}(\mathbf{x})$ is the Jacobian matrix given by

\begin{align}
  \mathbf{J}(\mathbf{x}) =
  \left[\begin{array}{ccc}
    \frac{\partial f_{1}}{\partial x_{1}}(\mathbf{x}) 
    & \cdots & \frac{\partial f_{1}}{\partial x_{N}}(\mathbf{x}) \\
    \vdots & \ddots & \vdots\\
    \frac{\partial f_{N}}{\partial x_{1}}(\mathbf{x}) 
    & \cdots & \frac{\partial f_{N}}{\partial x_{N}}(\mathbf{x}) \\
  \end{array}\right]
  \quad\quad\text{or}\quad\quad
  J_{ij}(\mathbf{x}) = \frac{\partial f_{i}}{\partial x_{j}}(\mathbf{x}).
\end{align}


Evaluating this at the root $\mathbf{x}=\mathbf{x}_{*}$ makes the left hand side vanish (since $\mathbf{f}(\mathbf{x}_{*})=\mathbf{0}$). Re-arranging gives an approximant for the root

\begin{align}
   \mathbf{x}_{*} \approx \mathbf{x}_{0} - \mathbf{J}^{-1}(\mathbf{x}_{0})\cdot\mathbf{f}(\mathbf{x}_{0}).
\end{align}

This is then used as the basis for an iteration scheme. In close analogy with the single variable case, the iteration equation can be written as

\begin{align}
   \mathbf{x}_{i+1} \approx \mathbf{x}_{i} - \mathbf{J}^{-1}(\mathbf{x}_{i})\cdot\mathbf{f}(\mathbf{x}_{i}).
\end{align}

However in practice it is not necessary to construct $\mathbf{J}^{-1}$ explicitly, which can be computationally expensive. It is enough to solve for $\mathbf{\delta x}$ at each iteration and then use that to obtain an improved approximant. This two-step process is represented as

\begin{align}
  &\text{Step 1:}\quad \mathbf{J}(\mathbf{x}_{i})\cdot\mathbf{\delta x}_{i} = -\mathbf{f}(\mathbf{x}_{i})
  \quad\text{(solve for $\mathbf{\delta x}_{i}$)}\\
  &\text{Step 2:}\quad \mathbf{x}_{i+1} = \mathbf{x}_{i} + \mathbf{\delta x}_{i}\
\end{align}

Notice that in step 1 we need to find a solution to a system of linear equations using something akin to Gaussian elimination. Although this is not trivial, it is easier than our original problem. What we have done, in other words, is trade the difficult problem of finding a solution to a system of *non-linear* equations for a *sequence* of simpler tasks: that of solving *linear* systems of equations. The catch is that we have to do this repeatedly. No big deal if we have enough time and computing power. (Various methods have been developed to reduce, or shift, this computational effort. See [3] or [4] for further discussion.)

Having solved the linear system at one iteration, we obtain a new and improved approximant to the solution. That is then used to generate a new linear system to be solved. The process is repeated until the solution is found to the desired precision (or maximum allowed iterations is reached). As with the one-dimensional Newton-Raphson method, success depends on starting with a sufficiently close initial approximant. For that reason, the method is best used in tandem with a more globally robust method such as a gradient descent method. That method will be described elsewhere.

## 2. Algorithm

**INPUT**
- $\mathbf{x}_{0}$, initial guess for the root of the system of equations in question.
- TOL, the relative error tolerance that the answer is required to have.
- $i_\mathrm{max}$, maximum number of iterations allowed.

**Initialize loop**
- set $i = 0$

**Loop** while $i \leq i_\mathrm{max}$


- calculate $\mathbf{f}(\mathbf{x}_{i})$, $\mathbf{J}(x_{i})$


- solve $\mathbf{J}(x_{i})\cdot\mathbf{\delta x}_{i} = -\mathbf{f}(\mathbf{x}_{i})$


- calculate $\mathbf{x}_{i+1} = \mathbf{x}_{i} + \mathbf{\delta x}_{i}$


- save result


- calculate the relative uncertainty using: REL $= ||\mathbf{x}_{i+1} - \mathbf{x}_{i}||\,\,/\,\, ||\mathbf{x}_{i+1}||$


- calculate ABS $= ||\mathbf{f}(\mathbf{x}_{i+1})||$


- if (REL $\leq$ TOL) and (ABS $\leq$ TOL), stop. Otherwise, continue.


- rotate $\mathbf{x}_\mathrm{old} = \mathbf{x}_\mathrm{new}$


**Max iterations reached**
- Print message that max iterations have been reached, and stop.

**OUTPUT**

solution found, or message of failure

## 3. CODE:

In [12]:
%%writefile newton_raphson2.py
import numpy as np
import numpy.linalg as la 
import sys
def newton_raphson(F, J, X0, TOL, imax):
    """
    Function that searches for solution of a system of equations,
    F(X)=0, using the Newton-Raphson method.
    
    INPUT
    F    : function making up the system of equations 
    J    : Jacobian of first derivatives
    X0   : array of initial guesses for solution
    TOL  : allowed tolerance
    imax : maximum number of iterations
    
    OUTPUT
    solution to within the allowed tolerance, or failure message
    
    """
    
    # initialize output array
    solns = []
    
    # initialize iteration
    XOld = X0 # set initial approximant
    print('Initial guess for solution',X0)
    
    # iterate search using method of false position
    i = 0  # reset iteration number
    while i <= imax:

        # announce start of next iteration
        print('Iteration',i,':')

        # get function values
        FOld = F(XOld)

        # get Jacobian values
        JOld = J(XOld) 
        
        # solve for DX in J*DX = -F
        A = JOld
        b = -FOld
        # linear algebra solver
        DX = la.solve(A, b) 
        
        # update approximate location of root
        XNew = XOld + DX
        print('  approximate location of root at',XNew)
    
        # calculate errors
        XErr = la.norm(XNew - XOld, 2)
        REL  = XErr/la.norm(XNew, 2)
        ABS  = la.norm(F(XNew), 2)

        # save approximant and error
        solns.append([XNew, REL, ABS])

        # check if errors are within the allowed tolerance
        if (REL <= TOL and ABS <= TOL):
            best   = XNew  #best estimate
            uncert = XErr  #uncertainty
            print('SUCCESS! Solution found within the specified tolerance after',i,'iterations.')
            print('Solution is',best,'+/-',uncert)
            return solns
                
        # rotate approximants
        XOld = XNew

        # increment iteration number
        i = i + 1
        
    # print message that max iteration has been reached
    print('FAIL! Max number of iterations has been reached. Stopping.')
    
    return solns

Overwriting newton_raphson2.py


In [13]:
%run newton_raphson2.py

## 4. A simple 2D example

(Ralston and Rabinowitz, example 8.7)

The system of equations

\begin{align}
  & f_{1}(x, y) = x^2 - y - 1 \\
  & f_{2}(x, y) = (x - 2)^2 + (y - 0.5)^2 - 1
\end{align}

has two solutions

\begin{align}
  & r_{1} = [1.54634288332, 1.39117631279] \\
  & r_{2} = [1.06734608581, 0.139227666887]
\end{align}

In [25]:
# system of equations
def system(XX):
    x = XX[0]
    y = XX[1]
    dim = len(XX)
    f = np.zeros(dim)
    f[0] = x*x - y - 1 
    f[1] = (x - 2)**2 + (y - 0.5)**2 - 1
    return f

# Jacobian matrix elements
def jacobian(XX):
    x = XX[0]
    y = XX[1]
    dim = len(XX)
    dfdx = np.zeros((dim, dim))
    dfdx[0,0] = 2*x
    dfdx[0,1] = -1
    dfdx[1,0] = 2*(x - 2)
    dfdx[1,1] = 2*(y - 0.5)
    return dfdx

In [26]:
# exact solutions
exact1 = np.array([1.54634288332, 1.39117631279])
exact2 = np.array([1.06734608581, 0.139227666887])

# solve system
X0 = np.array([1.22, 0.7])
#X0 = np.array([2.0, 4.0])
TOL = 1e-3
IMAX = 10
solns = newton_raphson(system, jacobian, X0, TOL, IMAX)

Initial guess for solution [1.22 0.7 ]
Iteration 0 :
  approximate location of root at [ 0.4730137  -1.33424658]
Iteration 1 :
  approximate location of root at [ 0.87904998 -0.3921366 ]
Iteration 2 :
  approximate location of root at [1.0200233  0.02057406]
Iteration 3 :
  approximate location of root at [1.06372744 0.12960601]
Iteration 4 :
  approximate location of root at [1.06731826 0.13915537]
Iteration 5 :
  approximate location of root at [1.06734608 0.13922766]
SUCCESS! Solution found within the specified tolerance after 5 iterations.
Solution is [1.06734608 0.13922766] +/- 7.746683083641568e-05


In [27]:
# print solutions
iterations = len(solns)
print('x \t\t y \t\t rel_error \t abs_error')
for i in range(iterations):
    xsoln = solns[i][0][0]
    ysoln = solns[i][0][1]
    rel_err = solns[i][1]
    abs_err = solns[i][2]
    print('%.8f \t %.8f \t %.8f \t %.8f' % (xsoln, ysoln, rel_err, abs_err))

x 		 y 		 rel_error 	 abs_error
0.47301370 	 -1.33424658 	 1.53082940 	 4.72918112
0.87904998 	 -0.39213660 	 1.06579893 	 1.06527159
1.02002330 	 0.02057406 	 0.42747518 	 0.19123899
1.06372744 	 0.12960601 	 0.10961704 	 0.01392959
1.06731826 	 0.13915537 	 0.00947847 	 0.00010488
1.06734608 	 0.13922766 	 0.00007197 	 0.00000001


## 5. A simple 3D example

(Burden and Faires, example 10.2.1)

The system of equations

\begin{align}
  & f_{1}(x, y, z) = 3x - \cos(yz) - 0.5 \\
  & f_{2}(x, y, z) = x^2 - 81(y + 0.1)^2 + \sin(z) + 1.06\\
  & f_{3}(x, y, z) = \exp(-xy) + 20z + \frac{10\pi - 3}{3}
\end{align}

has a solution near

\begin{align}
  r_{1} = [0.5, 0.0, -0.52359877]
\end{align}

In [28]:
# system of equations
def system(XX):
    x = XX[0]
    y = XX[1]
    z = XX[2]
    dim = len(XX)
    f = np.zeros(dim)
    f[0] = 3*x - np.cos(y*z) - 0.5 
    f[1] = x**2 - 81.0*(y + 0.1)**2 + np.sin(z) + 1.06
    f[2] = np.exp(-x*y) + 20.0*z + (10.0*np.pi - 3.0)/3.0
    return f

# Jacobian matrix elements
def jacobian(XX):
    x = XX[0]
    y = XX[1]
    z = XX[2]
    dim = len(XX)
    dfdx = np.zeros((dim, dim))
    dfdx[0,0] = 3.0
    dfdx[0,1] = z*np.sin(y*z)
    dfdx[0,2] = y*np.sin(y*z)
    
    dfdx[1,0] = 2.0*x
    dfdx[1,1] = -162.0*(y + 0.1)
    dfdx[1,2] = np.cos(z)

    dfdx[2,0] = -y*np.exp(x*y)
    dfdx[2,1] = -x*np.exp(-x*y)
    dfdx[2,2] = 20.0

    return dfdx

In [33]:
# exact solutions
exact1 = np.array([0.5, 0.0, -0.52359877])

# solve system
#X0 = np.array([0.1, 0.1, -0.1])
X0 = np.array([0.0, 0.0, 0.0])
TOL = 1e-3
IMAX = 100
solns = newton_raphson(system, jacobian, X0, TOL, IMAX)

Initial guess for solution [0. 0. 0.]
Iteration 0 :
  approximate location of root at [ 0.5        -0.01688881 -0.52359878]
Iteration 1 :
  approximate location of root at [ 0.50001569  0.00172004 -0.52355363]
Iteration 2 :
  approximate location of root at [ 5.00000133e-01  1.45705210e-05 -5.23598394e-01]
Iteration 3 :
  approximate location of root at [ 5.00000000e-01  1.06342783e-09 -5.23598776e-01]
SUCCESS! Solution found within the specified tolerance after 3 iterations.
Solution is [ 5.00000000e-01  1.06342783e-09 -5.23598776e-01] +/- 1.457504577602255e-05


In [30]:
# print results
iterations = len(solns)
print('x \t\t y \t\t z \t\t rel_error \t abs_error')
for i in range(iterations):
    xsoln = solns[i][0][0]
    ysoln = solns[i][0][1]
    zsoln = solns[i][0][2]
    rel_err = solns[i][1]
    abs_err = solns[i][2]
    print('%.8f \t %.8f \t %.8f \t %.8f \t %.8f' % (xsoln, ysoln, zsoln, rel_err, abs_err))

x 		 y 		 z 		 rel_error 	 abs_error
0.49986969 	 0.01946808 	 -0.52148048 	 0.81167414 	 0.34592446
0.50001424 	 0.00158877 	 -0.52355696 	 0.02486308 	 0.02589220
0.50000011 	 0.00001245 	 -0.52359845 	 0.00217813 	 0.00020127
0.50000000 	 0.00000000 	 -0.52359878 	 0.00001720 	 0.00000001
