##  Conjugate Gradient Method:

\\begin{equation}
min \ f(x),\ f(x) = 0.5x^T A x - b^T x
\\end{equation}

\\begin{equation}
\ Using \ b^T = [4,-8,16,1,-2,9]
\\end{equation}

In [118]:
import math
import pandas
import numpy as np

**Initializing the vectors**

In [119]:
x0 = np.matrix([[0],[0],[0],[0],[0],[0]])
A  = np.matrix([[4,0,0,1,0,0],[0,4,0,0,1,0],[0,0,5,0,0,1],[1,0,0,5,0,0],[0,1,0,0,6,0],[0,0,1,0,0,6]]) 
b =  np.matrix([[4],[-8],[16],[1],[-2],[9]])

**The Function to compute the minimizer vector by Conjugate gradient method**

In [120]:
def conjugate_gradient(A, b, x0, tol = 1.0e-8, max_iter = 100):
    """
    A function to solve [A]{x} = {b} linear equation system with the 
    conjugate gradient method.
    
    :param A : array 
        A real symmetric positive definite matrix(assumed)
        
    :param b : vector
        The vector of the system which is given in RHS.
        
    :param x0 : vector
        The starting guess for the solution.
        
    :param max_iter : integer
        Maximum number of iterations. Iteration will stop after max_iter 
        steps even if the specified tolerance has not been achieved.
        
    :param tol : float
        Tolerance to achieve. The algorithm will terminate when either 
        the relative or the absolute residual is below tol.
        
    :var    r0 : vector
                 Initialization stores the value (b - a * A )
    
    :var    d  : vector
    
    :var    a  : float
                 Iteratively computes the scalar of (r1T.r1)/(r0T.r0)

    :var    ri : vector
                 Iteratively stores the value (r - a * A * d), used to check for the convergence
    
    :var    x  : vector 
                 Stores the solution for the next iteration iteratively
                 
    :var    b  : float
                 Iteratively computes the scalar of (riT.ri)/(diT.A.di)
    """
    x = x0
    r0 = b - np.dot(A, x)
    d = r0

#   Iterations:   
    for i in xrange(max_iter):
        a = float(np.dot(r0.T, r0)/np.dot(np.dot(d.T, A), d))
        x = x + d*a
        ri = r0 - np.dot(A*a, d)
        
        print "iteration: ",i, "r(i): ",round(np.linalg.norm(ri),5)

        if np.linalg.norm(ri) < tol:
            print "\nConverged Successfully in iterations :",i
            print "The result of vector x:"
            return np.around(x,decimals=10)
            break
        b = float(np.dot(ri.T, ri)/np.dot(r0.T, r0))
        d = ri + b * d
        r0 = ri
    return x

In [121]:
conjugate_gradient(A, b, x0, tol = 1.0e-10, max_iter = 100)

iteration:  0 r(i):  20.54264
iteration:  1 r(i):  10.48844
iteration:  2 r(i):  5.69484
iteration:  3 r(i):  3.39053
iteration:  4 r(i):  2.18629
iteration:  5 r(i):  1.41935
iteration:  6 r(i):  0.86272
iteration:  7 r(i):  0.48876
iteration:  8 r(i):  0.27119
iteration:  9 r(i):  0.15636
iteration:  10 r(i):  0.09683
iteration:  11 r(i):  0.06273
iteration:  12 r(i):  0.03945
iteration:  13 r(i):  0.02287
iteration:  14 r(i):  0.01234
iteration:  15 r(i):  0.00647
iteration:  16 r(i):  0.00345
iteration:  17 r(i):  0.00197
iteration:  18 r(i):  0.00123
iteration:  19 r(i):  0.00083
iteration:  20 r(i):  0.00055
iteration:  21 r(i):  0.00034
iteration:  22 r(i):  0.0002
iteration:  23 r(i):  0.00011
iteration:  24 r(i):  6e-05
iteration:  25 r(i):  4e-05
iteration:  26 r(i):  2e-05
iteration:  27 r(i):  1e-05
iteration:  28 r(i):  1e-05
iteration:  29 r(i):  0.0
iteration:  30 r(i):  0.0
iteration:  31 r(i):  0.0
iteration:  32 r(i):  0.0
iteration:  33 r(i):  0.0
iteration:  34 r(i)

array([[ 1.],
       [-2.],
       [ 3.],
       [ 0.],
       [-0.],
       [ 1.]])

**Comments on Results:**

- Convergence criteria is: 100 maximum iterations or **||ri|| ≤ 10^(-8)**
- Using the initial guess as xT = [0,0,0,0,0,0] approxiamated value of x is **[1.0,-2.0,3.0,0,0,1.0]**
- The algorithm converged in **49 iterations** using the above guess vector
- The computed values are cross-validates by checking plugging in these values given by the criteria defined above and   checking for convergence ~0.

**Comment on Method Used:**
- The conjugate gradient method works by generating a set of vectors d that are conjugate with respect to the matrix     A. That is, **dTi A dj = 0, i != j**
- The formula for αi( used as **a** here) corresponds to an **exact line search along the direction di(used as d in     the code)**