# 15.2 Matrix-chain Multiplication

## Matrices must be compatible
We can  multiply two matrices $A$ and $B$ only if they are ***compatible***: the number of columns of $A$ must equal the number of rows of $B$. If $A$ is a $p\times q$ matrix and $B$ is a $q\times r$ matrix, the the multiplication of $A$ and $B$ results in a $p\times r$ matrix $C$ with $pqr$ number of scalar multiplications.

## Matrix multiplication is associative
In computing the product $A_1A_2...A_n$ of $n$ matrices, we can put the parenthesises anywhere in the middle, because matrix multiplication is ***associative***, and all the parenthesizations yield the same result. However, differnt costs can be incurred by different parenthesizations of a matrix product. Consider the the problem of a $\langle A_1, A_2, A_3\rangle$ of three matrices with dimension $10\times 100$, $100\times 5$, and $5\times 50$ respectively. 
* If we multiply according to $A_1(A_2A_3)$, we perform $10\times 100\times 50+100\times 5\times 50=75000$ scalar multiplications
* If we multiply according to $(A_1A_2)A_3$, we perform $10\times 100\times 5+10\times 5\times 50=7500$ scalar multiplications

The second parenthesization is $\times 10$ faster than the first one!

## Matrix-chain multiplication problem
Given a chain $\langle A_1A_2...A_n\rangle$ of $n$ matrices, where for $i=1,2,...,n$, matrix $A_i$ has dimension $p_{i-1}\times p_i$, fully parenthesize the product $A_1A_2...A_n$ in a way that ***minimizes*** the number of scalar multiplications.

The number of alternative parenzations of a sequences of $n$ matrices is denoted by $P(n)$. We have the following recurrence:

$$
\begin{align}
P(n)=\left\{
        \begin{array}{ll}
        1 &\text{if}\ n=1\\
        \sum^{n-1}_{k=1}P(k)P(n-k) &\text{if}\ n\geq 2\\
        \end{array}
        \right.
\end{align}
$$

In matrix-chain multiplication problem, we are not actually multiplying the matrices, but to determine an **order of mutiplications** that has the lowest cost.

## Applying dynamic programming
We shall use the **dynamic-programming** method to determine how to optimally parenthesize a matrix chain. In doing so, we shall follow the four-step sequence that we stated at the beginning of this chapter:
1. Characterize the structure of an optimal solution
2. Recursively define the value of an optimal solution
3. Compute the value of an optimal solution
4. Construct an optimal solution from computed information

### Step 1: The structure of an optimal parentisization
Suppose that to optimise parenthesize $A_iA_{i+1}...A_j$, we split the product between $A_k$ and $A_{k+1}$. Then the way we parenthesize the "prefix" subchain $A_iA_{i+1}...A_k$ within this optimal parenthesization of $A_iA_{i+1}...A_j$ must be an **optimal parenthesization** of $A_iA_{i+1}...A_j$.

This optimal substructure allows us to construct an optimal solution to the problem from optimal soluutions to subproblems: 
1. Build an optimal solution to an instance of the matrix-chain multiplication by **splitting the problem into two subproblems**
2. Find **optimal solutions to subproblem** instances
3. **Combine** these optimal subproblem solutions

We must ensure that we search for th correct place to split the product, we have considered **all** possible places, so that we are sure of having examined the optimal one.

### Step 2: A recursive solution
A subproblem of the matrix-chain multiplication probelm $A_1A_2...A_n$ is the determination of the minimum cost of parenthesizing $A_iA_{i+1}...A_j$ for $1\leq i\leq j\leq n$. Let its optimal solution be $m[i,j]$, the minimum number of sclar multiplications needed to compute the matrix $A_{i..j}$. (The optimal solution for the full problem $A_{i..j}$ is thus $m[1,n]$.) We can define $m[i,j]$ recursively:
* If $i=j$, the problem is trivial because the chain has only one matrix $A_i$
    <br>$\Rightarrow m[i,i]=0$ for $i=1,2,...,n$
* If $i<j$, $m[i,j]$ equals the optimal solutions of the sum of its two subproblems, $A_{i..k}$ and $A_{k..j}$, plus the cost of multiplying these two matrices together.Thus, we have:

$$
\begin{align}
m[i,j]=m[i,k]+m[k+1,j]+p_{i-1}p_kp_j
\end{align}
$$

Recall that each matrix $A_i$ is $p_{i-1}\times p_i$, multiplying $A_{i..k}$ and $A_{k..j}$ takes $p_{i-1}\times p_k\times p_j$ scalar multiplications. Thus, we have:

$$
\begin{align}
m[i,j]=\left\{
        \begin{array}{ll}
        0 &\text{if}\ i=j\\
        \underset{i\leq k<j}{\min} \{m[i,k]+m[k+1,j]+p_{i-1}p_kp_j\} &\text{if}\ i<j\\
        \end{array}       
        \right.
\end{align}
$$

In [1]:
import numpy as np
p=np.array([30,35,15,5,10,20,25])

In [3]:
# To compute m[2,5]
def matrix_chain_order(p):
    # n=p.length-1
    n=len(p)-1
    #let m[1...n,1...n] to store the m[i,j] cost
    m=np.zeros((n+1,n+1))
    #let s[1...n-1,2...n] that records which index of k achieved the optimal cost in computing m[i,j]
    s=np.zeros((n,n+1),dtype=int)
    for l in range(2,n+1): #l is the chain length, l=j-i+1
        for i in range(1,n-l+2): #possible range of i, min=1, max=n-1
            j=i+l-1              #because l=j-i+1, min=2,max=n
            m[i,j]=np.inf        #set infinity as sentinel for m[i,j]
            #print (l,i,j)
            for k in range(i,j):
                q=m[i,k]+m[k+1,j]+p[i-1]*p[k]*p[j]
                if q<m[i,j]:
                    m[i,j]=q
                    s[i,j]=k
    return m,s
matrix_chain_order(p)

(array([[    0.,     0.,     0.,     0.,     0.,     0.,     0.],
        [    0.,     0., 15750.,  7875.,  9375., 11875., 15125.],
        [    0.,     0.,     0.,  2625.,  4375.,  7125., 10500.],
        [    0.,     0.,     0.,     0.,   750.,  2500.,  5375.],
        [    0.,     0.,     0.,     0.,     0.,  1000.,  3500.],
        [    0.,     0.,     0.,     0.,     0.,     0.,  5000.],
        [    0.,     0.,     0.,     0.,     0.,     0.,     0.]]),
 array([[0, 0, 0, 0, 0, 0, 0],
        [0, 0, 1, 1, 3, 3, 3],
        [0, 0, 0, 2, 3, 3, 3],
        [0, 0, 0, 0, 3, 3, 3],
        [0, 0, 0, 0, 0, 4, 5],
        [0, 0, 0, 0, 0, 0, 5]]))

In [8]:
def print_optimal_parens(s,i,j):
    if i==j:
        print ('A'+str(i),end='')
    else:
        print ('(', end='')
        print_optimal_parens(s,i,s[i,j]) #k=s[i,j], split product Ai...Ak
        print_optimal_parens(s,s[i,j]+1,j) #k+1=s[i,j]+1, split product A(k+1)...Aj 
        print (')',end='')
print_optimal_parens(matrix_chain_order(p)[1],1,6)    

((A1(A2A3))((A4A5)A6))