# Matrix-matrix multiplication by rows

We continue to look at how the FLAMEPy API can be used to implement different matrix-matrix multiplication algorithms.  

First, we create some matrices.

In [1]:
import numpy as np

m = 4
n = 3
k = 5

C = np.matrix( np.random.random( (m, n) ) )
print( 'C = ' )
print( C )

Cold = np.matrix( np.zeros( (m,n ) ) )
Cold = np.matrix( np.copy( C ) )           # an alternative way of doing a "hard" copy, in this case of a matrix
    
A = np.matrix( np.random.random( (m, k) ) )
print( 'A = ' )
print( A )

B = np.matrix( np.random.random( (k, n) ) )
print( 'B = ' )
print( B )

C = 
[[0.68073396 0.98569509 0.10501375]
 [0.51967546 0.78631201 0.49508307]
 [0.54368462 0.70280862 0.10598686]
 [0.31429729 0.07802089 0.11451586]]
A = 
[[0.89005842 0.27064834 0.75206284 0.35586619 0.48055606]
 [0.79858772 0.55317881 0.61630488 0.98104195 0.28963064]
 [0.06398331 0.9227011  0.95929183 0.50538821 0.26555641]
 [0.27559057 0.58197859 0.49382578 0.49327044 0.6058373 ]]
B = 
[[0.36394974 0.39097911 0.2326965 ]
 [0.95608722 0.62158843 0.57330315]
 [0.23617553 0.29450446 0.24878499]
 [0.62727623 0.96501086 0.31917138]
 [0.27205542 0.94674372 0.14142141]]


## <h2>The algorithm  </h2>  <image src="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/Gemm_nn_unb_var2.png" alt="Matrix-matrix multiplication by rows picture" width="80%">

<h2> The routine <code> Gemm_nn_unb_var2( A, B, C ) </code> </h2>

This routine computes $ C := A B + C $ by rows.  The "\_nn\_" means that this is the "No transpose, No transpose" case of matrix multiplication.  
The reason for this is that the operations $ C := A^T B + C $ ("\_tn\_" or "Transpose, No transpose"), $ C := A B^T + C $ ("\_nt\_" or "No transpose, Transpose"), and $ C := A^T B^T + C $ ("\_tt\_" or "Transpose, Transpose") are also encountered.  
    
The specific laff function we will use is
<ul>
<li> <code> laff.gemv( trans, alpha, A, x, beta, y ) </code> which computes 
$ y := \alpha A x + \beta y $ or $ y := \alpha A^T x + \beta y $, depending on 
        parameter <code> trans</code>.  In particular, 
        <ul>
        <li>
        <code> laff.gemv( 'No transpose', alpha, A, x, beta, y ) </code> computes $ y := \alpha A x + \beta y $.
            </li>
        <li>
        <code> laff.gemv( 'Transpose', alpha, A, x, beta, y ) </code> computes $ y := \alpha A^T x + \beta y $.
            </li>
            </ul>
    </li>
</ul>

Use the <a href="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/index.html"> Spark webpage</a> to generate a code skeleton.  (Make sure you adjust the name of the routine.)

In [2]:
import flame
import laff as laff

def Gemm_nn_unb_var2(A, B, C):

    AT, \
    AB  = flame.part_2x1(A, \
                         0, 'TOP')

    CT, \
    CB  = flame.part_2x1(C, \
                         0, 'TOP')

    while AT.shape[0] < A.shape[0]:

        A0,  \
        a1t, \
        A2   = flame.repart_2x1_to_3x1(AT, \
                                       AB, \
                                       1, 'BOTTOM')

        C0,  \
        c1t, \
        C2   = flame.repart_2x1_to_3x1(CT, \
                                       CB, \
                                       1, 'BOTTOM')

        #------------------------------------------------------------#

        laff.gemv( 'Transpose', 1.0, B, a1t, 1.0, c1t )

        #------------------------------------------------------------#

        AT, \
        AB  = flame.cont_with_3x1_to_2x1(A0,  \
                                         a1t, \
                                         A2,  \
                                         'TOP')

        CT, \
        CB  = flame.cont_with_3x1_to_2x1(C0,  \
                                         c1t, \
                                         C2,  \
                                         'TOP')

    flame.merge_2x1(CT, \
                    CB, C)



In [3]:
C = np.matrix( np.copy( Cold ) )               # restore C 

Gemm_nn_unb_var2( A, B, C )

print( 'C - ( Cold + A * B )' )
print( C - ( Cold + A * B ) )

C - ( Cold + A * B )
[[0.00000000e+00 0.00000000e+00 1.11022302e-16]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 2.22044605e-16]
 [0.00000000e+00 0.00000000e+00 1.11022302e-16]]


Bingo! It works!

## Watch the algorithm at work!

Copy and paste the code into <a href="http://edx-org-utaustinx.s3.amazonaws.com/UT501x/PictureFlame/PictureFLAME.html"> PictureFLAME </a>, a webpage where you can watch your routine in action.  Just cut and paste into the box.  

Disclaimer: we implemented a VERY simple interpreter.  If you do something wrong, we cannot guarantee the results.  But if you do it right, you are in for a treat.

If you want to reset the problem, just click in the box into which you pasted the code and hit "next" again.