# Matrix-matrix multiplication by rows

We continue to look at how the FLAMEJulia API can be used to implement different matrix-matrix multiplication algorithms.  

First, we create some matrices.

In [1]:
m = 4
n = 3
k = 5

C = rand(m, n)
println( "C = " )
C

C = 


4×3 Array{Float64,2}:
 0.089461  0.0807089  0.786385 
 0.775394  0.0109804  0.0878355
 0.840988  0.675529   0.214333 
 0.722499  0.879786   0.897798 

In [2]:
Cold = copy( C ) # an alternative way of doing a "hard" copy, in this case of a matrix
    
A = rand(m, k)
println( "A = " )
A

A = 


4×5 Array{Float64,2}:
 0.553488  0.636557   0.264353  0.0354749  0.62387 
 0.347997  0.545717   0.78843   0.348231   0.485425
 0.714738  0.125028   0.378362  0.370956   0.197038
 0.63489   0.0310842  0.405945  0.143988   0.644408

In [3]:
B = rand(k, n)
println( "B = " )
B

B = 


5×3 Array{Float64,2}:
 0.630872   0.930963  0.563996
 0.947745   0.776101  0.448223
 0.0755745  0.705812  0.617178
 0.504634   0.625457  0.25124 
 0.311762   0.409956  0.572654

## <h2>The algorithm  </h2>  <image src="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/Gemm_nn_unb_var2.png" alt="Matrix-matrix multiplication by rows picture" width="80%">

<h2> The routine <code> Gemm_nn_unb_var2!( A, B, C ) </code> </h2>

This routine computes $ C := A B + C $ by rows.  The "\_nn\_" means that this is the "No transpose, No transpose" case of matrix multiplication.  
The reason for this is that the operations $ C := A^T B + C $ ("\_tn\_" or "Transpose, No transpose"), $ C := A B^T + C $ ("\_nt\_" or "No transpose, Transpose"), and $ C := A^T B^T + C $ ("\_tt\_" or "Transpose, Transpose") are also encountered.  
    
The specific laff function we will use is
<ul>
<li> <code> laff.gemv!( trans, alpha, A, x, beta, y ) </code> which computes 
$ y := \alpha A x + \beta y $ or $ y := \alpha A^T x + \beta y $, depending on 
        parameter <code> trans</code>.  In particular, 
        <ul>
        <li>
        <code> laff.gemv!( "No transpose", alpha, A, x, beta, y ) </code> computes $ y := \alpha A x + \beta y $.
            </li>
        <li>
        <code> laff.gemv!( "Transpose", alpha, A, x, beta, y ) </code> computes $ y := \alpha A^T x + \beta y $.
            </li>
            </ul>
    </li>
</ul>

Use the <a href="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/index.html"> Spark webpage</a> to generate a code skeleton.  (Make sure you adjust the name of the routine.)

In [4]:
include("../flame.jl")
using .flame
include("../laff/laff.jl")
using .laff

function Gemm_nn_unb_var2!(A, B, C)

    AT, 
    AB  = flame.part_2x1(A, 
                         0, "TOP")

    CT, 
    CB  = flame.part_2x1(C, 
                         0, "TOP")

    while size(AT, 1) < size(A, 1)

        A0,  
        a1t, 
        A2   = flame.repart_2x1_to_3x1(AT, 
                                       AB, 
                                       1, "BOTTOM")

        C0,  
        c1t, 
        C2   = flame.repart_2x1_to_3x1(CT, 
                                       CB, 
                                       1, "BOTTOM")

        #------------------------------------------------------------#

        laff.gemv!( "Transpose", 1.0, B, a1t, 1.0, c1t )

        #------------------------------------------------------------#

        AT, 
        AB  = flame.cont_with_3x1_to_2x1(A0,  
                                         a1t, 
                                         A2,  
                                         "TOP")

        CT, 
        CB  = flame.cont_with_3x1_to_2x1(C0,  
                                         c1t, 
                                         C2,  
                                         "TOP")

    end

    flame.merge_2x1!(CT, 
                     CB, C)

end

Gemm_nn_unb_var2! (generic function with 1 method)

In [5]:
C = copy( Cold )              # restore C 

Gemm_nn_unb_var2!( A, B, C )

println( "C - ( Cold + A * B )" )
C - ( Cold + A * B )

C - ( Cold + A * B )


4×3 Array{Float64,2}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

Bingo! It works!

## Watch the algorithm at work!

Copy and paste the code into <a href="http://edx-org-utaustinx.s3.amazonaws.com/UT501x/PictureFlame/PictureFLAME.html"> PictureFLAME </a>, a webpage where you can watch your routine in action.  Just cut and paste into the box.  

Disclaimer: we implemented a VERY simple interpreter.  If you do something wrong, we cannot guarantee the results.  But if you do it right, you are in for a treat.

If you want to reset the problem, just click in the box into which you pasted the code and hit "next" again.