# Matrix-vector multiplication via dot products

This notebook walks you through how to implement $ y := A x + y $ that "marches" through the matrix in an alternative way.

## Getting started

We will use some functions that are part of our laff library (of which this function will become a part) as well as some routines from the FLAME API (Application Programming Interface) that allows us to write code that closely resembles how we typeset algorithms using the FLAME notation.  These functions are imported with `include` and `using` statements.

## Algorithm that takes dot products

<img src="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/423_Mvmult_n_unb_var1B.png" alt="Alternative matrix-vector multiplication via dot products algorithm" width="50%">

## The `Mvmult_n_unb_var1B!( A, x, y )` routine

This routine, given $ A \in \mathbb{R}^{n \times n} $, $ x \in \mathbb{R}^n $, and $ y \in \mathbb{R}^n $, computes $ y := A x + y $.  The "_n_" in the title of the routine indicates that this is the "no transpose" matrix-vector multiplication.  The "B" means this is the algorithm that marches through matrices from top-left to bottom-right.

The specific laff functions we will use are 
<ul>
<li> <code> laff.dots!( x, y, alpha ) </code> which computes $ \alpha := x^T y + \alpha $.  </li>
</ul>

Use the <a href="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/index.html"> Spark webpage</a> to generate a code skeleton.  (Make sure you adjust the name of the routine.)

In [1]:
include("../flame.jl")
using .flame
include("../laff/laff.jl")
using .laff

function Mvmult_n_unb_var1B!(A, x, y)

    ATL, ATR,
    ABL, ABR  = flame.part_2x2(A,
                               0, 0, "TL")

    xT,
    xB  = flame.part_2x1(x,
                         0, "TOP")

    yT,
    yB  = flame.part_2x1(y,
                         0, "TOP")

    while size(ATL, 1) < size(A, 1)

        A00,  a01,     A02,
        a10t, alpha11, a12t,
        A20,  a21,     A22   = flame.repart_2x2_to_3x3(ATL, ATR,
                                                       ABL, ABR,
                                                       1, 1, "BR")

        x0,
        chi1,
        x2    = flame.repart_2x1_to_3x1(xT,
                                        xB,
                                        1, "BOTTOM")

        y0,
        psi1,
        y2    = flame.repart_2x1_to_3x1(yT,
                                        yB,
                                        1, "BOTTOM")

        #------------------------------------------------------------#

        laff.dots!( a10t, x0, psi1 )
        laff.dots!( alpha11, chi1, psi1 )
        laff.dots!( a12t, x2, psi1 )

        #------------------------------------------------------------#

        ATL, ATR,
        ABL, ABR  = flame.cont_with_3x3_to_2x2(A00,  a01,     A02,
                                               a10t, alpha11, a12t,
                                               A20,  a21,     A22,
                                               "TL")

        xT,
        xB  = flame.cont_with_3x1_to_2x1(x0,
                                         chi1,
                                         x2,
                                         "TOP")

        yT,
        yB  = flame.cont_with_3x1_to_2x1(y0,
                                         psi1,
                                         y2, 
                                         "TOP")
    end
    
    flame.merge_2x1!(yT,
                    yB, y)

end




Mvmult_n_unb_var1B! (generic function with 1 method)

## Testing

Let's quickly test the routine by creating a 4 x 4 matrix and related vectors, performing the computation.

In [3]:
A = rand(4, 4)
x = rand(4)
y = rand(4)
yold = rand(4)

println("A before = ")
A

A before = 


4×4 Array{Float64,2}:
 0.268765   0.0611786   0.702223  0.0910037
 0.0122186  0.108919    0.72132   0.0509808
 0.931914   0.356389    0.484053  0.622381 
 0.842972   0.00386145  0.557056  0.977154 

In [4]:
println("x before = ")
x

x before = 


4-element Array{Float64,1}:
 0.22917041031809537
 0.09823300484973818
 0.4827111117103664 
 0.9889603553973529 

In [5]:
println("y before = ")
y

y before = 


4-element Array{Float64,1}:
 0.13156299852946018
 0.9196999960732597 
 0.7676548193702415 
 0.25552170454838996

In [6]:
laff.copy!( y, yold )   # save the original vector y

Mvmult_n_unb_var1B!( A, x, y )

println( "y after =" )
y

y after =


4-element Array{Float64,1}:
 0.6281357317126156
 1.3318067981392459
 1.8653989894997496
 1.684348987125792 

In [7]:
println( "y - ( A * x + yold ) = " )
y - ( A * x + yold )

y - ( A * x + yold ) = 


4-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0

Bingo, it seems to work!  (Notice that we are doing floating point computations, which means that due to rounding you may not get an exact "0", but it should be close.)

## Watch your code in action!

Copy and paste the code into <a href="http://edx-org-utaustinx.s3.amazonaws.com/UT501x/PictureFlame/PictureFLAME.html"> PictureFLAME </a>, a webpage where you can watch your routine in action.  Just cut and paste into the box.  

Disclaimer: we implemented a VERY simple interpreter.  If you do something wrong, we cannot guarantee the results.  But if you do it right, you are in for a treat.

If you want to reset the problem, just click in the box into which you pasted the code and hit "next" again.

## Algorithm that uses axpys

<img src="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/423_Mvmult_n_unb_var2B.png" alt="Alternative matrix-vector multiplication via dot products algorithm" width="50%">

## The `Mvmult_n_unb_var2B!( A, x, y )` routine

This routine, given $ A \in \mathbb{R}^{n \times n} $, $ x \in \mathbb{R}^n $, and $ y \in \mathbb{R}^n $, computes $ y := A x + y $.  The "_n_" in the name of the routine indicates this is the "no transpose" matrix-vector multiplication.  The "B" means this is the algorithm that marches through matrices from top-left to bottom-right.

The specific laff functions we will use are 
<ul>
<li> <code> laff.axpy!( alpha, x, y ) </code> which computes $ y := \alpha x +  y  $.  </li>
</ul>

Use the <a href="https://studio.edx.org/c4x/UTAustinX/UT.5.01x/asset/index.html"> Spark webpage</a> to generate a code skeleton.  (Make sure you adjust the name of the routine.)

In [8]:
include("../flame.jl")
using .flame
include("../laff/laff.jl")
using .laff

function Mvmult_n_unb_var2B!(A, x, y)

    ATL, ATR,
    ABL, ABR  = flame.part_2x2(A,
                               0, 0, "TL")

    xT,
    xB  = flame.part_2x1(x,
                         0, "TOP")

    yT,
    yB  = flame.part_2x1(y,
                         0, "TOP")

    while size(ATL, 1) < size(A, 1)

        A00,  a01,     A02,
        a10t, alpha11, a12t,
        A20,  a21,     A22   = flame.repart_2x2_to_3x3(ATL, ATR,
                                                       ABL, ABR,
                                                       1, 1, "BR")

        x0,
        chi1,
        x2    = flame.repart_2x1_to_3x1(xT,
                                        xB,
                                        1, "BOTTOM")

        y0,
        psi1,
        y2    = flame.repart_2x1_to_3x1(yT,
                                        yB,
                                        1, "BOTTOM")

        #------------------------------------------------------------#
        
        laff.axpy!( chi1, a01, y0 )
        laff.axpy!( chi1, alpha11, psi1 )
        laff.axpy!( chi1, a21, y2 )

        #------------------------------------------------------------#

        ATL, ATR,
        ABL, ABR  = flame.cont_with_3x3_to_2x2(A00,  a01,     A02,
                                               a10t, alpha11, a12t,
                                               A20,  a21,     A22,
                                               "TL")

        xT,
        xB  = flame.cont_with_3x1_to_2x1(x0,
                                         chi1,
                                         x2,
                                         "TOP")

        yT,
        yB  = flame.cont_with_3x1_to_2x1(y0,
                                         psi1,
                                         y2,
                                         "TOP")

    end
    flame.merge_2x1!(yT,
                    yB, y)

end






Mvmult_n_unb_var2B! (generic function with 1 method)

## Testing

Let's quickly test the routine by creating a 4 x 4 matrix and related vectors, performing the computation.

In [9]:
A = rand(4, 4)
x = rand(4)
y = rand(4)
yold = rand(4)

println("A before = ")
A

A before = 


4×4 Array{Float64,2}:
 0.0853503  0.754428  0.597317  0.903535 
 0.49858    0.426811  0.751957  0.210664 
 0.168557   0.39921   0.242136  0.0613319
 0.572391   0.749162  0.58085   0.167528 

In [10]:
println("x before = ")
x

x before = 


4-element Array{Float64,1}:
 0.8353166420086002  
 0.031829237958661105
 0.3322704912947081  
 0.29037840476104604 

In [13]:
println("y before = ")
y

y before = 


4-element Array{Float64,1}:
 0.5058548754442664  
 0.12217367445167437 
 0.020623276288562753
 0.2471537649253166  

In [14]:
laff.copy!( y, yold )   # save the original vector y

Mvmult_n_unb_var2B!( A, x, y )

println( "y after =" )
y

y after =


4-element Array{Float64,1}:
 1.4802347214421379
 0.6568067109235751
 0.4907152122023921
 0.6406544444572424

In [15]:
println( "y - ( A * x + yold ) = " )
y - ( A * x + yold )

y - ( A * x + yold ) = 


4-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0

Bingo, it seems to work!  (Notice that we are doing floating point computations, which means that due to rounding you may not get an exact "0".)

## Watch your code in action!

Copy and paste the code into <a href="http://edx-org-utaustinx.s3.amazonaws.com/UT501x/PictureFlame/PictureFLAME.html"> PictureFLAME </a>, a webpage where you can watch your routine in action.  Just cut and paste into the box.  

Disclaimer: we implemented a VERY simple interpreter.  If you do something wrong, we cannot guarantee the results.  But if you do it right, you are in for a treat.

If you want to reset the problem, just click in the box into which you pasted the code and hit "next" again.