# M280 Homework 2
* Author: Shuang Gao
* Date: 2018/05/02

## Question 1

1.1 Implement the algorithm with arguments:  $X$  (data, each row is a vectorized image), rank  $r$ , convergence tolerance, and optional starting point.

In [59]:
function nnmf(
 X::Matrix{T}, 
 r::Int;
 maxiter::Int = 1000, 
 tol::T = 1e-4,
 V::Matrix{T} = rand(T, size(X, 1), r),
 W::Matrix{T} = rand(T, r, size(X, 2))
 )  where T <: AbstractFloat
    
    # initialize L
    L = 0.0
    
    # initialize matrix B
    B = *(V, W)
    
    for t in 1:maxiter # stop after 1000 iterations
        # store the initial norm L
        # norm() is default to be Frobenius norm
        L = (norm((X - B)))^2
        
        # update v by element
        for k in 1:r, i in 1:size(X, 1)
            # update element V_ik
            V[i, k] = V[i, k] * vecdot(X[i, :], W[k, :]) / vecdot(B[i, :], W[k, :])
        end
        
        # update B with new V
        B = *(V, W)
        
        # update W by element
        for j in 1:size(X, 2), k in 1:r
            # update element W_kj
            W[k, j] = W[k, j] * vecdot(X[:, j], V[:, k]) / vecdot(B[:, j], V[:, k])
        end
        
        # update B
        B = *(V, W)
    
        # exit loop if criterion in part 3 less than tolerance
        if ((norm((X - B)))^2 - L) / (L + 1) < tol
            exit 
        end
        
        # update norm 
        L = (norm((X - B)))^2
    end  
    
    # Output
    return L, V, W
end
    
    

nnmf (generic function with 2 methods)

1.2 Database 1 from the MIT Center for Biological and Computational Learning (CBCL) reduces to a matrix  $X$  containing  $m=2,429$  gray-scale face images with  $n=19×19=361$  pixels per face. Each image (row) is scaled to have mean and standard deviation 0.25.
Read in the nnmf-2429-by-361-face.txt file, e.g., using readdlm() function, and display a couple sample images, e.g., using ImageView.jl package.

In [3]:
# add package for visulizating image
#Pkg.add("ImageView")
#Pkg.add("Images")
#Pkg.add("TestImages")
#Pkg.build("Cairo")
Pkg.pin("Cairo", v"0.4.0")
#Pkg.update()
using ImageView, Images

# readin matrix file
path_X = "http://Hua-Zhou.github.io/teaching/biostatm280-2018spring/hw/hw2/nnmf-2429-by-361-face.txt"

X = readdlm(download(path_X), ' ')

# show first three face images
for i in 1:3
    # extract one row pixels to form one face image
    img = reshape(X[1, :], 19, 19)
    imshow(img)
end
    


[1m[36mINFO: [39m[22m[36mPackage Cairo is already pinned to the selected commit
[39m

Dict{String,Any} with 4 entries:
  "gui"         => Dict{String,Any}(Pair{String,Any}("window", Gtk.GtkWindowLea…
  "roi"         => Dict{String,Any}(Pair{String,Any}("redraw", 145: "map(clim-m…
  "annotations" => 111: "input-38" = Dict{UInt64,Any}() Dict{UInt64,Any} 
  "clim"        => 110: "CLim" = ImageView.CLim{Float64}(0.0, 0.93656) ImageVie…

[1m[91mERROR: [39m[22m[91mCairo.CairoContext must implement reset_transform[39m


1.3  

In [53]:
# readin matrix X
path_X = "http://Hua-Zhou.github.io/teaching/biostatm280-2018spring/hw/hw2/nnmf-2429-by-361-face.txt"
X = readdlm(download(path_X), ' ')

# readin matrix V0
path_V0 = "http://Hua-Zhou.github.io/teaching/biostatm280-2018spring/hw/hw2/V0.txt"
V0 = readdlm(download(path_V0), ' ')

# readin matrix W0
path_W0 = "http://Hua-Zhou.github.io/teaching/biostatm280-2018spring/hw/hw2/W0.txt"
W0 = readdlm(download(path_W0), ' ');


In [None]:
# when r = 10
V = V0[:, 1:10]
W = W0[1:10, :]

#@time nnmf(X, 10, maxiter = 1000, tol = 1e-4, V = V0, W = W0)
@time L1, M1, N1 = nnmf(X, 10, maxiter = 1000, tol = 1e-4, V = V0, W = W0)

In [None]:
# when r = 20
V = V0[:, 1:20]
W = W0[1:20, :]

@time L2, M2, N2 = nnmf(X, 20, maxiter = 1000, tol = 1e-4, V = V, W = W)
# @time nnmf(X, 20, maxiter = 1000, tol = 1e-4, V = V0, W = W0)

In [None]:
# when r = 30
V = V0[:, 1:30]
W = W0[1:30, :]

@time L3, M3, N3 = nnmf(X, 30, maxiter = 1000, tol = 1e-4, V = V, W = W)
# @time nnmf(X, 30, maxiter = 1000, tol = 1e-4, V = V0, W = W0)

In [None]:
# when r = 40
V = V0[:, 1:40]
W = W0[1:40, :]

@time L4, M4, N4 = nnmf(X, 40, maxiter = 1000, tol = 1e-4, V = V, W = W)
# @time nnmf(X, 40, maxiter = 1000, tol = 1e-4, V = V0, W = W0)

In [None]:
# when r = 50
V = V0[:, 1:50]
W = W0[1:50, :]

@time L5, M5, N5 = nnmf(X, 50, maxiter = 1000, tol = 1e-4, V = V, W = W)
# @time nnmf(X, 50, maxiter = 1000, tol = 1e-4, V = V0, W = W0)

1.4 Choose an $r∈{10,20,30,40,50}$  and start your algorithm from a different $V(0)$  and  $W(0)$ . Do you obtain the same objective value and  $(V,W)$ ? Explain what you find.

* **Choose $r = 10$**

In [None]:
V = V0[:, 41:50]
W = W0[41:50, :]

L6, M6, N6 = nnmf(X, 10, maxiter = 1000, tol = 1e-4, V = V, W = W)


1.5 For the same  $r$ , start your algorithm from  $v(0)_{ik}=w(0)_{kj}=1$  for all  $i,j,k$ . Do you obtain the same objective value and  $(V,W)$ ? Explain what you find.

In [None]:
V = ones(2429, 10)
W = ones(10, 361)
L7, M7, N7 = nnmf(X, 10, maxiter = 1000, tol = 1e-4, V = V, W = W)

1.6 Plot the basis images (rows of  $W$ ) at rank  $r=50$ . What do you find?