# ECE367: PS02 Part 2 -- PageRank

## Framing

$N$ is the number of URL's. $J$ is the adjacency matrix.

### Power Iteration Method

* Method for calculating eigenvalues/vectors for diagonalizable matrix. 
* 

## Steps

- [ ] Load the pagerank data from `pagerank_urls.txt`, `pagerank_adj.mat`.
- [ ] Based on adjacency matrix $J$ calculate $$A_{i, j} = \frac{J_{i, j}}{\sum_{k=1}^{N}J_{k,j}}$$
    - [ ] Verify that the rows add to 1.
- [ ] Implement the **power iteration method** (OptM 7.1.1) for 10 iterations.
    - [ ] Calculate $e(k+1) = ||Ax(k+1) - x(k+1)||_2$
    - [ ] Plot $\log(e(k+1))$ vs. k
- [ ] Implement **shift-invert power iteration** and **Rayleigh quotient iteration** algorithms (OptM 7.1.2, 7.1.3). 
    - [ ] For shift-invert: $\sigma = 0.99$.
    - [ ] For Rayleigh quotient: $\sigma_1 = \sigma_2 = 0.99$ for first two iterations.
        - [ ] $\sigma_k = \frac{x^*(k) Ax(k)}{x^*(k)x(k)}$ for k > 2.
    - [ ] Plot $\log(e(k+1))$ vs. k for each plot.
- [ ] List the (page index, PageRank score) tuples for **top 5** and **bottom 5** pages according to PageRank scores.

In [92]:
# IMPORT BOX #
# IMPORT BOX #
using Plots
using Plotly
using GR
using SymPy
using MAT
using LinearAlgebra

plotly()

Plots.PlotlyBackend()

In [93]:
file_J = matopen("PS02_dataSet/pagerank_adj.mat")
J = read(file_J)["J"];
typeof(J)

Array{Float64,2}

In [94]:
# Normalization function for J
function normalize_J(J::Array{Float64,2})
    A = zeros(size(J))
    
    for i = 1:size(J,1)
        if(sum(J[:,i]) == 0)
            println("Sum of row ",i," equals 0...")
        else
            A[:,i] = J[:,i]/sum(J[:,i])
        end
        
    end
    A
end

normalize_J (generic function with 1 method)

In [95]:
A = normalize_J(J);
x = sum(A, dims=1);

In [96]:
good_sum = true

for i = 1:size(x,1)
    if abs(x[i] - 1) > 0.00001
        good_sum = false
        println("incorrect sum at ",i," -- x[i] = ",x[i])
    end
end

if good_sum
    println("Sum of A's columns are all 1. Checks out")
else
    println("Sum of A's columns are not all 1.")
end

x = transpose(x) # Turning x into a column vector

Sum of A's columns are all 1. Checks out


2571×1 Transpose{Float64,Array{Float64,2}}:
 1.0
 1.0
 0.9999999999999999
 1.0
 1.0
 1.0
 1.0
 0.9999999999999999
 1.0
 1.0
 1.0
 1.0
 1.0
 ⋮
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

In [97]:
function get_err(A::Array{Float64,2}, x)
    norm(A*x - x)
end

get_err (generic function with 2 methods)

In [102]:
function power_iteration(A::Array{Float64,2}, x, num_iters)
    if norm(x) != 1
        x = x./norm(x)
    end
    
    errors = zeros(num_iters)
    
    λ = 0   
    for k = 1:num_iters
        y = A*x
        x = y/norm(y)
        λ = transpose(x)*A*x
        errors[k] = get_err(A, x)
    end
    
    return (x, errors)
end

power_iteration (generic function with 2 methods)

In [113]:
x_new, errors = power_iteration(A, x, 10);

In [114]:
Plots.plot(errors, label)
Plots.title!("Power Iteration: Error vs. Iteration")
Plots.xlabel!("k")
Plots.ylabel!("e(k+1)")