Here we load the adjacency matrix of a graph with 2790 nodes. Each node is a web page referring to Roswell, NM, and the edges represent links between web pages.

In [1]:
using JLD
vars = load("roswelladj.jld")       # get from the book's website
i = vars["i"];  j = vars["j"];

using SparseArrays
A = sparse(i,j,fill(1.0,size(i)),2790,2790)
varinfo(r"A")                       # to see memory consumption

| name |        size | summary                                  |
|:---- | -----------:|:---------------------------------------- |
| A    | 110.516 KiB | 2790×2790 SparseMatrixCSC{Float64,Int32} |


We may define the density of $\mathbf{A}$ as the number of nonzeros divided by the total number of entries.

In [2]:
m,n = size(A)
@show density = nnz(A) / (m*n);

density = nnz(A) / (m * n) = 0.0010902994565845762


We can compare the storage space needed for the sparse $\mathbf{A}$ with the space needed for its dense or full counterpart. This ratio can never be as small as the density of nonzeros, because of the need to store locations as well as data. However, it's still quite small here, even though the matrix is not really large.

In [3]:
F = Matrix(A)
varinfo(r"F")

| name |       size | summary                    |
|:---- | ----------:|:-------------------------- |
| F    | 59.388 MiB | 2790×2790 Array{Float64,2} |


In [4]:
@show storageratio = 154000/59e6;

storageratio = 154000 / 5.9e7 = 0.0026101694915254235


Matrix-vector products are also much faster using the sparse form, because operations with structural zeros are skipped.

In [5]:
x = randn(n)
@elapsed for i = 1:200; A*x; end

0.047325839

In [6]:
@elapsed for i = 1:200; F*x; end

1.165482995

However, the sparse storage format is column-oriented. Operations on rows may take a lot longer than similar ones on columns. (Note: Such behavior is dramatic here for MATLAB, but not Julia.)

In [7]:
v = A[:,1000]
println("time for replacing columns:")
for i = 1:n; A[:,i]=v; end  # discard, improves timing accuracy
@elapsed for i = 1:n; A[:,i]=v; end

time for replacing columns:


0.19418728

In [8]:
r = v'
println("time for replacing rows:")
for i = 1:n; A[i,:]=r; end  # discard, improves timing accuracy
@elapsed for i = 1:n; A[i,:]=r; end

time for replacing rows:


0.241511209