<a href="https://colab.research.google.com/github/stephenbeckr/randomized-algorithm-class/blob/master/Demos/demo03_FrobeniusNorm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo 3: calculating the Frobenius norm, looping over rows vs columns

Demonstrates effect of stride length, and row- or column-based storage

See also the `c` language demo

Stephen Becker, Aug 2021, APPM 5650 Randomized Algorithms, University of Colorado Boulder

In [1]:
import numpy as np
rng = np.random.default_rng(12345)

In [24]:
def FrobeniusNormByRow(A, use_blas = True):
  """ Outer loop over rows (inner loop over columns) """
  m,n = A.shape
  nrm = 0.
  if use_blas:
    for row in range(m):
      nrm += np.linalg.norm( A[row,:] )**2  # this is Euclidean norm, not Frobenius
  else:
    for row in range(m):
      for col in range(n):
        nrm += A[row,col]**2
  return np.sqrt(nrm)

def FrobeniusNormByColumn(A, use_blas = True):
  """ Outer loop over columns (inner loop over rows) """
  m,n = A.shape
  nrm = 0.
  if use_blas:
    for col in range(n):
      nrm += np.linalg.norm( A[:,col] )**2  # this is Euclidean norm, not Frobenius
  else:
    for col in range(n):
      for row in range(m):
        nrm += A[row,col]**2
  return np.sqrt(nrm)

#### Run some experiments

In [26]:
n   = int(1e4)
A   = rng.standard_normal( size=(n,n) )

%time nrm = np.linalg.norm(A)
print(f'The true norm is {nrm-1e4:.6f} + 1e4')

CPU times: user 121 ms, sys: 1.02 ms, total: 122 ms
Wall time: 64.3 ms
The true norm is -1.311721 + 1e4


In [22]:
%time nrmRow = FrobeniusNormByRow(A, use_blas = True)
print(f'Looping over rows, the discrepancy in the norm is {nrmRow-nrm:.8f}')

CPU times: user 153 ms, sys: 0 ns, total: 153 ms
Wall time: 154 ms
Looping over rows, the discrepancy in the norm is -0.00000000


In [25]:
%time nrmRow = FrobeniusNormByColumn(A, use_blas = True)
print(f'Looping over columns, the discrepancy in the norm is {nrmRow-nrm:.8f}')

CPU times: user 615 ms, sys: 2.93 ms, total: 618 ms
Wall time: 628 ms
Looping over columns, the discrepancy in the norm is -0.00000000


### Repeat the experiment without using BLAS
Let's make the matrix smaller so we don't have to wait so long

Here there is less difference, because there's already a lot of overhead just due to the `for` loop (since Python isn't compiled)

In [31]:
n   = int(4e3)
A   = rng.standard_normal( size=(n,n) )

%time nrm = np.linalg.norm(A)
print(f'The true norm is {nrm-n:.6f} + ', n)

CPU times: user 18.9 ms, sys: 1.03 ms, total: 20 ms
Wall time: 10.4 ms
The true norm is -0.319010 +  4000


In [32]:
%time nrmRow = FrobeniusNormByRow(A, use_blas = True)
print(f'Looping over rows, the discrepancy in the norm is {nrmRow-nrm:.8f}')

%time nrmRow = FrobeniusNormByRow(A, use_blas = False)
print(f'Looping over rows (no BLAS), the discrepancy in the norm is {nrmRow-nrm:.8f}')

CPU times: user 44.9 ms, sys: 3.03 ms, total: 47.9 ms
Wall time: 51.7 ms
Looping over rows, the discrepancy in the norm is 0.00000000
CPU times: user 10.4 s, sys: 20.1 ms, total: 10.5 s
Wall time: 10.5 s
Looping over rows (no BLAS), the discrepancy in the norm is 0.00000000


In [33]:
%time nrmRow = FrobeniusNormByColumn(A, use_blas = True)
print(f'Looping over columns, the discrepancy in the norm is {nrmRow-nrm:.8f}')

%time nrmRow = FrobeniusNormByColumn(A, use_blas = False)
print(f'Looping over columns (no BLAS), the discrepancy in the norm is {nrmRow-nrm:.8f}')

CPU times: user 107 ms, sys: 2 ms, total: 109 ms
Wall time: 113 ms
Looping over columns, the discrepancy in the norm is 0.00000000
CPU times: user 10.6 s, sys: 18.7 ms, total: 10.6 s
Wall time: 10.6 s
Looping over columns (no BLAS), the discrepancy in the norm is -0.00000000
