# Homework Assignment 5: Recursion and the Spectral Theorem

During the lecture we proved that we can transform an arbitrary real orthogonal matrix $A$ into its canonical form using an orthogonal change-of-basis matrix $Q$ 
$$ \begin{equation*}
       Q^T A Q = \begin{bmatrix}
       \pm 1 & & & & & \\
       & \ddots & & & & \\
       & & \pm 1 & & & \\
       & & & R_1 & & \\
       & & & & \ddots & \\
       & & & & & R_k
   \end{bmatrix} \tag{$\star$}
   \end{equation*} $$
where $R_1, \dots, R_m$ are rotation matrices of the form
   $$\begin{bmatrix}
       \cos(\theta) & -\sin(\theta) \\
       \sin(\theta) & \cos(\theta)
   \end{bmatrix},$$
   where $\cos(\theta) = \text{Re}(\lambda)$ for any complex eigenvalue $\lambda$ of $A$.  The prove was by induction, which also sketches a procedure for how to bring a matrix $A$ into this form. 

Induction proves can numerically be implemented by a method called recursion, where a complicated problem is reduced to smaller instances until we reach a base case for which a solution is known and from which the solutions to the larger problems can be derived. You will be introduced to this way of programming first and then tackle the problem of implementing the Spectral decomposition. 

In [164]:
# All Import Statements Defined Here
# Note: Do not add to this list anywhere.
# ----------------

import math
import numpy as np
import numpy.linalg as la

## Recursive functions

A recursive function is a function that calls itself until it doesn't. Typically, you use a recursive function to divide a big problem that is difficult to solve into smaller problems that are easier to solve. In programming, you'll often find recursive functions used in data structures and algorithms like trees, graphs and binary searches.

The following `fn()` shows the general structure of a recursive function (ignore the error message).

In [162]:
def fn():
    #...
    if condition:
        # stop calling itself
    else:
        fn()
    #...

IndentationError: expected an indented block after 'if' statement on line 3 (2825268492.py, line 5)

Suppose you need to develop a `countdown` function that counts down from a specified number to zero. For example, if you call the function that counts down from 3, it'll show the output

    3
    2
    1
    Liftoff!

The following defines a recursive `countdown()` function and calls it by passing the number 3.

In [165]:
def countdown(start):
    """ Count down from a positive number """
    print(start)
    countdown(start-1)

#countdown(3)

If you execute this program, it will print a bunch of numbers, until an error message appears telling you the maximum recursion depth has been exceeded. The reason is that `countdown` calls itself indefinitely until the system stops it. Since we need to stop counting down when we reach zero, we need to add a condition as follows:

In [166]:
def countdown(start):
    """ Count down from a positive number """
    if start == 0:
        print("Liftoff!")
    else:
        print(start)
        countdown(start-1) 

countdown(3)

3
2
1
Liftoff!


In the example, `countdown` only calls itself when the next number is greater than zero. If the next number is zero, it is time for liftoff. 

### Recursive functions to calculate sums

Suppose that you need to calculate a sum of a sequence, e.g. from 1 to 100. In the first assignment we saw a simple way to do this using a for loop:

In [167]:
def arithmetic_sum(n):
    total = 0
    for k in range(1, n+1):
        total += k
    return(total)

print(arithmetic_sum(100))

5050


To apply the recursion technique, you can calculate the sum of the sequence from 1 to n as follows:
- recursive_sum(n) = n + recursive_sum(n-1)
- recursive_sum(n-1) = (n-1) + recursive_sum(n-2)
- ...
- recursive_sum(2) = 2 + recursive_sum(1)
- recursive_sum(1) = 1 + recursive_sum(0)

The `recursive_sum` keeps calling itself as long as its argument is greater than zero. When it reaches the "base case" $n=0$ we know already what the output should be. The following function defines the recursive version:

In [168]:
def recursive_sum(n):
    if n == 0:
        return(0)
    else:
        return(n + recursive_sum(n-1))
    
print(recursive_sum(100))

5050


## Problem 1: Getting started with recursive functions (1 point)

Implement the function `power` below in a recursive way, that is, without using the built-in power-operator `**` or the built-in function `pow`. 

*Hint: mimic the function `recursive_sum` above. Think about what the "base case" should be. Moreover you may assume the convention that $0^0 = 1$ for convenience (although this isn't always a natural assumption, see [here](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero)).*

In [169]:
def power(x, n):
    """
    Returns the result of x raised to the power n (i.e. x^n) 
    in a recursive manner. 
    
    Parameters
    ----------
        x (float): base number
        n (int): non-negative integer to which power x will be raised
        
    Returns
    -------
        y (float): result x raised to the power n
    
    """
    
    # YOUR CODE HERE
    if n == 0:
        return(1)
    else:
        return(x * power(x, n-1))


In [None]:
# You can use this code cell to play around with your functions to make sure
# they do what they are intended to do, i.e. to debug your code. 


In [170]:
# Test cases
y = power(2, 10)
print("2^10 =", y)

y = power(3, 5)
print("3^5 =", y)

y = power(5, 0)
print("5^0 =", y)

2^10 = 1024
3^5 = 243
5^0 = 1


Expected output:

    2^10 = 1024
    3^5 = 243
    5^0 = 1

In [171]:
# AUTOGRADER, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.

y1 = power(2, 10)
assert np.allclose(y1, 1024)

y2 = power(3, 5)
assert np.allclose(y2, 243)

y3 = power(5, 0)
assert np.allclose(y3, 1)


## Problem 2: Maximum in a list (1 point)

Implement a recursive function `recursive_maximum` below that returns the maximum value of a non-empty list of integers. You are **not allowed to use** Python's built-in `max` function or any kind of loop (`for`, `while`). The function must use recursion.

In [172]:
def recursive_maximum(lst):
    """
    Recursively computes the maximum element of a non-empty list.

    Parameters
    ----------
        lst (list): A non-empty list of integers.

    Returns
    -------
        (int): The maximum value in the list
    
    """

    # YOUR CODE HERE
    if len(lst) == 1:
        return(lst[0])
    else:
        if lst[0] > recursive_maximum(lst[1:]):
            return(lst[0])
        else:
            return(recursive_maximum(lst[1:]))

In [None]:
# You can use this code cell to play around with your functions to make sure
# they do what they are intended to do, i.e. to debug your code. 


In [173]:
# Test cases
test_list = [3, 7, 2, 9, 5]
print(f"The maximum element of {test_list} is {recursive_maximum(test_list)}.")

test_list = [42]
print(f"The maximum element of {test_list} is {recursive_maximum(test_list)}.")

The maximum element of [3, 7, 2, 9, 5] is 9.
The maximum element of [42] is 42.


Expected output:

    The maximum element of [3, 7, 2, 9, 5] is 9.
    The maximum element of [42] is 42.

In [174]:
# AUTOGRADER, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.

m1 = recursive_maximum([3, 7, 2, 9, 5])
assert np.allclose(m1, 9)

m2 = recursive_maximum([42])
assert np.allclose(m2, 42)


## Problem 3: Permutation sequences (2 points)

The list \[1, 2, ..., n\] contains a total of $n!$ unique permutations. By listing and labeling all of the permutations in ascending order, we get the following sequence for $n=3$:
    
    1. "123"
    2. "132"
    3. "213"
    4. "231"
    5. "312"
    6. "321"

Your task is to implement the function `permutation` below that finds for a list of distinct digits the $k^{\text{th}}$ permutation sequence. For example the third permutation sequence for \[1, 2, 3\] would be "213". Note that your output should be string, not an integer!

*Hint 1. Your first realization should be that for each starting number there are (n-1)! possibilities. E.g. for \[1, 2, 3, 4\] the possible permutation sequences are:*
    
    1. "1234"       7. "2134"      13. "3124"      19. "4123"
    2. "1243"       8. "2143"      14. "3142"      20. "4132"
    3. "1324"       9. "2314"      15. "3214"      21. "4213"
    4. "1342"      10. "2341"      16. "3241"      22. "4231"
    5. "1423"      11. "2413"      17. "3412"      23. "4312"
    6. "1432"      12. "2431"      18. "3421"      24. "4321"

*So we see for each starting number there are six possibilities, which is the same as 3!.*

*Hint 2. You can find the starting number for the permutation by considering (k-1) // (n-1)!, that is integer division (how many times does (n-1)! fit into (k-1)). This is because for each possible starting number there are (n-1)! possibilities. So if we divide k-1 by this, we will get the index of the starting number in the list of digits.*

*Hint 3. If you found the starting number, you can remove it from your list of digits. Remember the method `list.remove(item)`. Now you need to repeat the process by applying `permutation` to the remaining digits. Think about what the new `k` should be.*

*Hint 4. What should be the base case in your recursion? Various options are possible.* 

*Hint 5. You can concatenate strings using the `+` operator. So `"4" + "213"` will create `"4213"`*

*Hint 6. You can use the built-in factorial function `math.factorial`.*

In [177]:
def permutation(digits, k):
    """
    Returns the kth permutation sequence. 
    
    Parameters
    ----------
        digits (list): list containing n distinct digits between 1 and 9. 
        k (int): integer between 1 and n!. 
        
    Returns
    -------
        s (str): string representing the kth permutation sequence.
    
    """
    
    # YOUR CODE HERE
    if len(digits) == 0:
        return("")
    else:
        n = len(digits)
        index = (k - 1) // math.factorial(n - 1)
        select_digit = digits[index]
        remain_digits = digits[:index] + digits[index + 1:]
        new_k = k - index * math.factorial(n - 1)
        return(str(select_digit) + permutation(remain_digits, new_k))


In [None]:
# You can use this code cell to play around with your functions to make sure
# they do what they are intended to do, i.e. to debug your code. 


In [178]:
# Test cases
digits = list(range(1, 4))
k = 3
print("For [1, 2, 3] the third permutation sequence with is", permutation(digits, k))

digits = list(range(1, 5))
k = 9
print("For [1, 2, 3, 4] the ninth permutation sequence with is", permutation(digits, k))

digits = list(range(1, 7))
k = 451
print("For [1, 2, 3, 4, 5, 6] the 451st permutation sequence with is", permutation(digits, k))

For [1, 2, 3] the third permutation sequence with is 213
For [1, 2, 3, 4] the ninth permutation sequence with is 2314
For [1, 2, 3, 4, 5, 6] the 451st permutation sequence with is 456123


Expected output:
    
    For [1, 2, 3] the third permutation sequence with is 213
    For [1, 2, 3, 4] the ninth permutation sequence with is 2314
    For [1, 2, 3, 4, 5, 6] the 451st permutation sequence with is 456123

In [179]:
# AUTOGRADER, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.

assert permutation(list(range(1,4)), 3) == "213"
assert permutation(list(range(1,5)), 1) == "1234"
assert permutation(list(range(1,5)), 9) == "2314"
assert permutation(list(range(1,5)), 24) == "4321"
assert permutation(list(range(1,7)), 451) == "456123"


## Problem 4: Spectral Decomposition of Orthogonal Matrices (6 = 1 + 2 + 3 points)

The proof of the Spectral Decomposition of an orthogonal matrix (see the reader or appendix below) uses induction, which shows great similarities with the recursive problems discussed earlier. Your task will be to implement the Spectral decomposition of orthogonal matrices in Python, the procedure of which should be in line with the proof of the spectral theorem. 

Implement the functions `orthonormal_basis`, `base_case` and `spectral_decomposition` below. 

Important instructions:

1. For `orthonormal_basis` it might be useful to think about how you can use the built-in QR factorization `np.linalg.qr` using the 'complete' mode, you can find its documentation [here](https://numpy.org/doc/stable/reference/generated/numpy.linalg.qr.html). Note that this approach might alter the sign of the original columns, do not worry about this!

2. Each iteration use the first eigenvalue and eigenvector from the eVals and eVecs to get results that are consistent with the provided test cases. Mathematically it does not matter which one you choose, but it might be easier to check your work if you stick to this rule. 

3. You can check if an eigenvalue is real or complex using the methods `np.isreal` or `np.iscomplex`, see the documentation [here](https://numpy.org/doc/stable/reference/generated/numpy.isreal.html). **However**, the eigenvalue/eigenvector function `np.linalg.eig` is only an approximation of the actual eigenvalues/eigenvectors and might make rounding errors that makes real eigenvalues be treated as complex eigenvalues. For example the matrix $\begin{bmatrix} 3 & 0 & 0 \\ 1 & 1 & 2 \\ 1 & -2 & 5 \end{bmatrix}$ only has eigenvalue $\lambda=3$, but Python will return $\lambda=3+0.00000003i, \lambda=3-0.00000003i$ and $\lambda=3$. So **instead** of using `np.isreal` to check if an eigenvalue is real, you should use `np.isclose` to check if the imaginary part of an eigenvalue is close to zero, which you can find using `np.imag`. 

5. When your matrix has complex eigenvalues, the eVecs matrix containing eigenvectors will show real numbers as complex numbers. That is, the number 3 will be treated as 3 + 0j. To make everything print nicely (which makes it easier to verify the test cases), convert the eigenvectors to display real values using `np.real`.

6. You should skip the final step in the proof, namely that of reshuffling the columns of $Q$ to make the matrix $B$ be in the form ($\star$). 

In [180]:
def orthonormal_basis(A):
    """
    Computes the matrix Q whose first k columns form an orthonormal basis for
    the image of A, and extends this to an orthonormal basis of R^n so that
    Q is an orthogonal matrix
    
    Parameters
    ----------
        A (ndarray): n x k matrix with linearly independent columns
        
    Returns
    -------
        Q (ndarray): n x n orthogonal matrix.
    
    """
    
    # YOUR CODE HERE
    Q ,R = np.linalg.qr(A, mode='complete')
    return Q
    

In [182]:
def base_case(A):
    """
    Computes the matrices Q and B in the Spectral decomposition of A, 
    that is A = Q B Q^T, in the special casess that A is either
    a 1x1 or 2x2 matrix. Implementation should be in line with that 
    as discussed in the proof of the reader
    
    Parameters
    ----------
        A (ndarray): an arbitrary 1x1 or 2x2 orthogonal matrix with 
                     real entries.
        
    Returns
    -------
        Q (ndarray): orthogonal matrix.
        B (ndarray): matrix in canonical form (⋆) such that QBQ^T = A.
    
    """
    
    n = A.shape[0]
    eVals, eVecs = la.eig(A)  # The eig function returns normalized eigenvectors
    
    # YOUR CODE HERE
    if n == 1:
        Q = np.array([[1]])
        B = A
    else:
        if np.isreal(eVals).all():
            Q = eVecs
            B = np.diag(np.sign(eVals))

            if Q[0,0] > 0:
                Q[:,0] = -Q[:,0]
        else:
            w = eVecs[:, 0]

            x = np.real(w)
            y = np.imag(w)

            Q = np.array([np.sqrt(2)*y/la.norm(w), np.sqrt(2)*x/la.norm(w)]).T
            B = Q.T @ A @ Q
    
    return Q, B


In [183]:
def spectral_decomposition(A):
    """
    Computes the matrices Q and B in a Spectral decomposition of A, 
    that is A = Q B Q^T. Implementation should be in line with that
    as discussed in the proof of the textbook.
    
    Parameters
    ----------
        A (ndarray): an orthogonal n x n matrix.
        
    Returns
    -------
        Q (ndarray): orthogonal n x n matrix.
        B (ndarray): matrix in canonical form (⋆) such that QBQ^T = A.
    
    """
    
    n = A.shape[0]
    
    if n <= 2:
        Q, B = base_case(A)
        return(Q, B)
    
    eVals, eVecs = la.eig(A)
    
    # YOUR CODE HERE
    lam = eVals[0]
    w = eVecs[:, 0]

    if np.isclose(np.imag(lam), 0):
        v = np.real(w).reshape(-1,1)

        Q1 = orthonormal_basis(v)

        A1 = Q1.T @ A @ Q1
        B1 = A1[1:,1:]
        Q2, B2 = spectral_decomposition(B1)

        Q = Q1 @ np.block([
            [np.array([[1]]), np.zeros((1,n-1))],
            [np.zeros((n-1,1)), Q2]
        ])
        B = np.block([
            [np.array([[np.real(lam)]]), np.zeros((1,n-1))],
            [np.zeros((n-1,1)), B2]
        ])

    else:
        x = np.real(w)
        y = np.imag(w)

        u1 = np.sqrt(2) * y / la.norm(w)
        u2 = np.sqrt(2) * x / la.norm(w)

        V = np.column_stack([u1,u2])
        Q1 = orthonormal_basis(V)

        A1 = Q1.T @ A @ Q1
        B1 = A1[2:,2:]
        Q2, B2 = spectral_decomposition(B1)

        theta = np.angle(lam)
        R = np.array([[np.cos(theta), -np.sin(theta)],
                      [np.sin(theta),  np.cos(theta)]])

        Q = Q1 @ np.block([
            [np.eye(2), np.zeros((2,n-2))],
            [np.zeros((n-2,2)), Q2]
        ])
        B = np.block([
            [R, np.zeros((2,n-2))],
            [np.zeros((n-2,2)), B2]
        ])

    return np.real(Q), np.real(B)
    

In [None]:
# You can use this code cell to play around with your functions to make sure
# they do what they are intended to do, i.e. to debug your code. 


In [184]:
# Test case for orthonormal_basis
# Note that the sign of the first column has changed, don't worry about this!
np.set_printoptions(precision=2, suppress=True)  # Print zeros and decimals nicely

A = np.array([[0, 1], [1, 1], [2, 1], [2, 1]])
Q = orthonormal_basis(A)
print("A = \n", A, "\n")
print("Q = \n", Q)

A = 
 [[0 1]
 [1 1]
 [2 1]
 [2 1]] 

Q = 
 [[ 0.   -0.9  -0.3  -0.3 ]
 [-0.33 -0.4   0.6   0.6 ]
 [-0.67  0.1   0.35 -0.65]
 [-0.67  0.1  -0.65  0.35]]


Expected output:

    A = 
     [[0 1]
     [1 1]
     [2 1]
     [2 1]] 

    Q = 
     [[ 0.   -0.9  -0.3  -0.3 ]
     [-0.33 -0.4   0.6   0.6 ]
     [-0.67  0.1   0.35 -0.65]
     [-0.67  0.1  -0.65  0.35]]

In [185]:
# Test cases for base_case
np.set_printoptions(precision=2, suppress=True)  # Print zeros and decimals nicely

A1 = np.array([[-1]])
Q1, B1 = base_case(A1)
print("Q1 =\n", Q1, "\nB1 =\n", B1, "\n")

A2 = np.array([[0, 1], [1, 0]])
Q2, B2 = base_case(A2)
print("Q2 =\n", Q2, "\nB2 =\n", B2, "\n")

A3 = 1/math.sqrt(2) * np.array([[1, 1], [-1, 1]])
Q3, B3 = base_case(A3)
print("Q3 =\n", Q3, "\nB2 =\n", B3, "\n")

Q1 =
 [[1]] 
B1 =
 [[-1]] 

Q2 =
 [[-0.71 -0.71]
 [-0.71  0.71]] 
B2 =
 [[ 1.  0.]
 [ 0. -1.]] 

Q3 =
 [[0. 1.]
 [1. 0.]] 
B2 =
 [[ 0.71 -0.71]
 [ 0.71  0.71]] 



Expected output:

    Q1 =
     [[1.]] 
    B1 =
     [[-1]] 

    Q2 =
     [[-0.71 -0.71]
     [-0.71  0.71]] 
    B2 =
     [[ 1. -0.]
     [-0. -1.]] 

    Q3 =
     [[0. 1.]
     [1. 0.]] 
    B2 =
     [[ 0.71 -0.71]
     [ 0.71  0.71]] 

In [186]:
# Test cases for spectral_decomposition
np.set_printoptions(precision=2, suppress=True)  # Print zeros and decimals nicely

A1 = 1/3*np.array([[2,1,-2],[1,2,2],[2,-2,1]])
Q1, B1 = spectral_decomposition(A1)
print("Q1 =\n", Q1, "\nB1 =\n",B1, "\n")

A2 = np.array([[0, 0, 1], [1, 0, 0], [0, 1, 0]])
Q2, B2 = spectral_decomposition(A2)
print("Q2 =\n", Q2, "\nB2 =\n",B2, "\n")

A3 = 1/3*np.array([[2,1,-2],[1,2,2],[-2,2,-1]])
Q3, B3 = spectral_decomposition(A3)
print("Q3 =\n", Q3, "\nB3 =\n",B3, "\n")

A4 = 1/2*np.array([[1,1,1,1],[1,1,-1,-1],[1,-1,1,-1],[1,-1,-1,1]])
Q4, B4 = spectral_decomposition(A4)
print("Q4 =\n", Q4, "\nB4 =\n",B4, "\n")

A5 = np.array([[1,0,0,0],[0,0,1,0],[0,-1,0,0],[0,0,0,1]])
Q5, B5 = spectral_decomposition(A5)
print("Q5 =\n", Q5, "\nB5 =\n",B5, "\n")

Q1 =
 [[-0.71 -0.   -0.71]
 [-0.71 -0.    0.71]
 [-0.    1.   -0.  ]] 
B1 =
 [[ 1.    0.    0.  ]
 [ 0.    0.33 -0.94]
 [ 0.    0.94  0.33]] 

Q2 =
 [[-0.71 -0.41  0.58]
 [ 0.71 -0.41  0.58]
 [ 0.    0.82  0.58]] 
B2 =
 [[-0.5  -0.87  0.  ]
 [ 0.87 -0.5   0.  ]
 [ 0.    0.    1.  ]] 

Q3 =
 [[-0.91  0.    0.41]
 [-0.18 -0.89 -0.41]
 [ 0.37 -0.45  0.82]] 
B3 =
 [[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0. -1.]] 

Q4 =
 [[-0.87  0.    0.5  -0.  ]
 [-0.29 -0.82 -0.5  -0.  ]
 [-0.29  0.41 -0.5   0.71]
 [-0.29  0.41 -0.5  -0.71]] 
B4 =
 [[ 1.  0.  0.  0.]
 [ 0.  1.  0.  0.]
 [ 0.  0. -1.  0.]
 [ 0.  0.  0.  1.]] 

Q5 =
 [[ 0.  0.  1.  0.]
 [ 0.  1.  0.  0.]
 [-1.  0.  0.  0.]
 [ 0.  0.  0.  1.]] 
B5 =
 [[ 0. -1.  0.  0.]
 [ 1.  0.  0.  0.]
 [ 0.  0.  1.  0.]
 [ 0.  0.  0.  1.]] 



Expected output:

    Q1 =
     [[-0.71  0.71  0.  ]
     [-0.71 -0.71  0.  ]
     [ 0.   -0.    1.  ]] 
    B1 =
     [[ 1.    0.   -0.  ]
     [-0.    0.33 -0.94]
     [ 0.    0.94  0.33]] 

    Q2 =
     [[-0.71 -0.41  0.58]
     [ 0.71 -0.41  0.58]
     [ 0.    0.82  0.58]] 
    B2 =
     [[-0.5  -0.87  0.  ]
     [ 0.87 -0.5  -0.  ]
     [ 0.    0.    1.  ]] 

    Q3 =
     [[-0.91  0.    0.41]
     [-0.18 -0.89 -0.41]
     [ 0.37 -0.45  0.82]] 
    B3 =
     [[ 1. -0. -0.]
     [ 0.  1.  0.]
     [-0.  0. -1.]] 

    Q4 =
     [[-0.87  0.   -0.   -0.5 ]
     [-0.29 -0.82 -0.    0.5 ]
     [-0.29  0.41 -0.71  0.5 ]
     [-0.29  0.41  0.71  0.5 ]] 
    B4 =
     [[ 1. -0.  0. -0.]
     [-0.  1.  0.  0.]
     [-0.  0.  1. -0.]
     [-0.  0. -0. -1.]] 

    Q5 =
     [[0. 0. 1. 0.]
     [0. 1. 0. 0.]
     [1. 0. 0. 0.]
     [0. 0. 0. 1.]] 
    B5 =
     [[ 0. -1.  0.  0.]
     [ 1.  0.  0.  0.]
     [ 0.  0.  1.  0.]
     [ 0.  0.  0.  1.]]

In [187]:
# AUTOGRADER FOR orthonormal_basis, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.
A = np.array([[1], [0]])
Q = orthonormal_basis(A)
assert np.allclose(Q.T @ Q, np.eye(2))
assert np.allclose(Q[:,1:2].T @ A, np.zeros((1,1)))

A1 = np.array([[1], [2], [3]])
Q1 = orthonormal_basis(A1)
assert np.allclose(Q1.T @ Q1, np.eye(3))
assert np.allclose(Q1[:,1:3].T @ A1, np.zeros((2,1)))

A2 = np.array([[0, 1], [1, 1], [2, 1], [2, 1]])
Q2 = orthonormal_basis(A2)
assert np.allclose(Q2.T @ Q2, np.eye(4))
assert np.allclose(Q2[:,2:4].T @ A2, np.zeros((2,2)))


In [188]:
# AUTOGRADER FOR base_case, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.
A1 = np.array([[-1]])
Q1, B1 = base_case(A1)
assert np.allclose(Q1, np.eye(1))
assert np.allclose(B1, A1)

A2 = np.array([[0, 1], [1, 0]])
Q2, B2 = base_case(A2)
assert np.allclose(Q2, 1/math.sqrt(2) * np.array([[-1, -1], [-1, 1]]))
assert np.allclose(B2, np.array([[1,0],[0,-1]]))

A3 = 1/math.sqrt(2) * np.array([[1, 1], [-1, 1]])
Q3, B3 = base_case(A3)
assert np.allclose(Q3, np.array([[0,1],[1,0]]))
assert np.allclose(B3, 1/math.sqrt(2) * np.array([[1,-1],[1,1]]))


In [190]:
# AUTOGRADER FOR spectral_decomposition, BEWARE OF POTENTIAL HIDDEN TEST CASES
# Note that this not an exhaustive check for correctness.
A1 = 1/3*np.array([[2,1,-2],[1,2,2],[2,-2,1]])
Q1, B1 = spectral_decomposition(A1)
assert np.allclose(Q1.T @ Q1, np.eye(3))
assert np.allclose(B1, np.array([[1, 0, 0], [0, 1/3, -math.sqrt(8/9)], [0, math.sqrt(8/9), 1/3]]))
assert np.allclose(Q1 @ B1 @ Q1.T, A1)

A2 = np.array([[0, 0, 1], [1, 0, 0], [0, 1, 0]])
Q2, B2 = spectral_decomposition(A2)
assert np.allclose(Q2.T @ Q2, np.eye(3))
assert np.allclose(B2, np.array([[-0.5, -math.sqrt(3/4), 0], [math.sqrt(3/4), -0.5, 0], [0, 0, 1]]))
assert np.allclose(Q2 @ B2 @ Q2.T, A2)

A3 = 1/3*np.array([[2,1,-2],[1,2,2],[-2,2,-1]])
Q3, B3 = spectral_decomposition(A3)
assert np.allclose(Q3.T @ Q3, np.eye(3))
assert np.allclose(B3, np.array([[1, 0, 0], [0, 1, 0], [0, 0, -1]]))
assert np.allclose(Q3 @ B3 @ Q3.T, A3)

A4 = 1/2*np.array([[1,1,1,1],[1,1,-1,-1],[1,-1,1,-1],[1,-1,-1,1]])
Q4, B4 = spectral_decomposition(A4)
assert np.allclose(Q4.T @ Q4, np.eye(4))
assert np.allclose(B4, np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, -1]]))
assert np.allclose(Q4 @ B4 @ Q4.T, A4)


AssertionError: 

## Appendix: Proof of Spectral Decomposition

We prove by induction on $n$ that a real symmetric $n \times n$ matrix is orthogonally similar to a diagonal matrix, i.e. there exists an orthogonal matrix $Q$ such that $Q^T A Q = D$ is diagonal. 

**Base case:** For a $1 \times 1$ matrix $A$ we can set $Q = [1]$ and the statement is trivially true for $n=1$. 

**Induction step:** Let $A$ be a $(n+1) \times (n+1)$ real symmetric matrix where $n \geq 1$ and assume that the claim is true for all $1 \leq k \leq n$. Let $\lambda_1$ be an eigenvalue of $A$, which exists by the fundamental theorem of algebra. Let $u_1$ be a unit eigenvector associated with $\lambda_1$. We can extend $\{u_1\}$ to an orthonormal basis $\{u_1, u_2, \dots, u_{n+1}\}$ of $\mathbb{R}^{n+1}$ and form the orthogonal matrix
    $$Q_1 = \begin{bmatrix}
        u_1 & u_2 & \dots & u_{n+1} 
    \end{bmatrix}.$$
the first column of $Q^T A Q$ is
$$Q^T A Q e_1 = Q^T A u_1 = Q^T(\lambda_1 u_1) = \lambda_1 Q^{-1} u_1 = \lambda_1 e_1.$$
Secondly we also note that $Q^T A Q$ is symmetric, since
$$(Q^T A Q)^T = Q^T A^T (Q^T)^T = Q^T A Q,$$
where we used that $A$ is symmetric. So $Q_1^T A Q_1$ will be a matrix of block form
    $$Q_1^T A Q_1 = \begin{bmatrix}
        \lambda_1 & 0 \\
        0 & A_1
    \end{bmatrix},$$
where $A_1$ is a real symmetric $n \times n$ matrix. By the induction hypothesis applied to $A_1$ there exists an orthogonal matrix $R$ such that $R^T A_1 R = D$ is a diagonal matrix. If we set
    $$Q_2 = \begin{bmatrix}
        1 & 0 \\
        0 & R
    \end{bmatrix} \quad \text{and} \quad Q = Q_1 Q_2,$$
then both $Q_2$ and $Q$ are $(n+1) \times (n+1)$ orthogonal matrices. Moreover
    $$\begin{align*}
        Q^T A Q = Q_2^T Q_1^T A Q_1 Q_2= 
        \begin{bmatrix}
            1 & 0 \\
            0 & R^T
        \end{bmatrix} \begin{bmatrix}
            \lambda_1 & 0 \\
            0 & A_1
        \end{bmatrix}\begin{bmatrix}
            1 & 0 \\
            0 & R
        \end{bmatrix} = \begin{bmatrix}
            \lambda_1 & 0 \\
            0 & R^T A_1 R
        \end{bmatrix} = \begin{bmatrix}
        \lambda_1 & 0 \\
        0 & D
    \end{bmatrix},
    \end{align*}$$
Hence it follows that there exists an orthogonal matrix $Q$ that brings $A$ into diagonal form, proving our claim. 

**Theorem (Spectral theorem for orthogonal matrices)** Let $A$ be an orthogonal $n \times n$ matrix. Then there exists an orthogonal matrix $Q$ such that 
\begin{equation}
   Q^T A Q = \begin{bmatrix}
   \pm 1 & & & & & \\
   & \ddots & & & & \\
   & & \pm 1 & & & \\
   & & & R_1 & & \\
   & & & & \ddots & \\
   & & & & & R_k
\end{bmatrix} \tag{$\star$}
\end{equation}
where $R_1, \dots, R_m$ are rotation matrices of the form
$$\begin{bmatrix}
   \cos(\theta) & -\sin(\theta) \\
   \sin(\theta) & \cos(\theta)
\end{bmatrix},$$
where $\cos(\theta) = \text{Re}(\lambda)$ for any complex eigenvalue $\lambda$ of $A$. 

**Proof:** Similar to the proof of the spectral theorem for symmetric matrices, we prove the claim again by induction on the dimension $n$. This time around we need to distinguish several base cases, where we first show the claim is true for all $1 \times 1$ and $2 \times 2$ matrices. 

**Base case (1):** An orthogonal $1 \times 1$ matrix $A$ must be of the form $\begin{bmatrix}
        \pm 1
    \end{bmatrix}$. So taking $Q = \begin{bmatrix}
        1
    \end{bmatrix}$ yields that $Q^T A Q = \begin{bmatrix}
        \pm 1
    \end{bmatrix}$ and the statement of the theorem is trivially true for $n=1$.

**Base case (2.1):** For an orthogonal $2 \times 2$ matrix $A$, let's first assume that the eigenvalues $\lambda_1$ and $\lambda_2$ of $A$ are both real, in which case we know that $\lambda_1, \lambda_2 \in \{1, -1\}$. Let $u_1$ be a unit eigenvector of $A$ corresponding to $\lambda_1$, and let $u_2$ be any unit vector that is orthogonal to $u_1$. If we set $Q = \begin{bmatrix}
    u_1 & u_2
\end{bmatrix}$, then $Q$ is an orthogonal matrix. The first column of $Q^T A Q$ is
$$Q^T A Q e_1 = Q^T A u_1 = Q^T (\lambda_1 u_1) = \lambda_1 Q^{-1} u_1 = \lambda_1 e_1.$$
Moreover, note that $Q^T A Q$ is a product of orthogonal matrices, so that $Q^T A Q$ is itself also orthogonal. Since the columns of an orthogonal matrix must be orthonormal, we deduce that the second column of $Q^T A Q$ must be a unit vector orthogonal to $\lambda_1 e_1 = \pm e_1$, i.e. the $Q^T A Q = \pm e_2$. We conclude that
$$Q^T A Q = \begin{bmatrix}
    \pm 1 & O \\ 
    O & \pm 1
\end{bmatrix}.$$
**Base case (2.2):** For an orthogonal $2 \times 2$ matrix $A$, let's now assume that the eigenvalues $\lambda_1$ and $\lambda_2$ are non-real, and let $w = x + iy$ be an eigenvector corresponding to $\lambda_1$. Setting $Q = \begin{bmatrix}
    \frac{\sqrt{2} y}{\|w\|} & \frac{\sqrt{2} x}{\|w\|}
\end{bmatrix}$, we know by Theorem ... that $Q$ is orthogonal and 
$$Q^T A Q = \begin{bmatrix}
    \cos(\theta) & -\sin(\theta) \\
    \sin(\theta) & \cos(\theta)
\end{bmatrix},$$
where $\cos(\theta) = \text{Re}(\lambda_1)$. This shows that the statement of the theorem is also true for $n=2$. 

**Induction hypothesis:** Suppose the claim is true for all $1 \leq k \leq n$, that is for all orthogonal $k \times k$ matrices $A$ there exists an orthogonal matrix $Q$ such that $Q^T A Q$ is of the form $(\star)$.

**Induction step:** Let $A$ be an orthogonal $(n+1) \times (n+1)$ matrix and let $\lambda_1$ be an eigenvalue of $A$, which exists by the fundamental theorem of algebra. We distinguish two cases again.

First, assume $\lambda_1$ is real (so $\lambda_1 \in \{1, -1\}$) and let $u_1$ be an associated unit eigenvector. We can extend $\{u_1\}$ to an orthonormal basis $\{u_1, u_2, \dots, u_{n+1}\}$ of $\mathbb{R}^{n+1}$ and form the orthogonal matrix
$$Q_1 = \begin{bmatrix}
        u_1 & u_2 & \dots u_{n+1}
    \end{bmatrix}.$$
As in the $2 \times 2$ case we see that the first column of $Q_1^T A Q_1$ will be $\lambda_1 e_1$ and moreover $Q_1^T A Q_1$ is also an orthogonal matrix. This means that $Q_1^T A Q_1$ is of the form
$$\begin{bmatrix}
    \pm 1 & O \\
    O & A_1
\end{bmatrix},$$
where $A_1$ is some $n \times n$ orthogonal matrix. 

Now assume that $\lambda_1$ is non-real and let $w = x + iy$ be an associated eigenvector of $A$. Let $u_1 = \frac{\sqrt{2} y}{\|w\|}$ and $u_2 = \frac{\sqrt{2} x}{\|w\|}$, which we know are orthonormal by \autoref{thm:rotation}. Extend $\{u_1, u_2\}$ to an orthonormal basis $\{u_1, u_2, \dots, u_{n+1}\}$ of $\mathbb{R}^{n+1}$ and define the orthogonal matrix
$$Q_1 = \begin{bmatrix}
    u_1 & u_2 & \dots & u_{n+1}
\end{bmatrix}.$$
By Theorem ... we know that $Q^T A Q$ is of the form
$$Q_1^T A Q_1 = \begin{bmatrix}
        \cos(\theta) & -\sin(\theta) & X \\
        \sin(\theta) & \cos(\theta) & X \\
        O & O & A_1
    \end{bmatrix},$$
where $X$ is a $2 \times (n-2)$ matrix and $A_1$ is an $(n-1) \times (n-1)$ matrix. Since $Q_1^T A Q_1$ is orthogonal we deduce that $X = 0$ and $A_1$ is an orthogonal matrix, so that
$$Q_1^T A Q_1 = \begin{bmatrix}
        \cos(\theta) & -\sin(\theta) & O \\
        \sin(\theta) & \cos(\theta) & O\\
        O & O & A_1
    \end{bmatrix}.$$

So, in general, we have shown that there exists an orthogonal $(n+1) \times (n+1)$ matrix $Q_1$ such that $Q_1^T A Q_1$ is of the form
$$Q_1^T A Q_1 = \begin{bmatrix}
    B & O \\
    O & A_1
\end{bmatrix},$$
where
$$\begin{align}
    B &= \begin{bmatrix}
        \pm 1
    \end{bmatrix} \text{ and $A_1$ is an orthogonal $n \times n$ matrix} && \text{if $\lambda_1$ is real}; \\
    B &= \begin{bmatrix}
        \cos(\theta) & -\sin(\theta) \\
        \sin(\theta) & \cos(\theta)
    \end{bmatrix} \text{ and $A_2$ is an orthogonal $(n-1) \times (n-1)$ matrix} && \text{if $\lambda_1$ is non-real}.
\end{align}$$
In either case, by the induction hypothesis applied to $A_1$, there exists an orthogonal matrix $P$ such that $P^T A_1 P$ is of the form $(\star)$, i.e.
$$P^T A_1 P = \begin{bmatrix}
   \pm 1 & & & & & \\
   & \ddots & & & & \\
   & & \pm 1 & & & \\
   & & & R_1 & & \\
   & & & & \ddots & \\
   & & & & & R_k
\end{bmatrix}.$$
If we set
$$Q_2 = \begin{bmatrix}
   I & O \\
   O & P
\end{bmatrix} \quad \text{and} \quad Q = Q_1 Q_2,$$
then both $Q_2$ and $Q$ are $(n+1) \times (n+1)$ orthogonal matrices. Morover,
$$Q^T A Q = Q_2^T \left( Q_1^T A Q_1\right) Q_2 = \begin{bmatrix}
   I & O \\
   O & P^T
\end{bmatrix} \begin{bmatrix}
   B & O \\
   O & A_1
\end{bmatrix} \begin{bmatrix}
   I & O \\
   O & P
\end{bmatrix} = \begin{bmatrix}
   B & O \\
   O & P^T A_1 P
\end{bmatrix}.$$
Recalling that $B$ and $P^T A_1 P$ are of the form $(\star)$, after reordering the columns of $Q$, we find that $Q^T A Q$ is of the desired form $(\star)$. This completes the induction step, and we conclude that, for all $n \geq 1$, an orthogonal $n \times n$ matrix is orthogonally similar to $(\star)$.