## Loading libraries aind downloading the data

In [1]:
import numpy as np
import pandas as pd

### Warm-up Exercise with Matrices

Each of a set of $m$ patients can exhibit any number of $n$ possible symptoms.  
We represent this as an $m \times n$ matrix $S$, where  
$S_{ij} = 1$ if the $i$-th patient exhibits symptom $j$, and $S_{ij} = 0$ otherwise.

Write simple English explanations for the following matrix expressions:

1. $S \boldsymbol{1}$  
2. $S^T \boldsymbol{1}$  
3. $S^T S$  
4. $S S^T$  

For example: $S \mathbf{1}$ is an $m$-vector whose $i$-th element is ... that patient $i$ has.

In [None]:

#1
print("S@1 is an m-vektor whose i-th element is amount of symptoms that patient i has")
#2
print("S.T@1 is an n-vektor whose i-th element is amount of pacients that have i symptom")
#3
print("pocet pacientov kotry maju tieto dva symptomps naraz")
#4
print("pocet symptomov ktory maju tieto dva sympt naraz")

## Vector Spaces, Bases, and Orthogonal Projections

Determine whether the vectors $\boldsymbol{a}_1$, $\boldsymbol{a}_2$, and $\boldsymbol{a}_3$ defined below:

- form a **basis**, and  
- are **orthogonal**.

Justify your answer.

In [None]:
a1 = np.array([0,0,-1])
a2 = np.array([1,1,0])/(2)**0.5
a3 = np.array([1,-1,0])/(2)**0.5

A = np.vstack((a1,a2,a3)).T

print(np.round(A.T @ A,10))

[[ 0.  0.  0.]
 [ 0.  0. -0.]
 [ 0. -0.  0.]]


Given a vector $\boldsymbol{b}_1 = (4, 2)^\top$, find the projection $\boldsymbol{b}_p$ of $\boldsymbol{b}_2 = (1, 5)^\top$ that is **orthogonal** to $\boldsymbol{b}_1$.  
Verify your calculation by checking that the resulting vector is indeed orthogonal to $\boldsymbol{b}_1$.


In [13]:
b2 = np.array([[1],[5]])
b1 = np.array([[4],[2]])

bp = ((b2.T @ b1)/(b1.T @ b1)) * b1
print(bp)
print((b2 - bp).T @bp)

[[2.8]
 [1.4]]
[[8.8817842e-16]]


Perform an **orthogonal projection** of the vector $\boldsymbol{v} = (1, 2, 3, 4)^\top$ onto the plane spanned by  
$\boldsymbol{u}_1 = (1, 6, 2, 2)^\top$ and $\boldsymbol{u}_2 = (2, -1, -1, 3)^\top$.  

Refer to the lecture slides for the equations needed to perform these calculations.

**Hint:**  
First, check whether $\boldsymbol{u}_1$ and $\boldsymbol{u}_2$ are orthogonal.


In [23]:
v=np.array([[1,2,3,4]]).T
u1=np.array([[1,6,2,2]]).T
u2=np.array([[2,-1,-1,3]]).T

print(u1.T @ u2)

w = (u1.T @ v) / (u1.T @u1)*u1 + (u2.T @ v) / (u2.T @ u2)*u2
print(w)
np.round((v-w).T @ u1,10)


[[0]]
[[1.8]
 [3. ]
 [0.6]
 [3. ]]


array([[0.]])

Perform an **orthogonal projection** of the vector  
$\boldsymbol{v} = (1, 2, 3, 4)^\top$ onto the plane spanned by  
$\boldsymbol{u}_3 = (1, 1, 2, 3)^\top$ and $\boldsymbol{u}_4 = (2, 0, 1, 1)^\top$.

Refer to the lecture slides for the procedure to compute the projection.  
Try to solve it using a **set of equations**, as demonstrated in the slides.


In [28]:
v=np.array([[1,2,3,4]]).T
u3=np.array([[1,1,2,3]]).T
u4=np.array([[2,0,1,1]]).T
# a little help
U = np.hstack((u3,u4))
print(U)
G = U.T @ U
print(G)
b = U.T @ v
print(b)
c = np.linalg.inv(G) @ b
print(c)
w = U @ c
print(w)



[[1 2]
 [1 0]
 [2 1]
 [3 1]]
[[15  7]
 [ 7  6]]
[[21]
 [ 9]]
[[ 1.53658537]
 [-0.29268293]]
[[0.95121951]
 [1.53658537]
 [2.7804878 ]
 [4.31707317]]


## Gram–Schmidt Orthogonalization

The **Gram–Schmidt algorithm** produces a set of **orthonormal vectors** from a collection of linearly independent vectors.  
If some of the input vectors are linearly dependent, the algorithm detects this and stops.

Given a set of vectors $\boldsymbol{a}_1, \boldsymbol{a}_2, \dots, \boldsymbol{a}_k$, we first want to find a projection of $\boldsymbol{a}_2$ that is **perpendicular** to $\boldsymbol{a}_1$.  
How can we find such a vector?

Given $\boldsymbol{a}_1 = (0, 3)^\top$, find the **orthogonal projection** of $\boldsymbol{a}_2 = (4, 2)^\top$ onto $\boldsymbol{a}_1$, and then **rescale** the resulting projection to have **unit length**.  
In other words, perform a **single step of the Gram–Schmidt orthogonalization** process.


In [36]:
a1 = np.array([[0, 3]]).T
a2 = np.array([[4, 2]]).T

a = ((a2.T @ a1)/(a1.T @ a1)) * a1



From *VMLS* (p. 97), the **Gram–Schmidt orthogonalization** process is defined as follows:

Given $n$-vectors $\boldsymbol{a}_1, \boldsymbol{a}_2, \dots, \boldsymbol{a}_k$:

---

For $i = 1, \dots, k$:

1. **Orthogonalization:**  
   $\tilde{\boldsymbol{q}}_i = \boldsymbol{a}_i - (\boldsymbol{q}_1^\top \boldsymbol{a}_i)\boldsymbol{q}_1 - \cdots - (\boldsymbol{q}_{i-1}^\top \boldsymbol{a}_i)\boldsymbol{q}_{i-1}$

2. **Test for linear dependence:**  
   If $\tilde{\boldsymbol{q}}_i = \boldsymbol{0}$, stop.

3. **Normalization:**  
   $\boldsymbol{q}_i = \tilde{\boldsymbol{q}}_i / \lVert \tilde{\boldsymbol{q}}_i \rVert$

---
For $i = 1$, the orthogonalization step reduces to $\tilde{\boldsymbol{q}}_1 = \boldsymbol{a}_1$.

---
**Task**

Complete the function below that performs the Gram-Schmidt orthogonalization for a matrix.

---

In [None]:
def gram_schmidt_orto(A):
    n, m = A.shape # get the shape of A
    Q = np.empty((n, m), dtype = A.dtype) # initialize matrix Q
    Q[:,0] = A[:,0] / np.linalg.norm(A[:,0])
    for i in range(1,m):
        for i in range(1,m):
            Q[:,i] = A[:,i]
        for j in range(0,i):
            #Q[:,i] = Q[:,i] - (A[:,i].T @ Q[:,j])/(Q[:,j].T @ Q[:,j]) * Q[:,j]
            Q[:,i] = Q[:,i] - (Q[:,j].T @ A[:,i]) * Q[:,j]
        if np.linalg.norm(Q[:,i]) == 0:
            raise Exception("Vektors are a linear combinatoin")
        Q[:,i] /= np.sqrt(Q[:,i].T @ Q[:,i])
    return Q


Below find a matrix to text it, where it will consist of vectors with $\pm0.5$ as elements.

In [50]:
# matrix to test the orthogonalization:
A = np.array([[-1., -1., 1.], [1.,  3.,  3.], [-1., -1.,  5.], [1.,  3.,  7.]], dtype = np.float64)

A_o = gram_schmidt_orto(A)
print(A_o)

A_o.T @ A_o

[[-0.5        -1.          0.24981529]
 [ 0.5         3.         -0.67257964]
 [-0.5        -1.          0.28824842]
 [ 0.5         3.         -0.63414651]]


array([[ 1.        ,  4.        , -0.92239493],
       [ 4.        , 20.        , -4.45824216],
       [-0.92239493, -4.45824216,  1.        ]])

Print an answer what will happen if one of the vectors can be defined as a linear combination of others.

To verify your procedure, you can print the resulting matrices and compare them with the result from `np.linalg.qr()`.


### Ortogonalizing a basis

Orthogonalize the vectors $\boldsymbol{u}_3 = (1, 1, 2, 3)^\top$ and  
$\boldsymbol{u}_4 = (2, 0, 1, 1)^\top$ using the **Gram-Schmidt algorithm**, so that projecting $\boldsymbol{v} = (1, 2, 3, 4)^\top$ onto the resulting orthogonal basis can be done **without computing a matrix inverse**.

Start by generalizing the following code to work with matrices:
```
w = (u3.T @ v)/(u3.T @ u3)*u3 + (u4.T @ v)/(u4.T @ u4) * u4
```
To extract diagonal from a matrix as a vector, you can use `np.diag`.
You may also notice, that there are dot products of ortonormal vectors, which can be simplified as well.

In [None]:
v=np.array([[1,2,3,4]]).T
u3=np.array([[1,1,2,3]]).T
u4=np.array([[2,0,1,1]]).T
# a little help
U = np.hstack((u3, u4))

U_o = gram_schmidt_orto(U)

w = np.sum((U_o.T @ v).T * U_o, axis = 1).reshape(-1,1)
(w - v).T @ U_o


## Housing Stock

In this section, we will estimate housing prices using the [Real Estate dataset](https://www.kaggle.com/datasets/quantbruce/real-estate-price-prediction?resource=download) from Kaggle.  
The dataset will be downloaded by running the cell below.

In [54]:
real_estate_df = pd.read_csv("https://drive.google.com/uc?id=1lZMjd7v2sbtI91i_In5cm4-5LhX4e1IW", header=0,index_col=0)

Now, we will perform an **orthogonal projection** of the vector `Y house price of unit area` onto the subspace spanned by the remaining features (the other columns of `real_estate_df`).

Your task is to extract these remaining features into a NumPy array `X`, which will store vectors of the subspace.


In [56]:
y = real_estate_df.iloc[:,6].to_numpy()
X = real_estate_df.iloc[:,:6].to_numpy() # extract the rest of columns (just change the subsetting sequence for column written above)
X = np.hstack((X, np.ones((X.shape[0], 1)))) # append column of ones
print(X)

[[2.0129170e+03 3.2000000e+01 8.4878820e+01 ... 2.4982980e+01
  1.2154024e+02 1.0000000e+00]
 [2.0129170e+03 1.9500000e+01 3.0659470e+02 ... 2.4980340e+01
  1.2153951e+02 1.0000000e+00]
 [2.0135830e+03 1.3300000e+01 5.6198450e+02 ... 2.4987460e+01
  1.2154391e+02 1.0000000e+00]
 ...
 [2.0132500e+03 1.8800000e+01 3.9096960e+02 ... 2.4979230e+01
  1.2153986e+02 1.0000000e+00]
 [2.0130000e+03 8.1000000e+00 1.0481010e+02 ... 2.4966740e+01
  1.2154067e+02 1.0000000e+00]
 [2.0135000e+03 6.5000000e+00 9.0456060e+01 ... 2.4974330e+01
  1.2154310e+02 1.0000000e+00]]


Next, we will **orthogonalize** the columns of `X` to simplify the computation of matrix inverses and make the projection process more straightforward.  
Orthogonalizing the columns ensures numerical stability and allows projections to be computed more efficiently. Below, store the coefficients of the projection in vector `beta`.


In [None]:
X_o = gram_schmidt_orto(X)
beta = X_o.T @ y



Now, we can use the `beta` coefficients to estimate the values of `y`.  
Simply multiply the orthogonalized matrix `X_o` by the coefficient vector `beta` and assign the result to `y_est`.

Next, visualize `y_est` versus `y` to evaluate how well the estimated values approximate the actual data.  
The red dashed line represents the line of **perfect predictions**, where the estimated and true values are equal.


In [None]:
import matplotlib.pyplot as plt

y_est = # estimate y

plt.figure(figsize=(6, 6))
plt.scatter(y, y_est, alpha=0.6, edgecolor='k')
plt.plot([y.min(), y.max()], [y.min(), y.max()], 'r--', label='Perfect estimate')
plt.xlabel('Actual values')
plt.ylabel('Estimated values')
plt.title('Actual vs Estimated Scatter Plot')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()


Finally, compute RMSE of your estimate.

In [None]:
error = y_est - y
RMSE = 0
print(f"RMSE:{RMSE:.3f}")

In the exercise above, we performed a **linear regression**.  
If we were to obtain new data, we could use the coefficients stored in `beta` to make new predictions.  
However, the new data would still be expressed in the **original (non-orthogonal)** basis, while the regression coefficients correspond to the **orthogonalized** basis.

Therefore, before making predictions, we would need to compute the **transformation matrix** that maps the original basis to the new orthogonalized one, and then use this transformation to correctly project the new data before applying the model. The computation of the transformation matrix will be performed in the next lab.


## Matrix Kernel

Now, we will examine the **null space** (also called the **kernel**) of a matrix.

We define a matrix `A` and a vector `b`, which together represent a **system of linear equations**.  
First, determine the **rank** of `A` and the **rank** of the augmented matrix `A|b`.  

Based on these ranks:
- How many solutions does the system have?  
- What geometric object (e.g., point, line, plane) represents the set of solutions?


In [None]:
# Matrix and right hand side vector
A = np.array([[1., 2., 3.],
              [2., 4., 6.],
              [1., 1., 1.]])
b = np.array([3., 6., 4.])

# ranks
rank_A = np.linalg.matrix_rank(A)
rank_Ab = np.linalg.matrix_rank(np.column_stack((A, b)))
print(f"rank(A) = {rank_A}, rank([A|b]) = {rank_Ab}")

A particular solution to the system can be obtained using the **pseudoinverse**.  
This solution is also the one with the **smallest norm** among all possible solutions.

Next, use the `scipy.linalg` module to find the **null space** of the matrix.  
The resulting basis vector `ns_A` of the null space is of **unit length** and forms a basis for a **one-dimensional subspace** of $\mathbb{R}^3$.


In [None]:
# Pseudoinverse
x_p = np.linalg.pinv(A) @ b
print("Particular solution x_p:")
print(x_p)
print("Verify that A @ x_p ~ b:", np.allclose(A @ x_p, b))

# null_space computation
from scipy.linalg import null_space

ns_A = null_space(A)
print("Basis of nullspace ker(A) (columns):")
print(ns_A)
print("Verification A @ n ~ 0:", np.allclose(A @ ns_A, 0))

Next, we will find a **solution vector** and a **vector belonging to the kernel** whose elements are **integers**, making the results easier to read and interpret.


In [None]:
ns_A = null_space(A)
nice_null_space = 1/ns_A[0]*ns_A
print(f"null space vector:\n",nice_null_space)
# this can be found by Gauss-Jordan elimination :)
x_nice = np.array([5,-1,0]).reshape(-1,1)
print(f"solution vector:\n", x_nice)


In [None]:
np.round(np.linalg.pinv(A),2)

Now, we can verify that adding any vector from the **null space** of $A$ to a particular solution still yields a valid solution to the system of equations:

$$A(\boldsymbol{x} + \boldsymbol{v}) = \boldsymbol{b}, \quad \boldsymbol{v} \in \ker(A).
$$

Write this equation in Python and experiment with different values of the scalar $t$,  
which you can use to generate new solution vectors of the form:

$$
\boldsymbol{x}_{\text{new}} = \boldsymbol{x} + t \boldsymbol{v}, \quad \text{where } \boldsymbol{v} \in \ker(A).
$$


In [None]:
t = -5
# write the equation above