---
MAT421 - Applied Computational Methods

Arizona State University

Homework #5

Written by Edward Hayes

---
This notebook is a review and an elaboration of the topics covered in the Chapter 1 lecture notes from MAT421.

# Chapter 1. Linear Algebra

## 1.1 Introduction

Linear algebra is used in a variety of fields. The use of linear algebra is very important to data science, machine learning and many engineering applications.

## 1.2 Elements of Linear Algebra

$V=ℝ^n$




### 1.2.1 Linear Spaces

#### 1.2.1.1 Linear Combinations

A linear combination is the creation of a new vector from a subset by multiplying each vector by a scalar and adding the results. A linear combination results in a linear subspace.

**Definition: Linear Subspace**\
$U$ is a subset of $V$, $U⊆V$, that is closed under vector addition and scalar multiplication. That is, for all $\bar{u}_1,\bar{u}_2∈U$ and $α∈U$, it holds that

- $\bar{u}_1+\bar{u}_2∈U$
- $α\bar{u}_1∈U$

**Definition: Span**\
Let $\bar{w}_1,...,\bar{w}_m∈V$. The span of {$\bar{w}_1,...,\bar{w}_m$}, denoted span($\bar{w}_1,...,\bar{w}_m$), is the set of all linear combinations of $\bar{w}_j$. That is,

- span($\bar{w}_1,...,\bar{w}_m$) $=\{∑_{j=1}^m α_j\bar{w}_j:α_1,...,α_m∈ℝ\}$

**Definition: Column Space**\
Let $A∈ℝ^{n×m}$ be a $n×m$ matrix with columns $\bar{a}_1,...,\bar{a}_m∈ℝ^n$. The column space of A, denote col(A), is the span of the columns of A, that is,

- col(A) $=$ span($\bar{a}_1,...,\bar{a}_m$)$∈ℝ^n$

#### 1.2.1.2 Linear Independence and Dimension

**Definition: Linear Indepedence**\
A list of vectors $\bar{u}_1,...,\bar{u}_m$ is linearly independent if none of them can be written as a linear combination of the others, that is,

- $∀i$, $\bar{u}_i∉$ span({$\bar{u}_j:j\neq{i}$})

**Definition: Basis of a Space**\
Let $U$ be a linear subspace of $V$. A basis of $U$ is a list of vectors $\bar{u}_1,..,\bar{u}_m$ in $U$ that,

- Span $U$, $U$=span($\bar{u}_1,..,\bar{u}_m$)
- Linearly independent

**Dimension Theorem**\
Let $U$ be a linear subspace of V. Any basis of $U$ always has the same number of elements. All bases of $U$ have the same length, that is, the same number of elements. We call this number the dimension of $U$ and denote it dim($U$).

### 1.2.2 Orthogonality

#### 1.2.2.1 Orthonormal Bases

**Definition: Norm and Inner Product**

- $⟨\bar{u},\bar{v}⟩=\bar{u}·\bar{v}=∑_i^nu_iv_i$
- $\|\bar{u}\|=\sqrt{\sum_i^nu_i^2}$

**Definition: Orthonormal**\
A list of vectors {$\bar{u}_1,..,\bar{u}_m$} is orthonormal if the $\bar{u}_i$'s are pairwise orthogonal and each has a norm 1, that is $∀i$ and $j\neq{i}$,

- $⟨\bar{u}_i,\bar{u}_j⟩=0$
- $\|\bar{u}_i\|=1$

**Orthonormal Basis Expansion**\
Let $\bar{q}_1,..,\bar{q}_m$ be an orthonormal basis of 𝓤 and let $\bar{u}∈𝓤$. Then $\bar{u}=∑_{j=1}^m⟨\bar{u},\bar{q}_j⟩\bar{q}_j$.


### 1.2.3 Gram-Schmidt Process

The Gram-Schmidt algorithm is used to obtain an orthonormal basis.

Let $\bar{a}_1,...,\bar{a}_m$ in $ℝ^n$ be linearly independent. Then there exist an orthonormal basis $\bar{q}_1,...,\bar{q}_m$ of span($\bar{a}_1,...,\bar{a}_m$)

### 1.2.4 Eigenvalues and Eigenvectors

**Definition: Eigenvalues and Eigenvectors**\
Let $A∈ℝ^{d×d}$ be a square matrix. Then $λ∈ℝ$ is an eigenvalue of $A$ if there exists a nonzero vector $\bar{x}\neq{\bar{0}}$ such that $A\bar{x}=λ\bar{x}$.

$\bar{x}$ is referred to as an eigenvector.

## 1.3 Linear Regression



### 1.3.1 QR Decomposition

Gran-Schmidt algorithm is used to obtain an orthonormal basis, ($\bar{q}_1,...,\bar{q_m}$), from a linearly independent set of span($\bar{a}_!,...,\bar{a}_m$).

Given $A$ and $Q$ of $n×m$ matrices from the Gran-Schmidt algorithm, QR decomposition is $A=QR$. Column $i$ of the $m×m$ matrix $R$ contains the coefficients of the linear combination of $q_j$'s that produce $a_i$.

### 1.3.2 Least-Squares Problems

Used where matrix inverses cannot be use, a least square problem is used.

A solution to

\begin{equation}
\min_{\bar{x}∈ℝ^m}\|A\bar{x}-\bar{b}\|
\end{equation}

Satisfies

\begin{equation}
A^TA\bar{x}=A^T\bar{b}
\end{equation}

Or

\begin{equation}
\bar{x}=(A^TA)^{-1}A^T\bar{b}
\end{equation}

### 1.3.3 Linear Regression

In linear regression a function is calculated to fit given $n$ data points, $\{(\bar{x}_i,y_i)\}_{i=1}^n$. Then the coefficients to minimize the error between the data and function is found.

Error is $\sum_{i=1}^n(y_i-\hat{y})^2$

Predicted Value (function) is $y_i=\beta_0+∑_{j=1}^dβ_jx_{ij}$

It is then a least squares problem,

\begin{equation}
\min_{\beta}\|\bar{y}-A\bar{\beta}\|
\end{equation}