# Crash Course Lesson 4

In this lesson we will learn about:

* **Change of Basis**
* **Eigenvectors** and **Eigenvalues**
* **Singular Value Decomposition**

When we defined the matrix $M_L$ associated with a linear map $L: \mathbb{R}^n \to \mathbb{R}^m$ we said that the columns of $M_L$ should be the images of each of the **standard basis vectors** $\vec{e}_1, \vec{e_2}, \vec{e}_3, ..., \vec{e}_n$.  We were also implicitly using the standard basis of $\mathbb{R}^m$:  when we write a vector like $\begin{bmatrix} 5 \\ 3 \end{bmatrix}$ we really mean $5\begin{bmatrix}1 \\ 0 \end{bmatrix} + 3 \begin{bmatrix}0 \\ 1\end{bmatrix}$.

However **any** basis is capable of representing any arbitrary vector as a linear combination of the basis vectors, in a unique way.  The coefficients of this linear combination can be thought of as a new set of coordinates for the vector.  

**Example**:  Let $\mathcal{B} = \left( \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right) = (\vec{b}_1, \vec{b}_2)$.  This is a basis of $\mathbb{R}^2$.  Every vector in $\mathbb{R}^2$ can be expressed as a linear combination of these vectors.  We can even make a coordinate system to see how:

<p align = 'middle'>
<img src="crash_course_assets/standard-basis.png" width="400">
<img src="crash_course_assets/new-basis.png" width="400">
</p>



Here we can see that the point 

$$\begin{bmatrix} 5 \\ 3  \end{bmatrix} = 5\begin{bmatrix}1 \\ 0 \end{bmatrix} + 3 \begin{bmatrix}0 \\ 1\end{bmatrix} = 5 \vec{e}_1 + 3 \vec{e}_2$$

could also be thought of as 

$$
2\vec{b}_1 + 3 \vec{b}_2 = 2\begin{bmatrix}1 \\ 0 \end{bmatrix} + 3 \begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix} 5 \\ 3  \end{bmatrix}
$$

So the same vector has coordinates $\begin{bmatrix} 5 \\ 3  \end{bmatrix}$ with respect to the standard basis, but has coordinates $\begin{bmatrix} 2 \\ 3  \end{bmatrix}_\mathcal{B}$ with respect to the basis $\mathcal{B}$.



**Definition**  Let $\mathcal{B} = (\vec{b_1}, \vec{b}_2, \vec{b_3}, \dots , \vec{b}_n)$ be a basis of $\mathbb{R}^n$ and $\vec{v} \in \mathbb{R}^n$.  Then we can write $\vec{v}$ uniquely as a linear combination of the vectors from $\mathcal{B}$:

$$
\vec{v} = c_1\vec{b_1} +c_2 \vec{b}_2 + c_3\vec{b}_3 + \dots + c_n \vec{b}_n
$$

We call these coefficients $c_i$ the **coordinates of $\vec{v}$ with respect to $\mathcal{B}$**.  We introduce the following notation as a shorthand for the linear combination:

$$
c_1\vec{b_1} +c_2 \vec{b}_2 + c_3\vec{b}_3 + \dots + c_n \vec{b}_n = \begin{bmatrix} c_1 \\ c_2 \\ c_3 \\ \vdots \\ c_n \end{bmatrix}_\mathcal{B}
$$

**Observation**:  Using our understanding of the equivalence between linear combination of column vectors with matrix multiplication, we could also rephrase this as follows:

Let $M_\mathcal{B}$ be the matrix whose columns are $\mathcal{B}$ (in standard coordinates).  Then the coordinates of $\vec{v} \in \mathbb{R}^n$ with respect to $\mathcal{B}$ is the vector of coefficients $\vec{c}$ which solves the equation 

$$
\vec{v} = M_\mathcal{B} \vec{c}
$$

This can be solved using the inverse of $M_\mathcal{B}$.

$$
\vec{c} = M_\mathcal{B}^{-1} \vec{v}
$$

**Moral**:  To find the coordinates of a vector $\vec{v}$ with respect to a new basis $\mathcal{B}$, just apply the inverse of the matrix whose columns come from $\mathcal{B}$.

In [1]:
# Showing that this works using the example we kicked things off with.

import numpy as np
v = np.array([[5],[3]])
M = np.array([[1,1],[0,1]])
Minv = np.linalg.inv(M)
v_new = np.dot(Minv, v)
print('The coordinates of \n', v, '\n with respect to the columns of \n', M, '\n are \n', v_new )

The coordinates of 
 [[5]
 [3]] 
 with respect to the columns of 
 [[1 1]
 [0 1]] 
 are 
 [[2.]
 [3.]]


**Exercise 1**:  Find the coordinates of $\begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix}$ with respect to the basis 

$$\mathcal{B} = \left( \begin{bmatrix} 1 \\ 1 \\ 1\end{bmatrix}, \begin{bmatrix} 1 \\ -1 \\ 0\end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ -2\end{bmatrix} \right)$$

Experience has shown that when confronting a problem where we think linear algebra might help, it is extremely helpful to find bases for the spaces we are interested in which are "custom made" to make our problem easy to work with.

For example, we have had a special focus on orthogonal projection onto a $k$ dimensional subspace of $\mathbb{R}^n$ throughout this crash course.  We can view the Gram-Schmidt process as a way to "custom make" a basis for $\mathbb{R}^n$ where all the basis vectors are orthogonal and the first $k$ basis vectors span the subspace.  We have seen how this is a useful basis for the task of orthogonally projecting onto the subspace!

**Definition**:  We now define the **matrix of a linear map with respect to a choice of basis for the domain and codomain**.  Let $L : \mathbb{R}^n \to \mathbb{R}^m$ be a linear map.  Let $\mathcal{B}_1$ be a basis for $\mathbb{R}^n$ and $\mathcal{B}_2$ be a basis for $\mathbb{R}^m$.  Then we define the matrix for $L$ with respect to these bases to be the matrix $M_{\mathcal{B}_1 \to \mathcal{B}_2}$ whose columns are the $\mathcal{B}_2$ coordinates of the image (under $L$) of the basis vectors from $\mathcal{B}_1$.

**Example**:  Let $L: \mathbb{R}^2 \to \mathbb{R}^3$ have standard matrix

$$
M_L = \begin{bmatrix} 1 & 2 \\ 0 & -1 \\ -1 & 1\end{bmatrix}
$$

Let $\mathcal{B}_1$ be the columns of the matrix $B_1$:

$$
B_1 = \begin{bmatrix} 2  & -1\\ -1 & 1  \end{bmatrix}
$$

Let $\mathcal{B}_2$ be the columns of the matrix $B_2$:

$$
B_2 = \begin{bmatrix} 1 & -1 & 0 \\ 1 & 1 & 1 \\ 1 & 1 & 0 \end{bmatrix}
$$

Then we can compute the matrix $M_{\mathcal{B}_1 \to \mathcal{B}_2}$ as follows:

Note that the first column is $B_2^{-1}M_L(B_1 \vec{e}_1)$ and the second column is $B_2^{-1}M_L(B_1 \vec{e}_2)$.  But that means that the matrix $M_{\mathcal{B}_1 \to \mathcal{B}_2}$ is given by

$$
M_{\mathcal{B}_1 \to \mathcal{B}_2} = B_2^{-1} M_L B_1
$$





In [2]:
B_1 = np.array([[2,-1],[1,1]])
B_2 = np.array([[1,-1,0],[1, 1, 1],[1, 1, 0]])
B_2_inv = np.linalg.inv(B_2)
M = np.array([[1,2],[0,-1],[-1,1]])
M_new = np.dot(B_2_inv, np.dot(M, B_1))
print(M_new)

[[ 1.5  1.5]
 [-2.5  0.5]
 [ 0.  -3. ]]


## Eigenvectors

The easiest linear transformations to understand are the *scaling transformations*  $L(\vec{v}) = cL(\vec{v]})$ for some constant $c$.  Visually, they just stretch all vectors by the same constant factor, either enlarging them or reducing them in size.

The next easiest linear transformations to understand scale each coordinate axis independently.  For example the map

$$
L\left( \begin{bmatrix} x \\ y \end{bmatrix} \right) = \begin{bmatrix} 2x \\ 3y \end{bmatrix}
$$

stretches the plane horizontally by a factor of 2 and vertically by a factor of 3.

The matrix for such a linear transformation is diagonal:

$$
M_L = \begin{bmatrix} 2 & 0 \\ 0 & 3\end{bmatrix}
$$

If we have a linear transformation from $\mathbb{R}^2 \to \mathbb{R}^2$ which *doesn't* have a diagonal matrix, it can be pretty hard to have a geometric understanding of what it is doing.  You know it is rotating, stretching, and shearing space somehow, but it is certainly not as simple as the diagonal matrix transformations we talked about above.

Often we are able to find a basis of **eigenvectors** for linear transformations.  The matrix with respect to a basis of eigenvectors is diagonal, so it makes understanding the linear transformation much simpler!

**Definition**:  Let $L:\mathbb{R}^n \to \mathbb{R}^n$ be a linear transformation.  A vector $\vec{v}$ is called an **eigenvector** of $L$ if $\vec{v} \neq  \vec{0}$ and there is a scalar $\lambda \in \mathbb{R}$ so that 

$$
L(\vec{v}) = \lambda \vec{v}
$$

This constant $\lambda$ is called the **eigenvalue** associated with $\vec{v}$.

The effect of $L$ on $\textrm{span}(\vec{v})$ is just "simple scaling"!

Note:  While it is not standard terminology I will use the phrase **eigenstuff** of $L$ to refer to the eigenvector/eigenvalue pairs of $L$.  I am doing this because I hope it will catch on.

**Example**  Let $L: \mathbb{R}^2 \to \mathbb{R}^2$ have matrix

$$
M_L = \begin{bmatrix} -1 & 3 \\ 1 & 1 \end{bmatrix}
$$

Then $\begin{bmatrix} 1 \\ 1 \end{bmatrix}$ is an eigenvector with eigenvalue $2$ and $\begin{bmatrix} -3 \\ 1 \end{bmatrix}$ is an eigenvector with eigenvalue $-2$ (check!)

These vectors are linearly independent so they form a basis of eigevectors which I will call $\mathcal{B}$.

Let's see how this gives us a better geometric understanding of $L$:


The picture on below shows the standard basis vectors $\vec{e}_1$ and $\vec{e_2}$ as solid red and blue vectors.  The image of these vectors under $L$ are dotted red and blue vectors (these are the columns of $M_L$).  It is pretty hard to understand exactly what $L$ is doing to the plane.  It seems like it is flipping it, scaling it, and shearing it somehow, but it is a little baffling.  The linear transformation is "mixing up" the coordinates somehow.

<p align = 'middle'>
<img src="crash_course_assets/bad-basis.png" width="400">
</p>

Now let us look at exactly the same linear transformation, but using the basis of eigenvectors $\mathcal{B}$.  Again the first basis vector is a solid red vector and the second basis vector is a solid blue vector.  The dotted vectors represent where they are mapped by $L$.  This is so much easier to understand!  From this picture, we can see that $L$ just stretches the plane by a factor of $2$ in the direction of the first basis vector, and stretches the plane by a factor of $-2$ (which flips it and stretches it) in the direction of the second basis vector. 

<p align = 'middle'>
<img src="crash_course_assets/good-basis.png" width="400">
</p>

The matrix of $L$ with respect to $\mathcal{B}$ is as simple as possible:

$$
M_{\mathcal{B}} = \begin{bmatrix}  2 & 0 \\ 0 & -2\end{bmatrix}
$$

A few facts about eigenvectors:

* If you took an undergraduate linear algebra course you were probably taught to find eigenstuff using determinants and something called the "characteristic equation".  I would like to note that this is **completely impractical** for anything except the smallest matrices.
* It is usually impossible to find "closed form" expressions for eigenstuff:  you will usually not be able to get exact expressions for them.  The best we can do is numerically approximate.
* See [this wikipedia page](https://en.wikipedia.org/wiki/Eigenvalue_algorithm) to get an idea of the variety of algorithms available for finding eigenstuff.  You don't need to necessarily need to learn any of these, but you should be aware that if your matrix is special in some way (symmetric, tridiagonal, "sparse", etc) you may get better performance out of an algorithm which is tailored to that particular case.
* Not all linear transformations have real eigenstuff.  For instance, the matrix

$$
\begin{bmatrix}  0 & -1 \\ 1 & 0
\end{bmatrix}
$$

has the geometric effect of rotation by $90$ degrees (check!).  This doesn't have any "real" eigenstuff.  However, if you allow complex numbers (something we will not consider in this crash course!) every linear transformation does have at least one eigenvector.
* Even allowing complex numbers, not every matrix has a **basis** of eigenvectors, as demonstrated by the counterexample:

$$
\begin{bmatrix} 0 & 1 \\
 0 & 0 \end{bmatrix}
$$

$\begin{bmatrix} 1 \\ 0\end{bmatrix}$ is clearly an eigenvector for this matrix with eigenvalue $0$, but there can be no other eigenstuff:

$$
\begin{bmatrix} 0 & 1 \\
 0 & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = 
 \begin{bmatrix}
  y \\ 0 
 \end{bmatrix}
$$

for $\begin{bmatrix} x \\ y \end{bmatrix}$ to be an eigenvector we would need $\lambda x = y$ and $\lambda y = 0$.  If $y = 0$ this would imply either $x = 0$ (not allowed, since eigenvectors cannot be the zero vector) or $\lambda = 0$ (which would imply $y = 0$, and hence the eigenvector is of the form $\begin{bmatrix}x \\ 0 \end{bmatrix} =  x \begin{bmatrix}1 \\ 0 \end{bmatrix}$, which is the eigenvector we already found).  If you want to know more about this situation you should check out the wikipiedia page for [Jordan Canonical Form](https://en.wikipedia.org/wiki/Jordan_normal_form)

* However, we can say that "with probability 1" a random linear map $: \mathbb{C}^n \to \mathbb{C}^n$ will have a basis of eigenvectors, so " for almost all data you find in nature" you can be reasonably sure that you can find a basis of eigenvectors.  For the cognescenti:  The reason is that matrices with repeated eigenvalues are the algebraic variety in $\mathbb{C}^{(n^2)}$ which is the zero set of the discriminant of the characteristic polynomial.  The dimension of this subvariety is less than $n^2$, so it has measure $0$.  The compliment of this variety consists of all matrices which are diagonalizable with *distinct* eigenvalues.  So the set of matrices which are *not* diagonalizable with distinct eigenvalues is of measure $0$.


We can use numpy to find eigenstuff as follows:

In [3]:
M = np.random.random((3,3)) 
M = M + np.transpose(M)
    # making a random symmetric 3x3 matrix. 
    # I am making is symmetric to ensure that it has a basis of real eigenvectors.  We will learn why in a bit.

eigenstuff = np.linalg.eig(M) 
    # eigenstuff is a tuple.  
    # The index 0 element of this tuple is a numpy array of the eigenvalues.
    # The index 1 element of the tuple is a numpy array whose columns are the eigevectors.
    # Note the numpy normalizes the eigenvectors:  it always scales the eigenvector to have length 1.
 
print('These are the eigenvalues \n', np.linalg.eig(M)[0])
print('The columns of this matrix are the eigenvectors \n', np.linalg.eig(M)[1])

These are the eigenvalues 
 [ 2.95973016  0.37150463 -0.74754508]
The columns of this matrix are the eigenvectors 
 [[ 0.4359467   0.88306026 -0.17365211]
 [ 0.69575767 -0.45308141 -0.55734953]
 [ 0.57085177 -0.1221549   0.81191529]]


Putting our knowledge of change of basis together with the definition of eigenvectors, we can say the following:

**Idea**:  Let $L: \mathbb{R}^n \to \mathbb{R}^n$ be a linear transformation.  Assume that $L$ has a basis $\mathcal{B}$ of eigenvectors  (remember:  not all linear transformations *do* have such a basis).  Then the matrix of $L$ with respect to $\mathcal{B}$ is the diagonal matrix of eigevalues.

$$
M_\mathcal{B} = \begin{bmatrix} 
\lambda_1 & 0 & 0 & \dots & 0\\ 
0 & \lambda_2 & 0 & \dots & 0\\ 
0 & 0 & \lambda_3 & \dots & 0\\ 
\vdots & \vdots & \vdots & \vdots & \vdots\\ 
0 & 0 & 0 & \dots & \lambda_n\\ 
\end{bmatrix}_\mathcal{B}
$$

Let $B$ be the matrix whose columns are the eigenvectors and $M_L$ be the matrix of $L$ with respect to the standard basis.  Then, following our discussion about change of basis for matrices, we have

$$M_\mathcal{B} = B^{-1} M_L B$$

Let's check that in numpy for a random example:


In [4]:
M = np.random.random((3,3)) 
#M = M + np.transpose(M)
    # making a random symmetric 3x3 matrix. 
    # I am making is symmetric to ensure that it has a basis of real eigenvectors.  We will learn why in a bit.

eigenstuff = np.linalg.eig(M) 
    # eigenstuff is a tuple.  
    # The index 0 element of this tuple is a numpy array of the eigenvalues.
    # The index 1 element of the tuple is a numpy array whose columns are the eigevectors.
    # Note the numpy normalizes the eigenvectors:  it always scales the eigenvector to have length 1.
B = eigenstuff[1]
D = np.diag(eigenstuff[0]) # this makes a diagonal matrix out of the array of eigenvalues.
Binv = np.linalg.inv(B)

print('This is M with respect to the standard basis \n', M, '\n')
print('This is Binverse M B \n', np.round(np.dot(Binv, np.dot(M, B)),6),'\n')
print('This is the diagonal matrix of eigenvalues \n', D, '\n')
print('They agree!')

This is M with respect to the standard basis 
 [[0.63738396 0.61657668 0.22829545]
 [0.39428364 0.59046933 0.35304557]
 [0.76772537 0.15497294 0.73634091]] 

This is Binverse M B 
 [[ 1.484107-0.j       -0.      +0.j       -0.      -0.j      ]
 [-0.      +0.j        0.240044+0.193042j  0.      -0.j      ]
 [-0.      -0.j        0.      +0.j        0.240044-0.193042j]] 

This is the diagonal matrix of eigenvalues 
 [[1.48410671+0.j        0.        +0.j        0.        +0.j       ]
 [0.        +0.j        0.24004375+0.1930425j 0.        +0.j       ]
 [0.        +0.j        0.        +0.j        0.24004375-0.1930425j]] 

They agree!


## Singular Value Decomposition

What if the matrix we are working with is not a square matrix?  Can we still find "nice" bases for the domain and codomain so that the linear transformation "looks like scaling the axes" with respect to these bases?  The answer is a resounding "yes"!  In fact, we obtain a result which is as nice as possible:

**Theorem**:  (Singular Value Decomposition = SVD) Let $L : \mathbb{R}^n \to \mathbb{R}^m$ be a linear transformation.  Then there is an orthonormal basis $\mathcal{V} = \left( \vec{v}_1, \vec{v}_2, \vec{v}_3, \dots, \vec{v}_n \right)$ of $\mathbb{R}^n$ (called "left singular vectors") and an orthonormal basis $\mathcal{U} = \left( \vec{u}_1, \vec{u}_2, \vec{u}_3, \dots, \vec{u}_m \right)$ of $\mathbb{R}^m$  (called "right singular vectors") and positive scalars $\sigma_1, \sigma_2, \sigma_3, \dots, \sigma_n$ (called the "singular values") so that $L(\vec{v}_j) = \sigma_j \vec{u}_j$.  In other words, the matrix for $L$ with respect to the bases $U$ and $V$ only has non-zero entries on the "main diagonal" (the entries of the matrix with equal index).

This is huge!  It means that, after appropriately **rotating/reflecting** the domain and codomain independently, **every** linear map can be thought of as just independent positive scaling of the axes!

A few notes and definitions:

* Consider a matrix $A$ whose columns form an orthonormal basis of $\mathbb{R}^n$.  We can think of $A^\top A$ as computing the dot product of the columns of $A$ each other.  Since the columns are pairwize orthogonal, the dot product is $0$ unless we are dotting a column with itself (which lands us on the diagonal).  Since the columns are length $1$ we get $1$ in that case.  So $A^\top A = I$.  Conversely, if $A^\top A = I$, then the columns form an orthonormal basis.  We call such a matrix an **orthogonal** matrix.  Note that this means that, for orthogonal matrices only, the inverse is the same as the tranpose!
* Using this notation of orthogonal matrices, we can rephrase the statement of the theorem as follows:

**Restatement of Theorem**: Let $L : \mathbb{R}^n \to \mathbb{R}^m$ be a linear transformation.  Then there is an $n \times n$ orthogonal matrix $V$, an $m \times m$ orthogonal matrix $U$, and a $m \times n$ matrix $\Sigma$ which only has non-zero entries on the "main diagonal" so that

$$
\Sigma =  U^{-1} M_L V
$$

Usually this is rewritten as follows:

$$
\begin{align*}
\Sigma &=  U^{-1} M_L V\\
U \Sigma &= M_L V\\
U \Sigma V^{-1} &= M_L\\
M_L &= U \Sigma V^\top \textrm{ since $V^{-1} = V^\top$ for orthogonal matrices}
\end{align*}
$$

Note:  we usually order the bases so that $\sigma_i$ are in descending order.

We can compute these matrices using Numpy as follows:

In [5]:
M = np.random.random((3,2))

svdstuff = np.linalg.svd(M)
    #svdstuff is a tuple of arrays.
    #svdstuff[0] is the matrix U
    #svdstuff[1] is the array of singular values.  Note:  you need to use np.diag(svdstuff[1]) to get the diagonal matrix called Sigma above.
    #svdstuff[2] is, unfortunately, the *transpose* of the matrix V instead of being the matrix V.  Such is life.

U = svdstuff[0]
Sigma = np.zeros((3,2)) #initializing Sigma as an array of zeros with the appropriate shape (3,2).
np.fill_diagonal(Sigma, svdstuff[1]) # Modify S in place to have diagonal entries equal singular values stored in svdstuff[1]
V = np.transpose(svdstuff[2])

print('U = \n',U,'\n')
print('Sigma = \n',Sigma,'\n')
print('V = \n', V,'\n')
print('Hopefully these two are equal: \n \n M = \n', M, '\n \n U Sigma V-transpose = \n',np.dot(U, np.dot(Sigma, np.transpose(V))))


U = 
 [[-0.80586036 -0.29846487 -0.51137833]
 [-0.21918337 -0.65193721  0.72590394]
 [-0.55004339  0.69706284  0.45995182]] 

Sigma = 
 [[1.33579627 0.        ]
 [0.         0.17543092]
 [0.         0.        ]] 

V = 
 [[-0.87876235  0.4772596 ]
 [-0.4772596  -0.87876235]] 

Hopefully these two are equal: 
 
 M = 
 [[0.92096785 0.55976535]
 [0.20270369 0.24023813]
 [0.70402939 0.24320387]] 
 
 U Sigma V-transpose = 
 [[0.92096785 0.55976535]
 [0.20270369 0.24023813]
 [0.70402939 0.24320387]]


We will see that the Singular Value Decomposition is useful in data science for feature extraction.  When $X$ is an $N \times k$ matrix recordeding $N$ observations of $k$ different variables (say, $10000$ rows each representing a person, with $5$ columns representing height, weight, blood pressure, etc ), the left singular vectors of $X$ are "new features" which are linear combinations of the old features, but which are orthogonal.  The left singular vectors with the largest singular values will be "more relevant" than left singular vectors with smaller singular values.  If we have a ton of features it is nice to be able to use SVD to do feature reduction in this way.

# Exercise Solutions

> **Exercise 1**:  Find the coordinates of $\begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix}$ with respect to the basis 
> 
> $$\mathcal{B} = \left( \begin{bmatrix} 1 \\ 1 \\ 1\end{bmatrix}, \begin{bmatrix} 1 \\ -1 \\ 0\end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ -2\end{bmatrix} \right)$$

In [6]:
import numpy as np
v = np.array([[1],[2],[3]])
M = np.array([[1,1,0],[1,-1,1],[1,0,-2]])
Minv = np.linalg.inv(M)
v_new = np.dot(Minv, v)
print('The coordinates of \n', v, '\n with respect to the columns of \n', M, '\n are \n', v_new )

The coordinates of 
 [[1]
 [2]
 [3]] 
 with respect to the columns of 
 [[ 1  1  0]
 [ 1 -1  1]
 [ 1  0 -2]] 
 are 
 [[ 1.8]
 [-0.8]
 [-0.6]]


You can check that 

$$1.8\begin{bmatrix} 1 \\ 1 \\ 1\end{bmatrix} - 0.8\begin{bmatrix} 1 \\ -1 \\ 0\end{bmatrix} - 0.6\begin{bmatrix} 0 \\ 1 \\ -2\end{bmatrix} =  \begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix}$$

So we could write

$$
\begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix} = \begin{bmatrix} 1.8 \\ -0.8 \\ -0.6\end{bmatrix}_\mathcal{B}
$$