In [9]:
# To start, we're using slightly different libraries than usual
# Jupyter's inline plots don't seem to be ready for 3D yet
import numpy as np
import matplotlib
import matplotlib.pylab as plt
from mpl_toolkits import mplot3d

# Vectors

A **vector** is a geometric entity with a **magnitude** and a **direction**. The prototype for a vector from physics is the position vector, which describes the location of a point relative to a coordinate system. Let's look at some vectors.

In [10]:
# A 1-dimensional vector (a point on the number line)
vec1d = np.array([2.1])

# A 2-dimensional vector
vec2d = np.array([2.1,3.6])

# A 3-dimensional vector 
vec3d = np.array([2.1,3.6,1.7])

In [11]:
# Plot our three vectors.

# We want our arrows to originate from the origin
x_origin = np.array([0])
y_origin = np.array([0])
z_origin = np.array([0])

# matplotlib provides two different interfaces for plotting. In previous examples,
# we have used matplotlib.pylab.plot(), which is simpler but less flexible.
# In this more complicated example, we will use the "object-oriented" interface.

fig = plt.figure()   # Q: What is the type of fig?

# ------------------------------------------------------------------------
# 1D plot ----------------------------------------------------------------
ax = fig.add_subplot(1,3,1)  # Q: What is the type of ax?

ax.quiver(x_origin,y_origin,     # Origin of vector
          vec1d[0],0.,           # Components of vector
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10           # Arrowhead rules
         )
ax.set_xlim([-.1,3.])
ax.set_title("1D vector")
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.grid(True)

# ------------------------------------------------------------------------
# 2D plot ----------------------------------------------------------------
ax = fig.add_subplot(1,3,2)
ax.quiver(x_origin,y_origin,
          vec2d[0],vec2d[1],
          angles="xy", scale_units="xy", scale=1.,
          headlength = 10, headwidth = 10
          )
ax.set_xlim([-.1,3.])
ax.set_ylim([-.1,4.])
ax.set_title("2D vector")
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.grid(True)

# ------------------------------------------------------------------------
# 3D plot ----------------------------------------------------------------
ax = plt.subplot(1,3,3,projection='3d')  # Q: What is the type of ax? Different than above!
ax.quiver(vec3d[0],vec3d[1],vec3d[2],
             vec3d[0],vec3d[1],vec3d[2],
             length = np.linalg.norm(vec3d),
             arrow_length_ratio=.2
           )
ax.set_xlim3d([0.,4])
ax.set_ylim3d([0.,4])
ax.set_zlim3d([0.,4])
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
ax.set_title('3D Vector')
plt.grid(True)

plt.show()

# Basic equation of linear algebra

The basic equation upon which much of linear algebra is built is

$$ Ax = b$$

where $A$ is a matrix, $x$ is the vector of unknowns, and $b$ is a column vector (often called the "right hand side"). A few different situations may arise. Let's look at them through an example.

## Situation I: $n \times n$, inverse exists

Suppose you work for a teleco company. Your company offers two tiers of bandwidth service, one for "ordinary users" an one for "power users." The pricing scheme is as follows:
 
<table style="width:50%">
  <tr>
    <td>**Tier**</td>
    <td>**Price**</td> 
    <td>**Bandwidth**</td>
  </tr>
  <tr>
    <td>A</td>
    <td> RM 54</td> 
    <td>3 Gb</td>
  </tr>
    <tr>
    <td>B</td>
    <td> RM 84</td> 
    <td>7 Gb</td>
    </tr>
</table>

You have been charged with determining the number of customers in each tier to target for marketing. The technology department tells you they have $B=12,000$GB bandwidth available, and finance tells you the company is shooting for $R=\mathrm{RM} 155,000$ in revenue. You formulate the following matrix equation:

<br>
<center>
 $ \begin{bmatrix}
54 & 84 \\
3 & 7 
\end{bmatrix} 
\begin{bmatrix}
N_A \\ N_B
\end{bmatrix}
=
\begin{bmatrix}
155000 \\ 12000
\end{bmatrix}
$
</center>

where $N_A$ and $N_B$ are the number of plans sold at tier A and B respectively. 

Happily, this is an **square matrix**, one which has the same number of rows as it has columns. If a matrix is square it might have an inverse. If the inverse exsts, we can solve for vector of unknowns using the formula $x = A^{-1}b$:

In [12]:
A = np.matrix([[54.,84.],[3.,7.]])
b = np.matrix([[155000.],[12000.]])
x = np.linalg.inv(A)*b
print x

[[  611.11111111]
 [ 1452.38095238]]


With this you conclude that the company needs to sell quantities of $N_A=611$ and $N_B=1453$ for tier A and tier B respectively.

**Question:** What happens if finance asked for a revenue of $R=250,000$? Edit the matrix equation above and interpret the result.


Let's take a look at the inverse matrix:

In [13]:
print np.linalg.inv(A)

[[ 0.05555556 -0.66666667]
 [-0.02380952  0.42857143]]


We see that $A^{-1}$ is a matrix with the same dimensions as $A$. When we multiply $A$ times its own inverse, we get the identity matrix: 

In [14]:
print np.linalg.inv(A)*A

[[ 1.  0.]
 [ 0.  1.]]


Recall that the identity matrix $I$ when multiplied by any vector $x$ gives back $x$ as the result. This helps us understand how the inverse matrix works. Consider the original equation:
$$
Ax = b
$$

Let's *left multiply* both sides of the equation by $A^{-1}$. Note that in matrix algebra, there is a difference between left muliply and right multiply!

$$
A^{-1} A x = A^{-1} b
$$

$$
I x = A^{-1} b
$$

$$
x = A^{-1} b
$$

The operation of left-multiplying by $A^{-1}$ is analagous to dividing both sides by $A$ for a standard algebraic equation.

## Situation II: $n \times n$, inverse does not exist

The marketing team comes back with a brilliant idea. Instead of forcing customers to choose between getting more data at a better rate, why not let them buy the smaller plan at the same rate as the larger plan? Let's reduce the price, they say, to RM 36 for 3 GB so that regardless of which tier a customer chooses the price remains RM12/GB.

<table style="width:50%">
  <tr>
    <td>**Tier**</td>
    <td>**Price**</td> 
    <td>**Bandwidth**</td>
  </tr>
  <tr>
    <td>A</td>
    <td> RM 36</td> 
    <td>3 Gb</td>
  </tr>
    <tr>
    <td>B</td>
    <td> RM 84</td> 
    <td>7 Gb</td>
    </tr>
</table>

Again you are charged with figuring out how many of each tier you must sell. You formulate your matrix equation again:

<br>
<center>
 $ \begin{bmatrix}
36 & 84 \\
3 & 7 
\end{bmatrix} 
\begin{bmatrix}
N_A \\ N_B
\end{bmatrix}
=
\begin{bmatrix}
155000 \\ 12000
\end{bmatrix}
$
</center>

No problem. Same as before, right?

In [17]:
A = np.matrix([[36.,84.],[3.,7.]])
b = np.matrix([[155000.],[12000.]])
x = np.linalg.inv(A)*b
print x

LinAlgError: Singular matrix

*Wrong!* The inverse of this matrix does not exist! (A matrix which does not have an inverse is called a **singular matrix**)

To understand what's happening, let's depict Situations I and II graphically. It will be helpful to rewrite the matrix equation as follows:

<br>
<center>
 $ \begin{bmatrix}
54\\
3 
\end{bmatrix} 
N_A +
\begin{bmatrix}
84 \\ 7
\end{bmatrix}
N_B
=
\begin{bmatrix}
155000 \\ 12000
\end{bmatrix}
$
</center>

In this form, we see that the matrix-vector product is a **linear combination** of the columns of $A$.

**Checkpoint:** Take a moment to convince yourself that this is equivalent to the matrix form above. 

In [None]:
# Left plot
vecA = np.array([54.,3.])*611.11111111
vecB = np.array([84.,7.])*1452.38095238
plt.subplot(1,2,1)
plt.quiver(0., 0., vecA[0], vecA[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.quiver(vecA[0], vecA[1], vecB[0], vecB[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.plot(155000,12000,'ro')
plt.title("Situation I, inverse exists")
plt.xlabel('Revenue')
plt.ylabel('Bandwidth')
plt.xlim([0,160000])
plt.ylim([0,20000])
plt.grid(True)
plt.xticks(rotation=35)
plt.text(158000,11700,"Target reached")

# Right plot
plt.subplot(1,2,2)
vecA = np.array([36.,3.])*1000
vecB = np.array([84.,7.])*1500
plt.quiver(0., 0., vecA[0], vecA[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.quiver(vecA[0], vecA[1], vecB[0], vecB[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.plot(155000,12000,'ro')
plt.title("Situation II, singular matrix")
plt.xlabel('Revenue')
plt.ylabel('Bandwidth')
plt.xlim([0,160000])
plt.ylim([0,20000])
plt.ylim([0,20000])
plt.grid(True)
plt.xticks(rotation=35)
plt.text(158000,11700,"Target unreachable!")

plt.subplots_adjust(wspace=.4)
plt.show()

In Situation I, we have two non-parallel vectors. By constructing an appropriate linear combination of these vectors, we are able to achieve the target (red circle). 

In Situation II, on the other hand, the two vectors in our column space are parallel. No matter how we try to combine them, we cannot reach any points outside of the straight line along the vectors!

We refer to the space of points which can be reached by a matrix as the **span** of the matrix. In Situation I, the span is the entire plane of points, whereas in Situation II the span is only the points along one straight line.

So, unfortunately, the demands of your company cannot be reached. Instead, let's get as close as we can. The inverse doesn't exist, so instead we'll use the **pseudo-inverse**.

In [None]:
A = np.matrix([[36.,84.],[3.,7.]])
b = np.matrix([[155000.],[12000.]])
x = np.linalg.pinv(A)*b
print x

Does this solution solve our equation $Ax=b$? No it does not; the solution doesn't exist. If it did, the two vectors printed below would be equal:

In [None]:
b_approximate = A*x
print b_approximate
print b

Let's plot this result on top of our vector diagram to see what happened:

In [None]:
# Left plot
vecA = np.array([54.,3.])*611.11111111
vecB = np.array([84.,7.])*1452.38095238
plt.subplot(1,2,1)
plt.quiver(0., 0., vecA[0], vecA[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.quiver(vecA[0], vecA[1], vecB[0], vecB[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.plot(155000,12000,'ro')
plt.title("Situation I, inverse exists")
plt.xlabel('Revenue')
plt.ylabel('Bandwidth')
plt.xlim([0,160000])
plt.ylim([0,20000])
plt.grid(True)
plt.xticks(rotation=35)
plt.text(158000,11700,"Target reached")

# Right plot
plt.subplot(1,2,2)
vecA = np.array([36.,3.])*668
vecB = np.array([84.,7.])*1559
plt.quiver(0., 0., vecA[0], vecA[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.quiver(vecA[0], vecA[1], vecB[0], vecB[1],          
          angles="xy", scale_units="xy", scale=1.,  # Scaling rules
          headlength = 10, headwidth = 10)          # Arrowhead rules
plt.plot(b_approximate[0],b_approximate[1],'go')
plt.plot(155000.,12000.,'ro')
plt.title("Situation II, singular matrix")
plt.xlabel('Revenue')
plt.ylabel('Bandwidth')
plt.xlim([0,160000])
plt.ylim([0,20000])
plt.ylim([0,20000])
plt.grid(True)
plt.xticks(rotation=35)
plt.text(150000,13500,"Nearest approximation")
plt.text(158000,11700,"Target unreachable!")
plt.show()

Actually, this plot is deceptive because the axes are not equal. Let's make them equal and zoom in to the interesting part:

In [None]:
# Right plot
vecA = np.array([36.,3.])*668
vecB = np.array([84.,7.])*1559

plt.plot(b_approximate[0],b_approximate[1],'go')
plt.plot(155000.,12000.,'ro')
plt.title(r"$b_\mathrm{approximate}$ is the closest point to $b$ on the line")
plt.xlabel('Revenue')
plt.ylabel('Bandwidth')
plt.xlim([0,160000])
plt.ylim([0,20000])
plt.ylim([0,20000])
plt.grid(True)
plt.xticks(rotation=35)
plt.text(b_approximate[0],b_approximate[1],r"$b_{\mathrm{approximate}}$")
plt.text(155000,11700,r"$b$ (Target)")

plt.plot([0.,vecA[0]*6.7],[0.,vecA[1]*6.7],'k--')  # Extend the line
plt.plot([155000,b_approximate[0]],[12000,b_approximate[1]],'k--')

plt.axis('equal')
plt.xlim([150000,160000])
plt.ylim([10000,15000])
plt.show()

We see that the line connecting $b_\mathrm{approximate}$ to $b$ is *perpendicular* to the span of matrix $A$. 

**Question:** Why should this be the case?

## Exercise

1. Construct a scenario in which there are three tiers instead of just two. Are you able to reach the target goals for revenue and bandwidth? 
2. Construct a scenario in which there are three targets instead of just two. For instance, suppose each plan is also associated with a certain number of minutes of calling time. Select a number of minutes for tiers A and B (from Situation I) and also a target. Can you reach this goal with two tiers?