### Scalar's and Vector's
Scalar's are entities that can be measured using a single real number. Examples are height, weight , temperature etc.
An entity which has both magnitude and direction are called Vectors. Force, velocity, movement in a stock price etc , all can be represented as vectors as besides just the quantity, the direction in which these are acting are crucial for their complete definition.

The development of algebra of vectors have been around since beginning of 20th century,but the concept of them have been around even hundreds of years before them. Geometrically, vectors are shown with an arrow connecting two points , the the direction of the arrow representing the direction.

![vector.png](attachment:vector.png)

Vectors with same length and direction are said to be equal, meaning just being at a different position in a coordinate system does not affect the vectors. 

A vector whose starting an ending point coincides are called zero vectors, and they are assumed to be in any direction.

##### Why are vectors useful ?

Vectors can be used to represent entities and model various phenomenon, physical or logical.  
Here are some examples:  
1) In physics, vectors in 4-dimensions can be used to denote spacetime where the first 3 vectors are 3-d coordinates of space and 4th vector is for time.  
2) In machine learning, we typically represent fetaures in the data as vectors. For example, In the dataset containing height and weight details for man and women, each data point is considered as vector with components in height and weight.  
3) In studying aerodynamics, the forces like velocity , lift and drag are all treated as vectors as  their directions are equally important as their magnitudes.  
  
These are just few examples, but the point to drive home is that when we start to model a phenomenon, we start by identying the components within that system and vectors are popular and convinient way of think about those components. Once we have identified the vectors, we then have the whole vector algebra which has evolved over centuries at our disposal to study interactions between these systems.

While the mathematical notation of vectors is more elaborate, we can represent a vector in numpy by using a single dimensional array. When we represent vectors programatically, we will think that all the vectors are originating from origin and hence can be represented by only the end-coordinates.

[fig-vector_representation]

In [1]:
import numpy as np
vector_2d = np.array([1,1]) 
vector_3d = np.array([1,1,1]) 

##### Length of a vector

In [2]:
u = np.array([1,1])
v = np.array([2,1])

[fig-length]

In [3]:
len_u = np.sqrt(np.sum(np.square(u)))
len_v = np.sqrt(np.sum(np.square(v)))
print('||u|| :'+str(len_u))
print('||v|| :'+str(len_v))

||u|| :1.41421356237
||v|| :2.2360679775


##### Algebraic properties

Operations are how the variables interact. Addition, subtraction, multiplication etc are common operations that we learn to first to apply on real numbers and then later to algebraic variables.  The nature of these operators can be described using properties that these operators exhibit. Let's see some basic properties and how these work on real numbers:  
**Associativity:** Allows you to group the variables differently without having any effect. For example:  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2 + (3 + 4) =  (2 + 3) + 4  or  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2 x (3 x 4)  = (2 x 3) x 4
So we say that + and x are associative.    
However, 2 - (3 - 2) != (2 - 3)- 2 and 1/(2/3) != (1/2)/3 , therefore substraction and division are not associative.  
**Commutativity:** This property allows to interchange the order of the variables without having any effect.Example  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2+3=3+2 and  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2x3=3x2  
but 2-3 != 3-2 and 2/3 != 3/2. So again, addition and multiplication are commutative while substraction and divison are not.  
**Identity and Inverse:** Identity property means having a member such that applying the operator to any variable does not change the variable. For example, 0 is an identity under addition operation while 1 is the identity under multiplication. Inverse property means having a member such that applying the operator produces the identity. For example, in addition for every x, adding -x will result in 0, which is the identity.  
Think of other operators like exponentiation in terms of above properties.  
Also there are lot more properties that can be used to characterize an operator like transitivity, reflexivity, distributivity etc.

#### Algebraic Operations on vectors

##### Addition

If ***u*** and ***v*** are two vectors, the sum can be represented as arrow from initial point of v to terminal point of w.

![vector_sum.png](attachment:vector_sum.png)

In [18]:
u = np.array([1,1])
v = np.array([2,1])
print(u+v)

[3 2]


Do note that the vectors have to be of same dimensions for this addition to make sense. Let's try to do the addition when the vectors are of different dimensions.

In [19]:
a = np.array([1,1,1])
b = np.array([2,1])
print(a+b)

ValueError: operands could not be broadcast together with shapes (3,) (2,) 

Addition with vectors are both associative and commutative.

##### Scalar Multiplication

We can multiply vectors by scalers , which scales the magnitude of the vectors. Scaling with negative quantity scalers revers the direction too.

[fig-vector scaling]

In [20]:
u = np.array([1,1])
print(3*u)

[3 3]


scalar vector multiplication is distributive over scalar addition

(a+b)x**u** = ax**u**+bx**u**

In [21]:
print(5*u 3*u+2*u))
print(5*u 3*u+2*u))

SyntaxError: invalid syntax (<ipython-input-21-e8d88c926edc>, line 1)

### Dot Product
The dot product is very important operator for vectors. It is also called inner product.It is denoted by '**.**' symbol.  
Angle between two vectors:

![dot.png](attachment:dot.png)

The angle between the two vectors **u** and **v** is given by:
$$ u.v = ||u||\ ||v|| \ cos \theta$$
From the above equation, if theta is 90 degrees, i.e,if **u** and **v** are perpendicular, then cos90 =0 and hence the dot product is also 0. To re-iterate:
$$ u.v = 0 \ if \ \theta=90^{\circ}$$
Fundamentally, dot product is a projection. The dot product **u.v** gives the projection of **u** along **v**.

![dot.png](attachment:dot.png)

Dot product as rectangular coordinates:  
Let the components of both U and V along the orthogonal basis vectors be (Ux,Uy) and (Vx,Vy) respectively.Hence, the dot product can now be written as :  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**u . v** = ( ux **i** + uy **j** ).( vx **i** +vy **j** )  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= ux.vx **i** + uy.vx **i** **j** + ux **i** vy **j** + uy.vy **j**   
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= ux.vx+uy.vy  
i.e , the dot product of two vectors is same as product of it's corresponding components.  
An important thing to keep in mind is vectors can exists without any coordinate system, hence above representation is just for convinience under certain scenarios.


The dot product of vectors produces a scalar output. What dot product gives you can be thought in various ways, think of it as a measure of similarity between two vectors, or by measure of contribution of a vector to another.  
The dot product are present everywhere in physics and computing is either as direct form or indirect forms like matrix multiplication. Wikipedia has some pointers to uses:
https://en.wikipedia.org/wiki/Dot_product#Physics

In [42]:
u=np.array([1,2,3])
v=np.array([1,2,3])
z=u.dot(v)
print(z)

14


##### Normal Vectors
Vectors of length 1 are called normal vectors.They are also known as unit vectors. Any vector can be normalized by diving the individual components by it's length.  
$$ N = \frac{X}{||X||} $$

In [34]:
print(u/np.linalg.norm(u))
print("Length after nomalization: ",0.26726124**2+0.53452248**2+0.80178373**2)

[ 0.26726124  0.53452248  0.80178373]
Length after nomalization:  1.0000000017244008


In [35]:
### Dot product for identical vectors
u=np.array([1,2,3])
v=np.array([1,2,3])
u_norm = u/np.linalg.norm(u)
v_norm = v/np.linalg.norm(v)
print(u_norm.dot(v_norm))

1.0


In [40]:
### Dot product for different vectors
u=np.array([1,2,3])
v=np.array([4,9,1])
u_norm = u/np.linalg.norm(u)
v_norm = v/np.linalg.norm(v)
print(u_norm.dot(v_norm))

0.674936558945


In [41]:
### Dot product for perpendicular vectors
u=np.array([1,0,0])
v=np.array([0,1,0])
u_norm = u/np.linalg.norm(u)
v_norm = v/np.linalg.norm(v)
print(u_norm.dot(v_norm))

0.0


### Linear Combinations

Let's **v1,v2,v3....vn** be vectors. Linear combination is defined as 

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a**v1** + b**v2** + c**v3** + ..... + n**vn**

Example: One Linear combination of [2,3,4] and [3,4,5] is

5 x [2,3,4] + 2 x [3,4,5]

= [10,15,20] + [6,8,10]

= [16,23,30]

Example2: Portfolio
Assume that you have a quantities of stock1 and b quantities of stock2. Then the toal value of your portfolio will be:  
a x stock1 + b x stock2

### Linear Equations
A linear equation is an equation of the form  
$$ a_1x_1+a_2x_2+a_3x_3+.....+a_1x_n = c_1 $$
 where  there are n variables  $$ x_1,x_2 ... x_n $$ and $$a_1,a_2 .. c_1 $$ are known constant  
 So the solution of the equations is ordered list of numbers s1,s2,s3...sn that when substituted satisfies the linear equation.  

##### System of linear equations
The graph of a linear equations is a line. In a system of linear equations, there are multiple lines and an interesting question to ask for is where does the line intersect. In order words, what are the solutions which satisfies all the linear equations in a system of equations. There can be 3 possible answers to above questions:  
1) The lines are parallel .. they do not intersect and thus have no solution.Such systems are also called singular or inconsistent.  
2) The lines are same .. there are infinite number of solutions  
3) The lines intersect have exactly one solution  

### Linear Transformations
A linear transformation, T:U→V, is a function that carries elements of the vector space U (domain) to the vector space 
V (codomain), and which has two additional properties  
$$ T(u_1+u_2)=T(u_1)+T(u_2) \ for \ all \ u_1,u_2∈U  $$  
and
$$ T(αu)=αT(u) \ for \ all \ u∈U \ and \ all \ α∈C  $$ 
All it's trying to say is, with linear transformations,  you should:  
1) Be able to transforms individual components first and then add them, or add them first and then apply the transformation without any side effect   
2) Scale the vector first and then apply the transformation or apply the transformation and then scale the vector without making any difference

#### Examples

Let's take a linear transform function
$$ T(x_1,x_2) =  (\begin{array}{c} x_1 + 2*x_2 \\ x_2  \end{array} ) $$

In [131]:
### Implementing transform function (x1,x2) -> (x1+2x2,x2)
def linearTransform(vector2d):
    return np.array([vector2d[0]+2*vector2d[1],vector2d[1]])

In [113]:
u = np.array([1,1])
v = np.array([2,2])
print("========Property 1:========")
print("T(u1+u2)")
print(linearTransform(u+v))
print("T(u1)+T(u2)")
print(linearTransform(u)+linearTransform(v))
print("========Property 2:========")
print("T(au1)")
print(linearTransform(5*u))
print("aT(u1)")
print(5*linearTransform(u))

T(u1+u2)
[9 3]
T(u1)+T(u2)
[9 3]
T(au1)
[15  5]
aT(u1)
[15  5]


Now let's look at a non-linear transform:
$$ T(x_1,x_2) =  (\begin{array}{c} x_1-1 \\ x_2  \end{array} ) $$

In [114]:
### Implementing transform function (x1,x2) -> (x1-1,x2)
def nonLinearTransform(vector2d):
    return np.array([vector2d[0]-1,vector2d[1]])

In [130]:
u = np.array([1,1])
v = np.array([2,2])
print("========Property 1:========")
print("T(u1+u2)")
print(nonLinearTransform(u+v))
print("T(u1)+T(u2)")
print(nonLinearTransform(u)+nonLinearTransform(v))
print("========Property 2:========")
print("T(au1)")
print(nonLinearTransform(5*u))
print("aT(u1)")
print(5*nonLinearTransform(u))

T(u1+u2)
[2 3]
T(u1)+T(u2)
[1 3]
T(au1)
[4 5]
aT(u1)
[0 5]


#### Rotation Functions

In [128]:
import math
def rotateTransform(vector2d):
    return np.array([vector2d[0]*math.cos(math.radians(60)),vector2d[1]*math.cos(math.radians(60))])

In [129]:
u = np.array([1,1])
v = np.array([2,2])
print("========Property 1:========")
print("T(u1+u2)")
print(rotateTransform(u+v))
print("T(u1)+T(u2)")
print(rotateTransform(u)+rotateTransform(v))
print("========Property 2:========")
print("T(au1)")
print(rotateTransform(5*u))
print("aT(u1)")
print(5*rotateTransform(u))

T(u1+u2)
[ 1.5  1.5]
T(u1)+T(u2)
[ 1.5  1.5]
T(au1)
[ 2.5  2.5]
aT(u1)
[ 2.5  2.5]


### Matrices
A matrix is a 2-d array of elements. It is used to conviniently represent sets of data.  
Here is an example of a matrix:  
\begin{bmatrix}
    {1}       & {4} & {7} \\
    {2}       & {5} & {8} \\
    {3}       & {6} & {9} \\
\end{bmatrix}  
The rows are the horizontal lines and the columns are the vertical line of numbers.  
The size of a matrix is the number of rows by the number of columns. Here the size is 3x3.  
Formally, a m rows and n column matrix are represented as  
$$ A = (a_{ij})_{m,n} $$ 
The i,j is denotes the row and column of a matrix.   
There are multiple ways of creating matrices in numpy. You can create an 2-d array or directly invoke the matrix api. We would stick with creating matrices using the array api.

In [140]:
A = np.array([[1,4,7], [2, 5,8],[3,6,9]])
print(A)

[[1 4 7]
 [2 5 8]
 [3 6 9]]


Let's access some elements using the index notation:

In [141]:
print(A[0][0])
print(A[1][2])
print(A[2][2])

1
8
9


#### Matrix Vector Product (MVP)

Let's say A is an mxn matrix with columns A1,A2,A3...An. Then if u is a vector of size n , the matrix vector product is defined as 

$$
Au = A_1u_1+A_2u_2+A_3u_3+....+A_nu_n
$$

Example:    

$$
A \ = \ \begin{bmatrix}
    {1}       & {4} & {7} \\
    {2}       & {5} & {8} \\
    {3}       & {6} & {9} \\
\end{bmatrix}
\ and \ u = \
\begin{bmatrix}
    {1} \\
    {2} \\
    {3} \\
\end{bmatrix}
$$

$$
= 
1
\begin{bmatrix}
    {1} \\
    {2} \\
    {3} \\
\end{bmatrix}
+
2
\begin{bmatrix}
    {4} \\
    {5} \\
    {6} \\
\end{bmatrix}
+
3
\begin{bmatrix}
    {7} \\
    {8} \\
    {9} \\
\end{bmatrix}
$$

$$
= 
\begin{bmatrix}
    {1} \\
    {2} \\
    {3} \\
\end{bmatrix}
+
\begin{bmatrix}
    {8} \\
    {10} \\
    {12} \\
\end{bmatrix}
+
\begin{bmatrix}
    {21} \\
    {24} \\
    {27} \\
\end{bmatrix}
$$

$$=
\begin{bmatrix}
    {1+8+21}\\
    {2+10+24}\\
    {3+12+27}\\
\end{bmatrix} 
$$

$$
=
\begin{bmatrix}
    {30}\\
    {36}\\
    {42}\\
\end{bmatrix} 
$$

In [150]:
A = np.array([[1,4,7], [2, 5,8],[3,6,9]])
u = np.array([[1],[2],[3]])
print(np.matmul(A,u))

[[30]
 [36]
 [42]]


**Matrices can be used to represent a linear transform. Multiplication of a vector by a matrix performs a linear combination and thus transforms the input vector into an output vector, possibly of a different size.**

$$ 
\begin{bmatrix}
    {1}       & {4} & {7} \\
    {2}       & {5} & {8} \\
    {3}       & {6} & {9} \\
\end{bmatrix}
\begin{bmatrix}
    x_{1} \\
    x_{2} \\
    x_{3} \\
\end{bmatrix}
=
x_{1}
\begin{bmatrix}
    {1} \\
    {2} \\
    {3} \\
\end{bmatrix}
+
x_{2}
\begin{bmatrix}
    {4} \\
    {5} \\
    {6} \\
\end{bmatrix}
+
x_{3}
\begin{bmatrix}
    {7} \\
    {5} \\
    {6} \\
\end{bmatrix}
$$

$$
= 
\begin{bmatrix}
    {x_{1}} \\
    {2x_{1}} \\
    {3x_{1}} \\
\end{bmatrix}
+
\begin{bmatrix}
    {4x_{2}} \\
    {5x_{2}} \\
    {6x_{2}} \\
\end{bmatrix}
+
\begin{bmatrix}
    {7x_{3}} \\
    {8x_{3}} \\
    {9x_{3}} \\
\end{bmatrix}
$$

$$
= 
\begin{bmatrix}
    {x_{1}+4x_{2}+7x_{3}} \\
    {2x_{1}+5x_{2}+8x_{3}} \\
    {3x_{1}+6x_{2}+9x_{3}} \\
\end{bmatrix}
$$

*** Matrix operations are linear ***

A(x+y) = Ax + Ay  
A(ax)  = aAx

In [154]:
##Property 1
print("Property1: A(x+y) = Ax + Ay  ")
A = np.array([[1,4,7], [2, 5,8],[3,6,9]])
u = np.array([[1],[2],[3]])
v = np.array([[3],[4],[5]])
print("A(x+y)")
print(np.matmul(A,u+v))
A = np.array([[1,4,7], [2, 5,8],[3,6,9]])
u = np.array([[1],[2],[3]])
v = np.array([[3],[4],[5]])
print("A(x)+A(y)")
print(np.matmul(A,u)+np.matmul(A,v))

Property1: A(x+y) = Ax + Ay  
A(x+y)
[[ 84]
 [102]
 [120]]
A(x)+A(y)
[[ 84]
 [102]
 [120]]


In [156]:
##Property 2
print("Property2: A(ax) = aA(x)")
A = np.array([[1,4,7], [2, 5,8],[3,6,9]])
u = np.array([[1],[2],[3]])
print("A(ax)")
print(np.matmul(A,3*u))
print("aA(x)")
print(3*np.matmul(A,u))

Property2: A(ax) = aA(x)
A(ax)
[[ 90]
 [108]
 [126]]
aA(x)
[[ 90]
 [108]
 [126]]


##### Deriving a matrix from a linear transformation
Let's do the process in reverse now.

$$
\begin{bmatrix}
    {x_{1}+4x_{2}+7x_{3}} \\
    {5x_{2}+8x_{3}} \\
    {3x_{1}+9x_{3}} \\
\end{bmatrix}
$$

$$
=
\begin{bmatrix}
    {x_{1}} \\
    {0 x_{1}} \\
    {3x_{1}} \\
\end{bmatrix}
+
\begin{bmatrix}
    {4x_{2}} \\
    {5x_{2}} \\
    {0x_{2}} \\
\end{bmatrix}
+
\begin{bmatrix}
    {7x_{3}} \\
    {8x_{3}} \\
    {9x_{3}} \\
\end{bmatrix}
$$

$$
=
x_{1}
\begin{bmatrix}
    {1} \\
    {0} \\
    {3} \\
\end{bmatrix}
+
x_{2}
\begin{bmatrix}
    {4} \\
    {5} \\
    {0} \\
\end{bmatrix}
+
x_{3}
\begin{bmatrix}
    {7} \\
    {8} \\
    {9} \\
\end{bmatrix}
$$

$$
=
\begin{bmatrix}
    {1}       & {4} & {7} \\
    {0}       & {5} & {8} \\
    {3}       & {0} & {9} \\
\end{bmatrix} 
\begin{bmatrix}
    {x_1}\\
    {x_2}\\
    {x_3}\\
\end{bmatrix} 
$$

### Matrix Algebra

#### Addition of matrices
Let's get down to some operations on matrices. The sum of two matrices is the sum of corresponding entries of the matrices and only makes sense if they are of same size.Formally  
$$ [A+B]_{ij} = [A]_{ij} + [B]_{ij}   $$

In [158]:
A = np.random.random((2, 2))
B = np.random.random((2, 2))
print("A = ")
print(A)
print("B = ")
print(B)
print("SUM:")
print(A+B)

A = 
[[ 0.97230088  0.80779044]
 [ 0.60804372  0.07976949]]
B = 
[[ 0.92501422  0.7275366 ]
 [ 0.83090743  0.78245292]]
SUM:
[[ 1.8973151   1.53532704]
 [ 1.43895115  0.86222241]]


#### Scalar Multiplication
Scalar multiplication will just scale individual components of A.  
$$ [aA]_{ij} = a[A]_{ij} $$

In [170]:
print(3*A)

[[ 2.91690265  2.42337132]
 [ 1.82413115  0.23930846]]


#### Transpose of a matrix
The transpose of a matrix interchanges the row and columns of a matrix. It is expressed as  
$$ A_{ij}^T = A_{ji}  $$

In [172]:
print(A.T)

[[ 0.97230088  0.60804372]
 [ 0.80779044  0.07976949]]


##### Symmetric Matrices
If the transpose of a matrix is equal to the matrix, such matrices are known as symmetric matrices.
$$  A^T = A $$

In [174]:
I = np.eye(3)
print(I)
IT = I.T
print(IT)
print(I==IT)

[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]
[[ True  True  True]
 [ True  True  True]
 [ True  True  True]]


Theorms on symmetric matrices:  
1) symmetric matrices are square.

#### Matrix multiplication
Multiplying matrices are very different than one might initially anticipate. And the difference arises because matrix multiplication denotes the idea of composing (or chaining) linear transformations. We can multiply matrices of different shapes and end-up with a matrix of other shape.

For simplicity, let's consider two  2x2 matrices u and v as :

$$  
u = \begin{bmatrix}
    {1}  & {3} \\
    {2}  & {4}  \\
\end{bmatrix} 
\ v = \begin{bmatrix}
    {5}  & {7} \\
    {6}  & {8}  \\
\end{bmatrix} 
$$

then, the matrix multiplication u*v is defined as :

$$
\begin{bmatrix}
    {1}  & {3} \\
    {2}  & {4}  \\
\end{bmatrix}
\begin{bmatrix}
    {5} \\
    {6}
\end{bmatrix}
|
\begin{bmatrix}
    {1}  & {3} \\
    {2}  & {4}  \\
\end{bmatrix}
\begin{bmatrix}
    {7} \\
    {8}
\end{bmatrix}
$$

$$
= 
\begin{bmatrix}
    { 1x5 + 3x6} \\
    { 2x5 + 4x6} 
\end{bmatrix}
|
\begin{bmatrix}
    { 1x7 + 3x8} \\
    { 2x7 + 4x8} 
\end{bmatrix}
$$

$$ =
\begin{bmatrix}
    {23}  & {31} \\
    {34}  & {46}  \\
\end{bmatrix}
$$

In [168]:
u = np.array([[1,3],[2,4]])
v = np.array([[5,7],[6,8]])
print(np.matmul(u, v))

[[23 31]
 [34 46]]


In [169]:
print(np.matmul(A, B))

[[ 1.57059123  1.33944246]
 [ 0.62873015  0.50478992]]


Another way to describe the matrix multiplication operation is that it's a dot product of row and column vectors. Another important thing to remember is that the operation only defined if number of columns of matrix on the left  matches with number of rows on the matrix to the right of the operator. The result of the multipication is a matrix with same number of rows as the left matrix by number of columns on the right. i.e  let's say u is mxn and v is nxp , then the output will be of shape mxp.

It's worth spending some time doing few examples to sink it in and practice multiplying few 2x2 and 3x3 matrices and also some rectangular matrices. Also observe that as the dimension grows, it becomes more tedious to carry out multiplication.

### * STOP AND PRACTICE *

Now if you have done some examples and felt that multiplying 3x3 is a lot worse than 2x2, think about multiplying 10000x10000 matrices, or millionxmillion matrices. These usecases are not fictional and are often required.  
There are some methods developed which can break a big matrix into sub-matrices (blocks) and perform multiplication in parallel , and then combine the result, but we are not going to cover those here. 

#### Properties of matrix multiplication
1) Matrix multiplication are **not** commutative. AB != BA. It might even be the case that AxB is defined whereas BxA is not defined.  
2) Multiplication distributes over addition. A(B+C) = AB+AC  
3)$$ (AB)^T = B^TA^T   $$


#### Inverse of a Matrix

In [176]:
Ainv = np.linalg.inv(A)
print(Ainv)

[[-0.19286069  1.9530152 ]
 [ 1.47008253 -2.35075621]]


In [177]:
print(np.matmul(A, Ainv))

[[  1.00000000e+00   0.00000000e+00]
 [ -8.32667268e-17   1.00000000e+00]]


### Linear Equations
A linear equation is an equation of the form  
$$ a_1x_1+a_2x_2+a_3x_3+.....+a_1x_n = c_1 $$
 where  there are n variables  $$ x_1,x_2 ... x_n $$ and $$a_1,a_2 .. c_1 $$ are known constant  
 So the solution of the equations is ordered list of numbers s1,s2,s3...sn that when substituted satisfies the linear equation.  

##### System of linear equations
The graph of a linear equations is a line. In a system of linear equations, there are multiple lines and an interesting question to ask for is where does the line intersect. In order words, what are the solutions which satisfies all the linear equations in a system of equations. There can be 3 possible answers to above questions:  
1) The lines are parallel .. they do not intersect and thus have no solution.Such systems are also called singular or inconsistent.  
2) The lines are same .. there are infinite number of solutions  
3) The lines intersect have exactly one solution  

For example, consider system of linear equations:  
$$ a_1x_1+b_1x_2+c_1x_3 = d_1 $$
$$ a_2x_1+b_2x_2+c_2x_3 = d_2 $$
$$ a_3x_1+b_3x_2+c_3x_3 = d_3 $$  
We can represent the same information with matrix as show below:  
$$ 
\begin{bmatrix}
    a_{1}       & b_{1} & c_{1} \\
    a_{2}       & b_{2} & c_{2} \\
    a_{3}       & b_{3} & c_{3} \\
\end{bmatrix}
\begin{bmatrix}
    x_{1} \\
    x_{2} \\
    x_{3} \\
\end{bmatrix}
=
\begin{bmatrix}
    d_{1} \\
    d_{2} \\
    d_{3} \\
\end{bmatrix}
$$

And we can describe the above form in general as   
$$ Ax=b $$

#### Matrix Factorization
In matrix factorization, we are interested in finding two or more matrices whose product gives you the original matrix. This process is also called as matrix decomposition.  
There are numerous matrix factorization algorithms,like LU,QR,Rank, Cholskey etc and each algorithm breaks the matrix such that the decomposed matrices solve a particular class of problems well.  
Wikipedia has good article which gives overview of all the different techniques:  
https://en.wikipedia.org/wiki/Matrix_decomposition

In [36]:
import scipy
P, L, U = scipy.linalg.lu(A)

print("A:")
print(A)

print("P:")
print(P)

print("L:")
print(L)

print("U:")
print(U)

A:
[[ 0.56073128  0.95912425  0.18109907]
 [ 0.97114416  0.44711371  0.73957315]
 [ 0.81032431  0.50232417  0.38825604]]
P:
[[ 0.  1.  0.]
 [ 1.  0.  0.]
 [ 0.  0.  1.]]
L:
[[ 1.          0.          0.        ]
 [ 0.57739242  1.          0.        ]
 [ 0.83440168  0.18439136  1.        ]]
U:
[[ 0.97114416  0.44711371  0.73957315]
 [ 0.          0.70096418 -0.24592486]
 [ 0.          0.         -0.18349862]]


#### Determinant
Determinant is a function that operates on square matrices to produce a scalar. Determinants maps to area spanned by two vectors in case of 2x2 matrices and to volume in case of 3x3 matrices.  
http://www.maths.manchester.ac.uk/~lwalker/MATH10000/project-04-part2.pdf  
A 2×2 determinant is defined to be
$$ det[a b; c d]=|a b; c d|=ad-bc.  $$

In [69]:
print(np.linalg.det(A))

-0.285374711663


##### Eigen Values and Eigen Vectors
Eigen is German means "characteristic" or "individual". Eigen vectors and eigen values brings out important characteristics of a matrix. Just by looking at eigen values and vectors of a matrix, you can tell a lot about the nature of the transformation.   
By definition, for a given nxn matrix A, then a non-zero vector x is said to be the eigen vector of A if Ax is a scalar multiple of x. i.e.  
$$ Ax = \lambda x $$
Here lambda is a scalar value.

##### Computing eigen values
Rewriting the definition, the equation can be written as :  
$$ (A-\lambda ) x =0 $$
As x by definition cannot be 0,   
$$ det(A-\lambda) = 0 $$
The above equation is called characteristic equation of A.  

In [38]:
w, v = np.linalg.eig(A)
print(w)
print(v)

[ 1.86507234+0.j         -0.23448566+0.10950836j -0.23448566-0.10950836j]
[[ 0.55077013+0.j          0.69697853+0.j          0.69697853-0.j        ]
 [ 0.65018887+0.j         -0.48658111+0.11281292j -0.48658111-0.11281292j]
 [ 0.52336097+0.j         -0.48347780-0.17601764j -0.48347780+0.17601764j]]


Application of eigen:
Powers of matrix A.
Diagonalization.
Transition matrix.

#### Positive Definite and Positive-Semi definate matrices

In [39]:
def is_pos_def(x):
    return np.all(np.linalg.eigvals(x) > 0)

print(is_pos_def(A))

False


#### Singular Value Decomposition

In [40]:
 U, s, V = np.linalg.svd(A, full_matrices=True)
print("A:")
print(A)

print("U:")
print(U)

print("s:")
print(s)

print("V:")
print(V)

A:
[[ 0.56073128  0.95912425  0.18109907]
 [ 0.97114416  0.44711371  0.73957315]
 [ 0.81032431  0.50232417  0.38825604]]
U:
[[-0.53157978  0.81264395 -0.2388153 ]
 [-0.65598447 -0.57335562 -0.49086425]
 [-0.53582396 -0.10427439  0.83786606]]
s:
[ 1.91000513  0.59726374  0.10949938]
V:
[[-0.71691958 -0.56141631 -0.41332559]
 [-0.31080432  0.78808035 -0.53134737]
 [ 0.62404086 -0.25246996 -0.73948085]]


#### Rank of a matrix

In [41]:
print(np.linalg.matrix_rank(A))

3


#### Constructing orthonormal basis 

In [42]:
#Construct an orthonormal basis for the range of A using SVD
print(scipy.linalg.orth(A))

[[-0.53157978  0.81264395 -0.2388153 ]
 [-0.65598447 -0.57335562 -0.49086425]
 [-0.53582396 -0.10427439  0.83786606]]


##### Find null space of a matrix

In [43]:
def nullspace(A, atol=1e-13, rtol=0):
    A = np.atleast_2d(A)
    u, s, vh = np.linalg.svd(A)
    tol = max(atol, rtol * s[0])
    nnz = (s >= tol).sum()
    ns = vh[nnz:].conj().T
    return ns
print(nullspace(A))

[]


#### Reduced Row Echleon Form:

In [70]:
## Adapted from https://rosettacode.org/wiki/Reduced_row_echelon_form
def ReducedRowEchelonForm(M):
    if M.size==0: return
    lead = 0
    rowCount = len(M)
    columnCount = len(M[0])
    for r in range(rowCount):
        if lead >= columnCount:
            return
        i = r
        while M[i][lead] == 0:
            i += 1
            if i == rowCount:
                i = r
                lead += 1
                if columnCount == lead:
                    return
        M[i],M[r] = M[r],M[i]
        lv = M[r][lead]
        M[r] = [ mrx / float(lv) for mrx in M[r]]
        for i in range(rowCount):
            if i != r:
                lv = M[i][lead]
                M[i] = [ iv - lv*rv for rv,iv in zip(M[r],M[i])]
        lead += 1

In [72]:
ReducedRowEchelonForm(B)
print(B)

[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]


Linear Span  
Linear Independence and Basis vectors  
Affine, conical, and convex combinations  
Fundamental Spaces  

For example, consider system of linear equations:  
$$ a_1x_1+b_1x_2+c_1x_3 = d_1 $$
$$ a_2x_1+b_2x_2+c_2x_3 = d_2 $$
$$ a_3x_1+b_3x_2+c_3x_3 = d_3 $$  
We can represent the same information with matrix as show below:  
$$ 
\begin{bmatrix}
    a_{1}       & b_{1} & c_{1} \\
    a_{2}       & b_{2} & c_{2} \\
    a_{3}       & b_{3} & c_{3} \\
\end{bmatrix}
\begin{bmatrix}
    x_{1} \\
    x_{2} \\
    x_{3} \\
\end{bmatrix}
=
\begin{bmatrix}
    d_{1} \\
    d_{2} \\
    d_{3} \\
\end{bmatrix}
$$