# **Chapter 4 - THE PRELIMINARIES: A CRASHCOURSE**

## **4.2 Linear Algebra**

In [60]:
from mxnet import nd

#### **4.2.1 Scalars**

In [61]:
x = nd.array([3.0])
y = nd.array([2.0])

print('x + y = ', x + y)
print('x * y = ', x * y)
print('x / y = ', x / y)
print('x ** y = ', nd.power(x,y))

x + y =  
[5.]
<NDArray 1 @cpu(0)>
x * y =  
[6.]
<NDArray 1 @cpu(0)>
x / y =  
[1.5]
<NDArray 1 @cpu(0)>
x ** y =  
[9.]
<NDArray 1 @cpu(0)>


#### **4.2.2 Vectors**

In [62]:
x = nd.arange(4)
print('x = ', x)

x =  
[0. 1. 2. 3.]
<NDArray 4 @cpu(0)>


In [63]:
x[3]


[3.]
<NDArray 1 @cpu(0)>

#### **4.2.3 Length, dimensionality and shape**

In [64]:
x.shape

(4,)

In [65]:
a = 2
x = nd.array([1,2,3])
y = nd.array([10,20,30])

print(a * x)
print(a * x + y)


[2. 4. 6.]
<NDArray 3 @cpu(0)>

[12. 24. 36.]
<NDArray 3 @cpu(0)>


#### **4.2.4 Matrices**

In [66]:
A = nd.arange(20).reshape((5,4))
print(A)


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]
 [12. 13. 14. 15.]
 [16. 17. 18. 19.]]
<NDArray 5x4 @cpu(0)>


In [67]:
print(A.T)


[[ 0.  4.  8. 12. 16.]
 [ 1.  5.  9. 13. 17.]
 [ 2.  6. 10. 14. 18.]
 [ 3.  7. 11. 15. 19.]]
<NDArray 4x5 @cpu(0)>


#### **4.2.5 Tensors**

- Just as vectors generalize scalars, and matrices generalize vectors, we can actually build data structures with even more axes.<br> 
  **Tensors give us a generic way of discussing arrays with an arbitrary number of axes.**<br>
  Vectors, for example, are first-order tensors, and matrices are second-order tensors

In [68]:
X = nd.arange(24).reshape((2, 3, 4))

print('X.shape =', X.shape)
print('X =', X)

X.shape = (2, 3, 4)
X = 
[[[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]]

 [[12. 13. 14. 15.]
  [16. 17. 18. 19.]
  [20. 21. 22. 23.]]]
<NDArray 2x3x4 @cpu(0)>


#### **4.2.6 Basic properties of tensor arithetic**

In [69]:
a = 2
x = nd.ones(3)
y = nd.zeros(3)

print(x.shape)
print(y.shape)
print((a * x).shape)
print((a * x + y).shape)

(3,)
(3,)
(3,)
(3,)


#### **4.2.7 Sums and means**

In [70]:
print(x)
print(nd.sum(x))


[1. 1. 1.]
<NDArray 3 @cpu(0)>

[3.]
<NDArray 1 @cpu(0)>


In [73]:
print(A)
print(nd.sum(A))


[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]
 [12. 13. 14. 15.]
 [16. 17. 18. 19.]]
<NDArray 5x4 @cpu(0)>

[190.]
<NDArray 1 @cpu(0)>


In [74]:
print(nd.mean(A))
print(nd.sum(A) / A.size)


[9.5]
<NDArray 1 @cpu(0)>

[9.5]
<NDArray 1 @cpu(0)>


#### **4.2.8 Dot products**

In [75]:
x = nd.arange(4)
y = nd.ones(4)
print(x, y, nd.dot(x, y))


[0. 1. 2. 3.]
<NDArray 4 @cpu(0)> 
[1. 1. 1. 1.]
<NDArray 4 @cpu(0)> 
[6.]
<NDArray 1 @cpu(0)>


In [76]:
nd.sum(x * y)


[6.]
<NDArray 1 @cpu(0)>

#### **4.2.9 Matrix-vector products**

In [81]:
A.shape, x.shape

((5, 4), (4,))

In [82]:
nd.dot(A, x)


[ 14.  38.  62.  86. 110.]
<NDArray 5 @cpu(0)>

#### **4.2.10 Matrix-matrix multiplication**

In [85]:
B = nd.ones(shape=(4, 3))
A.shape, B.shape, nd.dot(A, B)

((5, 4), (4, 3), 
 [[ 6.  6.  6.]
  [22. 22. 22.]
  [38. 38. 38.]
  [54. 54. 54.]
  [70. 70. 70.]]
 <NDArray 5x3 @cpu(0)>)

#### **4.2.11 Norms**

In [88]:
# L2 norm
x, nd.norm(x)

(
 [0. 1. 2. 3.]
 <NDArray 4 @cpu(0)>, 
 [3.7416573]
 <NDArray 1 @cpu(0)>)

In [87]:
# L1 norm
nd.sum(nd.abs(x))


[6.]
<NDArray 1 @cpu(0)>

#### **4.2.12 Norms and objectives**

- While we do not want to get too far ahead of ourselves, we do want you to anticipate why these concepts are useful. <br>
  In machine learning we are often trying to solve optimization problems: Maximize the probability assigned to observed data. <br>
  Minimize the distance between predictions and the ground-truth observations.<br>
  Assign vector representations to items (like words, products, or news articles) such that the distance between similar items is minimized, <br> 
  and the distance between dissimilar items is maximized. Oftentimes, these objectives, <br>
  perhaps the most important component of a machine learning algorithm (besides the data itself), are expressed as norms.

#### **4.2.13 Intermediate linear algebra**

##### **Basic vector properties**

- **_Additive axioms_** (we assume that x,y,z are all vectors): <br>
  x + y = y + x and (x + y) + z = x + (y + z) and 0 + x = x + 0 = x and (−x) + x = x + (−x) = 0.

- **_Multiplicative axioms_** (we assume that x is a vector and a, b are scalars): <br>
  0 · x = 0 and 1 · x = x and (ab)x = a(bx).

- **_Distributive axioms_** (we assume that x and y are vectors and a, b are scalars): <br>
  a(x + y) = ax + ay and (a + b)x = ax + bx.

##### **Special matrices**

- **_Symmetric Matrix_** M⊤ = M

- **_Antisymmetric Matrix_**

- **_Diagonally Dominant Matrix_**

- **_Positive Definite Matrix_**

____