# Lesson 1: Introduction to Linear Algebra

---

## 1. What is Linear Algebra?

Linear algebra is a branch of mathematics that studies mathematical structures such as *vectors*, *matrices*, and *linear transformations*. It is fundamentally concerned with solving systems of linear equations and analyzing systems that can be expressed using these equations.

### Why is Linear Algebra Important?

Linear algebra is a cornerstone of *data science* and *machine learning*. Here's why:

1. **Data Representation:** In data science, data typically consists of numbers. *Vectors* and *matrices* are used to represent this data in an organized manner. For example, a single house's features (size, number of rooms, location) can be represented as a vector, and the features of multiple houses can be represented as a matrix.

2. **Model Building:** Machine learning algorithms use linear algebra concepts to model the relationships between data. Many algorithms, such as *linear regression*, *logistic regression*, and *support vector machines*, are fundamentally based on linear algebra.

3. **Dimensionality Reduction:** Datasets can sometimes have a large number of features. This increases computational cost and can also reduce the performance of a model. Dimensionality reduction techniques, like *Principal Component Analysis (PCA)*, use linear algebra to represent the data with fewer, but more meaningful, features.

4. **Deep Learning:** Popular *deep learning* algorithms consist of *neural networks*. The operations within these networks are essentially matrix multiplications and linear transformations.

**In short:** Linear algebra is a fundamental tool used in data science to understand, model, and process data.

---

## 2. Scalars, Vectors, and Matrices

Let's examine the fundamental building blocks of linear algebra: scalars, vectors, and matrices.

### Scalars

A **scalar** is a single numerical quantity. It's often used in mathematics and physics. Examples of scalars include temperature, mass, and time.

Scalars are usually represented by lowercase letters (e.g., *a*, *b*, *x*, *y*).

In [8]:
a = 5       # int (integer)
b = -3.14   # float (floating-point number)

### Vectors

A **vector** is an ordered sequence of numbers. It represents multiple numbers in a single dimension (i.e., a single column or row). Vectors are used to represent quantities that have both magnitude and direction (like velocity and force in physics). In data science, they often represent the features of a data point.

Vectors are usually represented by bold lowercase letters or with an arrow above them. The elements of a vector are written within square brackets, separated by commas.

$ \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} $ (column vector) or $ \mathbf{v} = \begin{bmatrix} v_1 & v_2 & \dots & v_n \end{bmatrix} $ (row vector)

**Example:**  
A student's exam scores:

$\mathbf{v} = \begin{bmatrix} 85 \\ 92 \\ 78 \end{bmatrix}$

<br>

In [9]:
import numpy as np

v = np.array([85, 92, 78])
print(v)

[85 92 78]


### Matrices

**Matrices** are two-dimensional arrays of numbers. They consist of multiple rows and columns. Matrices are used to represent *datasets* in machine learning. Each row represents a *data point*, and each column represents a *feature*.

Matrices are generally represented by uppercase letters (e.g., **A**, **B**, **X**). Their elements are denoted as `aᵢⱼ`, where *i* is the row number and *j* is the column number.

$
\mathbf{A} = \begin{bmatrix}
a_{1,1} & a_{1,2} & \dots & a_{1,n} \\
a_{2,1} & a_{2,2} & \dots & a_{2,n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m,1} & a_{m,2} & \dots & a_{m,n}
\end{bmatrix}
$

*   **Example:** Let's consider representing the features of multiple houses (size in square meters, number of rooms, age) with a matrix. We can express this matrix mathematically as follows:

$
\mathbf{X} = \begin{bmatrix}
150 & 3 & 10 \\
120 & 2 & 20 \\
200 & 4 & 5
\end{bmatrix}
$

<br>

In [10]:
import numpy as np

X = np.array([
    [150, 3, 10],  # House 1: Area, Number of Rooms, Age
    [120, 2, 20],  # House 2
    [200, 4, 5]    # House 3
])
print(X)

[[150   3  10]
 [120   2  20]
 [200   4   5]]


## 3. Basic Operations

We can perform various mathematical operations on scalars, vectors, and matrices. In this section, we'll examine the most fundamental operations: addition, subtraction, and scalar multiplication.

### Addition and Subtraction

**Vectors:** Corresponding elements of two vectors of the same dimension are added or subtracted to obtain a new vector.

Let $ \mathbf{u} = \begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{bmatrix} $,
$ \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} $.

$ \mathbf{u} + \mathbf{v} = \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \\ \vdots \\ u_n + v_n \end{bmatrix} $,
$ \mathbf{u} - \mathbf{v} = \begin{bmatrix} u_1 - v_1 \\ u_2 - v_2 \\ \vdots \\ u_n - v_n \end{bmatrix} $

**Example:**  

If $ \mathbf{u} = \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} $,
$ \mathbf{v} = \begin{bmatrix} 4 \\ 0 \\ -1 \end{bmatrix} $, then

$ \mathbf{u} + \mathbf{v} = \begin{bmatrix} 1+4 \\ 3+0 \\ 2+(-1) \end{bmatrix} = \begin{bmatrix} 5 \\ 3 \\ 1 \end{bmatrix} $,
    $ \mathbf{u} - \mathbf{v} = \begin{bmatrix} 1-4 \\ 3-0 \\ 2-(-1) \end{bmatrix} = \begin{bmatrix} -3 \\ 3 \\ 3 \end{bmatrix} $

<br>

In [11]:
import numpy as np

u = np.array([1, 3, 2])
v = np.array([4, 0, -1])

print("u + v:", u + v)
print("u - v:", u - v)

u + v: [5 3 1]
u - v: [-3  3  3]


**Matrices:** Corresponding elements of two matrices with the same dimensions (same number of rows and columns) are added or subtracted to obtain a new matrix.


Let $ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} $,
$ \mathbf{B} = \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} $.

$ \mathbf{A} + \mathbf{B} = \begin{bmatrix} a_{11}+b_{11} & a_{12}+b_{12} \\ a_{21}+b_{21} & a_{22}+b_{22} \end{bmatrix} $,
$ \mathbf{A} - \mathbf{B} = \begin{bmatrix} a_{11}-b_{11} & a_{12}-b_{12} \\ a_{21}-b_{21} & a_{22}-b_{22} \end{bmatrix} $
**Example:**

If $ \mathbf{A} = \begin{bmatrix} 2 & 5 \\ 1 & 0 \end{bmatrix} $,
$ \mathbf{B} = \begin{bmatrix} -1 & 3 \\ 4 & 2 \end{bmatrix} $, then

    $ \mathbf{A} + \mathbf{B} = \begin{bmatrix} 2+(-1) & 5+3 \\ 1+4 & 0+2 \end{bmatrix} = \begin{bmatrix} 1 & 8 \\ 5 & 2 \end{bmatrix} $,
    $ \mathbf{A} - \mathbf{B} = \begin{bmatrix} 2-(-1) & 5-3 \\ 1-4 & 0-2 \end{bmatrix} = \begin{bmatrix} 3 & 2 \\ -3 & -2 \end{bmatrix} $

<br>

In [12]:
import numpy as np

A = np.array([[2, 5], [1, 0]])
B = np.array([[-1, 3], [4, 2]])

print("A + B:\n", A + B)
print("A - B:\n", A - B)

A + B:
 [[1 8]
 [5 2]]
A - B:
 [[ 3  2]
 [-3 -2]]


### Scalar Multiplication

**Vectors:** Each element of a vector is multiplied by a scalar to obtain a new vector.

Let $ \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} $ and *c* be a scalar.

$ c\mathbf{v} = \begin{bmatrix} cv_1 \\ cv_2 \\ \vdots \\ cv_n \end{bmatrix} $

**Example:**
If $ \mathbf{v} = \begin{bmatrix} 2 \\ -1 \\ 4 \end{bmatrix} $ and *c* = 3, then

    $ 3\mathbf{v} = \begin{bmatrix} 3\cdot2 \\ 3\cdot(-1) \\ 3\cdot4 \end{bmatrix} = \begin{bmatrix} 6 \\ -3 \\ 12 \end{bmatrix} $

<br>

In [13]:
import numpy as np

v = np.array([2, -1, 4])
c = 3

print("c * v:", c * v)

c * v: [ 6 -3 12]


**Matrices:** Each element of a matrix is multiplied by a scalar to obtain a new matrix.

    Let $ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} $ and *c* be a scalar.

    $ c\mathbf{A} = \begin{bmatrix} c \cdot a_{11} & c \cdot a_{12} \\ c \cdot a_{21} & c \cdot a_{22} \end{bmatrix} $
**Example:**
If $ \mathbf{A} = \begin{bmatrix} 1 & -2 \\ 3 & 4 \end{bmatrix} $ and *c* = -2, then

$ -2\mathbf{A} = \begin{bmatrix} -2\cdot1 & -2\cdot(-2) \\ -2\cdot3 & -2\cdot4 \end{bmatrix} = \begin{bmatrix} -2 & 4 \\ -6 & -8 \end{bmatrix} $

<br>

In [14]:
import numpy as np

A = np.array([[1, -2], [3, 4]])
c = -2

print("c * A:\n", c * A)

c * A:
 [[-2  4]
 [-6 -8]]


## 4. Real-World Example: Processing Customer Data

Suppose an e-commerce company wants to analyze the purchasing habits of its customers.  We have a dataset like this:

| Customer ID | Age | Spending (€) | Number of Products Purchased |
| :--------: | :-: | :----------: | :----------------------: |
|     1      | 25  |     500      |            3             |
|     2      | 32  |     1200     |            5             |
|     3      | 41  |     850      |            2             |
|     4      | 28  |     300      |            1             |

We can represent this dataset as a matrix:

$
\mathbf{X} = \begin{bmatrix}
25 & 500 & 3 \\
32 & 1200 & 5 \\
41 & 850 & 2 \\
28 & 300 & 1
\end{bmatrix}
$

Here, each row represents a customer, and each column represents a feature (age, spending, number of products).

Now, let's perform some analyses using the basic operations we've learned:

### Average Spending
To find the average spending of customers, we can take the vector representing the "Spending (€)" column, sum its elements, and divide by the number of elements. This involves scalar multiplication and addition.

*   First, take the spending column as a vector: $\mathbf{s} = \begin{bmatrix} 500 \\ 1200 \\ 850 \\ 300 \end{bmatrix}$

*   Sum the elements of this vector: $500 + 1200 + 850 + 300 = 2850$

*   Divide the sum by the number of elements (4): $2850 / 4 = 712.5$

*   So, the average spending of customers is 712.5 €.

In [15]:
import numpy as np

X = np.array([[25, 500, 3],
                [32, 1200, 5],
                [41, 850, 2],
                [28, 300, 1]])

spending = X[:, 1]  # Take all rows (:) and the 2nd column (index 1)
average_spending = np.mean(spending)
print("Average Spending:", average_spending)

Average Spending: 712.5


### Age Standardization
Let's say we want to standardize the ages of the customers (transform them to have a mean of 0 and a standard deviation of 1).  This can improve the performance of some machine learning algorithms.

*   First, take the age column as a vector: $\mathbf{a} = \begin{bmatrix} 25 \\ 32 \\ 41 \\ 28 \end{bmatrix}$

*   Calculate the mean of this vector: $(25 + 32 + 41 + 28) / 4 = 31.5$

*   Subtract the mean from each age: $\begin{bmatrix} 25-31.5 \\ 32-31.5 \\ 41-31.5 \\ 28-31.5 \end{bmatrix} = \begin{bmatrix} -6.5 \\ 0.5 \\ 9.5 \\ -3.5 \end{bmatrix}$

*   Calculate the standard deviation of this new vector (we'll let Python handle this for now).

*   Divide each element by the standard deviation (this is also a scalar multiplication operation).

In [16]:
import numpy as np

ages = X[:, 0]  # Take all rows and the 1st column (index 0)
mean_age = np.mean(ages)
std_dev = np.std(ages)
standardized_ages = (ages - mean_age) / std_dev
print("Standardized Ages:", standardized_ages)

Standardized Ages: [-1.07959124  0.08304548  1.57786412 -0.58131836]
