# 2.3 Linear Algebra

Can load datasets into tensors and manipulate these tensors with basic mathematical operations. To start building sophisticated models, will need a few tools from linear algebra. This section offers a gentle introduction to the most essential concepts

### 2.3.1 Scalers
Manipulating numbers one at a time. Formally, we call these values **scalars**.

For example, the temperature in Palo Alto is a balmy $72$ degrees Fahrenheit. If you wanted to convert the temperature to Celsius you would evaluate the expression $c = \frac{5}{9}(f - 32)$, setting $f$ to $72$. In this equation, the values $5$, $9$, and $32$ are constant scalars. The variables $c$ and $f$ in general represent unknown scalars.

We denote scalars by ordinary lower-cased letters (e.g., $x$, $y$, and $z$)and the space of all (continuous) *real-valued* scalars by $\mathbb{R}$.

Remember that the expression $x \in \mathbb{R}$
is a formal way to say that $x$ is a real-valued scalar.
The symbol $\in$ (pronounced "in")
denotes membership in a set.

(**Scalars are implemented as tensors 
that contain only one element.**)
Below, we assign two scalars
and perform the familiar addition, multiplication,
division, and exponentiation operations.

In [2]:
import torch

x = torch.tensor(3.0)
y = torch.tensor(2.0)
x+y, x-y, x*y, x/y, x**y

(tensor(5.), tensor(1.), tensor(6.), tensor(1.5000), tensor(9.))

### 2.3.2 Vectors

[**you can think of a vector as a fixed-length array of scalars.**].
Call these scalars the *elements* of the vector

For example, if we were training a model to predict the risk of a loan defaulting, might associate each applicant with a vector whose components correspond to quantities like their income, length of employment, or number of previous defaults.

If we were studying the risk of heart attack, each vector might represent a patient and its components might correspond to
their most recent vital signs, cholesterol levels, minutes of exercise per day, etc.

We denote vectors by bold lowercase letters, 
(e.g., $\mathbf{x}$, $\mathbf{y}$, and $\mathbf{z}$).

Vectors are implemented as $1^{\textrm{st}}$-order tensors.

In general, such tensors can have arbitrary lengths,
subject to memory limitations. Caution: in Python, as in most programming languages, vector indices start at $0$, also known as *zero-based indexing*, whereas in linear algebra subscripts begin at $1$ (one-based indexing).


In [5]:
x = torch.arange(3)

We can refer to an element of a vector by using a subscript.
For example, $x_2$ denotes the second element of $\mathbf{x}$. 
Since $x_2$ is a scalar, we do not bold it.
By default, we visualize vectors 
by stacking their elements vertically.

$$\mathbf{x} =\begin{bmatrix}x_{1}  \\ \vdots  \\x_{n}\end{bmatrix},$$


Here $x_1, \ldots, x_n$ are elements of the vector.
Later on, we will distinguish between such *column vectors*
and *row vectors* whose elements are stacked horizontally.
Recall that [**we access a tensor's elements via indexing.**]

In [6]:
x[2]

tensor(2)

To indicate that a vector contains $n$ elements,
we write $\mathbf{x} \in \mathbb{R}^n$.

Formally, we call $n$ the *dimensionality* of the vector.
[**In code, this corresponds to the tensor's length**],
accessible via Python's built-in `len` function.

In [7]:
len(x)

3

Can also access the length via the shape attribute. The shape is a tuple that indicates a tensor’s length along each axis. Tensors with just one axis have shapes with just one element.

In [8]:
x.shape

torch.Size([3])

The word “dimension” gets overloaded to mean both the number of axes and the length along a particular axis. 

To avoid this confusion, we use **order** to refer to the number of axes and **dimensionality** exclusively to refer to the number of components.

### 2.3.3 Matrices

Scalars are $0^{\textrm{th}}$-order tensors and vectors are $1^{\textrm{st}}$-order tensors, matrices are $2^{\textrm{nd}}$-order tensors.

Denote matrices by bold capital letters (e.g., $\mathbf{X}$, $\mathbf{Y}$, and $\mathbf{Z}$), and represent them in code by tensors with two axes.

The expression $\mathbf{A} \in \mathbb{R}^{m \times n}$
indicates that a matrix $\mathbf{A}$ contains $m \times n$ real-valued scalars, arranged as $m$ rows and $n$ columns.

When $m = n$, we say that a matrix is *square*.

Visually, can illustrate any matrix as a table. To refer to an individual element,we subscript both the row and column indices, e.g.,
$a_{ij}$ is the value that belongs to $\mathbf{A}$'s
$i^{\textrm{th}}$ row and $j^{\textrm{th}}$ column:

$$\mathbf{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}.$$


In code, we represent a matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$
by a $2^{\textrm{nd}}$-order tensor with shape ($m$, $n$).
[**We can convert any appropriately sized $m \times n$ tensor 
into an $m \times n$ matrix**] 
by passing the desired shape to `reshape`:


In [9]:
A = torch.arange(6).reshape(3,2)
A

tensor([[0, 1],
        [2, 3],
        [4, 5]])

Sometimes we want to flip the axes. When exchange a matrix's rows and columns, the result is called its *transpose*.

Formally, we signify a matrix $\mathbf{A}$'s transpose by $\mathbf{A}^\top$ and if $\mathbf{B} = \mathbf{A}^\top$, then $b_{ij} = a_{ji}$ for all $i$ and $j$.

Thus, the transpose of an $m \times n$ matrix is an $n \times m$ matrix:

$$
\mathbf{A}^\top =
\begin{bmatrix}
    a_{11} & a_{21} & \dots  & a_{m1} \\
    a_{12} & a_{22} & \dots  & a_{m2} \\
    \vdots & \vdots & \ddots  & \vdots \\
    a_{1n} & a_{2n} & \dots  & a_{mn}
\end{bmatrix}.
$$

In code, we can access any (**matrix's transpose**) as follows:

In [10]:
A.T

tensor([[0, 2, 4],
        [1, 3, 5]])

[**Symmetric matrices are the subset of square matrices
that are equal to their own transposes:
$\mathbf{A} = \mathbf{A}^\top$.**]
The following matrix is symmetric:

In [12]:
A = torch.tensor([[1,2,3],[2,0,4],[3,4,5]])

In [13]:
A == A.T

tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

Matrices are useful for representing datasets. 

Typically, rows correspond to individual records and columns correspond to distinct attributes.

### 2.3.4 Tensors