# Tensors in Deep Learning:

A Simple Mathematical Explanation



In the context of deep learning, **tensors** are fundamental data structures. They generalize scalars, vectors, and matrices to higher dimensions and are central to the functioning of deep neural networks.

#### 1. **Mathematical Definition of Tensors**

Mathematically, **tensors** are often defined as **multilinear maps** that act on vectors, and they generalize scalars, vectors, and matrices in a way that they describe a relationship between vectors and covectors in a multi-dimensional space.

- **Scalars**: A scalar is a 0-dimensional tensor (just a number).
- **Vectors**: A vector is a 1-dimensional tensor, a collection of numbers arranged in a single row or column.
- **Matrices**: A matrix is a 2-dimensional tensor, an array of numbers with rows and columns.
- **Higher-Dimensional Tensors**: Higher-dimensional tensors can have more dimensions (like 3D, 4D, etc.), each with a specific shape or structure.

In pure mathematics, a tensor $ T $ is a multilinear map:
$T: V_1 \times V_2 \times \cdots \times V_k \rightarrow W $
where $ V_1, V_2, \dots, V_k $ are vector spaces, and $ W $ is the output space.

The important idea is that a tensor defines a mapping between different vector spaces. The key here is that tensors are **multilinear**, meaning they exhibit linearity with respect to each argument.



#### 2. **Tensors in Deep Learning**

In deep learning, tensors are used to store data, weights, and activations. When we use a **tensor** in deep learning, we are typically working with a **multidimensional array**.

- **Scalars** are single numbers (0D tensor).
- **Vectors** are arrays with one dimension (1D tensor).
- **Matrices** are 2D arrays (2D tensor).
- **Higher-dimensional arrays** (e.g., 3D, 4D arrays) are used for representing more complex data such as images (3D) and batches of images (4D).

#### 3. **Tensor code implementation using Python**

#### Pytorch example:

In [1]:
import torch

# Scalar (0D tensor)
scalar = torch.tensor(5)
print("Scalar (0D tensor):")
print(scalar)
print(scalar.shape)  # Shape is empty, as it's just a number

# Vector (1D tensor)
vector = torch.tensor([1, 2, 3])
print("\nVector (1D tensor):")
print(vector)
print(vector.shape)  # Shape is (3,), a 1D tensor with 3 elements

# Matrix (2D tensor)
matrix = torch.tensor([[1, 2, 3], [4, 5, 6]])
print("\nMatrix (2D tensor):")
print(matrix)
print(matrix.shape)  # Shape is (2, 3), a 2D tensor with 2 rows and 3 columns

# Higher-Dimensional Tensor (3D tensor)
tensor_3d = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\nHigher-Dimensional Tensor (3D tensor):")
print(tensor_3d)
print(tensor_3d.shape)  # Shape is (2, 2, 2), a 3D tensor

Scalar (0D tensor):
tensor(5)
torch.Size([])

Vector (1D tensor):
tensor([1, 2, 3])
torch.Size([3])

Matrix (2D tensor):
tensor([[1, 2, 3],
        [4, 5, 6]])
torch.Size([2, 3])

Higher-Dimensional Tensor (3D tensor):
tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])
torch.Size([2, 2, 2])


#### TensorFlow Example:

In [2]:
import tensorflow as tf

# Scalar (0D tensor)
scalar = tf.constant(5)
print("Scalar (0D tensor):")
print(scalar)
print(scalar.shape)  # Shape is empty, as it's just a number

# Vector (1D tensor)
vector = tf.constant([1, 2, 3])
print("\nVector (1D tensor):")
print(vector)
print(vector.shape)  # Shape is (3,), a 1D tensor with 3 elements

# Matrix (2D tensor)
matrix = tf.constant([[1, 2, 3], [4, 5, 6]])
print("\nMatrix (2D tensor):")
print(matrix)
print(matrix.shape)  # Shape is (2, 3), a 2D tensor with 2 rows and 3 columns

# Higher-Dimensional Tensor (3D tensor)
tensor_3d = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\nHigher-Dimensional Tensor (3D tensor):")
print(tensor_3d)
print(tensor_3d.shape)  # Shape is (2, 2, 2), a 3D tensor


Scalar (0D tensor):
tf.Tensor(5, shape=(), dtype=int32)
()

Vector (1D tensor):
tf.Tensor([1 2 3], shape=(3,), dtype=int32)
(3,)

Matrix (2D tensor):
tf.Tensor(
[[1 2 3]
 [4 5 6]], shape=(2, 3), dtype=int32)
(2, 3)

Higher-Dimensional Tensor (3D tensor):
tf.Tensor(
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]], shape=(2, 2, 2), dtype=int32)
(2, 2, 2)


#### 4. **Linking Tensors to Multilinear Maps in Math**

The key idea from **multilinear maps** in mathematics is that **tensors** act in a **multidimensional space**, where each dimension can be thought of as a different vector space. A tensor can be viewed as a **multilinear map** because it allows us to compute relationships between different sets of data (vectors, matrices, or higher-dimensional structures) in a linear way.

For instance, in deep learning, the tensor $ X$ representing an image can be mapped via a set of weights $ W $ (another tensor) to produce an output $Y $ (another tensor). This relationship is linear, and the computation follows the principles of a multilinear map:

$$
Y = W \cdot X
$$

This transformation is done for each input across multiple layers in a neural network, where each layer can be represented by a tensor transformation.

In summary:
- A **tensor** is a generalized data structure used in deep learning to represent multi-dimensional data.
- Mathematically, a tensor is a **multilinear map** between vector spaces.
- Tensors generalize scalars, vectors, and matrices to higher dimensions and allow operations (such as addition, multiplication) to be performed across dimensions.
- In deep learning, these tensors represent input data, model parameters (weights), and activations in neural networks.



#### 5. **Other applications of tensors: Differential Geometry**

Differential geometry is the mathematical framework used to study curves, surfaces, and more general spaces. It’s crucial for understanding the curvature of spacetime in general relativity. Here's how tensors come into play:

#### a) **Manifolds**:
In general relativity, the universe is modeled as a **4-dimensional manifold** (a space that locally looks like Euclidean space, but globally can be curved). Tensors provide the tools to describe the geometry of these manifolds.

#### b) **Metric Tensor** ($ g_{\mu\nu} $):
The **metric tensor** is a key tensor in differential geometry. It describes the structure of spacetime, telling us how distances are measured. In the context of general relativity:
- The **metric** determines how we compute the length of a curve or the distance between two points in spacetime.
- The curvature of spacetime is encoded in how the metric changes from point to point.

#### c) **Curvature Tensors**:
To understand how spacetime curves in response to mass and energy, differential geometry uses **curvature tensors**, including:
- **Riemann curvature tensor** ($ R^\rho_{\sigma\mu\nu} $): Describes the curvature of spacetime at a point.
- **Ricci tensor** ($ R_{\mu\nu} $): A trace of the Riemann tensor, it measures the curvature experienced by matter.
- **Ricci scalar** ($ R $): The simplest curvature measure, representing the overall curvature of spacetime at a point.

#### d) **Geodesics**:
A **geodesic** is the shortest path between two points in curved space. In flat space (Euclidean geometry), geodesics are straight lines, but in curved space (spacetime), they can be curved. Tensors are used to describe the curvature of geodesics.

The **geodesic equation**:

$$
\frac{d^2 x^\mu}{d\tau^2} + \Gamma^\mu_{\rho\sigma} \frac{dx^\rho}{d\tau} \frac{dx^\sigma}{d\tau} = 0
$$

Where:
- $ \frac{d^2 x^\mu}{d\tau^2} $: Acceleration of an object along a geodesic.
- $ \Gamma^\mu_{\rho\sigma} $: **Christoffel symbols** (connection coefficients) — encodes how the spacetime metric changes as you move through spacetime.
- $\tau $: Proper time (the time experienced by an object moving along the geodesic).

This equation governs the motion of objects under gravity, where gravity is described as the curvature of spacetime rather than a force in the traditional sense.

#### 6. **Einstein’s Equations**

Einstein's field equations describe the spacetime using tensors.

- Code example: solving Einstein equations numerically using specialized libraries to help perform tensor operations e.g., calculating Ricci curvature, metric tensor, and solving differential equations related to the geometry of spacetime:

In [4]:
from sympy import symbols, Matrix
from sympy.tensor import tensor

# Define a 4D metric tensor (simplified)
g = Matrix([[1, 0, 0, 0],
            [0, -1, 0, 0],
            [0, 0, -1, 0],
            [0, 0, 0, -1]])

# Ricci tensor example (simplified)
# In reality, Ricci tensor is computed from the Riemann curvature tensor and requires differential geometry knowledge
R = Matrix([[0, 0, 0, 0],
            [0, 0, 0, 0],
            [0, 0, 0, 0],
            [0, 0, 0, 0]])

# Simplified Einstein field equation: Riemann tensor = Stress-Energy tensor * Constant
# Here you can add a more complex structure based on real tensor calculations
T = Matrix([[0, 0, 0, 0],
            [0, 1, 0, 0],
            [0, 0, 1, 0],
            [0, 0, 0, 1]])

print("Metric Tensor (g):")
print(g)

print("\nRicci Tensor (R):")
print(R)

print("\nStress-Energy Tensor (T):")
print(T)


Metric Tensor (g):
Matrix([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, -1]])

Ricci Tensor (R):
Matrix([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]])

Stress-Energy Tensor (T):
Matrix([[0, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])


#### 7. **Resources:**

- [PyTorch Documentation](https://pytorch.org/docs/stable/tensors.html)
- [TensorFlow Documentation](https://www.tensorflow.org/api_docs/python/tf/Tensor)

