# **What are Tensors?**

**Definition**
A tensor is a specialized multi-dimensional array designed for mathematical and computational efficiency.

---



## **Real-World Examples of Tensors**

### 1. Scalars (0D Tensors)

* **Represents:** A single value (used for constants or simple metrics).
* **Example:**

  * *Loss Value:* A single scalar result after model evaluation.
  * Example values: `5.0`, `-3.14`

### 2. Vectors (1D Tensors)

* **Represents:** A sequence or list of values.
* **Example:**

  * *Feature Vector:* Word embeddings in NLP as 1D vectors.
  * Example: `[0.12, -0.84, 0.33]` (e.g., from Word2Vec or GloVe)

### 3. Matrices (2D Tensors)

* **Represents:** Tabular or grid-like data.
* **Example:**

  * *Grayscale Images:* Stored as 2D tensors, each entry = pixel intensity.
  * Example:

    ```plaintext
    [[0, 255, 128],
     [34, 90, 180]]
    ```

### 4. 3D Tensors

* **Represents:** Colored RGB images (width × height × channels).
* **Example:**

  * A 256×256 RGB image: Shape = `[256, 256, 3]`

### 5. 4D Tensors

* **Represents:** Batches of RGB images (adds batch size).
* **Example:**

  * Batch of 32 images, each 128×128 with 3 color channels:
    Shape = `[32, 128, 128, 3]`

### 6. 5D Tensors

* **Represents:** Video data (adds time dimension for frames).
* **Example:**

  * Batch of 10 video clips, each with 16 frames of 64×64 and 3 channels:
    Shape = `[10, 16, 64, 64, 3]`

---



## **Why Are Tensors Useful?**

### 1. Mathematical Operations

* **Tensors enable efficient mathematical computations** (addition, multiplication, dot product, etc.) necessary for neural network operations.

### 2. Representation of Real-world Data

* **Data like images, audio, videos, and text can be represented as tensors:**

  * **Images:** Represented as 3D tensors (width × height × channels).
  * **Text:** Tokenized and represented as 2D or 3D tensors (sequence length × embedding size).

### 3. Efficient Computations

* **Tensors are optimized for hardware acceleration**, allowing computations on GPUs or TPUs, which are crucial for training deep learning models.

---




## **Where Are Tensors Used in Deep Learning?**

1. **Data Storage**

   * Training data (images, text, etc.) is stored in tensors.

2. **Weights and Biases**

   * The learnable parameters of a neural network (weights, biases) are stored as tensors.

3. **Matrix Operations**

   * Neural networks involve operations like matrix multiplication, dot products, and broadcasting—all performed using tensors.

4. **Training Process**

   * During forward passes, tensors flow through the network.
   * Gradients, represented as tensors, are calculated during the backward pass.


# **Practical Example**

In [1]:
import torch

print(torch.__version__)

2.5.1+cu124


In [3]:
if torch.cuda.is_available():
  print("GPU is Available")
  print("Using GPU :",torch.cuda.get_device_name(0))
else:
  print("GPU not Available")

GPU is Available
Using GPU : Tesla T4


## **Creating a Tensor**

In [None]:
# using empty : alots empty storage with the specified shape and returs the values at that location
a = torch.empty(2,3)
a

tensor([[2.7543e+28, 4.5696e-41, 2.7543e+28],
        [4.5696e-41, 4.4842e-44, 0.0000e+00]])

In [None]:
# check type
type(a)

torch.Tensor

In [None]:
# using zeros
torch.zeros(2,3)

tensor([[0., 0., 0.],
        [0., 0., 0.]])

In [None]:
# using ones
torch.ones(2,3)

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [None]:
# using rand # random numbers in the range of 0 to 1
torch.rand(2,3)

tensor([[0.5716, 0.9077, 0.0494],
        [0.0350, 0.7057, 0.8845]])

In [None]:
# use of seed
# when using rand, it produces some random value each time so in order to get Reproducibility, ie, each time get the same random numbers then

In [None]:
# manual_seed
torch.manual_seed(100)
torch.rand(2,3)

tensor([[0.1117, 0.8158, 0.2626],
        [0.4839, 0.6765, 0.7539]])

In [None]:
torch.manual_seed(100)
torch.rand(2,3)

tensor([[0.1117, 0.8158, 0.2626],
        [0.4839, 0.6765, 0.7539]])

In [None]:
# using tensor
torch.tensor([[1,2],[3,4]])

tensor([[1, 2],
        [3, 4]])

In [None]:
# other ways

# arange
print("using arange ->", torch.arange(0,10,2))

# using linspace
print("using linspace ->", torch.linspace(0,10,10))

# using eye :  generates IIdentity matrix
print("using eye ->", torch.eye(5))

# using full
print("using full ->", torch.full((3, 3), 5))

using arange -> tensor([0, 2, 4, 6, 8])
using linspace -> tensor([ 0.0000,  1.1111,  2.2222,  3.3333,  4.4444,  5.5556,  6.6667,  7.7778,
         8.8889, 10.0000])
using eye -> tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
using full -> tensor([[5, 5, 5],
        [5, 5, 5],
        [5, 5, 5]])


## **Tensor Shapes**

In [None]:
x = torch.tensor([[1,2],[3,4]])
x

tensor([[1, 2],
        [3, 4]])

In [None]:
x.shape

torch.Size([2, 2])

In [None]:
# to create another tensor of the same shape
torch.empty_like(x)

tensor([[       450971566080, 8319683848551211643],
        [3180222411935070754, 4189017755886035488]])

In [None]:
torch.zeros_like(x)

tensor([[0, 0],
        [0, 0]])

In [None]:
torch.ones_like(x)

tensor([[1, 1],
        [1, 1]])

In [None]:
torch.rand_like(x) # this error because the the values in x are of int but the rand produces values of float soo

RuntimeError: "check_uniform_bounds" not implemented for 'Long'

## **Torch Data Types**

In [None]:
# find data type
x.dtype

torch.int64

In [None]:
# assign data type
torch.tensor([1.3, 4.6, 2.5], dtype=torch.int32)

tensor([1, 4, 2], dtype=torch.int32)

In [None]:
torch.tensor([1, 4, 2], dtype=torch.float32)

tensor([1., 4., 2.])

In [None]:
x.to(torch.float32)

tensor([[1., 2.],
        [3., 4.]])

In [None]:
torch.rand_like(x, dtype=torch.float32) # Specify the dtype to resolve this error

tensor([[0.1633, 0.8267],
        [0.3377, 0.5696]])

----

| **Data Type**             | **Dtype**         | **Description**                                                                                                                                                                |
|---------------------------|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **32-bit Floating Point** | `torch.float32`   | Standard floating-point type used for most deep learning tasks. Provides a balance between precision and memory usage.                                                         |
| **64-bit Floating Point** | `torch.float64`   | Double-precision floating point. Useful for high-precision numerical tasks but uses more memory.                                                                               |
| **16-bit Floating Point** | `torch.float16`   | Half-precision floating point. Commonly used in mixed-precision training to reduce memory and computational overhead on modern GPUs.                                            |
| **BFloat16**              | `torch.bfloat16`  | Brain floating-point format with reduced precision compared to `float16`. Used in mixed-precision training, especially on TPUs.                                                |
| **8-bit Floating Point**  | `torch.float8`    | Ultra-low-precision floating point. Used for experimental applications and extreme memory-constrained environments (less common).                                               |
| **8-bit Integer**         | `torch.int8`      | 8-bit signed integer. Used for quantized models to save memory and computation in inference.                                                                                   |
| **16-bit Integer**        | `torch.int16`     | 16-bit signed integer. Useful for special numerical tasks requiring intermediate precision.                                                                                    |
| **32-bit Integer**        | `torch.int32`     | Standard signed integer type. Commonly used for indexing and general-purpose numerical tasks.                                                                                  |
| **64-bit Integer**        | `torch.int64`     | Long integer type. Often used for large indexing arrays or for tasks involving large numbers.                                                                                  |
| **8-bit Unsigned Integer**| `torch.uint8`     | 8-bit unsigned integer. Commonly used for image data (e.g., pixel values between 0 and 255).                                                                                    |
| **Boolean**               | `torch.bool`      | Boolean type, stores `True` or `False` values. Often used for masks in logical operations.                                                                                      |
| **Complex 64**            | `torch.complex64` | Complex number type with 32-bit real and 32-bit imaginary parts. Used for scientific and signal processing tasks.                                                               |
| **Complex 128**           | `torch.complex128`| Complex number type with 64-bit real and 64-bit imaginary parts. Offers higher precision but uses more memory.                                                                 |
| **Quantized Integer**     | `torch.qint8`     | Quantized signed 8-bit integer. Used in quantized models for efficient inference.                                                                                              |
| **Quantized Unsigned Integer** | `torch.quint8` | Quantized unsigned 8-bit integer. Often used for quantized tensors in image-related tasks.                                                                                     |

---

## **Mathematical Operations**

### **1. Scalar operation**

In [None]:
x = torch.rand(2,2)
x

tensor([[0.3427, 0.2074],
        [0.0454, 0.4331]])

In [None]:
# addition
print("addition")
print(x + 2)

# substraction
print("substraction")
print(x - 2)

# multiplication
print("multiplication")
print(x * 3)

# division
print("division")
print(x / 3)

# int division
print("int division")
print((x * 100)//3)

# mod
print("mod")
print(((x * 100)//3)%2)

# power
print("power")
print(x**2)

addition
tensor([[2.3427, 2.2074],
        [2.0454, 2.4331]])
substraction
tensor([[-1.6573, -1.7926],
        [-1.9546, -1.5669]])
multiplication
tensor([[1.0281, 0.6221],
        [0.1362, 1.2994]])
division
tensor([[0.1142, 0.0691],
        [0.0151, 0.1444]])
int division
tensor([[11.,  6.],
        [ 1., 14.]])
mod
tensor([[1., 0.],
        [1., 0.]])
power
tensor([[0.1174, 0.0430],
        [0.0021, 0.1876]])


### **2. Element wise operation**

In [None]:
a = torch.rand(2,3)
b = torch.rand(2,3)

print(a)
print(b)

tensor([[0.5730, 0.7059, 0.3330],
        [0.8424, 0.0514, 0.7820]])
tensor([[0.7460, 0.2880, 0.1249],
        [0.7406, 0.0641, 0.5905]])


In [None]:
# add
print(a + b)

# sub
print(a - b)

# multiply
print(a * b)

# division
print(a / b)

# power
print(a ** b)

# mod
print(a % b)

tensor([[1.3190, 0.9939, 0.4579],
        [1.5831, 0.1155, 1.3725]])
tensor([[-0.1730,  0.4179,  0.2081],
        [ 0.1018, -0.0127,  0.1915]])
tensor([[0.4275, 0.2033, 0.0416],
        [0.6239, 0.0033, 0.4618]])
tensor([[0.7681, 2.4514, 2.6660],
        [1.1374, 0.8024, 1.3243]])
tensor([[0.6601, 0.9046, 0.8717],
        [0.8807, 0.8269, 0.8649]])
tensor([[0.5730, 0.1300, 0.0832],
        [0.1018, 0.0514, 0.1915]])


In [None]:
c = torch.tensor([1, -2, 3, -4])

In [None]:
# abs
torch.abs(c)

tensor([1, 2, 3, 4])

In [None]:
# negative
torch.neg(c)

tensor([-1,  2, -3,  4])

In [None]:
d = torch.tensor([1.9, 2.3, 3.5, 4.4])

In [None]:
# round
torch.round(d)

tensor([2., 2., 4., 4.])

In [None]:
# ceil
torch.ceil(d)

tensor([2., 3., 4., 5.])

In [None]:
# floor
torch.floor(d)

tensor([1., 2., 3., 4.])

In [None]:
# clamp : numbers less than 2 will become 2 numbers greater than 3 will become 3
torch.clamp(d, min=2, max=3)

tensor([2.0000, 2.3000, 3.0000, 3.0000])

### **3. Reduction operation**

In [None]:
e = torch.randint(size=(2,3), low=0, high=10, dtype=torch.float32)
e

tensor([[9., 7., 1.],
        [7., 6., 3.]])

In [None]:
# sum
torch.sum(e)

tensor(33.)

In [None]:
# sum along columns
torch.sum(e, dim=0)

tensor([16., 13.,  4.])

In [None]:
# sum along rows
torch.sum(e, dim=1)

tensor([17., 16.])

In [None]:
# mean
torch.mean(e)
# mean along col
torch.mean(e, dim=0)

tensor([8.0000, 6.5000, 2.0000])

In [None]:
# median
torch.median(e)

tensor(6.)

In [None]:
# max and min
torch.max(e)
torch.min(e)

tensor(1.)

In [None]:
# product
torch.prod(e)

tensor(7938.)

In [None]:
# standard deviation
torch.std(e)

tensor(2.9496)

In [None]:
# variance
torch.var(e)

tensor(8.7000)

In [None]:
# argmax
torch.argmax(e)

tensor(0)

In [None]:
# argmin
torch.argmin(e)

tensor(2)

## **4. Matrix Operations**

In [None]:
f = torch.randint(size=(2,3), low=0, high=10)
g = torch.randint(size=(3,2), low=0, high=10)

print(f)
print(g)

tensor([[1, 8, 2],
        [0, 2, 8]])
tensor([[9, 8],
        [1, 7],
        [6, 8]])


In [None]:
# matrix multiplcation
torch.matmul(f, g)

tensor([[29, 80],
        [50, 78]])

In [None]:
vector1 = torch.tensor([1, 2])
vector2 = torch.tensor([3, 4])

# dot product
torch.dot(vector1, vector2)

tensor(11)

In [None]:
# transpose
torch.transpose(f, 0, 1)

tensor([[1, 0],
        [8, 2],
        [2, 8]])

In [None]:
# determinant (Works only with square matrix)
h = torch.randint(size=(3,3), low=0, high=10, dtype=torch.float32)
torch.det(h)

tensor(45.)

In [None]:
h

tensor([[0., 6., 3.],
        [0., 7., 6.],
        [3., 8., 5.]])

In [None]:
# inverse (Works only with square matrix)
torch.inverse(h)

tensor([[-0.2889, -0.1333,  0.3333],
        [ 0.4000, -0.2000,  0.0000],
        [-0.4667,  0.4000,  0.0000]])

### **5. Comparison operations**

In [None]:
i = torch.randint(size=(2,3), low=0, high=10)
j = torch.randint(size=(2,3), low=0, high=10)

print(i)
print(j)

tensor([[4, 7, 3],
        [2, 6, 7]])
tensor([[9, 6, 8],
        [5, 8, 2]])


In [None]:
# greater than
i > j
# less than
i < j
# equal to
i == j
# not equal to
i != j
# greater than equal to

# less than equal to

tensor([[True, True, True],
        [True, True, True]])

### **6. Special Functions**

In [None]:
k = torch.randint(size=(2,3), low=0, high=10, dtype=torch.float32)
k

tensor([[0., 0., 3.],
        [4., 4., 2.]])

In [None]:
# log
torch.log(k)

tensor([[  -inf,   -inf, 1.0986],
        [1.3863, 1.3863, 0.6931]])

In [None]:
# exp
torch.exp(k)

tensor([[ 1.0000,  1.0000, 20.0855],
        [54.5981, 54.5981,  7.3891]])

In [None]:
# sqrt
torch.sqrt(k)

tensor([[0.0000, 0.0000, 1.7321],
        [2.0000, 2.0000, 1.4142]])

In [None]:
# sigmoid
torch.sigmoid(k)

tensor([[0.5000, 0.5000, 0.9526],
        [0.9820, 0.9820, 0.8808]])

In [None]:
# softmax : require float, and ention of dimension
torch.softmax(k, dim=0)

tensor([[0.0180, 0.0180, 0.7311],
        [0.9820, 0.9820, 0.2689]])

In [13]:
# relu
torch.relu(k)

tensor([[0., 0., 3.],
        [4., 4., 2.]])

## **Inplace Operations**

In [14]:
m = torch.rand(2,3)
n = torch.rand(2,3)

print(m)
print(n)

tensor([[0.7536, 0.3707, 0.9506],
        [0.8107, 0.3825, 0.3177]])
tensor([[0.5660, 0.4604, 0.7877],
        [0.9849, 0.1071, 0.9158]])


In [17]:
m + n # return a new matrix and it takes additional space, if the matrix is huge then memory issues occur

tensor([[1.8856, 1.2914, 2.5260],
        [2.7804, 0.5966, 2.1492]])

In [15]:
m.add_(n)

tensor([[1.3196, 0.8310, 1.7383],
        [1.7955, 0.4895, 1.2334]])

In [18]:
m # sum is added to m

tensor([[1.3196, 0.8310, 1.7383],
        [1.7955, 0.4895, 1.2334]])

In [20]:
n

tensor([[0.5660, 0.4604, 0.7877],
        [0.9849, 0.1071, 0.9158]])

In [23]:
m.relu_()   # inplace relu operation

tensor([[1.3196, 0.8310, 1.7383],
        [1.7955, 0.4895, 1.2334]])

In [24]:
m

tensor([[1.3196, 0.8310, 1.7383],
        [1.7955, 0.4895, 1.2334]])

## **Copying a Tensor**

In [25]:
a = torch.rand(2,3)
a

tensor([[0.2345, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])

In [26]:
b = a

In [27]:
b

tensor([[0.2345, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])

In [29]:
a[0][0] = 0
a

tensor([[0.0000, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])

In [30]:
b

tensor([[0.0000, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])

In [31]:
id(a) == id(b)

True

In [33]:
b = a.clone()
b

tensor([[0.0000, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])

In [34]:
a[0][0] = 10

In [36]:
print(a)
print(b)

tensor([[10.0000,  0.5242,  0.1845],
        [ 0.6297,  0.4038,  0.9418]])
tensor([[0.0000, 0.5242, 0.1845],
        [0.6297, 0.4038, 0.9418]])


In [37]:
id(a) == id(b)

False

## **Tensor operations on GPU**

In [5]:
torch.cuda.is_available()

True

In [12]:
# torch.device("cuda" if torch.cuda.is_available() else "cpu")
device = torch.device("cuda")

In [13]:
device

device(type='cuda')

In [14]:
torch.rand((2,3), device = device)

tensor([[0.8174, 0.3920, 0.9658],
        [0.0508, 0.4326, 0.2459]], device='cuda:0')

In [22]:
a = torch.rand((2,3))  # Created o the cpu
a

tensor([[0.0585, 0.8124, 0.8723],
        [0.5936, 0.7118, 0.0533]])

In [23]:
b = a.to(device)  # moved to GPU
b

tensor([[0.0585, 0.8124, 0.8723],
        [0.5936, 0.7118, 0.0533]], device='cuda:0')

In [27]:
import time

# Define the size of the matrices
size = 10000  # Large size for performance comparison

# Create random matrices on CPU
matrix_cpu1 = torch.randn(size, size)
matrix_cpu2 = torch.randn(size, size)

# Measure time on CPU
start_time = time.time()
result_cpu = torch.matmul(matrix_cpu1, matrix_cpu2)  # Matrix multiplication on CPU
cpu_time = time.time() - start_time

print(f"Time on CPU: {cpu_time:.4f} seconds")

# Move matrices to GPU
matrix_gpu1 = matrix_cpu1.to('cuda')
matrix_gpu2 = matrix_cpu2.to('cuda')

# Measure time on GPU
start_time = time.time()
result_gpu = torch.matmul(matrix_gpu1, matrix_gpu2)  # Matrix multiplication on GPU
torch.cuda.synchronize()  # Ensure all GPU operations are complete
gpu_time = time.time() - start_time
print(f"Time on GPU: {gpu_time:.4f} seconds")

# Compare Results
print("\n Speedup (CPU Time / GPU Time)", cpu_time / gpu_time)

Time on CPU: 25.3554 seconds
Time on GPU: 0.5354 seconds

 Speedup (CPU Time / GPU Time) 47.35684399708595


## **Reshaping Tensors**

In [29]:
a = torch.ones(4,4)
a

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [31]:
# reshape
a.reshape(2,2,2,2)

tensor([[[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]],


        [[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]]])

In [33]:
# flatten
a.flatten()

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [34]:
b = torch.rand(2,3,4)
b

tensor([[[0.1154, 0.5564, 0.2507, 0.2598],
         [0.0209, 0.1208, 0.4840, 0.2909],
         [0.5609, 0.4294, 0.9728, 0.8766]],

        [[0.9416, 0.3617, 0.7721, 0.4217],
         [0.5923, 0.6186, 0.3640, 0.6853],
         [0.1000, 0.2020, 0.2899, 0.5878]]])

In [36]:
# permute
b.permute(2,0,1) # mention the index number of shape
b.permute(2,0,1).shape

torch.Size([4, 2, 3])

In [38]:
# unsqueeze : adds an extra dimension in the specified index position
# image size
c = torch.rand(226,226,3)
c.unsqueeze(0).shape

torch.Size([1, 226, 226, 3])

In [41]:
# squeeze : removes the dimension at the specified index
c = torch.rand(1,20)
c.squeeze(0).shape

torch.Size([20])

## **NumPy and PyTorch**

In [45]:
import numpy as np

In [46]:
a = torch.tensor([1, 2, 3])
a

tensor([1, 2, 3])

In [52]:
b = a.numpy()  # to convert tensor to numpy

In [53]:
print(b)
type(b)

[1 2 3]


numpy.ndarray

In [54]:
c = np.array([4, 5, 6])
c

array([4, 5, 6])

In [55]:
d = torch.from_numpy(c) # to convert numpy to tensor

In [56]:
print(d)
type(d)

tensor([4, 5, 6])


torch.Tensor