In [2]:
import torch
print(torch.__version__)

2.5.1+cu121


In [3]:
if torch.cuda.is_available():
  print("GPU is Available")
  print(f'Using GPU: {torch.cuda.get_device_name(0)}')
else:
  print("GPU not available. Using CPU")

GPU is Available
Using GPU: Tesla T4


# 1. Creating Tensors

In [4]:
# Using Empty
a = torch.empty(2,4)

In [5]:
# Check Type
type(a)

torch.Tensor

In [6]:
# Using Zeros
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [7]:
# Torch Ones
torch.ones(2,3)

tensor([[1., 1., 1.],
        [1., 1., 1.]])

In [8]:
# Use of Rand
torch.rand(2,3)

tensor([[0.4942, 0.2326, 0.2836],
        [0.4720, 0.5913, 0.3662]])

In [10]:
# Use of seed
torch.manual_seed(42)
torch.rand(2,3)

tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])

In [70]:
torch.manual_seed(100)
torch.rand(2,3)

tensor([[0.1117, 0.8158, 0.2626],
        [0.4839, 0.6765, 0.7539]])

In [71]:
# using tensor
torch.tensor([[1,2,3],[4,5,6]])

tensor([[1, 2, 3],
        [4, 5, 6]])

In [72]:
# other ways

# arange
print("using arange ->", torch.arange(0,10,2))


using arange -> tensor([0, 2, 4, 6, 8])


**`torch.linspace`** is a PyTorch function that generates a sequence of evenly spaced numbers over a specified range. It's often used for tasks like sampling points in a range or generating data for plotting

In [73]:
# using linspace
print("using linspace ->", torch.linspace(0,10,10))


using linspace -> tensor([ 0.0000,  1.1111,  2.2222,  3.3333,  4.4444,  5.5556,  6.6667,  7.7778,
         8.8889, 10.0000])


In [74]:
# using eye ( Creates Identity Matrix)
print("using eye ->", torch.eye(5))

using eye -> tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])


In [75]:
# using full
print("using full ->", torch.full((3, 3), 5))

using full -> tensor([[5, 5, 5],
        [5, 5, 5],
        [5, 5, 5]])


# 2. Tensor Shapes

In [81]:
x = torch.tensor([[1,2,3],[4,5,6]], dtype=torch.float32)
x

tensor([[1., 2., 3.],
        [4., 5., 6.]])

In [77]:
x.shape

torch.Size([2, 3])

**`torch.empty_like`** function creates a new tensor with the same shape, data type, and device as an existing tensor, but without initializing its values (i.e., it contains uninitialized or "garbage" data).

In [78]:
torch.empty_like(x)

tensor([[3616445622929465956, 6066686442253397300, 4121702075501001008],
        [6499315714907844910, 7309453675965983778, 8315168162784306286]])

In [79]:
torch.zeros_like(x)

tensor([[0, 0, 0],
        [0, 0, 0]])

In [80]:
torch.ones_like(x)

tensor([[1, 1, 1],
        [1, 1, 1]])

In [82]:
# To generates rand_like original vector should hace float values
torch.rand_like(x, dtype=torch.float32)

tensor([[0.2627, 0.0428, 0.2080],
        [0.1180, 0.1217, 0.7356]])

# 3. Tensor Data Types

In [83]:
# find data type
x.dtype

torch.float32

In [84]:
# assign data type
torch.tensor([1.0,2.0,3.0], dtype=torch.int32)

tensor([1, 2, 3], dtype=torch.int32)

In [85]:
torch.tensor([1,2,3], dtype=torch.float64)

tensor([1., 2., 3.], dtype=torch.float64)

In [86]:
# using to()
x.to(torch.float32)

tensor([[1., 2., 3.],
        [4., 5., 6.]])

| **Data Type**             | **Dtype**         | **Description**                                                                                                                                                                |
|---------------------------|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **32-bit Floating Point** | `torch.float32`   | Standard floating-point type used for most deep learning tasks. Provides a balance between precision and memory usage.                                                         |
| **64-bit Floating Point** | `torch.float64`   | Double-precision floating point. Useful for high-precision numerical tasks but uses more memory.                                                                               |
| **16-bit Floating Point** | `torch.float16`   | Half-precision floating point. Commonly used in mixed-precision training to reduce memory and computational overhead on modern GPUs.                                            |
| **BFloat16**              | `torch.bfloat16`  | Brain floating-point format with reduced precision compared to `float16`. Used in mixed-precision training, especially on TPUs.                                                |
| **8-bit Floating Point**  | `torch.float8`    | Ultra-low-precision floating point. Used for experimental applications and extreme memory-constrained environments (less common).                                               |
| **8-bit Integer**         | `torch.int8`      | 8-bit signed integer. Used for quantized models to save memory and computation in inference.                                                                                   |
| **16-bit Integer**        | `torch.int16`     | 16-bit signed integer. Useful for special numerical tasks requiring intermediate precision.                                                                                    |
| **32-bit Integer**        | `torch.int32`     | Standard signed integer type. Commonly used for indexing and general-purpose numerical tasks.                                                                                  |
| **64-bit Integer**        | `torch.int64`     | Long integer type. Often used for large indexing arrays or for tasks involving large numbers.                                                                                  |
| **8-bit Unsigned Integer**| `torch.uint8`     | 8-bit unsigned integer. Commonly used for image data (e.g., pixel values between 0 and 255).                                                                                    |
| **Boolean**               | `torch.bool`      | Boolean type, stores `True` or `False` values. Often used for masks in logical operations.                                                                                      |
| **Complex 64**            | `torch.complex64` | Complex number type with 32-bit real and 32-bit imaginary parts. Used for scientific and signal processing tasks.                                                               |
| **Complex 128**           | `torch.complex128`| Complex number type with 64-bit real and 64-bit imaginary parts. Offers higher precision but uses more memory.                                                                 |
| **Quantized Integer**     | `torch.qint8`     | Quantized signed 8-bit integer. Used in quantized models for efficient inference.                                                                                              |
| **Quantized Unsigned Integer** | `torch.quint8` | Quantized unsigned 8-bit integer. Often used for quantized tensors in image-related tasks.                                                                                     |


# Mathematical Operations

### 1. Scaler Operations

In [87]:
x = torch.rand(2,2)
x

tensor([[0.7118, 0.7876],
        [0.4183, 0.9014]])

In [88]:
# addition
x + 2
# substraction
x - 2
# multiplication
x * 3
# division
x / 3
# int division
(x * 100)//3
# mod
((x * 100)//3)%2
# power
x**2

tensor([[0.5066, 0.6203],
        [0.1750, 0.8125]])

### 2. Element wise operation

In [89]:
a = torch.rand(2,3)
b = torch.rand(2,3)

print(a)
print(b)

tensor([[0.9969, 0.7565, 0.2239],
        [0.3023, 0.1784, 0.8238]])
tensor([[0.5557, 0.9770, 0.4440],
        [0.9478, 0.7445, 0.4892]])


In [90]:
# add
a + b
# sub
a - b
# multiply
a * b
# division
a / b
# power
a ** b
# mod
a % b

tensor([[0.4411, 0.7565, 0.2239],
        [0.3023, 0.1784, 0.3346]])

In [91]:
c = torch.tensor([1, -2, 3, -4])

In [92]:
# abs
torch.abs(c)

tensor([1, 2, 3, 4])

In [93]:
# negative
torch.neg(c)

tensor([-1,  2, -3,  4])

In [94]:
d = torch.tensor([1.9, 2.3, 3.7, 4.4])

In [95]:
# round
torch.round(d)

tensor([2., 2., 4., 4.])

In [96]:
# ceil
torch.ceil(d)

tensor([2., 3., 4., 5.])

In [97]:
# floor
torch.floor(d)

tensor([1., 2., 3., 4.])

**`torch.clamp`** is a PyTorch function that restricts (clamps) the values of a tensor to a specified range. Any value below the lower bound is set to the lower bound, and any value above the upper bound is set to the upper bound.

In [98]:
# clamp
torch.clamp(d, min=2, max=3)

tensor([2.0000, 2.3000, 3.0000, 3.0000])

### 3. Reduction Operation

In [99]:
e = torch.randint(size=(2,3), low=0, high=10, dtype=torch.float32)
e

tensor([[8., 0., 7.],
        [0., 0., 9.]])

In [100]:
# sum all the values
torch.sum(e)

tensor(24.)

In [101]:
# sum along columns
torch.sum(e, dim=0)

tensor([ 8.,  0., 16.])

In [102]:
torch.sum(e, dim=1)

tensor([15.,  9.])

In [103]:
# mean
torch.mean(e)
# mean along col
torch.mean(e, dim=0)

tensor([4., 0., 8.])

In [104]:
# median
torch.median(e)

tensor(0.)

In [105]:
# max and min
torch.max(e)
torch.min(e)

tensor(0.)

In [106]:
# product
torch.prod(e)

tensor(0.)

In [107]:
# standard deviation
torch.std(e)

tensor(4.4272)

In [108]:
# variance
torch.var(e)

tensor(19.6000)

In [109]:
# argmax  gives the index of largest value in matrix
torch.argmax(e)

tensor(5)

In [None]:
# argmin
torch.argmin(e)

# 3. Matrix Operations

In [110]:
f = torch.randint(size=(2,3), low=0, high=10)
g = torch.randint(size=(3,2), low=0, high=10)

print(f)
print(g)

tensor([[5, 7, 3],
        [9, 4, 0]])
tensor([[5, 7],
        [5, 9],
        [9, 7]])


In [111]:
# matrix multiplcation
torch.matmul(f, g)

tensor([[ 87, 119],
        [ 65,  99]])

In [112]:
vector1 = torch.tensor([1, 2])
vector2 = torch.tensor([3, 4])

# dot product
torch.dot(vector1, vector2)

tensor(11)

In [113]:
# transpose
torch.transpose(f, 0, 1)

tensor([[5, 9],
        [7, 4],
        [3, 0]])

In [114]:
h = torch.randint(size=(3,3), low=0, high=10, dtype=torch.float32)
h

tensor([[5., 9., 8.],
        [9., 7., 9.],
        [2., 6., 7.]])

In [115]:
# determinant
torch.det(h)

tensor(-110.)

In [116]:
torch.inverse(h)

tensor([[ 0.0455,  0.1364, -0.2273],
        [ 0.4091, -0.1727, -0.2455],
        [-0.3636,  0.1091,  0.4182]])

# 4. Comparison Operations

In [118]:
i = torch.randint(size=(2,3), low=0, high=10)
j = torch.randint(size=(2,3), low=0, high=10)

print(i)
print(j)

tensor([[7, 8, 3],
        [6, 1, 5]])
tensor([[5, 0, 4],
        [3, 8, 8]])


In [119]:
# greater than
i > j
# less than
i < j
# equal to
i == j
# not equal to
i != j

tensor([[True, True, True],
        [True, True, True]])

# 5. Special Functions

In [120]:
k = torch.randint(size=(2,3), low=0, high=10, dtype=torch.float32)
k

tensor([[3., 3., 5.],
        [0., 6., 4.]])

In [121]:
# log
torch.log(k)

tensor([[1.0986, 1.0986, 1.6094],
        [  -inf, 1.7918, 1.3863]])

In [122]:
# exp
torch.exp(k)

tensor([[ 20.0855,  20.0855, 148.4132],
        [  1.0000, 403.4288,  54.5981]])

In [123]:
# sqrt
torch.sqrt(k)

tensor([[1.7321, 1.7321, 2.2361],
        [0.0000, 2.4495, 2.0000]])

In [124]:
# sigmoid
torch.sigmoid(k)

tensor([[0.9526, 0.9526, 0.9933],
        [0.5000, 0.9975, 0.9820]])

In [125]:
# softmax
torch.softmax(k, dim=0)

tensor([[0.9526, 0.0474, 0.7311],
        [0.0474, 0.9526, 0.2689]])

In [126]:
# relu
torch.relu(k)

tensor([[3., 3., 5.],
        [0., 6., 4.]])

# 6. Inplace Operations

When we perform element wise opeartions, then the resualtant vector will be new vector and original vectors will remain same.

Sometimes in element wise opearions we want to make changes in orignal tensor, to achieve this we'll use inplace operations

To perform inplace operations just add *underscore* in front of normal operation. Such as
 - `relu` becomes `relu_`
 - `add` becomes `add_`

In [11]:
m = torch.rand(2,3)
n = torch.rand(2,3)

print(m)
print(n)

tensor([[0.2566, 0.7936, 0.9408],
        [0.1332, 0.9346, 0.5936]])
tensor([[0.8694, 0.5677, 0.7411],
        [0.4294, 0.8854, 0.5739]])


In [12]:
m.add_(n)

tensor([[1.1260, 1.3614, 1.6819],
        [0.5626, 1.8200, 1.1675]])

Here you can see that the values of orignal vector `m` are chnaged

In [13]:
m

tensor([[1.1260, 1.3614, 1.6819],
        [0.5626, 1.8200, 1.1675]])

In [14]:
# Relu operation
m.relu_()

tensor([[1.1260, 1.3614, 1.6819],
        [0.5626, 1.8200, 1.1675]])

# 7. Copying Tensor

In [31]:
p = torch.rand(size = (2,3))
p

tensor([[0.7104, 0.9464, 0.7890],
        [0.2814, 0.7886, 0.5895]])

One Simple approach to copy any tensor or matrix is by using assignment (`=`) operator.

But There is one problem in this approach, assignment opeartor doesn't create a copy, it just create a refernce to the matrix. This means changes in original matrix will also be reflected in all the copies, which is sometime very undesirable

In [32]:
ref_copy = p

In [33]:
# ID of ref_copy and orignal matrix p is same
print(id(ref_copy))
print(id(p))

137193278976336
137193278976336


### **`clone`** creates deep copy of orignal matrix

In [34]:
deep_copy = p.clone()

In [35]:
deep_copy[0][0] = 98
deep_copy

tensor([[98.0000,  0.9464,  0.7890],
        [ 0.2814,  0.7886,  0.5895]])

In [36]:
p

tensor([[0.7104, 0.9464, 0.7890],
        [0.2814, 0.7886, 0.5895]])

In [37]:
# Id of deep_copy and orignal tensor p is different
print(id(deep_copy))
print(id(p))

137193278979696
137193278976336


# 8. Tensor Opeartions on GPU

In [38]:
torch.cuda.is_available()

True

In [39]:
device = torch.device('cuda')

In [40]:
# Creating a new tensor on GPU
torch.rand(2,3, device= device)

tensor([[0.6130, 0.0101, 0.3984],
        [0.0403, 0.1563, 0.4825]], device='cuda:0')

There are two ways to perform tensor opeartions on Tensor
 - Make Tensors on GPU using the above method
 - Move the existing CPU Tensors to GPU

In [41]:
# Moving Existiing Tensors to GPU
a_on_GPU = a.to(device)

In [42]:
a_on_GPU

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]], device='cuda:0')

So, what is benefit of having tensors on GPU?

**Speed**, i would say. To demonstarte this let's take an example

In [46]:
import time

# Define the  size of matrices
size  = 10000

# Create random matrices on CPU
matrix_cpu1 = torch.randn(size,size)
matrix_cpu2 = torch.randn(size,size)

# Measure time on CPU
start_time = time.time()
result_cpu = torch.matmul(matrix_cpu1, matrix_cpu2)
cpu_time = time.time() - start_time

print(f'Time on CPU: {cpu_time:.4f} seconds')

# Move matrices on GPU
matrix_gpu1 = matrix_cpu1.to('cuda')
matrix_gpu2 = matrix_cpu2.to('cuda')

# Measure time on GPU
start_time = time.time()
result_cpu = torch.matmul(matrix_gpu1, matrix_gpu2)
gpu_time = time.time() - start_time

print(f'Time on GPU: {gpu_time:.4f} seconds')

# Compare results
print("\nSpeedup (CPU time / GPU time):", cpu_time / gpu_time)

Time on CPU: 15.6828 seconds
Time on GPU: 0.1241 seconds

Speedup (CPU time / GPU time): 126.41742868893732


Yes, there are differences between GPU tensors and CPU tensors in PyTorch, and whether operations can be performed across them depends on where the tensors are stored.

### Differences Between GPU and CPU Tensors
1. **Device**:
   - CPU tensors reside in the system's main memory.
   - GPU tensors reside in the GPU's memory, which is separate from the system's memory.

2. **Performance**:
   - GPU tensors are optimized for massively parallel operations, making them much faster for large-scale computations like matrix multiplications.
   - CPU tensors are better for small-scale tasks or operations with less parallelism.

3. **Device-Specific Operations**:
   - Some PyTorch operations may behave differently or may only be available on the GPU (e.g., operations using `torch.cuda`).

### Operations Between CPU and GPU Tensors
You **cannot directly perform operations between a CPU tensor and a GPU tensor**. If you try, PyTorch will throw an error like this:
```
RuntimeError: Expected all tensors to be on the same device
```

### Transferring Tensors
To perform operations, both tensors must reside on the same device. You can move tensors between devices using `.to()`, `.cuda()`, or `.cpu()`:

1. **Move CPU tensor to GPU**:
   ```python
   cpu_tensor = torch.tensor([1, 2, 3])
   gpu_tensor = cpu_tensor.to('cuda')  # Or .cuda()
   ```

2. **Move GPU tensor to CPU**:
   ```python
   gpu_tensor = torch.tensor([1, 2, 3], device='cuda')
   cpu_tensor = gpu_tensor.to('cpu')
   ```

3. **Performing Operations**:
   Once the tensors are on the same device, operations will work as expected:
   ```python
   a = torch.tensor([1, 2, 3], device='cuda')
   b = torch.tensor([4, 5, 6], device='cuda')
   result = a + b  # Works because both are on GPU
   ```

4. **Automatic Device Handling**:
   To simplify device management, you can use `.to(device)` dynamically:
   ```python
   device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
   tensor = torch.tensor([1, 2, 3]).to(device)
   ```


# 9. Reshaping Tensors

The `torch.reshape` function in PyTorch reshapes a tensor without changing its data. However, certain conditions must be met for it to work properly:


### **Necessary Conditions for `torch.reshape`:**
1. **Consistency of Total Elements**:
   - The total number of elements in the input tensor must match the total number of elements in the desired shape. This ensures no data is lost or created during reshaping.
   - Formula:
     \[
     \text{Product of dimensions in input shape} = \text{Product of dimensions in desired shape}
     \]

   Example:
   ```python
   a = torch.tensor([[1, 2], [3, 4], [5, 6]])  # Shape: (3, 2)
   b = a.reshape(2, 3)  # Valid, as 3*2 = 2*3
   ```

   Invalid Example:
   ```python
   a = torch.tensor([[1, 2], [3, 4], [5, 6]])  # Shape: (3, 2)
   b = a.reshape(2, 4)  # Error, as 3*2 ≠ 2*4
   ```

2. **Use of `-1`**:
   - You can use `-1` in one dimension of the target shape to let PyTorch automatically infer that dimension based on the total number of elements.
   - Only **one dimension** can be `-1`.

   Example:
   ```python
   a = torch.tensor([[1, 2], [3, 4], [5, 6]])  # Shape: (3, 2)
   b = a.reshape(-1, 3)  # Shape becomes (2, 3)
   ```

   Invalid Example:
   ```python
   a = torch.tensor([[1, 2], [3, 4], [5, 6]])  # Shape: (3, 2)
   b = a.reshape(-1, -1)  # Error, as two dimensions can't be -1
   ```

---






In [47]:
q = torch.ones(4,4)
q

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [50]:
q.reshape(2,2,2,2)

tensor([[[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]],


        [[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]]])

In [51]:
# Flattens the Tensor to one dimensional Tenor
q.flatten()

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [52]:
s = torch.rand(2,3,4)
s

tensor([[[0.8727, 0.4904, 0.7581, 0.6359],
         [0.5027, 0.0819, 0.9114, 0.0129],
         [0.8586, 0.4301, 0.7264, 0.7931]],

        [[0.8030, 0.9458, 0.7073, 0.9908],
         [0.0038, 0.2487, 0.1595, 0.3193],
         [0.3111, 0.2432, 0.9495, 0.8161]]])

### Permuataion

**`tensor.permute`** is a PyTorch function that rearranges the dimensions (axes) of a tensor in the order you specify. It doesn't change the data, only how the dimensions are ordered.

**Key Idea**

 - It reorders the axes of the tensor.
 - Useful for changing the shape of data to match specific requirements, such as for different operations or libraries.

 -`permute` changes the order of dimensions but doesn’t modify the underlying data.
 -It differs from view or reshape because it doesn't flatten or reassign data; it just reorders axes.

In [55]:
s.permute(2, 1, 0).shape

torch.Size([4, 3, 2])

### Sequeeze & Unsqueeze

`torch.unsqueeze` function adds a new dimension (axis) of size 1 to a tensor at a specified position. It's useful for aligning tensor dimensions for operations like broadcasting or for reshaping data.



In [57]:
s.shape

torch.Size([2, 3, 4])

Now we can add another dimension at any given index in `s` vector.

In [59]:
# Adding a new dimension at index 1
s.unsqueeze(1).shape

torch.Size([2, 1, 3, 4])

Opposite of `unsqueeze` is `squeeze` function,, which removes dimension t any given index

In [61]:
s.squeeze(1).shape

torch.Size([2, 3, 4])

# 10. Numpy and Pytorch Tensor

Both NumPy arrays and PyTorch tensors are used for handling multidimensional data, but there are key differences between the two in terms of functionality, performance, and use cases.

---

### **Key Differences**

| Feature                | NumPy Array                           | PyTorch Tensor                        |
|------------------------|----------------------------------------|---------------------------------------|
| **Device Support**     | Works only on CPU.                    | Works on both CPU and GPU (via CUDA). |
| **Data Types**         | Limited to basic data types.          | Supports advanced types (e.g., `float16`, `bfloat16`) and gradient-enabled types. |
| **Automatic Gradients**| Not supported.                        | Supports automatic differentiation (`requires_grad=True`). |
| **Performance**        | Limited to single-threaded or CPU-parallel operations. | Optimized for GPUs and large-scale parallel operations. |
| **Deep Learning**      | Not directly usable for training ML models. | Designed for deep learning and seamlessly integrates with libraries like PyTorch. |
| **Interoperability**   | Standard for numerical computation in Python. | Can convert to/from NumPy arrays easily. |
| **Broadcasting**       | Supports broadcasting for arithmetic. | Also supports broadcasting, with behavior similar to NumPy. |

---






In [62]:
import numpy as np

In [63]:
t = torch.tensor([1,2,3])
t

tensor([1, 2, 3])

In [65]:
# Converting pytorch tensor to numpy tensor
numpy_tensor = t.numpy()
numpy_tensor

array([1, 2, 3])

In [66]:
print(type(t))
print(type(numpy_tensor))

<class 'torch.Tensor'>
<class 'numpy.ndarray'>


We can convert numpy array to pytorch Tensor using `from_numpy()` function

In [67]:
numpy_tensor2 = np.array([1,3,57,98])
numpy_tensor2

array([ 1,  3, 57, 98])

In [69]:
numpy_to_torch = torch.from_numpy(numpy_tensor2)
type(numpy_to_torch)

torch.Tensor