# Tensors in Pytorch

### What are tensors?
Anything that arrays (numpy arrays) do in one dimension; tensors do them in n-dimension.

---

### Why are tensors useful?
- Mathematical operations.
- Representation of real-world data.
- **Efficient computations:** Tensors are optimized for hardware acceleration.

---

### Why are are tensors used in DL?
- Data storage.
- Weights and biases.
- Matrix based operations.
- Training process.


### Setup and Import

In [1]:
import torch
print(torch.__version__)

2.5.1+cu124


In [2]:
if torch.cuda.is_available():
  device = torch.device('cuda')
else:
  device = torch.device('cpu')

print(device)

cpu


### Creating Tensors

In [50]:
print(torch.empty(2, 3))
print(torch.zeros(2, 3))
print(torch.ones(2, 3))
print(torch.rand(2, 3))

# To acheive reproducability when it comes to randomly generating tensors
torch.manual_seed(42)
print(torch.rand(2, 3))

# To create a custom tensor
print(torch.tensor([[1, 2, 3], [4, 5, 6]]))

# Other ways
print(torch.arange(0, 10, 2))
print(torch.linspace(0, 10, 5))
print(torch.eye(5))  # To get an identity matrix of shape (5, 5)
print(torch.full((2, 3), 5))

tensor([[4.1306e-09, 4.4710e-41, 7.1780e-34],
        [0.0000e+00, 4.4842e-44, 0.0000e+00]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.0916, 0.0399, 0.8603],
        [0.9275, 0.4440, 0.3461]])
tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])
tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([0, 2, 4, 6, 8])
tensor([ 0.0000,  2.5000,  5.0000,  7.5000, 10.0000])
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
tensor([[5, 5, 5],
        [5, 5, 5]])


### Working with Tensors

In [6]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])

print(t.shape)

# To create a tensors replicating the shape of an existing tensor
print(torch.empty_like(t))
print(torch.zeros_like(t))
print(torch.ones_like(t))
print(torch.rand_like(t)) # This line will generate an error; as rand generates floats (between 0 and 1). For, this you need to understand the data types.

torch.Size([2, 3])
tensor([[              0,       137728928,       118334912],
        [134062563048720,               0,               0]])
tensor([[0, 0, 0],
        [0, 0, 0]])
tensor([[1, 1, 1],
        [1, 1, 1]])


RuntimeError: "check_uniform_bounds" not implemented for 'Long'

### Understanding Tensor Data-types

In [49]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])
t_1 = torch.tensor([1.0, 2.0, 3.0])

print(t.dtype)
print(t_1.dtype)
print(torch.tensor([1.0, 2.0, 3.0], dtype=torch.int32))
print(torch.tensor([1, 2, 3], dtype=torch.float64))
print(t.to(torch.float64))  # To convert a datatype from one to another

t = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.long)
print(t.dtype)

t = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.double)
print(t.dtype)

torch.int64
torch.float32
tensor([1, 2, 3], dtype=torch.int32)
tensor([1., 2., 3.], dtype=torch.float64)
tensor([[1., 2., 3.],
        [4., 5., 6.]], dtype=torch.float64)
torch.int64
torch.float64


In [48]:
# Therefore, the solution to the above problem can be
print(torch.rand_like(t, dtype=torch.float32))

tensor([[0.2852, 0.1391, 0.2684],
        [0.9310, 0.3423, 0.6405]])


### Performing Operations on Tensors

##### Scalar operations

In [47]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])

print(t + 2)
print(t - 2)
print(t * 2)
print(t / 2)
print(t**2)
print((t*100)//2)
print(((t*100)//2)%2)

print(t)  # From here you can observe that scalr operations are not inplace

tensor([[3, 4, 5],
        [6, 7, 8]])
tensor([[-1,  0,  1],
        [ 2,  3,  4]])
tensor([[ 2,  4,  6],
        [ 8, 10, 12]])
tensor([[0.5000, 1.0000, 1.5000],
        [2.0000, 2.5000, 3.0000]])
tensor([[ 1,  4,  9],
        [16, 25, 36]])
tensor([[ 50, 100, 150],
        [200, 250, 300]])
tensor([[0, 0, 0],
        [0, 0, 0]])
tensor([[1, 2, 3],
        [4, 5, 6]])


##### Element-wise operations

In [46]:
a = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.tensor([[7, 8, 9], [10, 11, 12]])

print(a + b)
print(a - b)
print(a * b)
print(a / b)
print(a**b)
print((a*100)//b)
print(((a*100)//b)%2)

# Element-wise operations are also not inplace
print(a)
print(b)

tensor([[ 8, 10, 12],
        [14, 16, 18]])
tensor([[-6, -6, -6],
        [-6, -6, -6]])
tensor([[ 7, 16, 27],
        [40, 55, 72]])
tensor([[0.1429, 0.2500, 0.3333],
        [0.4000, 0.4545, 0.5000]])
tensor([[         1,        256,      19683],
        [   1048576,   48828125, 2176782336]])
tensor([[14, 25, 33],
        [40, 45, 50]])
tensor([[0, 1, 1],
        [0, 1, 0]])
tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[ 7,  8,  9],
        [10, 11, 12]])


##### Reduction operations

In [45]:
t = torch.tensor([1.7, -2.2, 3.5, 4.5, 5.7])

print(t.sum())
print(torch.sum(t, dim=0))  # For, along columns (and dim=1 for along rows)
print(t.prod())
print(torch.prod(t, dim=0))
print(t.mean())
print(torch.mean(t, dim=0))
print(t.std())
print(torch.std(t, dim=0))
print(t.var())
print(torch.var(t, dim=0))
print(t.max())
print(t.min())
print(t.argmin())
print(t.argmax())

print(t)  # Not inplace

tensor(13.2000)
tensor(13.2000)
tensor(-335.7585)
tensor(-335.7585)
tensor(2.6400)
tensor(2.6400)
tensor(3.0770)
tensor(3.0770)
tensor(9.4680)
tensor(9.4680)
tensor(5.7000)
tensor(-2.2000)
tensor(1)
tensor(4)
tensor([ 1.7000, -2.2000,  3.5000,  4.5000,  5.7000])


##### Other operations

In [44]:
t = torch.tensor([1.7, -2.2, 3.5, 4.5, 5.7])

print(t.abs())
print(t.sqrt())
print(t.neg())
print(t.round())
print(t.ceil())
print(t.floor())
print(t.clamp(min=0, max=3))

print(t)  # Also, not inplace

tensor([1.7000, 2.2000, 3.5000, 4.5000, 5.7000])
tensor([1.3038,    nan, 1.8708, 2.1213, 2.3875])
tensor([-1.7000,  2.2000, -3.5000, -4.5000, -5.7000])
tensor([ 2., -2.,  4.,  4.,  6.])
tensor([ 2., -2.,  4.,  5.,  6.])
tensor([ 1., -3.,  3.,  4.,  5.])
tensor([1.7000, 0.0000, 3.0000, 3.0000, 3.0000])
tensor([ 1.7000, -2.2000,  3.5000,  4.5000,  5.7000])


##### Matrix operations

In [43]:
a = torch.tensor([[1, 2], [4, 5]])
b = torch.tensor([[7, 8, 9], [10, 11, 12]])

print(a.matmul(b))
print(torch.matmul(a, b))
print(a @ b)

a_1 = torch.tensor([1, 2, 3])
b_1 = torch.tensor([4, 5, 6])

print(a_1.dot(b_1))
print(torch.dot(a_1, b_1))

print(torch.transpose(a, 0, 1)) # Swapping dimensions 0 (columns) by 1 (rows)

# Here, the following two operation require the data to be in floating point or complex data-types
print(torch.inverse(a.to(torch.double)))
print(torch.det(a.to(torch.double)))

# Also, not inplace
print(a)
print(b)
print(a_1)
print(b_1)

tensor([[27, 30, 33],
        [78, 87, 96]])
tensor([[27, 30, 33],
        [78, 87, 96]])
tensor([[27, 30, 33],
        [78, 87, 96]])
tensor(32)
tensor(32)
tensor([[1, 4],
        [2, 5]])
tensor([[-1.6667,  0.6667],
        [ 1.3333, -0.3333]], dtype=torch.float64)
tensor(-3., dtype=torch.float64)
tensor([[1, 2],
        [4, 5]])
tensor([[ 7,  8,  9],
        [10, 11, 12]])
tensor([1, 2, 3])
tensor([4, 5, 6])


##### Comparision operations

In [42]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])

print(t == 2)
print(t != 2)
print(t > 2)
print(t < 2)
print(t >= 2)
print(t <= 2)

print(t)  # Not inplace

tensor([[False,  True, False],
        [False, False, False]])
tensor([[ True, False,  True],
        [ True,  True,  True]])
tensor([[False, False,  True],
        [ True,  True,  True]])
tensor([[ True, False, False],
        [False, False, False]])
tensor([[False,  True,  True],
        [ True,  True,  True]])
tensor([[ True,  True, False],
        [False, False, False]])
tensor([[1, 2, 3],
        [4, 5, 6]])


##### Special function based operations

In [52]:
t = torch.tensor([[1, 2, 3], [4, 5, 6]])

print(torch.exp(t))
print(torch.log(t))
print(torch.sin(t))
print(torch.cos(t))
print(torch.tan(t))
print(torch.sigmoid(t))
print(torch.tanh(t))
print(torch.relu(t))

# Here, the softamax function requires the data to be floating point data-type
print(torch.softmax(t.to(torch.double), dim=0))

print(t)  # Not inplace

tensor([[  2.7183,   7.3891,  20.0855],
        [ 54.5981, 148.4132, 403.4288]])
tensor([[0.0000, 0.6931, 1.0986],
        [1.3863, 1.6094, 1.7918]])
tensor([[ 0.8415,  0.9093,  0.1411],
        [-0.7568, -0.9589, -0.2794]])
tensor([[ 0.5403, -0.4161, -0.9900],
        [-0.6536,  0.2837,  0.9602]])
tensor([[ 1.5574, -2.1850, -0.1425],
        [ 1.1578, -3.3805, -0.2910]])
tensor([[0.7311, 0.8808, 0.9526],
        [0.9820, 0.9933, 0.9975]])
tensor([[0.7616, 0.9640, 0.9951],
        [0.9993, 0.9999, 1.0000]])
tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[0.0474, 0.0474, 0.0474],
        [0.9526, 0.9526, 0.9526]], dtype=torch.float64)
tensor([[1, 2, 3],
        [4, 5, 6]])


##### Inplace operations

In [53]:
a = torch.tensor([[1, 2], [4, 5]])
b = torch.tensor([[7, 8], [10, 11]])

a.add_(b)

print(a)
print(b)

# The underscore represents that you want to perform an inplace operation

tensor([[ 8, 10],
        [14, 16]])
tensor([[ 7,  8],
        [10, 11]])


##### Copying a tensor: Traditional assignment of tensors does work here, but the copy is not a deep copy; meaning that a change to the former will also be reflected in the later. Therefore, we use the clone method here.

In [54]:
a = torch.tensor([[1, 2], [4, 5]])
b = a

a[0][0] = 100

print(a)
print(b)
print(id(a))
print(id(b))

# Therefore...
a = torch.tensor([[1, 2], [4, 5]])
b = a.clone()

a[0][0] = 100

print(a)
print(b)
print(id(a))
print(id(b))

tensor([[100,   2],
        [  4,   5]])
tensor([[100,   2],
        [  4,   5]])
137031499395248
137031499395248
tensor([[100,   2],
        [  4,   5]])
tensor([[1, 2],
        [4, 5]])
137031499383536
137031499383344


##### Reshaping a tensor

In [8]:
t = torch.ones(4, 4)

print(t.reshape(2, 2, 2, 2))

print(t.flatten())

print(t)  # Not, inplace

t = torch.rand(2, 3, 4)

print(t.permute(2, 0, 1).shape)

print(t)  # Not, inplace

# Squeeze
t = torch.rand(1, 1, 1, 2)

print(t.squeeze().shape)

# Unsqueeze
t = torch.rand(2, 3)

print(t.unsqueeze(dim=0).shape)

tensor([[[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]],


        [[[1., 1.],
          [1., 1.]],

         [[1., 1.],
          [1., 1.]]]])
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
torch.Size([4, 2, 3])
tensor([[[0.0921, 0.8987, 0.8410, 0.8655],
         [0.7987, 0.1555, 0.0953, 0.7473],
         [0.5916, 0.9969, 0.1437, 0.3681]],

        [[0.4725, 0.4220, 0.8774, 0.8746],
         [0.9291, 0.4621, 0.1010, 0.5196],
         [0.2775, 0.4912, 0.4346, 0.9985]]])
torch.Size([2])
torch.Size([1, 2, 3])


### Creating Tensors on GPU

In [2]:
import torch

In [4]:
print(torch.cuda.is_available())

device = torch.device('cuda')
print(device)

# Creating an tensor directly on the GPU
t = torch.tensor([[1, 2, 3], [4, 5, 6]], device=device)
print(t)

# Moving a tensor from CPU to GPU
t_1 = t.to(device)
print(t_1)

True
cuda
tensor([[1, 2, 3],
        [4, 5, 6]], device='cuda:0')
tensor([[1, 2, 3],
        [4, 5, 6]], device='cuda:0')


##### Comparision

In [5]:
import time

# Checking for GPU availability
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')
    print("GPU not available, using CPU.")

# Defining tensor size
size = 10000

# Creating tensors on CPU and GPU
cpu_tensor = torch.randn(size, size, device='cpu')
gpu_tensor = torch.randn(size, size, device=device)


# Matrix Multiplication on CPU
start_time = time.time()
cpu_result = cpu_tensor @ cpu_tensor
end_time = time.time()
cpu_time = end_time - start_time
print(f"CPU Matrix Multiplication Time: {cpu_time:.4f} seconds")

# Matrix Multiplication on GPU
start_time = time.time()
gpu_result = gpu_tensor @ gpu_tensor
end_time = time.time()
gpu_time = end_time - start_time
print(f"GPU Matrix Multiplication Time: {gpu_time:.4f} seconds")


# Calculating and print speedup
if torch.cuda.is_available():
    speedup = cpu_time / gpu_time
    print(f"Speedup: {speedup:.2f}x")

CPU Matrix Multiplication Time: 16.4210 seconds
GPU Matrix Multiplication Time: 0.1420 seconds
Speedup: 115.62x


### Converting a Tensor from Pytorch and numpy and vice versa

In [10]:
import numpy as np

# PyTorch tensor to NumPy array
tensor = torch.randn(3, 4)
numpy_array = tensor.numpy()
print(f"PyTorch Tensor:\n{tensor}")
print(f"NumPy Array:\n{numpy_array}")


# NumPy array to PyTorch tensor
numpy_array = np.random.rand(2, 3)
tensor = torch.from_numpy(numpy_array)
print(f"NumPy Array:\n{numpy_array}")
print(f"PyTorch Tensor:\n{tensor}")

PyTorch Tensor:
tensor([[-0.9128,  0.6102, -0.8808, -0.7788],
        [-1.1297, -0.9826, -0.7798, -1.4082],
        [ 1.3470,  2.0123, -0.7765, -0.4997]])
NumPy Array:
[[-0.9127962   0.610217   -0.880759   -0.77880424]
 [-1.1297117  -0.98264474 -0.7798132  -1.4082057 ]
 [ 1.3470302   2.0123408  -0.7764823  -0.49972594]]
NumPy Array:
[[0.17819056 0.24060718 0.29297762]
 [0.22062953 0.67330044 0.31739022]]
PyTorch Tensor:
tensor([[0.1782, 0.2406, 0.2930],
        [0.2206, 0.6733, 0.3174]], dtype=torch.float64)


# Understanding the Pytorch Training Pipeline

### Setup and Imports

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
import torch

### Loading the Dataset

In [None]:
!kaggle datasets download uciml/breast-cancer-wisconsin-data

Dataset URL: https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
License(s): CC-BY-NC-SA-4.0
Downloading breast-cancer-wisconsin-data.zip to /content
  0% 0.00/48.6k [00:00<?, ?B/s]
100% 48.6k/48.6k [00:00<00:00, 45.3MB/s]


In [None]:
!unzip -qq breast-cancer-wisconsin-data.zip

In [None]:
data = pd.read_csv('/content/data.csv')

In [None]:
data.head()

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,Unnamed: 32
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,


In [None]:
data.shape

(569, 33)

### Data Preprocessing

In [None]:
data.drop(['id', 'Unnamed: 32'], axis=1, inplace=True)

In [None]:
data.head()

Unnamed: 0,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


In [None]:
X_train, X_test, y_train, y_test = train_test_split(data.drop('diagnosis', axis=1), data['diagnosis'], test_size=0.2, random_state=42)

In [None]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
encoder = LabelEncoder()
y_train = encoder.fit_transform(y_train)
y_test = encoder.transform(y_test)

In [None]:
X_train_tensor = torch.tensor(X_train, dtype=torch.float64)
X_test_tensor = torch.tensor(X_test, dtype=torch.float64)
y_train_tensor = torch.tensor(y_train)
y_test_tensor = torch.tensor(y_test)

In [None]:
print(type(X_train_tensor)) # For, testing
print(type(y_train_tensor))
print(type(X_test_tensor))
print(type(y_test_tensor))

<class 'torch.Tensor'>
<class 'torch.Tensor'>
<class 'torch.Tensor'>
<class 'torch.Tensor'>


### Model Architecture

In [None]:
class SimpleNN():

  def __init__(self, X):
    self.weights = torch.rand(X.shape[1], 1, dtype=torch.float64, requires_grad=True)
    self.bias = torch.zeros(1, dtype=torch.float64, requires_grad=True)

  def forward(self, X):
    z = torch.matmul(X, self.weights) + self.bias

    return torch.sigmoid(z) # y_pred

  def loss_fn(self, y_pred, y):
    # Clamping prediction to avoid log(0)
    epsilon = 1e-7
    y_pred = torch.clamp(y_pred, epsilon, 1-epsilon)

    return -torch.mean(y * torch.log(y_pred) + (1-y) * torch.log(1-y_pred)) # loss

### Important Hyper-parameters

In [None]:
lr = 0.1
epochs = 50

### Training (Pipeline)

In [None]:
model = SimpleNN(X_train_tensor)

In [None]:
for epoch in range(epochs):

  # Forward pass
  y_pred = model.forward(X_train_tensor)

  # Loss Calculation
  loss = model.loss_fn(y_pred, y_train_tensor)

  # Backward pass
  loss.backward()

  # Updating weights
  with torch.no_grad():
    model.weights -= lr * model.weights.grad
    model.bias -= lr * model.bias.grad

  # Zeroing gradients
  model.weights.grad.zero_()
  model.bias.grad.zero_()

  print(f'Epoch: {epoch+1}, Loss: {loss.item()}')


Epoch: 1, Loss: 3.1221360960148528
Epoch: 2, Loss: 2.9746338239946306
Epoch: 3, Loss: 2.8189572207816824
Epoch: 4, Loss: 2.655172726781337
Epoch: 5, Loss: 2.485839938224061
Epoch: 6, Loss: 2.3179844785636754
Epoch: 7, Loss: 2.1493499605740274
Epoch: 8, Loss: 1.9832522355338689
Epoch: 9, Loss: 1.8219434268387005
Epoch: 10, Loss: 1.664295642925575
Epoch: 11, Loss: 1.5175033487433607
Epoch: 12, Loss: 1.3834030021096522
Epoch: 13, Loss: 1.2637973877549475
Epoch: 14, Loss: 1.1602464904887708
Epoch: 15, Loss: 1.0736994980032526
Epoch: 16, Loss: 1.0040338415600267
Epoch: 17, Loss: 0.9498204151149261
Epoch: 18, Loss: 0.9085331413359746
Epoch: 19, Loss: 0.8772001679150564
Epoch: 20, Loss: 0.8531352283082796
Epoch: 21, Loss: 0.8342676943398079
Epoch: 22, Loss: 0.8191144306203729
Epoch: 23, Loss: 0.8066442851953071
Epoch: 24, Loss: 0.7961496501073025
Epoch: 25, Loss: 0.7871462406737466
Epoch: 26, Loss: 0.7792997545464351
Epoch: 27, Loss: 0.7723748469723182
Epoch: 28, Loss: 0.7662011925247142
Epoc

### Evaluation

In [None]:
with torch.no_grad():
  y_pred = model.forward(X_test_tensor)
  y_pred = (y_pred > 0.5).float()
  accuracy = (y_pred == y_test_tensor).float().mean()

  print(f'Accuracy: {accuracy.item()}')

Accuracy: 0.5689442753791809


# PyTorch NN and `torch.optim` Module

### The use of the said PyTorch modules will help us in improving the fundamental PyTorch training pipeline by:
- Building the neural network using the nn module
- Using built-in activation functions
- Built-in loss functions
- Built-in Optimizers


### Setup and Imports

In [None]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder

### Loading the Dataset

**Note:** Here, we just generating a random dataset to simulate the processes

In [None]:
features = torch.rand(1000, 5)

### Model Architecture

In [None]:
class SimpleModel1(nn.Module):

  def __init__(self, num_features):
    super().__init__()

    self.linear = nn.Linear(num_features, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    output = self.linear(features)
    output = self.sigmoid(output)

    return output

Let's build a more sofisticated model to leverage other PyTorch provided utilities

In [None]:
class SimpleModel2(nn.Module):

  def __init__(self, num_features):
    super().__init__()

    self.linear1 = nn.Linear(num_features, 3)
    self.relu = nn.ReLU()
    self.linear2 = nn.Linear(3, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, features):
    output = self.linear1(features)
    output = self.relu(output)
    output = self.linear2(output)
    output = self.sigmoid(output)

    return output

Individually, defining the layers are not considered amoung the best of practices when building deep learning models through PyTorch; therefore, we encapsule the layers within a single sequential module, like:



```
# Within the constructor
self.network = nn.Sequential(
    nn.Linear(num_features, 3),
    nn.ReLU(),
    nn.Linear(3, 1),
    nn.Sigmoid()
)

# Within the forward pass method
return self.network(features)
```

### Training (Pipeline)

In [None]:
model = SimpleModel1(features.shape[1])

In [None]:
model(features) # Forward pass

tensor([[0.3484],
        [0.4229],
        [0.4356],
        [0.4234],
        [0.3726],
        [0.3863],
        [0.3953],
        [0.4213],
        [0.3306],
        [0.3773],
        [0.4033],
        [0.3828],
        [0.3988],
        [0.3934],
        [0.4114],
        [0.4049],
        [0.4666],
        [0.3691],
        [0.3918],
        [0.3603],
        [0.3995],
        [0.3647],
        [0.4382],
        [0.3620],
        [0.4119],
        [0.4113],
        [0.4413],
        [0.4465],
        [0.4098],
        [0.3978],
        [0.4034],
        [0.4257],
        [0.3892],
        [0.3564],
        [0.4149],
        [0.4051],
        [0.4116],
        [0.3133],
        [0.4122],
        [0.4018],
        [0.4389],
        [0.4336],
        [0.3901],
        [0.4149],
        [0.4282],
        [0.3779],
        [0.4243],
        [0.4690],
        [0.4160],
        [0.4011],
        [0.3997],
        [0.3967],
        [0.4347],
        [0.4292],
        [0.3975],
        [0

**Note:** The reason is that we are not calling the forward function the traditional way is that, within the core pytorch module (from which the inheritence has been made), has overriden the `__call__()` magic function.

In [None]:
print(model.linear.weight)
print(model.linear.bias)

Parameter containing:
tensor([[ 0.2996, -0.2684,  0.0810, -0.3107,  0.0780]], requires_grad=True)
Parameter containing:
tensor([-0.3437], requires_grad=True)


If you want to get a summary of your model; do the following

In [None]:
! pip install torchinfo

Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl.metadata (21 kB)
Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


In [None]:
from torchinfo import summary

summary(model, input_size=(1000, 5))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleModel1                             [1000, 1]                 --
├─Linear: 1-1                            [1000, 1]                 6
├─Sigmoid: 1-2                           [1000, 1]                 --
Total params: 6
Trainable params: 6
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.01
Input size (MB): 0.02
Forward/backward pass size (MB): 0.01
Params size (MB): 0.00
Estimated Total Size (MB): 0.03

In [None]:
# Training (Pipeline) for the more sophisticated model
model = SimpleModel2(features.shape[1])

model(features) # Forward pass

print(model.linear1.weight)
print(model.linear1.bias)
print(model.linear2.weight)
print(model.linear2.bias)
print()

print("Summary:")
summary(model, input_size=(1000, 5))

Parameter containing:
tensor([[ 0.3217,  0.3939, -0.2001,  0.3666,  0.1902],
        [-0.0878, -0.3509,  0.0813,  0.3021,  0.2419],
        [ 0.1969,  0.3344,  0.1038,  0.2941, -0.3094]], requires_grad=True)
Parameter containing:
tensor([-0.1663,  0.3343,  0.2039], requires_grad=True)
Parameter containing:
tensor([[-0.3705, -0.0688,  0.3127]], requires_grad=True)
Parameter containing:
tensor([0.1975], requires_grad=True)

Summary:


Layer (type:depth-idx)                   Output Shape              Param #
SimpleModel2                             [1000, 1]                 --
├─Linear: 1-1                            [1000, 3]                 18
├─ReLU: 1-2                              [1000, 3]                 --
├─Linear: 1-3                            [1000, 1]                 4
├─Sigmoid: 1-4                           [1000, 1]                 --
Total params: 22
Trainable params: 22
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.02
Input size (MB): 0.02
Forward/backward pass size (MB): 0.03
Params size (MB): 0.00
Estimated Total Size (MB): 0.05

### Now, imporving the previously developed architecture

#### Setup and Imports

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
import torch
import torch.nn as nn
import torch.optim as optim

#### Loading the Dataset

In [None]:
!kaggle datasets download uciml/breast-cancer-wisconsin-data

Dataset URL: https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
License(s): CC-BY-NC-SA-4.0
breast-cancer-wisconsin-data.zip: Skipping, found more recently modified local copy (use --force to force download)


In [None]:
!unzip -qq breast-cancer-wisconsin-data.zip

replace data.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: y


In [None]:
data = pd.read_csv('/content/data.csv')

In [None]:
data.head()

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,Unnamed: 32
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,


In [None]:
data.shape

(569, 33)

#### Data Preprocessing

In [None]:
data.drop(['id', 'Unnamed: 32'], axis=1, inplace=True)

In [None]:
data.head()

Unnamed: 0,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


In [None]:
X_train, X_test, y_train, y_test = train_test_split(data.drop('diagnosis', axis=1), data['diagnosis'], test_size=0.2, random_state=42)

In [None]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
encoder = LabelEncoder()
y_train = encoder.fit_transform(y_train)
y_test = encoder.transform(y_test)

In [None]:
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)

In [None]:
print(type(X_train_tensor)) # For, testing
print(type(y_train_tensor))
print(type(X_test_tensor))
print(type(y_test_tensor))

<class 'torch.Tensor'>
<class 'torch.Tensor'>
<class 'torch.Tensor'>
<class 'torch.Tensor'>


#### Model Architecture

In [None]:
class SimpleNN1(nn.Module):

  def __init__(self, num_features):
    super().__init__()

    self.neural_network = nn.Sequential(
      nn.Linear(num_features, 1),
      nn.Sigmoid()
    )

  def forward(self, X):
    return self.neural_network(X) # y_pred

  # def loss_fn(self, y_pred, y):
  #   # Clamping prediction to avoid log(0)
  #   epsilon = 1e-7
  #   y_pred = torch.clamp(y_pred, epsilon, 1-epsilon)

  #   return -torch.mean(y * torch.log(y_pred) + (1-y) * torch.log(1-y_pred)) # loss

In [None]:
loss_function = nn.BCELoss()

#### Important Hyper-parameters

In [None]:
lr = 0.1
epochs = 50

#### Training (Pipeline)

In [None]:
model = SimpleNN1(X_train_tensor.shape[1])

optimizer = optim.SGD(model.parameters(), lr=lr)  # Here, model.parameters() is an iterator that loops through all of the involved parameter present within the network

In [None]:
for epoch in range(epochs):

  # Forward pass
  y_pred = model(X_train_tensor)

  # Loss Calculation
  loss = loss_function(y_pred, y_train_tensor.view(-1, 1))

  optimizer.zero_grad()

  # Backward pass
  loss.backward()

  # # Updating weights
  # with torch.no_grad():
  #   model.linear.weights -= lr * model.linear.weight.grad
  #   model.linear.bias -= lr * model.linear.bias.grad

  # # Zeroing gradients
  # model.linear.weights.grad.zero_()
  # model.linear.bias.grad.zero_()

  optimizer.step()

  print(f'Epoch: {epoch+1}, Loss: {loss.item()}')


Epoch: 1, Loss: 0.9145975708961487
Epoch: 2, Loss: 0.6562520861625671
Epoch: 3, Loss: 0.5155688524246216
Epoch: 4, Loss: 0.4359382688999176
Epoch: 5, Loss: 0.38459667563438416
Epoch: 6, Loss: 0.3482820689678192
Epoch: 7, Loss: 0.32098859548568726
Epoch: 8, Loss: 0.29958412051200867
Epoch: 9, Loss: 0.28226158022880554
Epoch: 10, Loss: 0.2678986191749573
Epoch: 11, Loss: 0.2557583451271057
Epoch: 12, Loss: 0.2453346848487854
Epoch: 13, Loss: 0.23626773059368134
Epoch: 14, Loss: 0.22829383611679077
Epoch: 15, Loss: 0.2212149053812027
Epoch: 16, Loss: 0.21487906575202942
Epoch: 17, Loss: 0.2091677188873291
Epoch: 18, Loss: 0.20398671925067902
Epoch: 19, Loss: 0.19926045835018158
Epoch: 20, Loss: 0.19492729008197784
Epoch: 21, Loss: 0.19093644618988037
Epoch: 22, Loss: 0.1872459203004837
Epoch: 23, Loss: 0.18382033705711365
Epoch: 24, Loss: 0.18062976002693176
Epoch: 25, Loss: 0.17764882743358612
Epoch: 26, Loss: 0.1748557686805725
Epoch: 27, Loss: 0.1722317934036255
Epoch: 28, Loss: 0.1697

### Evaluation

In [None]:
with torch.no_grad():
  y_pred = model.forward(X_test_tensor)
  y_pred = (y_pred > 0.5).float()
  accuracy = (y_pred == y_test_tensor).float().mean()

  print(f'Accuracy: {accuracy.item()}')

Accuracy: 0.5301631093025208


# Building an ANN Using Pytorch

### Setup and Imports

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset, Dataset

In [None]:
torch.manual_seed(42)

<torch._C.Generator at 0x7d9637962890>

### Loading the Data

In [None]:
!kaggle datasets download zalando-research/fashionmnist

Dataset URL: https://www.kaggle.com/datasets/zalando-research/fashionmnist
License(s): other
Downloading fashionmnist.zip to /content
 99% 68.0M/68.8M [00:00<00:00, 162MB/s]
100% 68.8M/68.8M [00:00<00:00, 148MB/s]


In [None]:
!unzip -qq fashionmnist.zip

In [None]:
data = pd.read_csv('/content/fashion-mnist_train.csv')

In [None]:
data.head()

Unnamed: 0,label,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,9,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,6,0,0,0,0,0,0,0,5,0,...,0,0,0,30,43,0,0,0,0,0
3,0,0,0,0,1,2,0,0,0,0,...,3,0,0,0,0,1,0,0,0,0
4,3,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Data Preprocessing

In [None]:
X = data.iloc[:, 1:].values
y = data.iloc[:, 0].values

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Scaling
X_train = X_train / 255.0
X_test = X_test / 255.0

In [None]:
class CustomDataset(Dataset):

  def __init__(self, features, labels):
    self.features = torch.tensor(features, dtype=torch.float32)
    self.labels = torch.tensor(labels, dtype=torch.long)

  def __len__(self):
    return len(self.features)

  def __getitem__(self, idx):
    return self.features[idx], self.labels[idx]

In [None]:
train_dataset = CustomDataset(X_train, y_train)
test_dataset = CustomDataset(X_test, y_test)

In [None]:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

### Model Architecture

In [None]:
class NN(nn.Module):

  def __init__(self, num_features):
    super().__init__()

    self.model = nn.Sequential(
        nn.Linear(num_features, 128),
        nn.ReLU(),
        nn.Linear(128, 64),
        nn.ReLU(),
        nn.Linear(64, 10)
    )

  def forward(self, X):
    return self.model(X)

### Important hyperparameters

In [None]:
epochs = 10
lr = 0.1

### Training (Pipeline)

In [None]:
model = NN(X_train.shape[1])

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(model.parameters(), lr=lr)

In [None]:
for epoch in range(epochs):
  total_epoch_loss = 0

  for batch_idx, (features, labels) in enumerate(train_loader):

    y_pred = model(features)

    optimizer.zero_grad()

    loss = criterion(y_pred, labels)

    optimizer.step()

    total_epoch_loss += loss.item()

  print(f'Epoch: {epoch+1}, Average Epoch Loss: {total_epoch_loss/len(train_loader)}')

Epoch: 1, Average Epoch Loss: 2.2953363132476805
Epoch: 2, Average Epoch Loss: 2.2953363134066262
Epoch: 3, Average Epoch Loss: 2.2953363089561463
Epoch: 4, Average Epoch Loss: 2.295336306889852
Epoch: 5, Average Epoch Loss: 2.2953363130887348
Epoch: 6, Average Epoch Loss: 2.2953363183339435
Epoch: 7, Average Epoch Loss: 2.2953363200823467
Epoch: 8, Average Epoch Loss: 2.2953363060951233
Epoch: 9, Average Epoch Loss: 2.295336319128672
Epoch: 10, Average Epoch Loss: 2.2953363149960837


### Evalaution

In [None]:
model.eval()

NN(
  (model): Sequential(
    (0): Linear(in_features=784, out_features=128, bias=True)
    (1): ReLU()
    (2): Linear(in_features=128, out_features=64, bias=True)
    (3): ReLU()
    (4): Linear(in_features=64, out_features=10, bias=True)
  )
)

**Note:** It is important to explicitly tell the model that it is in evaluaiton mode. The puporse of doing this is to ensure that model does not simulate the training process behaviour when we are evaluating or testing our model on unseen data. Doing this is to avoid situations, such as:
- When the behaviour of the dropout layers needs to be nullified; once the model has been trained.
- Secondly, during batch normalization real-time mean and `std` calculations are not required to be made and updated; we just use the one's finalised during training.

In [None]:
total = 0
correct = 0

In [None]:
with torch.no_grad():

  for batch_idx, (features, labels) in enumerate(test_loader):

    y_pred = model(features)

    _, predicted = torch.max(y_pred.data, 1)

    total += labels.size(0)
    correct += (predicted == labels).sum().item()

In [None]:
print(correct/total)

0.10358333333333333
