# Definitions

Batch Size: In machine learning, the training process involves updating the model's parameters based on the gradients computed from a batch of training samples. The batch size determines the number of samples that are processed together before the model's parameters are updated.

For example, with a dataset of 1000 samples and a batch size of 100, there would be a total of 10 iterations or updates of the model's parameters.

Reason behind the need of batch size: 
* Mmeory Efficient: batch size smaller than the entire datast
* Computational Efficiency: Batch processing can take advantage of parallelism in modern hardware, such as GPUs. By processing multiple samples simultaneously, the computations can be distributed across multiple cores or devices, leading to faster training times.
* Parameters to be updated more frequently, letting the model to converge to the local minimum

* A larger batch size = more stable gradient estimates but computationally more expensive
* Smaller batch size = more frequent update of model parameters and potentiall converge faster but introduce more noise in the estimates

***

In [2]:
import torch
import torch.nn as nn

In [3]:
# Model Framework
class LogisticRegression(nn.Module):
    def __init__(self,input_size):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(input_size,1)
        self.sigmoid = nn.Sigmoid()
        
    def forward(self,x):
        out = self.linear(x)
        out = self.sigmoid(out) # probabilities
        return out 

In [24]:
# Sample dataset
X = torch.tensor([[1.0, 2.0], [2.0, 3.0], [3.0, 4.0], [4.0, 5.0]])
Y = torch.tensor([[0.0], [0.0], [1.0], [1.0]])

# Model hyperparameters
class Params(object):
    def __init__(self,input_size,learning_rate,epochs, threshold):
        self.input_size = input_size
        self.learning_rate = learning_rate
        self.epochs = epochs
        self.threshold = threshold
        
args = Params(2,0.01,100,0.7)

# Initialize the model
model = LogisticRegression(args.input_size)


# Define Loss Function and Optimizer 
criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(),lr=args.learning_rate)

# Training_loop
for e in range(args.epochs):
    # forward pass
    outputs = model(X)
    loss = criterion(outputs, Y)
    
    # Backword and optimization 
    
    """ 
     In PyTorch, when performing backpropagation to compute the gradients of the model's parameters, 
     it is necessary to zero out the gradients from the previous iteration. 
     This is because PyTorch accumulates gradients by default, so if we don't reset the gradients, 
     they would accumulate and interfere with subsequent parameter updates.
    """
    optimizer.zero_grad()
    
    
    loss.backward()
    """
     computes the gradients of the loss function with respect to all the tensors that require gradients in the computational graph. 
     It essentially performs automatic differentiation and accumulates the gradients in the respective parameters of the model.
    """
    
    optimizer.step()
    
    """
    applies the computed gradients to the model's parameters using the specified optimization algorithm (e.g., SGD, Adam). 
    It adjusts the parameters in the direction that reduces the loss, allowing the model to learn from the training data.
    
    """
    
     # Print the progress
    if (e + 1) % 10 == 0:
        print(f"Epoch [{e+1}/{args.epochs}], Loss: {loss.item():.4f}")

Epoch [10/100], Loss: 0.9164
Epoch [20/100], Loss: 0.7192
Epoch [30/100], Loss: 0.6488
Epoch [40/100], Loss: 0.6239
Epoch [50/100], Loss: 0.6136
Epoch [60/100], Loss: 0.6082
Epoch [70/100], Loss: 0.6045
Epoch [80/100], Loss: 0.6015
Epoch [90/100], Loss: 0.5986
Epoch [100/100], Loss: 0.5959


In [21]:
# Test the model
test_input = torch.tensor([[5.0, 6.0]])
predicted = model(test_input)
print(f"Predicted probability: {predicted.item():.4f}")

Predicted probability: 0.7145


By default, the threshold is commonly set to 0.5, but you can adjust it according to your specific needs and the trade-off between precision and recall

#### Changing the threshold

In [25]:
# Assuming `outputs` contains the predicted probabilities
# args.threshold = 0.7  # Set a new threshold value
(predicted >= args.threshold).float()

tensor([[1.]])

A higher threshold tends to increase precision (reducing false positives), but it may lead to lower recall (missing some true positives). Conversely, a lower threshold increases recall (capturing more true positives) but may reduce precision (increasing false positives).