optim is a submodule within the PyTorch library. It contains various optimization algorithms that are commonly used for training machine learning models, especially neural networks. These optimization algorithms are used to adjust the model's parameters (e.g., weights and biases) during training to minimize the loss function and improve the model's performance.

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np

In [2]:
# Generate some random data
np.random.seed(0)
X = np.random.randn(1000, 5)  # 1000 samples, 5 features
X[:, 0]+=1
X[:, 1]+=2
X[:, 2]+=3
y = 2 * X[:, 0] + 7 * X[:, 1] + 10 * X[:, 2] + np.random.randn(1000)
print(type(X))

<class 'numpy.ndarray'>


In [3]:
print(np.mean(X[:, 2]))

2.979512314250047


In [4]:
# Convert data to PyTorch tensors
X_tensor = torch.tensor(X, dtype=torch.float32)
y_tensor = torch.tensor(y, dtype=torch.float32)

In the constructor (__init__ method) of the LassoRegression class, you pass two parameters:

input_size: This parameter represents the number of features (input dimensions) for your linear regression model.
l1_strength: This parameter represents the strength of L1 regularization, which controls how much the L1 penalty affects the model during training.
Inside the constructor, super(LassoRegression, self).__init__() is used to call the constructor of the parent class (nn.Module). It's important to call the parent class constructor to properly initialize the LassoRegression class as a PyTorch module.

In Python, when you define a subclass and you want to call the constructor of the parent (base) class, you typically use the super() function. The purpose of calling the parent class constructor with super() is to initialize the inherited attributes and behaviors defined in the parent class before adding any additional attributes specific to the subclass.

The syntax you provided, super(LassoRegression, self).__init__(), is a way to call the constructor of the parent class (nn.Module) from within the LassoRegression class constructor. Let me explain the components:

The super() function returns a temporary object of the superclass, which allows you to call its methods. In this case, it returns an object that represents the nn.Module superclass.

This is the current class in which you are defining the constructor (i.e., the subclass). super(LassoRegression, self) specifies that you want to call the constructor of the parent class (nn.Module) in the context of the LassoRegression class.

self refers to the instance of the class itself, in this case, an instance of LassoRegression. When you call super(LassoRegression, self).__init__(), you are invoking the constructor of the parent class (nn.Module) and passing the current instance self as the first argument to that constructor.

The purpose of passing self as the first argument to the parent class constructor is to ensure that the initialization of the parent class is done in the context of the current instance of LassoRegression. This is important because the parent class constructor may need to set up instance-specific attributes and behaviors.

The line super(LassoRegression, self).__init__() is a standard practice in Python when creating subclasses. It ensures that the constructor of the parent class is called properly to initialize inherited attributes and behaviors while allowing you to customize and extend the subclass as needed.

In [5]:
# Define a simple linear regression model with L1 regularization (Lasso)
class LassoRegression(nn.Module):
    def __init__(self, input_size, l1_strength):
        super(LassoRegression, self).__init__()
        self.linear = nn.Linear(input_size, 1, bias=True)
        self.l1_strength = l1_strength

    def forward(self, x):
        return self.linear(x)

In [6]:
# Create the model with L1 regularization
model = LassoRegression(input_size=5, l1_strength=1)

In [7]:
# Define the loss function (mean squared error) with L1 regularization
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

In [8]:
# Training loop
num_epochs = 1000
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_tensor)
    loss = criterion(outputs, y_tensor)
    
    # L1 regularization term
    l1_reg = torch.abs(model.linear.weight).sum() * model.l1_strength
    
    # Add L1 regularization to the loss
    total_loss = loss + l1_reg
    
    # Backward pass and optimization
    optimizer.zero_grad()
    total_loss.backward()
    optimizer.step()

  return F.mse_loss(input, target, reduction=self.reduction)


In [9]:
# Extract the learned coefficients (weights)
learned_coefficients = model.linear.weight.detach().numpy()
print("Learned Coefficients (with L1 regularization):\n", learned_coefficients)

Learned Coefficients (with L1 regularization):
 [[ 4.6241956e+00  5.8891597e+00  5.8989525e+00 -3.9402531e-03
   1.3515234e-03]]


In [10]:
# Define a simple linear regression model with Ridge regularization
class RidgeRegression(nn.Module):
    def __init__(self, input_size, l2_strength):
        super(RidgeRegression, self).__init__()
        self.linear = nn.Linear(input_size, 1,bias = True)
        self.l2_strength = l2_strength

    def forward(self, x):
        return self.linear(x)

In [11]:
# Create the model with Ridge regularization
model = RidgeRegression(input_size=5, l2_strength=1)

In [12]:
# Define the loss function (mean squared error) with Ridge regularization
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

In [13]:
# Training loop
num_epochs = 1000
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_tensor)
    loss = criterion(outputs, y_tensor)
    
    # L2 regularization term
    l2_reg = (model.linear.weight ** 2).sum() * model.l2_strength
    
    # Add L2 regularization to the loss
    total_loss = loss + l2_reg
    
    # Backward pass and optimization
    optimizer.zero_grad()
    total_loss.backward()
    optimizer.step()

In [14]:
# Extract the learned coefficients (weights)
learned_coefficients = model.linear.weight.detach().numpy()
print("Learned Coefficients (with Ridge regularization):\n", learned_coefficients)

Learned Coefficients (with Ridge regularization):
 [[ 4.4364114   5.8936653   5.3988247  -0.17205386  0.06646103]]


In [15]:
# Define a simple linear regression model with Elastic Net regularization
class ElasticNetRegression(nn.Module):
    def __init__(self, input_size, l1_strength, l2_strength):
        super(ElasticNetRegression, self).__init__()
        self.linear = nn.Linear(input_size, 1, bias = True)
        self.l1_strength = l1_strength
        self.l2_strength = l2_strength

    def forward(self, x):
        return self.linear(x)

In [16]:
# Create the model with Elastic Net regularization
model = ElasticNetRegression(input_size=5, l1_strength=1.0, l2_strength=1.0)

In [17]:
# Define the loss function (mean squared error) with Elastic Net regularization
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

In [18]:
# Training loop
num_epochs = 1000
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_tensor)
    loss = criterion(outputs, y_tensor)
    
    # L1 regularization term
    l1_reg = torch.abs(model.linear.weight).sum() * model.l1_strength
    
    # L2 regularization term
    l2_reg = (model.linear.weight ** 2).sum() * model.l2_strength
    
    # Add Elastic Net regularization to the loss
    total_loss = loss + l1_reg + l2_reg
    
    # Backward pass and optimization
    optimizer.zero_grad()
    total_loss.backward()
    optimizer.step()

In [19]:
# Extract the learned coefficients (weights)
learned_coefficients = model.linear.weight.detach().numpy()
print("Learned Coefficients (with Elastic Net regularization):\n", learned_coefficients)

Learned Coefficients (with Elastic Net regularization):
 [[ 3.9642704e+00  5.5403581e+00  5.7475700e+00 -1.7321273e-03
   3.5503949e-04]]


In [20]:
# Define a simple linear regression model
class LinearRegression(nn.Module):
    def __init__(self, input_size):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(input_size, 1, bias=True)

    def forward(self, x):
        return self.linear(x)

In [21]:
# Create the model
model = LinearRegression(input_size=5)

In [22]:
# Define the loss function (mean squared error)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

In [23]:
# Training loop
num_epochs = 10000
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_tensor)
    loss = criterion(outputs, y_tensor)
    
    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

In [24]:
# Extract the learned coefficients (weights)
learned_coefficients = model.linear.weight.detach().numpy()
print("Learned Coefficients (Linear Regression without regularization):\n", learned_coefficients)

Learned Coefficients (Linear Regression without regularization):
 [[ 0.0780985   0.23204686  0.2930379  -0.00574475  0.00487926]]
