# Problem: Write a Custom Activation Function with Autograd

### Problem Statement
Implement a **custom activation function**, **Learned-SiLU**, using `torch.autograd.Function`. The activation function should be based on the SiLU formula \( x \cdot \text{sigmoid}(x) \) but include a **learnable slope parameter**. Use this custom activation function in a simple linear regression model.

### Requirements
1. **Define the Custom Activation Function**:
   - Implement a custom activation function, **Learned-SiLU**, where the output is calculated as:
     $$
     \text{Learned-SiLU}(x) = \text{slope} \cdot x \cdot \text{sigmoid}(x)
     $$
   - The **slope** should be a learnable parameter.

2. **Autograd Implementation**:
   - Use `torch.autograd.Function` to define the forward and backward passes for the custom activation function.

3. **Integrate the Activation Function**:
   - Incorporate the custom activation function into a simple linear regression model.
   - Train the model to verify the functionality of the activation function.

### Constraints
- Ensure the **slope parameter** is properly initialized and updated during training.

<details>
  <summary>💡 Hint</summary>
  Some details: https://pytorch.org/tutorials/beginner/examples_autograd/two_layer_net_custom_function.html
</details>


<details>
  <summary>💡 Alternate Implementation?</summary>
  Can be done with nn.Module without implementing backward.
</details>

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

In [2]:
# Generate synthetic data
torch.manual_seed(42)
X = torch.rand(100, 1) * 10  # 100 data points between 0 and 10
y = 2 * X + 3 + torch.randn(100, 1)  # Linear relationship with noise

In [3]:
class LearnedSiLUFunction(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x, slope):
        # Save the input tensor and slope for backward computation
        ...

    @staticmethod
    def backward(ctx, grad_output):
        # Retrieve the input and slope saved in the forward pass
        ...


# Define the Linear Regression Model
class LinearRegressionModel(nn.Module):
    def __init__(self, slope=1):
        super().__init__()
        self.slope = nn.Parameter(torch.ones(1) * slope)

    def forward(self, x):
        # Use the custom LearnedSiLUFunction
        ...

# Initialize the model, loss function, and optimizer
model = LinearRegressionModel()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Training loop
epochs = 1000
for epoch in range(epochs):
    # Forward pass
    predictions = model(X)
    loss = criterion(predictions, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Log progress every 100 epochs
    if (epoch + 1) % 100 == 0:
        print(f"Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}")


Epoch [100/1000], Loss: 1.0552
Epoch [200/1000], Loss: 0.8031
Epoch [300/1000], Loss: 0.7150
Epoch [400/1000], Loss: 0.6826
Epoch [500/1000], Loss: 0.6705
Epoch [600/1000], Loss: 0.6659
Epoch [700/1000], Loss: 0.6642
Epoch [800/1000], Loss: 0.6635
Epoch [900/1000], Loss: 0.6632
Epoch [1000/1000], Loss: 0.6632


In [4]:
# Display the learned parameters
[w, b] = model.linear.parameters()
print(f"Learned weight: {w.item():.4f}, Learned bias: {b.item():.4f}")

# Testing on new data
X_test = torch.tensor([[4.0], [7.0]])
with torch.no_grad():
    predictions = model(X_test)
    print(f"Predictions for {X_test.tolist()}: {predictions.tolist()}")

Learned weight: 1.9557, Learned bias: 2.2181
Predictions for [[4.0], [7.0]]: [[11.04088020324707], [16.907970428466797]]
