In [1]:
from minitorch import *
from minitorch.nn import *
from minitorch.optim import SGD

# Use case: Creation of a Multi Layer Perceptron (MLP) that learns the AND Gate 

Firstly, we define our model. This one will consist of two linear layers. The first one is the input layer, and it feeds its output to a ReLU function for cleaning negative outputs. The second layer feeds its output to a Sigmoid function for predicting probabilities.

In [2]:
class MLP(Module):
    def __init__(self):
        self.layers = Sequential(
            Linear(2,3, seed=1241241),
            ReLU(),
            Linear(3,1, seed=1241241),
            Sigmoid()
        )
    
    def forward(self, input : Tensor):
        return self.layers(input)

We instantiate the model.

In [3]:
model = MLP()

Now, we define a loss function. Since we are doing a binary classification task, we'll be using the **Binary Cross Entropy** loss function.

In [4]:
loss = BinaryCrossEntropyLoss()

We define an optimization algorithm. As for the version *v.1.0.0* of ***Minitorch***, only single-element batches are supported. Therefore, we'll be using **Stochastic Gradient Descent**

In [5]:
optim_sgd = SGD(model.parameters(), lr = 0.5)

Finally, we create a training loop. In this loop we'll be iterating through all the training examples. We then obtain the output of the model for each example, compute the loss function and then compute its gradient. Once the gradient of the loss function is calculated, we use Stochastic Gradient Descent to perform the updates of the weights.

In [6]:
data  = Tensor([[0,0], [0,1], [1,0], [1,1]])
label = Tensor([0,0,0,1]) # AND function

n_training_ex = len(data)

epoch = 1000

for i in range(epoch):
    for idx in range(n_training_ex):

        d = data[idx]
        l = label[idx]

        optim_sgd.zero_grad()

        output = model.forward(d)

        loss(output, l)

        loss.backward()

        optim_sgd.step()

Lastly, we visualize our results.

In [8]:
print("Model output for (0,0): ", model.forward(data[0]).item().data)
print("Model output for (0,1): ", model.forward(data[1]).item().data)
print("Model output for (1,0): ", model.forward(data[2]).item().data)
print("Model output for (1,1): ", model.forward(data[3]).item().data)

Model output for (0,0):  1.8408679308744555e-08
Model output for (0,1):  0.00018656888445709013
Model output for (1,0):  0.00017551873734118786
Model output for (1,1):  0.9975393850659847
