# Exercises Deep Learning
First Lecture

## Basic Tensor Operations


In [1]:
import torch
import torch.nn as nn
import numpy as np

In [2]:
x = torch.Tensor(2, 3, 4)


Different ways to create tensors:
- ```torch.zeros```: Creates a tensor filled with zeros
- ```torch.ones```: Creates a tensor filled with ones
- ```torch.rand```: Creates a tensor with random values uniformly sampled between 0 and 1
- ```torch.randn```: Creates a tensor with random values sampled from a normal distribution with mean 0 and variance 1
- ```torch.arange```: Creates a tensor containing the values
- ```torch.Tensor``` (input list): Creates a tensor from the list elements you provide

You can obtain the shape of a tensor in the same way as in numpy (```x.shape```), or using the ```.size``` method:

In [3]:
shape = x.shape
print("Shape:", x.shape)

size = x.size()
print("Size:", size)

dim1, dim2, dim3 = x.size()
print("Size:", dim1, dim2, dim3)

Shape: torch.Size([2, 3, 4])
Size: torch.Size([2, 3, 4])
Size: 2 3 4


Tensor to Numpy, and Numpy to Tensor


In [4]:
np_arr = np.array([[1, 2], [3, 4]])
tensor = torch.from_numpy(np_arr)

print("Numpy array:", np_arr)
print("PyTorch tensor:", tensor)

Numpy array: [[1 2]
 [3 4]]
PyTorch tensor: tensor([[1, 2],
        [3, 4]], dtype=torch.int32)


In [5]:
tensor = torch.arange(4)
np_arr = tensor.numpy()

print("PyTorch tensor:", tensor)
print("Numpy array:", np_arr)

PyTorch tensor: tensor([0, 1, 2, 3])
Numpy array: [0 1 2 3]


Matrix multiplication

In [6]:
x = torch.arange(6)
x = x.view(2, 3)
print("X", x)

X tensor([[0, 1, 2],
        [3, 4, 5]])


In [7]:
W = torch.arange(9).view(3, 3) # We can also stack multiple operations in a single line
print("W", W)

W tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])


In [8]:
h = torch.matmul(x, W) # Verify the result by calculating it by hand too!
print("h", h)

h tensor([[15, 18, 21],
        [42, 54, 66]])


 ### What about gpus?

When you create a tensor the tensor is ready to be computed by the cpu. To convert the tensor you can use ```.to()```
passing to the function "cuda" or "cpu" as needed

#### How do I know if I have cuda cores on my computer?
To solve this you can check with torch if cuda is available:

In [9]:
example_tensor = torch.rand(2,2)
if torch.cuda.is_available():
    print("CUDA is available. You can use GPU for PyTorch.")
    example_tensor.to("cuda")
else:
    print("CUDA is not available. Using CPU for PyTorch.")
    example_tensor.to("cpu")

CUDA is not available. Using CPU for PyTorch.


### Exercises

#### 1. Create two tensors

   - A 3x3 tensor of random numbers.
   - A 3x3 tensor filled with ones.

In [10]:
#Exercise 1

randTensor = torch.rand(3, 3)
oneTensor = torch.ones(3, 3)

print(randTensor)
print(oneTensor)

tensor([[0.5755, 0.9911, 0.1409],
        [0.6355, 0.2189, 0.8860],
        [0.3453, 0.1567, 0.6224]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


#### 2. Perform the following operations

- Add the two tensors.
- Multiply the two tensors element-wise.
- Compute the dot product between the first row of both tensors.
 - Find the transpose of the resulting tensor from the element-wise multiplication.

In [11]:
#Exercise 2
sum =  torch.add(randTensor, oneTensor)
print(sum)

mul = torch.mul(randTensor, oneTensor)
print(mul)

dotP = torch.dot(randTensor.view(-1), oneTensor.view(-1))
print(dotP)

transpose = torch.transpose(mul, 0, 1)
print(transpose)

tensor([[1.5755, 1.9911, 1.1409],
        [1.6355, 1.2189, 1.8860],
        [1.3453, 1.1567, 1.6224]])
tensor([[0.5755, 0.9911, 0.1409],
        [0.6355, 0.2189, 0.8860],
        [0.3453, 0.1567, 0.6224]])
tensor(4.5723)
tensor([[0.5755, 0.6355, 0.3453],
        [0.9911, 0.2189, 0.1567],
        [0.1409, 0.8860, 0.6224]])


#### 3. Convert the resulting tensor to a NumPy array and back to a PyTorch tensor.

In [12]:
#Exercise 3

npArray = transpose.numpy()
print(npArray)

transAgain = torch.from_numpy(npArray)
print(transAgain)

[[0.5754898  0.6354525  0.34534413]
 [0.9911484  0.21891862 0.15665454]
 [0.14086848 0.8860174  0.6224009 ]]
tensor([[0.5755, 0.6355, 0.3453],
        [0.9911, 0.2189, 0.1567],
        [0.1409, 0.8860, 0.6224]])


## Autograd

1. Create Tensors

In [13]:
x_a = torch.tensor(0., requires_grad=True)
x_b = torch.tensor(0., requires_grad=True)
w_a = torch.tensor(0.9, requires_grad=True)
w_b = torch.tensor(0.9, requires_grad=True)

y = torch.tensor(0., requires_grad=False)



2. Build a computation graph

In [14]:
weighted_a = w_a * x_a
weighted_b = w_b * x_b
sum_unit = weighted_a + weighted_b



3. Activation Function

For a simple approach as ease of replication by hand we will this activation function:

In [15]:
y_hat = torch.sigmoid(sum_unit)
y_hat

tensor(0.5000, grad_fn=<SigmoidBackward0>)

4. Calculate Loss

In [16]:
loss = torch.nn.BCELoss()
output = loss(y_hat, y)

5. Calculate gradients

In [17]:
output.backward()

6.Print out the gradients

In [18]:
print(x_a.grad)
print(x_b.grad)
print(w_a.grad)
print(w_b.grad)

tensor(0.4500)
tensor(0.4500)
tensor(0.)
tensor(0.)


### Training Loop

In [19]:
learning_rate = 0.1
epochs = 100

input_data = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
target_data = torch.tensor([[0], [0], [0], [1]], dtype=torch.float32)

In [20]:
class ANDGateModel(nn.Module):
    def __init__(self):
        super(ANDGateModel, self).__init__()
        self.linear = nn.Linear(2, 1,bias=True)

    def forward(self, x):
        x = self.linear(x)
        x = torch.sigmoid(x)
        return x


In [21]:
# Initialize the model
model = ANDGateModel()

# Loss function (Binary Cross-Entropy Loss)
loss_fn = torch.nn.BCELoss()

# Optimizer (Stochastic Gradient Descent)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(epochs):
    y_hat = model(input_data)
    loss = loss_fn(y_hat, target_data)


    loss.backward() # Backpropagation
    optimizer.step() # Update parameters using the optimizer
    optimizer.zero_grad() # Zero the gradients for the next iteration

    # Print loss and progress every 1000 epochs
    if (epoch + 1) % 100 == 0:
        print(f"Epoch {epoch + 1}/{epochs}, Loss: {loss.item():.4f}")

# Final weights and bias (optional)
print(f"Final weights: {model.linear.weight.data}")
print(f"Final bias: {model.linear.bias.data}")

# Test the AND gate
with torch.no_grad():
    for i in range(len(input_data)):
        x_a, x_b = input_data[i]
        y_hat = model(torch.tensor([[x_a, x_b]]))  # Model expects a batch
        print(f"Input: {input_data[i].numpy()} -> Predicted Output: {round(y_hat.item())}, Raw Output: {y_hat.item():.4f}")


Epoch 100/100, Loss: 0.0972
Final weights: tensor([[3.9964, 3.9329]])
Final bias: tensor([-5.9941])
Input: [0. 0.] -> Predicted Output: 0, Raw Output: 0.0025
Input: [0. 1.] -> Predicted Output: 0, Raw Output: 0.1129
Input: [1. 0.] -> Predicted Output: 0, Raw Output: 0.1194
Input: [1. 1.] -> Predicted Output: 1, Raw Output: 0.8738


!!! IMPORTANT: This example has a significant issue: the test set is the same as the training set.
This approach is used here solely for ease of explanation and should never be used in a production environment.!!!

### Exercises

#### 1.Replicate the OR Gate using a Neural Network
 Objective:
- Train a neural network to approximate the function of an OR gate.
- Compare how changing the weights or biases impacts the output of the network.

Input 1 | Input 2 | Output (OR)
| -- | -- | --|
0 | 0 | 0
0 | 1 | 1
1 | 0 | 1
1 | 1 | 1

1. Create the dataset
2. Replicate the architecture from the AND gate example
3. Change the loss function from Binary Cross-Entropy to Mean Squared Error

In [22]:
# Code Here

class ORGateModel(nn.Module):
    def __init__(self):
        super(ORGateModel, self).__init__()
        self.linear = nn.Linear(2, 1,bias=True)

    def forward(self, x):
        x = self.linear(x)
        x = torch.sigmoid(x)
        return x


In [23]:
target_data = torch.tensor([[0], [1], [1], [1]], dtype=torch.float32)

In [24]:
# Initialize the model
model = ORGateModel()

# Loss function (Binary Cross-Entropy Loss)
loss_fn = torch.nn.MSELoss()

# Optimizer (Stochastic Gradient Descent)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(epochs):
    y_hat = model(input_data)
    loss = loss_fn(y_hat, target_data)


    loss.backward() # Backpropagation
    optimizer.step() # Update parameters using the optimizer
    optimizer.zero_grad() # Zero the gradients for the next iteration

    # Print loss and progress every 1000 epochs
    if (epoch + 1) % 100 == 0:
        print(f"Epoch {epoch + 1}/{epochs}, Loss: {loss.item():.4f}")

# Final weights and bias (optional)
print(f"Final weights: {model.linear.weight.data}")
print(f"Final bias: {model.linear.bias.data}")

# Test the AND gate
with torch.no_grad():
    for i in range(len(input_data)):
        x_a, x_b = input_data[i]
        y_hat = model(torch.tensor([[x_a, x_b]]))  # Model expects a batch
        print(f"Input: {input_data[i].numpy()} -> Predicted Output: {round(y_hat.item())}, Raw Output: {y_hat.item():.4f}")


Epoch 100/100, Loss: 0.0085
Final weights: tensor([[4.2707, 4.1543]])
Final bias: tensor([-1.8241])
Input: [0. 0.] -> Predicted Output: 0, Raw Output: 0.1389
Input: [0. 1.] -> Predicted Output: 1, Raw Output: 0.9113
Input: [1. 0.] -> Predicted Output: 1, Raw Output: 0.9203
Input: [1. 1.] -> Predicted Output: 1, Raw Output: 0.9986


In [25]:
# https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html

#### 2. Build and train a network
1. Build a simple fully connected neural network with the following architecture:
    - Input layer with 2 units
    - Hidden layer with 4 units and ReLU activation
    - Output layer with 1 unit
2. Define the following loss function and optimizer:
    - Loss: Mean Squared Error (MSE)
    - Optimizer: Stochastic Gradient Descent (SGD)

The network should mimic $y = 2x_1 + 3x_2$, where $x_1$ and $x_2$ are random inputs

In [34]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split

# Define the neural network
class SimpleNet(nn.Module):
    def __init__(self):
        super(SimpleNet, self).__init__()
        # Define layers here
        self.layer1 =   nn.Linear(2, 4)
        self.activation_function = nn.ReLU()
        self.layer2 = nn.Linear(4, 1)

    def forward(self, x):
        # Define forward pass
        x = self.layer1(x)
        x = self.activation_function(x)
        x = self.layer2(x)
        return x

# Create synthetic data
x = torch.rand(10000, 2)
y = 2 * x[:, 0] + 3 * x[:, 1]
y = y.view(-1, 1)

# Split data into training and test sets (80% train, 20% test)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)

# Initialize the model, loss function, and optimizer
model = SimpleNet()
criterion =  torch.nn.MSELoss() # Loss function (MSE)
optimizer =  torch.optim.SGD(model.parameters(), lr=0.01)  # Optimizer (SGD)

# Training loop
for epoch in range(3000):
    model.train()

    # Forward pass
    y_pred = model(x_train)

    # Compute loss
    loss = criterion(y_pred, y_train)  # Compute loss using criterion

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item()}')

model.eval()  # Set the model to evaluation mode
with torch.no_grad():
    y_test_pred = model(x_test)  # Get predictions for the test set
    test_loss = criterion(y_test_pred, y_test)  # Compute test loss

    print(f'Test Loss: {test_loss.item()}')

# Show some final predictions
print("Final Predictions (first 5 test samples):")
for i in range(5):
    print(f"Predicted: {y_test_pred[i].item():.4f}, Actual: {y_test[i].item():.4f}")



Epoch 10, Loss: 5.473449230194092
Epoch 20, Loss: 2.632601261138916
Epoch 30, Loss: 1.1225172281265259
Epoch 40, Loss: 0.6394160389900208
Epoch 50, Loss: 0.534110963344574
Epoch 60, Loss: 0.5026615262031555
Epoch 70, Loss: 0.48144975304603577
Epoch 80, Loss: 0.4615555703639984
Epoch 90, Loss: 0.44194427132606506
Epoch 100, Loss: 0.42252400517463684
Epoch 110, Loss: 0.4033094048500061
Epoch 120, Loss: 0.3843275308609009
Epoch 130, Loss: 0.36560824513435364
Epoch 140, Loss: 0.34718233346939087
Epoch 150, Loss: 0.32908087968826294
Epoch 160, Loss: 0.3113352358341217
Epoch 170, Loss: 0.2939767837524414
Epoch 180, Loss: 0.27703624963760376
Epoch 190, Loss: 0.2605436146259308
Epoch 200, Loss: 0.2445271760225296
Epoch 210, Loss: 0.22901363670825958
Epoch 220, Loss: 0.2140275239944458
Epoch 230, Loss: 0.1995907723903656
Epoch 240, Loss: 0.18572242558002472
Epoch 250, Loss: 0.17243848741054535
Epoch 260, Loss: 0.15975160896778107
Epoch 270, Loss: 0.1476708948612213
Epoch 280, Loss: 0.1362019330