# FDU PRML 2024 Fall Assignment 1

Name: `徐一冉`

Student ID: `22307130487`


Please follow the instructions and complete the following exercises using PyTorch.

## 1. Basic Operations of Tensors

In [29]:
import torch

my_first_tensor = torch.ones(3,4)  # TODO: assign a tensor of shape (3, 4) with all elements equal to 1.0

my_second_tensor = torch.randn(3,4) # TODO: assign a random tensor of shape (3, 4) with all elements sampled from a standard normal distribution

their_matrix_product = my_first_tensor @ my_second_tensor.T  # TODO: compute the matrix product of my_first_tensor and the transpose of my_second_tensor (There are multiple ways to do this. Just pick one you like.)

some_meaningless_concatenation = torch.cat([my_first_tensor, my_second_tensor], dim=0)  # TODO: concatenate my_first_tensor and my_second_tensor along the first dimension. (Maybe you should check the documentation of torch.cat)

some_meaningless_stack = torch.stack([my_first_tensor] *5, dim =0)  # TODO: stack 5 copies of my_first_tensor along a newly created dimension. (Maybe you should check the documentation of torch.stack)

# What is the shape of some_meaningless_stack? Can you imagine the geometric interpretation of stacking 5 matrices of shape (3, 4) along the first dimension?


## 2. A simple logistic regression

There are 4 core components in Pytorch training process: **model**, **loss function**, **optimizer** and **data loader**. In this part, we will implement a simple logistic regression model to illustrate them.

### 2.1 Model and Loss Function

In [30]:
# Define a linear layer for logitstic regression

class Linear(torch.nn.Module):
	def __init__(self, input_dim, output_dim):
		super().__init__()
		# pass
		# TODO: initialize the weight and bias of the linear layer.
		self.weight = torch.nn.Parameter(torch.randn(input_dim, output_dim))
		self.bias = torch.nn.Parameter(torch.zeros(output_dim))
  
	def forward(self, x):
		# pass
		# TODO: implement the forward function of a linear layer.
		return x @ self.weight + self.bias


def loss_function(y_pred, y):
    # TODO: implement the loss function of logistic regression.
    # pass
	return torch.nn.functional.binary_cross_entropy_with_logits(y_pred.squeeze(), y)

### Synthetic Data

In real world, we usually have to deal with large-scale datasets. However, in this assignment, we will use synthetic data to illustrate the training process. The synthetic data is generated by the following function:

In [31]:
# Generate some random data for binary classification

num_samples = 100
num_features = 2

x_0 = torch.randn(num_samples, num_features) + torch.tensor([2.0, 2.0])
y_0 = torch.zeros(num_samples)

x_1 = torch.randn(num_samples, num_features) + torch.tensor([-2.0, -2.0])
y_1 = torch.ones(num_samples)

x = torch.cat([x_0, x_1], dim=0)
y = torch.cat([y_0, y_1], dim=0)

### 2.2 Dataloader

In [32]:
# Define a dataset to feed into the model

class MyDataset(torch.utils.data.Dataset):
	def __init__(self, x, y):
		super().__init__()
		self.x = x
		self.y = y

	def __getitem__(self, index):
		# TODO: implement the __getitem__ function.
		# pass
		return self.x[index], self.y[index]

	def __len__(self):
		# TODO: implement the __len__ function.
		# pass
		return len(self.x) 

dataset = MyDataset(x, y)
dataloder = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)

### 2.3 Optimizer

In [33]:
my_model = Linear(num_features, 1)

optimizer =  torch.optim.Adam(my_model.parameters(), lr=0.01)  # TODO: initialize an optimizer of your choice.

### Putting all together

Since this is just a toy experiment, we do not need validation.

In the following code, we expect to see the training loss decreasing to 0.001 or lower.

In [34]:
# Train the model

for epoch in range(100):
	for batch_x, batch_y in dataloder:
		# TODO: implement the training loop.
		# pass
		optimizer.zero_grad()
		pred = my_model(batch_x)
		loss = loss_function(pred, batch_y)
		loss.backward()
		optimizer.step()
  
	if epoch % 10 == 0:
		print('epoch: {}, loss: {}'.format(epoch, loss.item()))
  
# save the model

torch.save(my_model.state_dict(), 'my_model.pt')

epoch: 0, loss: 1.2072618007659912
epoch: 10, loss: 0.1462070494890213
epoch: 20, loss: 0.029419932514429092
epoch: 30, loss: 0.09361959248781204
epoch: 40, loss: 0.016517918556928635
epoch: 50, loss: 0.05103921517729759
epoch: 60, loss: 0.018556121736764908
epoch: 70, loss: 0.017881684005260468
epoch: 80, loss: 0.011547241359949112
epoch: 90, loss: 0.02400272898375988


## 3. MNIST Classification
MLP 数据集分类

In this section, you will use PyTorch to implement a multi-layer perceptron (MLP) model for classifying handwritten digits using the MNIST dataset.


1. Data Loading and Preprocessing:
   - Utilize the `torchvision.datasets` module to load the MNIST dataset.
   - Apply necessary transformations (like `ToTensor` and `Normalize`) to prepare the data for model training. These transformations ensure the data has the correct format and scales, helping with model convergence.
   - Use a DataLoader with a suitable batch size to efficiently manage data feeding into the model.

2. Architecture:
   - Define a simple MLP model with PyTorch's `torch.nn.Module`. A suggested architecture is:
     - An input layer that takes the flattened 28x28 pixel values (784 features).
     - One or more hidden layers with ReLU activations for non-linearity.
     - An output layer with softmax activation for multi-class classification.
   - Make sure to initialize the model appropriately, especially if you're stacking multiple layers.

3. Training:
   - Set up an optimizer (like `Adam` or `SGD`) to minimize the model's error during training. You will also need a loss function, such as `CrossEntropyLoss`, which is well-suited for classification tasks.
   - Write a training loop that performs the following steps:
     - Forward pass: Feed batches through the model to obtain predictions.
     - Compute the loss by comparing predictions with true labels.
     - Backward pass: Calculate gradients for each model parameter.
     - Update the model weights using the optimizer.
   - Periodically log or print the training loss to track progress.

4. Evaluation:
   - After training, evaluate your model on the test set.
   - Compute and print the accuracy metric, and optionally, create a confusion matrix to analyze classification errors.


MNIST: http://yann.lecun.com/exdb/mnist/

In [35]:
from torchvision import datasets, transforms

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])
dataset1 = datasets.MNIST('./data', train=True, download=True,
                    transform=transform)
dataset2 = datasets.MNIST('./data', train=False,
                    transform=transform)
train_loader = torch.utils.data.DataLoader(dataset1)
test_loader = torch.utils.data.DataLoader(dataset2)

In [36]:
class MLP(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = torch.nn.Flatten()
        self.layers = torch.nn.Sequential(
            torch.nn.Linear(28 * 28, 512),
            torch.nn.ReLU(),
            torch.nn.Linear(512, 256),
            torch.nn.ReLU(),
            torch.nn.Linear(256, 10)
        )
    
    def forward(self, x):
        x = self.flatten(x)
        return self.layers(x)

# 初始化模型、损失函数和优化器
model = MLP()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# 训练循环
def train(model, train_loader, criterion, optimizer):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        
        if batch_idx % 100 == 0:
            print(f'Train Batch: {batch_idx}, Loss: {loss.item():.4f}')

# 评估函数
def evaluate(model, test_loader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data, target in test_loader:
            output = model(data)
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()
    
    accuracy = 100 * correct / total
    print(f'Test Accuracy: {accuracy:.2f}%')
    return accuracy

# 训练模型
num_epochs = 5
for epoch in range(num_epochs):
    print(f'Epoch {epoch+1}/{num_epochs}')
    train(model, train_loader, criterion, optimizer)
    evaluate(model, test_loader)

Epoch 1/5
Train Batch: 0, Loss: 2.1812
Train Batch: 100, Loss: 3.0872
Train Batch: 200, Loss: 0.1913
Train Batch: 300, Loss: 2.3751
Train Batch: 400, Loss: 0.0019
Train Batch: 500, Loss: 4.0902
Train Batch: 600, Loss: 0.0297
Train Batch: 700, Loss: 0.0136
Train Batch: 800, Loss: 0.0057
Train Batch: 900, Loss: 0.3897
Train Batch: 1000, Loss: 0.2384
Train Batch: 1100, Loss: 0.1768
Train Batch: 1200, Loss: 0.0017
Train Batch: 1300, Loss: 0.0427
Train Batch: 1400, Loss: 0.7647
Train Batch: 1500, Loss: 3.4802
Train Batch: 1600, Loss: 1.9543
Train Batch: 1700, Loss: 0.0002
Train Batch: 1800, Loss: 0.0007
Train Batch: 1900, Loss: 0.0572
Train Batch: 2000, Loss: 1.7357
Train Batch: 2100, Loss: 0.0000
Train Batch: 2200, Loss: 0.3284
Train Batch: 2300, Loss: 0.0001
Train Batch: 2400, Loss: 0.0001
Train Batch: 2500, Loss: 0.0001
Train Batch: 2600, Loss: 0.7976
Train Batch: 2700, Loss: 0.0058
Train Batch: 2800, Loss: 0.7353
Train Batch: 2900, Loss: 0.1388
Train Batch: 3000, Loss: 0.5718
Train Batc