<div align="center">

# National Tsing Hua University

### Fall 2023

#### 11210IPT 553000

#### Deep Learning in Biomedical Optical Imaging

## Homework 2

#### 110066515 陳文祺


### ✏️ Task A: Transitioning to Cross-Entropy Loss (20 pts)

In Lab, we utilized the **Binary Cross-Entropy (BCE) Loss** for a binary classification task. The BCE loss is articulated as:

$$ \text{BCE}(y, \hat{y}) = - \left( y \log(\hat{y}) + (1 - y) \log(1 - \hat{y}) \right) $$

Here, $y$ is the true label (0 or 1), and $\hat{y}$ denotes the predicted probability of $y=1$.

In this task, we aim to explore the implementation of a model using **Cross-Entropy (CE) Loss**, which is a more common approach for classification tasks, especially when dealing with multiple classes. CE loss is expressed as:

$$ \text{CE}(y, \hat{y}) = -\sum_{i} y^{(i)} \log(\hat{y}^{(i)}) $$

In this expression, $y$ represents the ground truth labels, $ \hat{y} $ is the predictions from your model, and $i$ is the index of the class.


#### 1. Modify the Loss (3 pts)
Transition to using Cross-Entropy (CE) Loss for the classification task by utilizing PyTorch's built-in functionalities. You can refer to the [official PyTorch documentation](https://pytorch.org/docs/stable/nn.html) for detailed information and guidance to ensure the correct implementation of the CE loss.

In [None]:
import torch.nn as nn

# Replace '...' with the appropriate loss function in PyTorch
loss = nn.CrossEntropyLoss()


#### 2. Modify the Model Architecture (2 pts)
To adapt the original code for use with Cross-Entropy (CE) loss, make necessary modifications to the model architecture. Ensure it is compatible and optimized for the application of CE loss. Consider the number of output nodes and the activation function used in the output layer for effective multi-class classification.

In [None]:
# Modifying the architecture to be compatible with CE loss
ce_model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(256*256*1, 256),
    nn.ReLU,
    #nn.Linear(256, 1)    # for BCE
    nn.Linear(256, 2)     # for CE
).cuda()

#### 3. Reflection Questions (15 pts, 5 pts for each)
Provide detailed answers to the questions below:

**Q1. Loss Function Comparison:**  
   What are the differences between Binary Cross-Entropy (BCE) loss and Cross-Entropy (CE) loss?

**Q2. Model Architecture Modification:**  
   What motivated the specific changes you made to the model architecture?

**Q3. Adapting to CE Loss:**  
   In the original code configured for BCE loss, two major adjustments are needed for adaptation to CE loss. Analyze and explain the necessity for these changes, referring to the code below.

```python
for images, labels in train_loader:
    images = images.cuda()
    images = images / 255.0
    labels = labels.cuda()
    optimizer.zero_grad()
    outputs = model(images)

    # Change #1: Adaptation to the labels for CE loss
    labels = labels.long()  # Changed from labels.float().unsqueeze(1) for BCE loss

    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
    total_loss += loss.item()

    # Change #2: Predictions for CE loss
    train_predicted = outputs.argmax(-1)  # Changed from torch.sigmoid(outputs) > 0.5 for BCE loss
    train_correct += (train_predicted == labels).sum().item()
    
```

#### Put Your Response Here:

##### 1. CrossEntropyLoss is for multi-class classification; BCE is used for binary classification. If the output is a form of probability, BCE should be choose since the output of sigmoid function in BCE; In CE, softmax fuuction usually gets output as the form: [0.1%, 99.9%], which is not suitable for binary classfication, so softmax is only suitable for multi-class classification.


##### 2. the CE loss need multiple classfiaction; therefore, the output layer should contain n class, that is nn.Linear(256, n), integer n cannot be 1, or CE loss cannot classify the output nodes.  

##### 3.1. Change 1: The target should be a LongTensor using CE, since it is used to index the output logit (or log probability) for the current target class as shown in example:

#### *target = torch.empty(3, dtype=torch.long).random_(5)*

##### 3.2. Change 2: Using CE Loss will get loss instead of labels. By default mean will be taken and the snippet with permute will be fine. By using this loss we can train nn via backward propagation. To get predicted class just take argmax across appropriate dimension, in the case without permutation: din = -1

### ✏️ Task B: Creating an Evaluation Code (20 pts)

Evaluate the performance of a pretrained deep learning model with a test dataset of chest X-ray images available in `test_normal.npy` and `test_pneumonia.npy` files. These files respectively contain 200 grayscale normal and pneumonia chest X-ray images, each of size 256×256. The objective is to calculate the model’s accuracy, defined as the percentage of images correctly classified. To accomplish this, you are tasked to write code that loads, processes, and evaluates the model on this specific dataset. Ensure each segment of code replacing the `...` placeholders is functional and aligns with the steps provided in the instructions.

**Note: ⚠️ Ensure to upload your trained model's weights to your working environment if needed.**

### Step 0: Download test dataset

In [None]:
!wget https://raw.githubusercontent.com/TacoXDD/homeworks/master/dataset/test/test_normal.npy
!wget https://raw.githubusercontent.com/TacoXDD/homeworks/master/dataset/test/test_pneumonia.npy

--2023-10-13 09:46:58--  https://raw.githubusercontent.com/TacoXDD/homeworks/master/dataset/test/test_normal.npy
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13107328 (12M) [application/octet-stream]
Saving to: ‘test_normal.npy’


2023-10-13 09:46:59 (328 MB/s) - ‘test_normal.npy’ saved [13107328/13107328]

--2023-10-13 09:46:59--  https://raw.githubusercontent.com/TacoXDD/homeworks/master/dataset/test/test_pneumonia.npy
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13107328 (12M) [application/octet-stream]
Saving to: ‘test_p

### Step 1: Prepare your test dataset

In [None]:
import numpy as np

test_abnormal = np.load('test_pneumonia.npy')
test_normal = np.load('test_normal.npy')

print(f'Shape of test_abnormal: {test_abnormal.shape}')
print(f'Shape of test_normal: {test_normal.shape}')

# For the data having presence of pneumonia assign 1, for the normal ones assign 0.
test_abnormal_labels = np.ones((test_abnormal.shape[0],))
test_normal_labels = np.zeros((test_normal.shape[0],))

x_test = np.concatenate((test_abnormal, test_normal), axis=0)
y_test = np.concatenate((test_abnormal_labels, test_normal_labels), axis=0)

print(f'Shape of x_test: {x_test.shape}')
print(f'Shape of y_test: {y_test.shape}')

Shape of test_abnormal: (200, 256, 256)
Shape of test_normal: (200, 256, 256)
Shape of x_test: (400, 256, 256)
Shape of y_test: (400,)


### Step 2: Load Test Images into PyTorch DataLoader (5 pts)

In [None]:
import torch
from torch.utils.data import DataLoader, TensorDataset, random_split

# Convert to PyTorch tensors
x_test = torch.Tensor(x_test).float()
y_test = torch.Tensor(y_test).long()

# Combine the images and labels into a dataset
test_dataset = TensorDataset(x_test, y_test)

# Create a dataloader to load data in batches. Set batch size to 32.
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=True)

### Step 3: Prepare Your Trained Model  (5 pts)
- Define the architecture to match exactly with the trained model intended for inference. Ensure strict alignment to avoid errors during evaluation.
- Load the weights from the trained model and set the model to evaluation mode

In [None]:
# Declare the model architecture
model = nn.Sequential(
    nn.Flatten(),

    nn.Linear(256*256*1, 64),
    nn.BatchNorm1d(64),
    nn.ReLU(),
    nn.Dropout(0.5),

    nn.Linear(64, 2)
    #nn.Linear(64, 1)     # 1 revise to n

).cuda()


# Load the trained weights
model.load_state_dict(torch.load('model_classification.pth'))

# Set the model to evaluation mode
model.eval()

RuntimeError: ignored

### Step 4: Perform Inference and Calculate the Accuracy (10 pts)
- Ensure the image values are processed in a manner consistent with the training phase.
- Use the model that was trained with BCE loss to execute inference on the test dataset.
- Note that inference should be performed in GPU.

In [None]:
test_correct = 0
test_total = 0


with torch.no_grad():
    for images, labels in test_loader:

        images = images.cuda()
        images = images / 255.

        labels = labels.cuda()

        outputs = model(images)

        #labels_float = labels.float().unsqueeze(1)  # Convert labels to float and match shape with outputs
        labels = labels.long()
        #predicted = torch.sigmoid(outputs) > 0.5
        predicted = outputs.argmax(-1)

        test_correct += (predicted.float() == labels).sum().item()
        test_total += labels.size(0)

print(f'Test accuracy is {100. * test_correct / test_total}%.')


NameError: ignored