#Prequisites
please install Python and required libraries for this exercise.

**Install [PyTorch](https://pytorch.org/) Library**


```
!pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
```

**Import libraries**


```
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
```
Enable in-line plotting for notebooks
%matplotlib inline
## Data

For this exercise we simulate Single Cell RNA sequence data using [Splatter package in R](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1305-0) which is publically available. 

First, one need to start `rmagic` by executing this in a cell:

```
%load_ext rpy2.ipython
```
And install rpy2 package in Python. If it asked for restarting the runtime, from Runtime menu choose `Restart runtime`


Use %%R to execute cell magic. Use this if you want all syntax in a cell to be executed in R. Note that this must be placed at the beginning of the cell. install bioconductor:
%%R
```
# sim <- splatSimulate(params, mean.rate = 0.6, out.prob = 0.2)
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.15")
```
install splatter
```
%%R
BiocManager::install("splatter")
```

## 1)Pre-processing
### 1.1) Simulate 5000 samples with 2000 genes from two cell types and load the samples in Torch tensors and provide a DataLoader.
For example, this code loads MNIST dataset


```
batch_size = 32

train_dataset = datasets.MNIST('./data', 
                               train=True, 
                               download=True, 
                               transform=transforms.ToTensor())

validation_dataset = datasets.MNIST('./data', 
                                    train=False, 
                                    transform=transforms.ToTensor())

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

validation_loader = torch.utils.data.DataLoader(dataset=validation_dataset, 
                                                batch_size=batch_size, 
                                                shuffle=False)
```

### 1.2) Normalized samples using z-score normalization and plot them beside raw samples

## 2) Train a MLP on the data to classify rna samples into one of the two cell types.

### 2.1) implement following MLP model in sequential fashion.


```
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28*28, 50)
        self.fc1_drop = nn.Dropout(0.2)
        self.fc2 = nn.Linear(50, 50)
        self.fc2_drop = nn.Dropout(0.2)
        self.fc3 = nn.Linear(50, 10)

    def forward(self, x):
        x = x.view(-1, 28*28)
        x = F.relu(self.fc1(x))
        x = self.fc1_drop(x)
        x = F.relu(self.fc2(x))
        x = self.fc2_drop(x)
        return F.log_softmax(self.fc3(x), dim=1)
```



### 2.2) Instantiate (and name) the MLP model as `model`, Stochastic Gradient Descent optimizer as `optimizer` and Cross entropy loss as `criterion`. 


### 2.3) Fix the bug/s of the train method



```
def train(epoch, log_interval=200):
    # Set model to training mode
    model.train()
    
    # Loop over each batch from the training set
    for batch_idx, (data, target) in enumerate(train_loader):
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        
```


## 3) Evaluation
### 3.1) Change the previous codes to calculate test and train loss.

### 3.2) Change the previous codes to calculate Accuracy, specificity, sensitivity, and AUC for test set.