#  Liver patients prediction
### A prediction /classification model should take into account at least three factors: 
### 1. Purpose 2. Dataset characteristics. 3. Computanional & run time resurces. 

#### Starting with point #1, the goal I aim at is to get the **highest recall for positive diagnosis**, for not missing any patient. Nevertheless, I will also try to get a good **precision for healthy people**. 
#### As for point #2, the Liver patients dataset is **small and imbalnced**.

#### Sorry, I don't like oversampling, such as **smote** method. I suspect that the model will learn the smote pattern, and not the real generalization that I expect from a good ML algorithm.  
#### Because it is a small Dataset, I tried **NearMiss** undersampling method.
#### Additional improvments I got with 1. a **costum loss** function, 2. not using NearMiss and 3. increasing number of epochs. 


In [None]:
import pandas as pd 
import numpy as np
import torch
import math
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import accuracy_score


## Some data preparation... 

In [None]:
df = pd.read_csv('/kaggle/input/indian-liver-patient-records/indian_liver_patient.csv')
print (df['Dataset'])
df.dropna(inplace=True)

df.Gender.replace({'Male': 1, 'Female': 2}, inplace=True)
df.Dataset.replace({2: 0}, inplace=True)

#### Lets **normalize** the data: 

In [None]:
# Create a minimum and maximum processor object
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()

# Create an object to transform the data to fit minmax processor
df_scaled = min_max_scaler.fit_transform(df)

# Run the normalizer on the dataframe
df_normalized = pd.DataFrame(df_scaled)

#### Let's set aside 20% of the data for **validation**.

In [None]:
train, test = train_test_split(df_normalized, test_size=0.2, random_state=1)

labels = train[10]
train = train.drop(10, axis=1)

labelsTest = test[10]
test = test.drop(10, axis=1)

#### Shifting to pytorch

In [None]:
x = torch.tensor(train.values.astype(np.float32))
y = torch.tensor(labels.values.astype(np.float32))

x_test = torch.tensor(test.values.astype(np.float32))
y_test = torch.tensor(labelsTest.values.astype(np.float32))

### Because the dataset is so tiny, it suffers from the Bias–variance tradeoff. 
### Therefore, it will be hard to do **hyperparameters tuning**. For this purpose I use seed:

In [None]:
torch.manual_seed(123)

## **NearMiss**
### To deal with this small and imbalnced dataset, I will start with NearMiss.
### By using the NearMiss method, we undersample the frequent class. In NearMiss method, the chosen samples have similar characteristics to those of the rare class. We assume that the clasifier will need to "work" harder to find good generalization princepeles, and it will be easier to clasify the droped samples.

In [None]:
from imblearn.under_sampling import NearMiss
nr = NearMiss()
x_near, y_near = nr.fit_resample(x, y)

x_nm = torch.from_numpy(x_near)
y_nm = torch.from_numpy(y_near)

### Let's verify that the classes are balanced

In [None]:
negative_imbalanced=0
positive_imbalanced=0
for i in range(y_nm.shape[0]):
    if y_nm[i] == 0:
        negative_imbalanced += 1
    if y_nm[i] == 1:
        positive_imbalanced += 1
print ('Number of negative samples after balance: ', negative_imbalanced)
print ('Number of positive samples after balance: ', positive_imbalanced)

### The model: 

In [None]:
model = torch.nn.Sequential(
    torch.nn.Linear(10, 50, bias=True),
    torch.nn.Dropout(0.4),
    torch.nn.ReLU(),
    torch.nn.Linear(50, 20, bias=True),
    torch.nn.Dropout(0.3),
    torch.nn.LeakyReLU(),
    torch.nn.Linear(20, 1, bias=True),
    torch.nn.Sigmoid()
)

### I will use conventional configuration (Binary Cross Entropy, Adam, ReLU)

In [None]:
criterion = torch.nn.BCELoss()
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

### Lets run!

In [None]:
Losses = []
epochs = 500

for epoch in range(epochs):
    y_pred = model(x_nm)    
    y_nm = torch.reshape(y_nm, (-1, 1))
    loss = criterion(y_pred, y_nm)
    #collecting the losses... 
    Losses.append(loss)
    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

### Let's see the loss

In [None]:
import matplotlib.pyplot as plt
plt.plot(Losses)
plt.title('Test loss trend')
plt.xlabel('epoch')
plt.ylabel('loss')
plt.show()

### let's see the recall

In [None]:
# Get the model predictions for the testset:
test_pred = model(x_test)

# Lets round the test prediction to yes/no. (This means: 1 or 0).
y_out = torch.round(test_pred)

# mach the two vectors dimantion:
y_test = torch.reshape(y_test, (-1, 1))

# By the following vector we can extract F/P, F/N, T/P, T/N. 
F_positive, T_positive, F_negative, T_negative = 0, 0, 0, 0
for i in range(y_test.shape[0]):
    if y_out[i] == 0:
        if y_test[i] == 0:
            T_negative += 1
        else:
            F_negative += 1
    if y_out[i] == 1:
        if y_test[i] == 1:
            T_positive += 1
        if y_test[i] == 0:
            F_positive += 1

recall = T_positive / (F_negative + T_positive)
print('The percentage of recall is: ', int(100*recall))

### **SKlearn** is surely much more elegant **!!**

In [None]:
from sklearn.metrics import classification_report

Label = y_test.detach().numpy()
Prediction = y_out.detach().numpy()

print('Classification report for NearMiss: \n', classification_report(Label, Prediction))


### The precision is better then other notbooks I saw here, but this result does not match **my main goal**: high recall for a Liver patients prediction. 
### To reach this goal, I will try to **manipulate the loss function** a little bit: 

In [None]:
epsilon = 10**-10
def my_loss(output, target, alpha):
    loss = alpha * target * torch.abs(torch.log(output + epsilon)) + torch.abs((1 - target) * torch.log(1 - output + epsilon))
    loss = torch.mean(loss)
    return loss

### Redefine the model. 
##### (I define a function for quick reuse.)

In [None]:
def run(x, y, x_test, y_test, alpha, epochs):
    Losses = []
    for epoch in range(epochs):
        y_pred = model(x)    
        y = torch.reshape(y, (-1, 1))
        loss = my_loss(y_pred, y, alpha)
        #collecting the losses... 
        Losses.append(loss)
        # Zero gradients, perform a backward pass, and update the weights.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # Get the model predictions for the testset:
    test_pred = model(x_test)

    # Lets round the test prediction to yes/no. (This means: 1 or 0).
    y_out = torch.round(test_pred)

    # mach the two vectors dimantion:
    y_test = torch.reshape(y_test, (-1, 1))

    plt.plot(Losses)
    plt.title('Test loss trend')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.show()

    Label = y_test.detach().numpy()
    Prediction = y_out.detach().numpy()
    print('Classification report for alpha =', alpha, ': \n', classification_report(Label, Prediction))

In [None]:
run(x_nm, y_nm, x_test, y_test, 0.5, 500)

### Alpha=0.5 compensating the imbalance of the data, and give greater precision. But the recall is poor.. ### Let's see the recall for higher alpha.

In [None]:
run(x_nm, y_nm, x_test, y_test, 30, 500)

### Adding a 4th layer to the model didn't give me a signicent improvement.
### I also tryied to add epochs.
#### (Pytorch-lightning has an early stopping, but it did not work properly at the time I made this kernal). 

In [None]:
run(x, y, x_test, y_test,1.4, 1000)

In [None]:
run(x, y, x_test, y_test, 1, 500)

## I'm not sure why imbalanced loss function gave better results compare to NearMiss.  
## Please let me know if you have some insights and/or improvement sugestions. 
### *I assume it relates to the {small} size of this dataset.* 
### Thanks for reading,
### Ziv