###**Problem Description :-**
Consider the case of Pune labs in India. As the number of cases of COVID-19 is rising day by day, the government decided to predict the number of cases that would be critical, so that they can provide more ventilators.

As an Al/ML expert, your task is to predict whether a person would get critical or not. You are provided with the data of the cases like age, sex, travelled place, immunity level, fever frequency, breathing difficulty level and blood sugar. Help the government predict the number of critical cases.

**Finding weigth corresponding to different independent variables in data using gradient descent and cost function.**

In [1]:
import numpy as np

In [2]:
X = np.genfromtxt("train_X_pr.csv", delimiter=',', dtype=np.float64, skip_header=1)
Y = np.genfromtxt("train_Y_pr.csv", delimiter=',', dtype=np.float64)

In [3]:
def replace_null_values_with_mean(X):
    col_mean = np.nanmean(X, axis=0)
    inds = np.where(np.isnan(X))
    X[inds] = np.take(col_mean, inds[1])
    return X

In [4]:
def mean_normalize(X):
    for i in range(len(X[0])):
        column = X[:, i]
        avg = np.mean(column, axis=0)
        min = np.min(column,  axis=0)
        max = np.max(column,  axis=0)
        X[:, i] = (column-avg)/(max-min)
    return X

In [5]:
def standardize(X):
    for column_index in range(len(X[0])):
        column = X[:, column_index]
        mean = np.mean(column, axis=0)
        std = np.std(column, axis=0)
        X[:, column_index] = (column - mean) / std
    return X

In [6]:
def min_max_normalize(X):
    for column_index in range(len(X[0])):
        column = X[:, column_index]
        min = np.min(column, axis=0)
        max = np.max(column, axis=0)
        difference = max - min
        X[:, column_index] = (column - min) / difference
    return X

In [7]:

def sigmoid(Z):
    A = 1.0 / (1.0 + np.exp(-Z))
    return A

In [8]:
def compute_cost(X, Y, W, b, Lambda):
    M = len(Y)
    Z = np.dot(X, W.T) + b
    A = sigmoid(Z)
    cost = (-1/M) * np.sum(Y * np.log(A) + (1-Y) * np.log(1-A))
    regularization_cost = (Lambda * np.sum(np.square(W))) / (2 * M)
    return cost + regularization_cost

In [9]:
def compute_gradient_of_cost_function(X, Y, W, b):
    Z = np.dot(X, W.T) + b
    A = sigmoid(Z)
    db = np.sum(A - Y)
    dw = np.dot((A - Y).T, X)
    return dw, db

In [10]:
def Optimize_weights_using_gradient_descent(X, Y, learning_rate, Lambda):
    m = len(Y)
    Threshold_value = 0.0000001
    prev_cost, b, i = 0, 0, 1
    Y = Y.reshape(X.shape[0], 1)
    W = np.zeros((1, X.shape[1]))
    while True:
        dw, db = compute_gradient_of_cost_function(X, Y, W, b)
        W = W - (learning_rate * (dw + Lambda*W))/m
        b = b - (learning_rate * db)/m
        cost = compute_cost(X, Y, W, b, Lambda)
        if abs(cost - prev_cost) < (Threshold_value):
            break
        prev_cost = cost
        i += 1
    return W, b

In [11]:
def save_model(weights, weights_file_name):
    with open(weights_file_name, 'a', newline='') as weight_file:
        file_writer = csv.writer(weight_file, delimiter=",")
        file_writer.writerows(weights)
        weight_file.close()

In [15]:
X = replace_null_values_with_mean(X)
#with min max normalisation, accuracy = 0.599621118815286
#X = min_max_normalize(X)
#with standardization, accuracy = 0.6547185071844512
#X = standardize(X)
#with mean_normalize, accuracy = 0.6569682172544592
#X = mean_normalize(X)

**Storing weight of the data given for further prediction :-**

In [23]:
weights, b_value = Optimize_weights_using_gradient_descent(X, Y, learning_rate=0.0001, Lambda=0.1)
weights = np.insert(weights, 0, b_value, axis=1)
weights[0]

array([ 0.26853547,  1.83614455, -0.29501935,  0.00773004,  0.24506314,
        0.11307014, -0.01845501, -0.50970885])

In [21]:
def predict_target_values(test_X, weights):
    b = weights[0]
    weights = weights[1:]
    A = sigmoid(np.dot(test_X, weights) + b)
    Y_prediction = np.where(A >= 0.5, 1, 0)
    return Y_prediction

In [25]:
pred_Y = predict_target_values(X, weights[0])

**Using f1 score to validata prediction :-**

In [27]:
from sklearn.metrics import f1_score
weighted_f1_score = f1_score(Y, pred_Y, average = 'weighted')
weighted_f1_score

0.797239313518419