# Task 25-> Logistic Regression from scratch



######## Logistic Regression from scratch involves building a classification model by implementing the logistic 
function and optimization algorithm manually. You start with defining the sigmoid function to model 
probabilities, then use a cost function (like cross-entropy) to measure prediction error. 
Optimization, typically using gradient descent, adjusts model weights to minimize the cost function and 
improve accuracy. This process helps you understand the core mechanics of logistic regression beyond 
using built-in libraries.

# import necessary Libraries and dataset

In [35]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
file_path = r'C:\Users\Huawei\Desktop\anemia.csv'
data = pd.read_csv(file_path)



#define features and target

In [36]:
X = data[['Gender', 'Hemoglobin', 'MCH', 'MCHC', 'MCV']].values
y = data['Result'].values

# Split the data into training and testing sets

In [37]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Sigmoid function


In [38]:

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Cost function

In [39]:
def compute_cost(X, y, weights):
    m = len(y)
    h = sigmoid(np.dot(X, weights))
    cost = -1/m * (np.dot(y, np.log(h)) + np.dot((1 - y), np.log(1 - h)))
    return cost

# Gradient descent

In [46]:
def gradient_descent(X, y, weights, learning_rate, iterations):
    m = len(y)
    for i in range(iterations):
        h = sigmoid(np.dot(X, weights))
        gradient = np.dot(X.T, (h - y)) / m
        weights -= learning_rate * gradient
        if i % 100 == 0:
            cost = compute_cost(X, y, weights)
            print("Iteration", i, "Cost", cost)
    return weights

# Logistic regression function

In [41]:
def logistic_regression(X_train, y_train, learning_rate=0.01, iterations=1000):
    weights = np.zeros(X_train.shape[1])
    weights = gradient_descent(X_train, y_train, weights, learning_rate, iterations)
    return weights

# Prediction function

In [42]:
def predict(X, weights, threshold=0.5):
    probabilities = sigmoid(np.dot(X, weights))
    return probabilities >= threshold

# Training

In [43]:
weights = logistic_regression(X_train, y_train)
print("\n")

Iteration 0: Cost 5.0718687456441565
Iteration 100: Cost 2.6206888815654934
Iteration 200: Cost 15.217663043906294
Iteration 300: Cost 4.821840176942691
Iteration 400: Cost 7.926357343821111
Iteration 500: Cost 11.485062401245026
Iteration 600: Cost 1.588197615433473
Iteration 700: Cost 6.179175037535954
Iteration 800: Cost 1.5247849371301763
Iteration 900: Cost 10.832518862343886




# Prediction on test set

In [44]:
predictions = predict(X_test, weights)
print("Predictions:", predictions)
print("\n")

Predictions: [ True  True False False]




# Accuracy

In [45]:
accuracy = np.mean(predictions == y_test)#calculates the percentage of correct predictions by comapring the model's prediction to actual values
print("Accuracy:", accuracy * 100)


Accuracy: 75.0
