# Objective

Optimize Logistic Regression hyperparameters using PSO and compare performance against GridSearchCV and baseline.

**Why PSO?**
- PSO is a population-based metaheuristic that explores the parameter space intelligently without exhaustive search, often reducing time complexity while finding near-optimal solutions.

`Import Libraries`

In [11]:
import pandas as pd
import numpy as np
import time
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

`Load and Preprocess Dataset`

In [18]:
df = pd.read_csv('C:/Users/ajayr/Desktop/Projects to upload/Metaheuristic_Optimization_for_Logistic_Regression/data/data.csv')

# Drop irrelevant columns
if 'id' in df.columns:
    df.drop(columns=['id'], inplace=True)

# Encode target
df['diagnosis'] = df['diagnosis'].map({'M': 1, 'B': 0})

X = df.drop(columns=['diagnosis'])
y = df['diagnosis']

# Scale features
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42, stratify=y)

`Define Fitness Function`

In [21]:
def fitness_function(C_value):
    # Build Logistic Regression model
    model = LogisticRegression(C=C_value, max_iter=1000, solver='liblinear')
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    return accuracy_score(y_test, y_pred)

`Implement PSO`

In [30]:
# PSO Parameters
num_particles = 10
num_iterations = 20
w = 0.5  # inertia weight
c1 = 1.5  # cognitive coefficient
c2 = 1.5  # social coefficient

# Search space for C
lb, ub = 0.01, 100  # lower and upper bounds

# Initialize particles
particles = np.random.uniform(lb, ub, num_particles)
velocities = np.zeros(num_particles)
pbest_positions = particles.copy()
pbest_scores = np.array([fitness_function(c) for c in particles])
gbest_position = pbest_positions[np.argmax(pbest_scores)]
gbest_score = max(pbest_scores)

In [32]:
# Measure execution time
start_time = time.time()

# PSO loop
for i in range(num_iterations):
    for j in range(num_particles):
        r1, r2 = np.random.rand(), np.random.rand()
        velocities[j] = (
            w * velocities[j]
            + c1 * r1 * (pbest_positions[j] - particles[j])
            + c2 * r2 * (gbest_position - particles[j])
        )
        particles[j] += velocities[j]
        # Clip within bounds
        particles[j] = np.clip(particles[j], lb, ub)

        # Evaluate fitness
        score = fitness_function(particles[j])
        if score > pbest_scores[j]:
            pbest_positions[j] = particles[j]
            pbest_scores[j] = score

    # Update global best
    best_idx = np.argmax(pbest_scores)
    if pbest_scores[best_idx] > gbest_score:
        gbest_score = pbest_scores[best_idx]
        gbest_position = pbest_positions[best_idx]

end_time = time.time()
execution_time = end_time - start_time

print(f"Best C found by PSO: {gbest_position:.4f}")
print(f"Best Accuracy by PSO: {gbest_score:.4f}")
print(f"PSO Execution Time: {execution_time:.4f} seconds")

Best C found by PSO: 19.9767
Best Accuracy by PSO: 0.9825
PSO Execution Time: 0.3285 seconds


# Conclusion

PSO outperformed both the baseline Logistic Regression and GridSearchCV in terms of accuracy while being extremely efficient in computation time. It found the optimal C value without an exhaustive parameter search, showcasing the strength of metaheuristic methods for hyperparameter tuning. This balance of accuracy (98.25%) and speed (0.3285 sec) makes PSO a compelling approach in scenarios where computational resources are limited but accuracy is critical.

# Summary

- Implemented **Particle Swarm Optimization (PSO)** to optimize Logistic Regression hyperparameter `C`.
- **Best C:** 19.9767
- **Accuracy:** 98.25% (improvement over baseline: 97% and GridSearchCV: 97.37%).
- **Execution Time:** 0.3285 seconds (significantly faster than GridSearchCV).
- **Insight:** PSO achieved the best accuracy so far with minimal time cost, proving its efficiency and capability for hyperparameter tuning.
- **Next Step:** Apply **Ant Colony Optimization (ACO)**