# Logistic Regression with L1 & L2 Regularization

Author: Monasri

Objective: Understand and compare L1 (Lasso) and L2 (Ridge) regularization in Logistic Regression.
Dataset: Breast Cancer Dataset

In [9]:
pip install numpy pandas matplotlib seaborn scikit-learn

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.3 -> 26.0.1
[notice] To update, run: C:\Users\Monasri M\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [10]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

## Load Dataset

In [11]:
data = load_breast_cancer()
X = data.data
y = data.target

print('Feature Shape:', X.shape)
print('Target Shape:', y.shape)

Feature Shape: (569, 30)
Target Shape: (569,)


## Train-Test Split

In [12]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## Feature Scaling

In [13]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## L2 Regularization (Ridge)

In [14]:
model_l2 = LogisticRegression(penalty='l2', solver='lbfgs', max_iter=1000)
model_l2.fit(X_train, y_train)

y_pred_l2 = model_l2.predict(X_test)
print('L2 Accuracy:', accuracy_score(y_test, y_pred_l2))
print(classification_report(y_test, y_pred_l2))

L2 Accuracy: 0.9736842105263158
              precision    recall  f1-score   support

           0       0.98      0.95      0.96        43
           1       0.97      0.99      0.98        71

    accuracy                           0.97       114
   macro avg       0.97      0.97      0.97       114
weighted avg       0.97      0.97      0.97       114





## L1 Regularization (Lasso)

In [15]:
model_l1 = LogisticRegression(penalty='l1', solver='liblinear', max_iter=1000)
model_l1.fit(X_train, y_train)

y_pred_l1 = model_l1.predict(X_test)
print('L1 Accuracy:', accuracy_score(y_test, y_pred_l1))
print(classification_report(y_test, y_pred_l1))

L1 Accuracy: 0.9736842105263158
              precision    recall  f1-score   support

           0       0.95      0.98      0.97        43
           1       0.99      0.97      0.98        71

    accuracy                           0.97       114
   macro avg       0.97      0.97      0.97       114
weighted avg       0.97      0.97      0.97       114





## Compare Coefficients (Feature Selection Effect)

In [16]:
coef_l1 = np.abs(model_l1.coef_)
coef_l2 = np.abs(model_l2.coef_)

print('Number of non-zero coefficients in L1:', np.sum(coef_l1 > 1e-5))
print('Number of non-zero coefficients in L2:', np.sum(coef_l2 > 1e-5))

Number of non-zero coefficients in L1: 14
Number of non-zero coefficients in L2: 30


### Observation

- L1 Regularization performs feature selection by shrinking some coefficients to zero.
- L2 Regularization shrinks coefficients but rarely makes them exactly zero.
- Both help reduce overfitting.