In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import pandas as pd

# Load dataset
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)

# Split into training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train logistic regression model
model = LogisticRegression(max_iter=10000)
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Evaluation
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))


Accuracy: 0.956140350877193
Confusion Matrix:
 [[39  4]
 [ 1 70]]
Classification Report:
               precision    recall  f1-score   support

           0       0.97      0.91      0.94        43
           1       0.95      0.99      0.97        71

    accuracy                           0.96       114
   macro avg       0.96      0.95      0.95       114
weighted avg       0.96      0.96      0.96       114



In [None]:


# ---

# ### **1. Accuracy: `0.9561`**

# * This means **95.61%** of the predictions made by your logistic regression model were correct.
# * Out of 114 test cases, about **109** were predicted correctly.

# ---

# ### **2. Confusion Matrix**

# ```
# [[39  4]
#  [ 1 70]]
# ```

# Here's how to read it:

# |              | Predicted 0 | Predicted 1 |
# | ------------ | ----------- | ----------- |
# | **Actual 0** | 39          | 4           |
# | **Actual 1** | 1           | 70          |

# * **True Negatives (TN)** = 39 → Model correctly predicted 0 (no cancer)
# * **False Positives (FP)** = 4 → Model wrongly predicted 1 when it was actually 0
# * **False Negatives (FN)** = 1 → Model wrongly predicted 0 when it was actually 1
# * **True Positives (TP)** = 70 → Model correctly predicted 1 (cancer)

# ---

# ###  **3. Classification Report**

# This shows **precision**, **recall**, and **f1-score** for each class (0 and 1):

# | Label             | Precision | Recall | F1-score | Support |
# | ----------------- | --------- | ------ | -------- | ------- |
# | **0** (no cancer) | 0.97      | 0.91   | 0.94     | 43      |
# | **1** (cancer)    | 0.95      | 0.99   | 0.97     | 71      |

# #### What these mean:

# * **Precision**: Out of all predicted positives, how many were actually correct?

#   * Class 0: 97% of the time, when model said "no cancer", it was right.
#   * Class 1: 95% of the time, when model said "cancer", it was right.

# * **Recall**: Out of all actual positives, how many did the model catch?

#   * Class 0: 91% of actual "no cancer" cases were correctly identified.
#   * Class 1: 99% of actual "cancer" cases were correctly identified.

# * **F1-score**: Balance between precision and recall (good overall performance measure).

# * **Support**: Number of samples for each class in test data
#   (43 "no cancer", 71 "cancer")

# ---

# ### **4. Averages at the bottom**

# | Metric Type   | Precision | Recall | F1-score |
# | ------------- | --------- | ------ | -------- |
# | **Macro avg** | 0.96      | 0.95   | 0.95     |
# | **Weighted**  | 0.96      | 0.96   | 0.96     |

# * **Macro avg** = Unweighted average (treats all classes equally)
# * **Weighted avg** = Accounts for class imbalance (more samples of class 1)

# ---

# ###  Summary:

# Your model is **performing very well**:

# * High **accuracy (95.6%)**
# * Almost perfect **recall (99%)** for detecting cancer
# * Balanced **precision and recall**
# * Slightly weaker on identifying "no cancer" (only 91% recall), but still strong


