<a href="https://colab.research.google.com/github/USHANANDHU06/ML-/blob/main/Logistic_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Breast Cancer Classification using Logistic Regression

1. Introduction
Breast cancer is one of the most common cancers affecting women worldwide. Early detection and diagnosis play a crucial role in treatment success. This project aims to develop a predictive model to classify breast cancer tumors as either malignant or benign using machine learning techniques. The Breast Cancer dataset from scikit-learn is used to train and test a logistic regression model to achieve this goal.

2. Objective
The main objective of this project is to:

Develop a logistic regression model that accurately classifies breast cancer tumors.
Evaluate the model's performance using relevant metrics.
3. Dataset Description
Dataset: Breast Cancer Dataset from scikit-learn.

Features: The dataset contains 30 features describing characteristics of cell nuclei, such as radius, texture, perimeter, area, smoothness, etc.

Target: The target variable indicates the diagnosis of the tumor:

0: Malignant
1: Benign
Number of Samples: 569

4. Methodology
Data Preprocessing:

Load the dataset using scikit-learn.
Split the data into training and testing sets.
Standardize the features to improve model performance.
Model Building:

Use logistic regression as the classification model.
Train the model on the training dataset.
Model Evaluation:

Evaluate the model using accuracy, confusion matrix, precision, recall, and F1-score

In [5]:
# Importing necessary libraries
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import pandas as pd

In [6]:
# Load the breast cancer dataset
data = load_breast_cancer()
X = data.data
y = data.target

In [15]:
# Split the data into training set and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=100)


In [17]:
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [18]:
model = LogisticRegression()
model.fit(X_train, y_train)

In [19]:
# Make predictions
y_pred = model.predict(X_test)

In [20]:
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
report = classification_report(y_test, y_pred)

In [21]:
print(f'Accuracy: {accuracy:.2f}')
print('Confusion Matrix:\n', conf_matrix)
print('Classification Report:\n', report)

Accuracy: 0.97
Confusion Matrix:
 [[46  3]
 [ 0 65]]
Classification Report:
               precision    recall  f1-score   support

           0       1.00      0.94      0.97        49
           1       0.96      1.00      0.98        65

    accuracy                           0.97       114
   macro avg       0.98      0.97      0.97       114
weighted avg       0.97      0.97      0.97       114

