# Breast Cancer Detection Project

This notebook will look into using various different python-based machine learning libraries, models by evaluating them and improving them through hyperparameter tuning in an attempt to build a machine learning model that is capable of predicting whether someone has breast cancer based on their medical attributes.

## Problem

> Given clinical parameters about a patient, can we predict whether they have breast cancer or not?

## Data

Data is from kaggle: https://www.kaggle.com/datasets/utkarshx27/breast-cancer-wisconsin-diagnostic-dataset

## Target

Evaluate the model using various different statistics:
* Hyperparameter Tuning
* ROC Curve + Area under the curve (AUC)
* Confusion Matrix
* Cross-validation
* Precision
* Recall
* F1 Score
* Classification Report
* Feature Importance

> Aiming to reach 95% accuracy when predicting whether a patient has breast cancer or not.

## Features

x.radius_mean - Mean radius of the tumor cells
x.radius_mean - Mean radius of the tumor cells
x.texture_mean - Mean texture of the tumor cells
x.perimeter_mean - Mean perimeter of the tumor cells
x.area_mean - Mean area of the tumor cells
x.smoothness_mean - Mean smoothness of the tumor cells
x.compactness_mean - Mean compactness of the tumor cells
x.concavity_mean - Mean concavity of the tumor cells
x.concave_points_mean - Mean number of concave portions of the contour of the tumor cells
x.symmetry_mean	Mean - symmetry of the tumor cells
x.fractal_dimension_mean - Mean "coastline approximation" of the tumor cells
x.radius_se - Standard error of the radius of the tumor cells
x.texture_se - Standard error of the texture of the tumor cells
x.perimeter_se - Standard error of the perimeter of the tumor cells
x.area_se - Standard error of the area of the tumor cells
x.smoothness_se - Standard error of the smoothness of the tumor cells
x.compactness_se - Standard error of the compactness of the tumor cells
x.concavity_se - Standard error of the concavity of the tumor cells
x.concave_points_se - Standard error of the number of concave portions of the contour of the tumor cells
x.symmetry_se - Standard error of the symmetry of the tumor cells
x.fractal_dimension_se - Standard error of the "coastline approximation" of the tumor cells
x.radius_worst - Worst (largest) radius of the tumor cells
x.texture_worst - Worst (most severe) texture of the tumor cells
x.perimeter_worst - Worst (largest) perimeter of the tumor cells
x.area_worst - Worst (largest) area of the tumor cells
x.smoothness_worst - Worst (most severe) smoothness of the tumor cells
x.compactness_worst - Worst (most severe) compactness of the tumor cells
x.concavity_worst - Worst (most severe) concavity of the tumor cells
x.concave_points_worst - Worst (most severe) number of concave portions of the contour of the tumor cells
x.symmetry_worst - Worst (most severe) symmetry of the tumor cells
x.fractal_dimension_worst - Worst (most severe) "coastline approximation" of the tumor cells
y - target

## Getting Workspace Ready

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
plt.style.use("seaborn-v0_8-dark")

from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.svm import LinearSVC
from sklearn.model_selection import train_test_split, cross_val_score, RandomizedSearchCV, GridSearchCV
from sklearn.metrics import confusion_matrix, classification_report, precision_score, recall_score, f1_score, \
    RocCurveDisplay

## Exploring Data