# Task 5

# Build a Decision Tree Classifier

# Description: Develop a decision tree classifier to predict categorical outcomes.

# Step 1: Data Preparation
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, plot_tree

# Load the Breast Cancer dataset
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target

# Step 2: Data Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Model Creation
model = DecisionTreeClassifier(random_state=42)

# Step 4: Model Training
model.fit(X_train, y_train)

# Step 5: Prediction
y_pred = model.predict(X_test)

# Step 6: Model Evaluation
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

# Step 7: Visualization
plt.figure(figsize=(15, 10))
plot_tree(model, filled=True, feature_names=breast_cancer.feature_names, class_names=breast_cancer.target_names)
plt.title('Decision Tree Visualization')
plt.show()

# Print model performance metrics
print('Accuracy:', accuracy)
print('Precision:', precision)
print('Recall:', recall)
print('F1 Score:', f1)


Task: Build a Decision Tree Classifier
Purpose of the Task:
The purpose of this task was to develop a decision tree classifier to predict whether a tumor is malignant or benign using the "Breast Cancer" dataset. Decision trees are a powerful and interpretable machine learning model that can handle categorical and numerical data for classification tasks.

Dataset Used:
We utilized the "Breast Cancer" dataset, a well-known dataset in the machine learning community. This dataset contains features computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. The goal is to predict whether the tumor is malignant (coded as 0) or benign (coded as 1) based on features such as mean radius, mean texture, mean smoothness, and more.

Evaluation Metrics:
To evaluate the performance of the decision tree classifier, we used the following metrics:

Accuracy: The ratio of correctly predicted instances to the total instances.
Precision: The ratio of correctly predicted positive observations to the total predicted positives.
Recall: The ratio of correctly predicted positive observations to the total actual positives.
F1 Score: The weighted average of precision and recall, providing a balance between the two metrics.
Insights from the Decision Tree:
The decision tree visualization provided valuable insights into the classification process. It showed the feature importance and the hierarchy of features that the decision tree used to make predictions. Understanding the decision tree can help in feature selection, identifying key features in the dataset, and interpreting how the model makes predictions. The visualization revealed that features like mean concave points and worst concave points were particularly influential in determining the nature of the tumor (malignant or benign), highlighting their significance in the classification task.

Overall, the decision tree classifier demonstrated good predictive performance in distinguishing between malignant and benign tumors, making it a valuable tool for diagnosing breast cancer based on the given dataset and features.




