Skip to content

sujay723/Decision-Tree-Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Decision Tree Classifier

πŸ“˜ Project Overview

This project implements a Decision Tree Classifier using Python. The model is designed to classify data points based on given features, making it one of the most interpretable and powerful supervised learning algorithms for both classification and regression tasks.


πŸ“‚ Dataset

The dataset used in this project is processed within the notebook DECISION_TREE.ipynb. It may come from a public dataset (e.g., Iris, Titanic, or a custom CSV file).

Steps Involved:

  1. Data Loading – Import the dataset into a pandas DataFrame.
  2. Data Preprocessing – Handle missing values, encode categorical features, and normalize data if required.
  3. Data Splitting – Divide the dataset into training and testing subsets (commonly 70-30 or 80-20 split).

🧠 Model Development

Algorithm Used: Decision Tree

A Decision Tree splits the dataset into branches based on feature conditions to predict the target variable.

Key Concepts:

  • Entropy / Gini Index – Used to measure the purity of nodes.
  • Information Gain – Determines which feature provides the most significant split.
  • Pruning – Prevents overfitting by limiting the depth or complexity of the tree.

Steps in the Notebook:

  1. Import libraries (pandas, numpy, sklearn).
  2. Initialize and train a DecisionTreeClassifier from sklearn.tree.
  3. Evaluate model accuracy on test data.
  4. Visualize the trained tree using graphviz or plot_tree() from sklearn.

πŸ“ˆ Model Evaluation

Common Metrics:

  • Accuracy Score
  • Confusion Matrix
  • Precision, Recall, and F1-Score
  • Cross-Validation (optional)

Example:

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

🌳 Visualization

Decision Tree visualization helps understand how the model splits data:

from sklearn import tree
import matplotlib.pyplot as plt

plt.figure(figsize=(15,10))
tree.plot_tree(model, filled=True, feature_names=feature_cols, class_names=target_names)
plt.show()

βš™οΈ Requirements

To run this notebook, install the following dependencies:

pip install pandas numpy scikit-learn matplotlib graphviz

πŸš€ How to Run

  1. Open the notebook: DECISION_TREE.ipynb
  2. Run each cell sequentially.
  3. The notebook will display training accuracy, decision tree plots, and performance metrics.

🧩 Results

  • The trained model achieves a good accuracy depending on dataset complexity.
  • Visual inspection of the decision tree reveals key decision rules.
  • Demonstrates interpretability and simplicity of Decision Tree models.

πŸ‘¨β€πŸ’» Author

Created by: Sujay Roy
Year: 2025
Project Type: Machine Learning – Decision Tree Implementation


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published