Decision Tree Python

A rudimentary implementation of different Decision Tree related algorithms made in python without using any external machine learning libraries

Introduction

Binary Decision Tree and Bagged Decision Trees implemented from scratch in Python

Binary Decision Tree

Given a dataset the goal of the algorithm is to generate a Binary Decision Tree to accurately predict the value of a new example

In each iteration it calculates the entropy using the formula

- { x · log₂(x) + (1 - x) · log₂(1 - x) }

where is x is the ratio of true cases to total cases

more about this formula and calculation of entropy in Binary Trees

It further calculates information gain, based on which it decides where to split

Information gain is calculated as follows:

Assume that before splitting entropy was H_root and there were n elements in total

Now it splits at some feature and sends a and b elements to the left and right branches respectively, with their entropies being H_left and H_right

gain = H_root - { [ a/n ] · H_left + [ b/n ] · H_right }

The goal is to maximise gain each iteration and for every branch

Bagged Decision Trees Algorithm

Given a dataset the goal of the algorithm is to generate a set of 'n' Binary Decision Trees to predict the probability of a new example being true or false

Each Binary Decision Tree is created on a unique dataset generated with sampling with replacement on the original dataset

The generation of the Binary Decision Trees follows the same process as before only with a different dataset

Screenshots

An example result of binarydecisiontree.ipynb (implementation of Binary Decision Tree)

Dependencies

Python 3.x
graphviz 9.0.0 (required only for visualisation)
dsplot 0.9.0 (required only for visualisation)

Instructions

Directions to Install

$ git clone https://github.com/AngadBasandrai/decision-tree-python.git

Directions to Run

Open .ipynb files

Graphviz Installation (required only for visualisation)

Windows

Graphviz Downloads

Linux

Ubuntu and Debian packages

sudo apt install graphviz

Fedora packages or RedHat Enterprise and CentOS systems

sudo yum install graphviz

Mac

MacPorts

sudo port install graphviz

HomeBrew

brew install graphviz

DSPlot installation (required only for visualisation)

Refer to DSPlot installation guide

NOTE: The files were made in kaggle and there may be some portability issues so it is recommended to import them into kaggle

Contributors

Angad Basandrai

License

Made with ❤️ by Angad Basandrai

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LICENSE		LICENSE
README.md		README.md
baggeddecisiontree.ipynb		baggeddecisiontree.ipynb
binarydecisiontree.ipynb		binarydecisiontree.ipynb
tree.jpg		tree.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Decision Tree Python

Introduction

Binary Decision Tree

Bagged Decision Trees Algorithm

Screenshots

Dependencies

Instructions

Directions to Install

Directions to Run

Graphviz Installation (required only for visualisation)

Windows

Linux

Ubuntu and Debian packages

Fedora packages or RedHat Enterprise and CentOS systems

Mac

MacPorts

HomeBrew

DSPlot installation (required only for visualisation)

NOTE: The files were made in kaggle and there may be some portability issues so it is recommended to import them into kaggle

Contributors

License

About

Uh oh!

Releases

Packages

Languages

License

AngadBasandrai/decision-tree-python

Folders and files

Latest commit

History

Repository files navigation

Decision Tree Python

Introduction

Binary Decision Tree

Bagged Decision Trees Algorithm

Screenshots

Dependencies

Instructions

Directions to Install

Directions to Run

Graphviz Installation (required only for visualisation)

Windows

Linux

Ubuntu and Debian packages

Fedora packages or RedHat Enterprise and CentOS systems

Mac

MacPorts

HomeBrew

DSPlot installation (required only for visualisation)

NOTE: The files were made in kaggle and there may be some portability issues so it is recommended to import them into kaggle

Contributors

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages