MDLP-Decision-Tree

Based on a paper by Quinlan and Rivest, “Inferring decision trees using the minimum description lenght principle”. In: Information and computation 80.3 (1989), pp. 227–248.

The main idea is that trying to transmit a theory for how data is classified and the exceptions for that theory using as few bits as possible will result in a theory that will also accurately classify new data. Because if an attribute’s relationship with the object’s class is significant enough to save bits during transmission, it will most likely be a worthwhile addition to a decision tree.

Including the theory in the cost of the transmission takes care of the overfitting problem, because a more complex theory that might classify the training data perfectly by assigning a leaf node to each training object will result in a relatively high theory cost. On the other hand, a too simple theory will result in a high cost for encoding all of the exceptions.

Usage

from mdlp import Tree

train = # list of feature arrays with the label as the last element
test = # list of feature arrays with the label as the last element
tree = Tree(train)
predictions  = [tree.classify(t) for  t  in  test]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
mdlp.py		mdlp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDLP-Decision-Tree

Usage

About

Releases

Packages

Languages

FrankRuis/MDLP-Decision-Tree

Folders and files

Latest commit

History

Repository files navigation

MDLP-Decision-Tree

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages