Skip to content

kc0bfv/DecisionTreeDNA

Repository files navigation

Decision Tree DNA

A project to visualize classifications in a decision tree.

Goal

A script that takes a Weka J48 decision tree, and an ARFF file input, and builds a graphical representation of which tree node each instance in the ARFF was assigned to. Here's what it looks like:

Data Plot 0

Data Plot 1

Those are plots of the sample data. Both result from a random selection from the same corpus, so they look very similar.

Why?

Lets say you've got a training and a test set, and you built the decision tree off of the training set. Now you want to show that the decision tree is used similarly with both training and test set. You could give your reader a bunch of numbers, or you could give her a cool graph. Cool graphs are cooler.

The training and test set should probably result in similar decision tree use if:

  • they were selected randomly from a larger corpus
  • the tree isn't over-specialized (can't remember the right word right now, you know what I mean)

Usage

If you want to try it out, right now it's only outputting the numbers that will be transformed into a graphical output. There are some sample files included, and they should be all that you need.

j48Vis.py *SampleARFF.arff sampleDecisionTree

Requirements

Python3. It'd probably work in python2.7 with only a bit of modification.

License

I'm using GPL 2 right now. Why not.

About

A project to visualize classifications in a decision tree

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published