Skip to content

Commit

Permalink
Merge pull request #3282 from OXPHOS/cookbook_cartree
Browse files Browse the repository at this point in the history
cookbook - CARTree -  classification tree
  • Loading branch information
karlnapf committed Nov 22, 2016
2 parents 657573c + fbd979c commit 18204b2
Show file tree
Hide file tree
Showing 4 changed files with 84 additions and 38 deletions.
47 changes: 47 additions & 0 deletions doc/cookbook/source/examples/multiclass_classifier/cartree.rst
@@ -0,0 +1,47 @@
==================================
Classification And Regression Tree
==================================

Decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the item's target value.

Decision trees are mostly used as the following two types:

- Classification tree, where the predicted outcome is the class to which the data belongs.
- Regression tree, where predicted outcome can be considered a real number.

Classification And Regression Tree (CART) algorithm is an umbrella method that can be applied to generate both classification tree and regression tree.

In this example, we showed how to apply CART algorithm to multi-class dataset and predict the labels with classification tree.

-------
Example
-------

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CMulticlassLabels` as

.. sgexample:: cartree.sg:create_features

We set the type of each predictive attribute (true for nominal, false for ordinal/continuous)

.. sgexample:: cartree.sg:set_attribute_types

We create an instance of the :sgclass:`CCARTree` classifier by passting it the attribute types and the tree type.
We can also set the number of subsets used in cross-valiation and whether to use cross-validation pruning.

.. sgexample:: cartree.sg:create_instance

Then we train and apply it to test data, which here gives :sgclass:`CMulticlassLabels`.

.. sgexample:: cartree.sg:train_and_apply

We can evaluate test performance via e.g. :sgclass:`CMulticlassAccuracy`.

.. sgexample:: cartree.sg:evaluate_accuracy

----------
References
----------

:wiki:`Decision_tree_learning`

:wiki:`Predictive_analytics#Classification_and_regression_trees_.28CART.29`
36 changes: 36 additions & 0 deletions examples/meta/src/multiclass_classifier/cartree.sg
@@ -0,0 +1,36 @@
CSVFile f_feats_train("../../data/classifier_4class_2d_linear_features_train.dat")
CSVFile f_feats_test("../../data/classifier_4class_2d_linear_features_test.dat")
CSVFile f_labels_train("../../data/classifier_4class_2d_linear_labels_train.dat")
CSVFile f_labels_test("../../data/classifier_4class_2d_linear_labels_test.dat")
Math:init_random(1)

#![create_features]
RealFeatures features_train(f_feats_train)
RealFeatures features_test(f_feats_test)
MulticlassLabels labels_train(f_labels_train)
MulticlassLabels labels_test(f_labels_test)
#![create_features]

#![set_attribute_types]
BoolVector ft(2)
ft[0] = False
ft[1] = False
#![set_attribute_types]

#![create_instance]
CARTree classifier(ft,enum EProblemType.PT_MULTICLASS, 5, True)
classifier.set_labels(labels_train)
#![create_instance]

#![train_and_apply]
classifier.train(features_train)
MulticlassLabels labels_predict = classifier.apply_multiclass(features_test)
#![train_and_apply]

#![evaluate_accuracy]
MulticlassAccuracy eval()
real accuracy = eval.evaluate(labels_predict, labels_test)
#![evaluate_accuracy]

# integration testing variables
RealVector output = labels_predict.get_labels()
37 changes: 0 additions & 37 deletions examples/undocumented/python_modular/multiclass_cartree_modular.py

This file was deleted.

0 comments on commit 18204b2

Please sign in to comment.