#### This is an implementation of Random Forest with TensorFlow Monitored Session
The benefits of using MonitoredTrainingSession are that, this object takes care of initialising variables,
starting queue runner as well as setting up the file writers, but it has also the benefit of making your 
code easy to distribute as it also works differently depending if you specified the running process as a master or not.
For example you could run something like:

In [1]:
def run_my_model(train_op, session_args):
    with tf.train.MonitoredTrainingSession(**session_args) as sess:
        sess.run(train_op)

that you would call in a non-distributed way: 

In [None]:
run_my_model(train_op,{})

or in a distributed way (see the distributed doc for more information on the inputs):

In [None]:
run_my_model(train_op, {"master": server.target, "is_chief": (FLAGS.task_index == 0)})

**Random forests** or random decision forests are an ensemble learning method for classification, regression 
and other tasks, that operate by constructing a multitude of decision trees at training time and outputting 
the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision tree''s habit of overfitting to their training set.

Decision trees are a popular method for various machine learning tasks. Tree learning "comes closest to meeting the requirements for serving as an off-the-shelf procedure for data mining".

In particular, trees that are grown very deep tend to learn highly irregular patterns: they overfit their training sets, i.e. have low bias, but very high variance. Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance. This comes at the expense of a small increase in the bias and some loss of interpretability, but generally greatly boosts the performance in the final model.

The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners. Given a training set X = x1,...,xn with responses Y = y1,...,yn, bagging repeatedly (B times) selects a random sample with replacement of the training set and fits trees to these samples:
For b = 1,...,B:
Sample, with replacement, n training examples from X,Y; call these Xb,Yb.
Train a classification or regression tree fb on Xb,Yb.
After training, predictions for unseen samples x' can be made by averaging the predictions from all the individual regression trees on x'.

![title](img/rf.png)

**TensorFlow Operations**, also known as Ops, are nodes that perform computations on or with Tensor objects. After computation, they return zero or more tensors, which can be used by other Ops later in the graph. To create an Operation, you call its constructor in Python, which takes in whatever Tensor parameters needed for its calculation, known as inputs, as well as any additional information needed to properly create the Op, known as attributes. The Python constructor returns a handle to the Operation’s output (zero or more Tensor objects), and it is this output which can be passed on to other Operations or **Session.run**

In [3]:
# Import mnist data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)

  from ._conv import register_converters as _register_converters


Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


In [4]:
import tensorflow as tf
from tensorflow.python.ops import resources
from tensorflow.contrib.tensor_forest.python import tensor_forest

# Ignore all GPUs, tf random forest does not benefit from it.
import os
os.environ["CUDA_VISIBLE_DEVICES"] = ""

In [5]:
# Parameters
num_steps = 500 # Total steps to train
batch_size = 1024 # The number of samples per batch
num_classes = 10 # The 10 digits
num_features = 784 # Each image is 28x28 pixels
num_trees = 10
max_nodes = 1000

# Input and Target data§
X = tf.placeholder(tf.float32,shape=[None,num_features])
# For random forest, labels must be integers (the class id) <-- watch out
Y = tf.placeholder(tf.int32,shape=[None])
# creating the hyper parameters
hparams = tensor_forest.ForestHParams(num_trees=num_trees, 
                                      num_features=num_features, 
                                      num_classes=num_classes, 
                                      max_nodes=max_nodes).fill()

In [6]:
# the forest is created with hyper parameters
forest_graph = tensor_forest.RandomForestGraphs(hparams)

# creating train and loss graph
train_graph = forest_graph.training_graph(X, Y)
loss_graph = forest_graph.training_loss(X, Y)

# infer graph
infer_op,_,_ = forest_graph.inference_graph(X)
prediction = tf.equal(tf.cast(Y,tf.int64), tf.argmax(infer_op,1))
accuracy = tf.reduce_mean(tf.cast(prediction, tf.float32))
# shared resource initializer
init_vars = tf.group(tf.global_variables_initializer(), resources.initialize_resources(resources.shared_resources()))

INFO:tensorflow:Constructing forest with params = 
INFO:tensorflow:{'num_trees': 10, 'max_nodes': 1000, 'bagging_fraction': 1.0, 'feature_bagging_fraction': 1.0, 'num_splits_to_consider': 28, 'max_fertile_nodes': 0, 'split_after_samples': 250, 'valid_leaf_threshold': 1, 'dominate_method': 'bootstrap', 'dominate_fraction': 0.99, 'model_name': 'all_dense', 'split_finish_name': 'basic', 'split_pruning_name': 'none', 'collate_examples': False, 'checkpoint_stats': False, 'use_running_stats_method': False, 'initialize_average_splits': False, 'inference_tree_paths': False, 'param_file': None, 'split_name': 'less_or_equal', 'early_finish_check_every_samples': 0, 'prune_every_samples': 0, 'num_features': 784, 'num_classes': 10, 'bagged_num_features': 784, 'bagged_features': None, 'regression': False, 'num_outputs': 1, 'num_output_columns': 11, 'base_random_seed': 0, 'leaf_model_type': 0, 'stats_model_type': 0, 'finish_type': 0, 'pruning_type': 0, 'split_type': 0}


In [7]:
# start session
sess = tf.train.MonitoredSession()
# initializer
sess.run(init_vars)

# training
for i in range(1, num_steps + 1):
    # get the next batch of MNIST data (only images are needed, not labels)
    batch_x, batch_y = mnist.train.next_batch(batch_size)
    _, l = sess.run([train_graph, loss_graph], feed_dict={X: batch_x, Y: batch_y})
    # log the progress
    if i % 50 == 0 or i == 1:
        acc = sess.run(accuracy, feed_dict={X: batch_x, Y: batch_y})
        print('Step %i, Loss: %f, Acc: %f' % (i, l, acc))

# test
test_x, test_y = mnist.test.images, mnist.test.labels
print("Test Accuracy:", sess.run(accuracy, feed_dict={X: test_x, Y: test_y}))

Step 1, Loss: -1.000000, Acc: 0.407227
Step 50, Loss: -252.800003, Acc: 0.867188
Step 100, Loss: -540.200012, Acc: 0.914062
Step 150, Loss: -831.000000, Acc: 0.916016
Step 200, Loss: -1001.000000, Acc: 0.921875
Step 250, Loss: -1001.000000, Acc: 0.929688
Step 300, Loss: -1001.000000, Acc: 0.930664
Step 350, Loss: -1001.000000, Acc: 0.913086
Step 400, Loss: -1001.000000, Acc: 0.928711
Step 450, Loss: -1001.000000, Acc: 0.931641
Step 500, Loss: -1001.000000, Acc: 0.938477
Test Accuracy: 0.9202
