# Tutorial Overview
This set of two tutorials (`uncertaintyforest_running_example.ipynb` and `uncertaintyforest_fig1.ipynb`) will explain the UncertaintyForest class. After following both tutorials, you should have the ability to run UncertaintyForest code on your own machine and generate Figure 1 from [this paper](https://arxiv.org/pdf/1907.00325.pdf). 

If you haven't seen it already, take a look at other tutorials to setup and install the progressive learning package `Installation-and-Package-Setup-Tutorial.ipynb`

# Simply Running the Uncertainty Forest class
## *Goal: Train the UncertaintyForest classifier on some training data and produce a metric of accuracy on some test data*

### 1: First, we'll import required packages and set some parameters for the forest. 

In [32]:
from proglearn.forest import UncertaintyForest
from proglearn.sims import generate_gaussian_parity

In [34]:
# Real Params.
n_train = 10000 # number of training data points
n_test = 1000 # number of testing data points
num_trials = 10 # number of trials
n_estimators = 100 # number of estimators

#### We've done a lot. Can we just run it now? Yes!

### 2: Creating & Training our UncertaintyForest 
First, generate our data:

In [35]:
X, y = generate_gaussian_parity(n_train+n_test)

Now, split that data into training and testing data. We don't want to accidently train on our test data.

In [36]:
X_train = X[0:n_train] # Takes the first n_train number of data points and saves as X_train
y_train = y[0:n_train] # same as above for the labels
X_test = X[n_train:] # Takes the remainder of the data (n_test data points) and saves as X_test
y_test = y[n_train:] # same as above for the labels

Then, create our forest:

In [37]:
UF = UncertaintyForest(n_estimators = n_estimators)

Then fit our learner:

In [38]:
UF.fit(X_train, y_train)

NameError: name 'max_depth' is not defined

Well, we're done. Exciting right?

### 3: Producing a Metric of Accuracy for Our Learner
We've now created our learner and trained it. But to actually show if what we did is effective at predicting the class labels of the data, we'll create some test data (with the same distribution as the train data) and see if we classify it correctly.

In [18]:
X_test, y_test = generate_gaussian_parity(n_test) # creates the test data

In [19]:
predictions = UF.predict(X_test) # predict the class labels of the test data

AttributeError: 'UncertaintyForest' object has no attribute 'lf'

To see the learner's accuracy, we'll now compare the predictions with the actual test data labels. We'll find the number correct and divide by the number of data.

In [9]:
accuracy = sum(predictions == y_test)/n_test

And, let's take a look at our accuracy:

In [10]:
print(accuracy)

0.933


Ta-da. That's an uncertainty forest at work. 


## What's next? --> See a metric on the power of uncertainty forest by generating Figure 1 from [this paper](https://arxiv.org/pdf/1907.00325.pdf)
### To do this, check out `uncertaintyforest_fig1`