<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       DecisionForest and TDDecisionForestPredict functions in Vantage
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction</b></p>
<p style = 'font-size:16px;font-family:Arial'>Decision forest functions create predictive models based on the algorithm for decision tree training and prediction.DecisionForest function is an ensemble algorithm used for classification and regression predictive modeling problems. DecisionForestPredict function uses the model output by DecisionForest function to analyze the input data and make predictions. This function outputs the probability that each observation is in the predicted class. <br> In this notebook we will see how we can use the DecisionForest and TDDecisionForestPredict functions available in Vantage.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>1. Initiate a connection to Vantage</b>

<p style = 'font-size:16px;font-family:Arial'>In the section, we import the required libraries and set environment variables and environment paths (if required).

In [None]:
from teradataml import *

# Modify the following to match the specific client environment settings
display.max_rows = 5

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>1.1 Connect to Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial'>You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [None]:
%run -i ../../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=PP_DecisionForest_and_TDDecisionForestPredict_Python.ipynb;' UPDATE FOR SESSION; ''')

<p style = 'font-size:16px;font-family:Arial'>Begin running steps with Shift + Enter keys. </p>

<hr style='height:1px;border:none;'>

<p style = 'font-size:18px;font-family:Arial'><b>1.2 Getting Data for This Demo</b></p>

<p style = 'font-size:16px;font-family:Arial'>Here, we will get the data which is available in the teradataml library and use the same to show the usage of the function.</p>

In [None]:
load_example_data("decisionforest", ["boston"])

<p style = 'font-size:16px;font-family:Arial'>Next is an optional step – if you want to see the status of databases/tables created and space used.</p>

In [None]:
%run -i ../../UseCases/run_procedure.py "call space_report();"        # Takes 10 seconds

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>2. Data Exploration</b>
<p style = 'font-size:16px;font-family:Arial'>Create a "Virtual DataFrame" that points to the data set in Vantage. Check the shape of the dataframe as check the datatype of all the columns of the dataframe.</p>

In [None]:
boston = DataFrame.from_table("boston")
print("Shape of the data: ", boston.shape)
boston

<p style = 'font-size:16px;font-family:Arial'>Spliting the data into train and test datasets.</p>

In [None]:
TrainTestSplit_out = TrainTestSplit(data = boston,
                                        id_column="id",
                                        train_size=0.80,
                                        test_size=0.20,
                                        seed=42)

In [None]:
boston_train=TrainTestSplit_out.result[TrainTestSplit_out.result['TD_IsTrainRow'] == 1].drop(['TD_IsTrainRow'], axis = 1)
boston_test=TrainTestSplit_out.result[TrainTestSplit_out.result['TD_IsTrainRow'] == 0].drop(['TD_IsTrainRow'], axis = 1)

In [None]:
boston_train.shape

In [None]:
boston_test.shape

<p style = 'font-size:16px;font-family:Arial'>Creating DecisionForest model on this data.<br>Detailed help can be found by passing function name to built-in help function.</p>

In [None]:
help(DecisionForest)

In [None]:
# Training the model.
DecisionForest_out = DecisionForest(data = boston_train,
                                    input_columns = ['crim', 'zn', 'indus', 'chas', 'nox', 'rm',
                                                    'age', 'dis', 'rad', 'tax', 'ptratio', 'black',
                                                    'lstat'],
                                    response_column = 'medv',
                                    max_depth = 12,
                                    num_trees = 4,
                                    min_node_size = 1,
                                    mtry = 3,
                                    mtry_seed = 1,
                                    seed = 1,
                                    tree_type = 'REGRESSION')

In [None]:
# Print the result DataFrame.
DecisionForest_out.result

<p style = 'font-size:16px;font-family:Arial'>Predict the values using the model created above by TDDecisionForestPredict function.<br>
Detailed help can be found by passing function name to built-in help function.</p>

In [None]:
help(TDDecisionForestPredict)

In [None]:
TDDecisionForestPredict_out = TDDecisionForestPredict(newdata=boston_test,
                                                      object=DecisionForest_out,
                                                      id_column="id")

# Print the result DataFrame.
TDDecisionForestPredict_out.result

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>3. Cleanup</b>

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'> <b>Databases and Tables </b></p>
<p style = 'font-size:16px;font-family:Arial'>The following code will clean up tables and databases created above.</p>

In [None]:
db_drop_table("boston")

In [None]:
remove_context()

<hr style="height:1px;border:none;">
<p style = 'font-size:16px;font-family:Arial'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradataml Python reference: <a href = 'https://docs.teradata.com/search/all?query=Python+Package+User+Guide&content-lang=en-US'>here</a></li>
    <li>DecisionForest function reference: <a href = 'https://docs.teradata.com/search/all?query=DecisionForest&content-lang=en-US'>here</a></li>
    <li>TDDecisionForestPredict function reference: <a href = 'https://docs.teradata.com/search/all?query=DecisionForestPredict&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; border-bottom:3px solid #91A0Ab">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>