# Understand and utilize `RawData` in ValidMind tests

Test functions in ValidMind can return a special object called *`RawData`*, which holds intermediate or unprocessed data produced somewhere in the test logic but not returned as part of the test's visible output, such as in tables or figures.

- The `RawData` feature allows you to customize the output of tests, making it a powerful tool for creating custom tests and post-processing functions.
- `RawData` is useful when running post-processing functions with tests to recompute tabular outputs, redraw figures, or even create new outputs entirely.

In this notebook, you'll learn how to access, inspect, and utilize `RawData` from ValidMind tests.

## Setup

Before we can run our examples, we'll need to set the stage to enable running tests with the ValidMind Library. Since the focus of this notebook is on the `RawData` object, this section will merely summarize the steps instead of going into greater detail. 

To learn more about running tests with ValidMind: **[Run tests and test suites](https://docs.validmind.ai/developer/model-testing/testing-overview.html)**

### Installation and intialization

First, let's make sure that the ValidMind Library is installed and ready to go, and our Python environment set up for data analysis:

In [None]:
# Install the ValidMind Library
%pip install -q validmind

# Initialize the ValidMind Library
import validmind as vm

# Import the `xgboost` library with an allias
import xgboost as xgb


### Load the sample dataset

Then, we'll import a sample ValidMind dataset and preprocess it:

In [None]:
# Import the `customer_churn` sample dataset
from validmind.datasets.classification import customer_churn
raw_df = customer_churn.load_data()

# Preprocess the raw dataset
train_df, validation_df, test_df = customer_churn.preprocess(raw_df)

# Separate features and targets
x_train = train_df.drop(customer_churn.target_column, axis=1)
y_train = train_df[customer_churn.target_column]
x_val = validation_df.drop(customer_churn.target_column, axis=1)
y_val = validation_df[customer_churn.target_column]

# Create an `XGBClassifier` object
model = xgb.XGBClassifier(early_stopping_rounds=10)
model.set_params(
    eval_metric=["error", "logloss", "auc"],
)

# Train the model using the validation set
model.fit(
    x_train,
    y_train,
    eval_set=[(x_val, y_val)],
    verbose=False,
)

### Initialize the ValidMind objects

Before you can run tests, you'll need to initialize a ValidMind dataset object, as well as a ValidMind model object that can be passed to other functions for analysis and tests on the data:


In [None]:
# Initialize the dataset object
vm_raw_dataset = vm.init_dataset(
    dataset=raw_df,
    input_id="raw_dataset",
    target_column=customer_churn.target_column,
    class_labels=customer_churn.class_labels,
    __log=False,
)

# Initialize the datasets into their own dataset objects
vm_train_ds = vm.init_dataset(
    dataset=train_df,
    input_id="train_dataset",
    target_column=customer_churn.target_column,
    __log=False,
)
vm_test_ds = vm.init_dataset(
    dataset=test_df,
    input_id="test_dataset",
    target_column=customer_churn.target_column,
    __log=False,
)

# Initialize a model object
vm_model = vm.init_model(
    model,
    input_id="model",
    __log=False,
)

# Assign predictions to the datasets
vm_train_ds.assign_predictions(
    model=vm_model,
)

vm_test_ds.assign_predictions(
    model=vm_model,
)

## Examples

Once you're set up, you can then run the following examples:

### Using `RawData` from the ROC Curve Test

### Pearson Correlation Matrix

### Precision-Recall Curve

### Using `RawData` in custom tests