# Using deepchecks with an H2O Model

[H2O](https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html) is a wildly used AutoML framework, written in Java but with multiple fully implemented API libraries, including one in python. In this example, we'll be demonstrating how to effortlessly test your H2O model with deepchecks in the exact same way that you can test any sklearn model.

## Building an H2O Model

First we'll load our dataset, using the adult dataset that can be downloaded from the [UCI machine learning repository](http://archive.ics.uci.edu/ml>) and is also available in `deepchecks.tabular.datasets`

In [None]:
import warnings
warnings.filterwarnings('ignore')
from deepchecks.tabular.datasets.classification import adult

In [None]:
train_df, test_df = adult.load_data(data_format='Dataframe', as_train_test=True)

install h2o

In [None]:
!pip install h2o

Now we'll build an H2O [H2OGradientBoostingEstimator](https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2o.estimators.gbm.H2OGradientBoostingEstimator) based on this dataset split.

In [None]:
import h2o
h2o.no_progress()
h2o.init()

H2O models can only train and predict on H2OFrame objects. 

In [None]:
train_h2o = h2o.H2OFrame(train_df)

In [None]:
gbm = h2o.estimators.gbm.H2OGradientBoostingEstimator()

We must tell the H2O model what are the feature columns and what is the name of the label column.

In [None]:
feature_cols = adult._FEATURES
print(feature_cols)
label_col = adult._target
print(label_col)

In [None]:
gbm.train(x=feature_cols, y=label_col, training_frame=train_h2o, verbose=False)
print('Trained!')

To compare to our new H2O model, we'll download the default scikit-learn model (a pipeline containing a naive Random Forest Classifier) for this task from deepchecks.

In [None]:
skmodel = adult.load_fitted_model()

# Running deepchecks on H2O

Before we can run deepchecks on our H2O model, we have one step remaining - bridging between the API of the H2O model and the scikit-learn API that deepchecks expects.

We have created a wrapper to our H2O model, that implements the required API for deepchecks according the [supported models](https://docs.deepchecks.com/stable/user-guide/tabular/supported_models.html) guide. This wrapper is available from `deepchecks.tabular.integrations.h2o`.
Generally the wrapper model will contain 2 functions in case of a classification problem: the ``predict`` and the ``predict_proba`` functions that will be called by deepchecks. In addition, it is also recommended to define a way to get the feature importances of the model, implemented here using the h2o model's internal calculation. Read more about feature importance handling in the [feature importance](https://docs.deepchecks.com/dev/user-guide/tabular/feature_importance.html) guide.

In [None]:
from deepchecks.tabular.integrations.h2o import H2OWrapper
h2o_dc_gbm = H2OWrapper(gbm)

Now we can run deepchecks! We'll start by defining the [deepchecks Dataset](https://docs.deepchecks.com/stable/user-guide/tabular/dataset_object.html) objects and comparing the two models.

In [None]:
from deepchecks.tabular import Dataset

In [None]:
train_ds = Dataset(train_df, label=label_col, features=feature_cols, cat_features=adult._CAT_FEATURES)
test_ds = Dataset(test_df, label=label_col, features=feature_cols, cat_features=adult._CAT_FEATURES)

### Comparing model performance

In [None]:
from deepchecks.tabular.checks import MultiModelPerformanceReport

In [None]:
MultiModelPerformanceReport().run(train_ds, test_ds, [h2o_dc_gbm, skmodel])

We can see that from an overall performance standpoint (considering the f1 metric), the H2O model achieves superior performance compared to the sklearn model, especially in the smaller class.

### Further model performance evaluation

We next wish to get a more in depth evaluation of the H2O model. For example, we can use the SegmentPerformance check to compare the performance of the two models across segments of the sex and relationship status features.

In [None]:
from deepchecks.tabular.checks import SegmentPerformance
from sklearn.utils import parallel_backend
with parallel_backend('threading', n_jobs=2):
    result_h2o = SegmentPerformance(feature_1='sex', feature_2='relationship').run(test_ds, h2o_dc_gbm)
    
result_skl = SegmentPerformance(feature_1='sex', feature_2='relationship').run(test_ds, skmodel)

In [None]:
result_h2o

In [None]:
result_skl

We see that the scikit-learn model, apart from being worse overall, is specifically worse (a drop in accuracy of 0.6) at predicting the income of females with a relationship status of "Others".

### Wrapping up

We've demonstrated how you can take your h2o model as is and use deepchecks to evaluate the model and compare it to other models, but of course the sky is the limit! Once you've wrapped your h2o model with the H2OWrapper, you can use it with any other deepchecks check and suites, the same way you can use your sklearn models.
Those of you how want to dive a bit deeper can have a look at the wrapper object itself, located at `deepchecks.tabular.integrations.h2o`, and see that implementing such a wrapper is barely any work at all! The same can be done for nearly any other custom model out there, making deepchecks easy to use with any tabular use-case you may encounter.