## Installation
Run the following commands in a Terminal window to install H2O for Python.

Install dependencies (prepending with sudo if needed):

```pip install requests```

```pip install tabulate```

```pip install future```

##### Required for plotting:
```pip install matplotlib```

Note: These are the dependencies required to run H2O. matplotlib is optional and only required to plot in H2O. A complete list of dependencies is maintained in the following file: https://github.com/h2oai/h2o-3/blob/master/h2o-py/conda/h2o-main/meta.yaml.

Run the following command to remove any existing H2O module for Python.

```pip uninstall h2o```

Use pip to install this version of the H2O Python module.

```pip install -f http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html h2o```

In [None]:
import h2o
h2o.init()
from h2o.estimators import H2ORuleFitEstimator

In [None]:
# Import the titanic dataset and set the column types:
f = "https://s3.amazonaws.com/h2o-public-test-data/smalldata/gbm_test/titanic.csv"
df = h2o.import_file(path=f, col_types={'pclass': "enum", 'survived': "enum"})

# Split the dataset into train and test
train, test = df.split_frame(ratios=[0.8], seed=1)

# Set the predictors and response:
x = ["age", "sibsp", "parch", "fare", "sex", "pclass"]
y = "survived"

In [None]:
# Build and train the model:
rfit = H2ORuleFitEstimator(max_rule_length=10,
                           max_num_rules=100,
                           seed=1)
rfit.train(training_frame=train, x=x, y=y)

In [8]:
# Retrieve the rule importance:
rfit.rule_importance()

Unnamed: 0,variable,coefficient,support,rule
,M0T49N14,0.8463259,0.179635,"(age < 59.41797637939453 or age is NA) & (pclass in {1, 2} or pclass is NA) & (sex in {female})"
,M0T9N17,-0.6009129,0.5389049,(age >= 9.522165298461914 or age is NA) & (fare < 52.0334358215332 or fare is NA) & (sex in {male} or sex is NA)
,M1T28N32,0.5592326,0.192123,(fare >= 15.009644508361816 or fare is NA) & (parch < 3.5 or parch is NA) & (sex in {female}) & (sibsp < 2.5 or sibsp is NA)
,M0T45N20,-0.4145062,0.4726225,"(age >= 9.522165298461914 or age is NA) & (pclass in {2, 3} or pclass is NA) & (sex in {male} or sex is NA)"
,M0T43N13,0.3907539,0.1853987,"(pclass in {1, 2} or pclass is NA) & (sex in {female}) & (sibsp < 2.5 or sibsp is NA)"
,M2T14N44,-0.3551698,0.4217099,(age >= 16.38283920288086 or age is NA) & (fare < 26.12078857421875 or fare is NA) & (parch < 0.5 or parch is NA) & (sex in {male} or sex is NA)
,M0T20N15,0.2761472,0.1738713,"(age < 55.987640380859375 or age is NA) & (pclass in {1, 2} or pclass is NA) & (sex in {female})"
,M0T13N14,-0.1275328,0.5427474,(age < 75.47819519042969 or age is NA) & (parch < 0.5 or parch is NA) & (sex in {male} or sex is NA)
,M0T3N13,0.1261551,0.1767531,"(age < 64.48551940917969) & (pclass in {1, 2} or pclass is NA) & (sex in {female})"
,M1T5N31,-0.0651541,0.4582133,"(age >= 9.522165298461914 or age is NA) & (fare < 48.28102493286133 or fare is NA) & (pclass in {2, 3} or pclass is NA) & (sex in {male} or sex is NA)"


In [9]:
# Predict on the test data:
rfit.predict(test)

rulefit prediction progress: |███████████████████████████████████████████████████| (done) 100%


predict,p0,p1
1,0.118845,0.881155
0,0.715785,0.284215
0,0.548647,0.451353
0,0.715785,0.284215
0,0.715785,0.284215
0,0.782249,0.217751
1,0.118845,0.881155
1,0.118845,0.881155
0,0.579991,0.420009
1,0.118845,0.881155
