### Example of RIPPER usage on congress dataset

In [1]:
import ruleset
import pandas as pd

SyntaxError: invalid syntax (base.py, line 457)

Load our dataset:

In [None]:
df = pd.read_csv('../datasets/house-votes-84.csv')

Split our data into train-test sets:

In [3]:
from sklearn.model_selection import train_test_split
train, test = train_test_split(df, random_state=0)

Create a ruleset classifier:

In [4]:
ripper_clf = ruleset.RIPPER()
ripper_clf

<RIPPER object (unfit)>

Train the ruleset classifier on the trainset:

In [5]:
ripper_clf.fit(train, class_feat='Party', random_state=0)
ripper_clf.ruleset_ # Access underlying model

<Ruleset object: [physician-fee-freeze=n] V [adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n]>

Verbosity allows us to transparently view training steps...

In [6]:
ripper_clf.verbosity = 1 # Scale of 1-5
ripper_clf.fit(train, class_feat='Party', random_state=0)
ripper_clf.ruleset_


GREW INITIAL RULESET:
[[physician-fee-freeze=n^adoption-of-the-budget-resolution=y] V
[synfuels-corporation-cutback=y^physician-fee-freeze=n] V
[synfuels-corporation-cutback=y^mx-missile=y] V
[adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n] V
[physician-fee-freeze=n] V
[Handicapped-infants=?] V
[synfuels-corporation-cutback=y^superfund-right-to-sue=n]]

optimization run 1 of 2

OPTIMIZED RULESET:
[[physician-fee-freeze=n] V
[synfuels-corporation-cutback=y^physician-fee-freeze=n] V
[synfuels-corporation-cutback=y^mx-missile=y] V
[adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n] V
[physician-fee-freeze=n] V
[Handicapped-infants=?] V
[synfuels-corporation-cutback=y^superfund-right-to-sue=n]]

optimization run 2 of 2

OPTIMIZED RULESET:
[[physician-fee-freeze=n] V
[synfuels-corporation-cutback=y^physician-fee-freeze=n] V
[synfuels-corporation-cutback=y^mx-missile=y] V
[adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n] V
[physician-fee-freeze=n] V


<Ruleset object: [physician-fee-freeze=n] V [adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n]>

How good is our model?

In [7]:
test_X = test.drop('Party', axis=1)
test_y = test['Party']
ripper_clf.score(test_X, test_y) # Default metric is accuracy

0.9174311926605505

We can also score it on custom metrics, including sklearn's:

In [8]:
from sklearn.metrics import precision_score
ripper_clf.score(test_X, test_y, precision_score)

0.9516129032258065

To make predictions:

In [9]:
ripper_clf.predict(test_X.tail())

[True, False, True, False, True]

For explainability, we can query the reasons responsible for each prediction:

In [10]:
ripper_clf.predict(test_X.tail(), give_reasons=True)

([True, False, True, False, True],
 [[<Rule object: [physician-fee-freeze=n]>],
  [],
  [<Rule object: [physician-fee-freeze=n]>],
  [],
  [<Rule object: [adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n]>]])