# Rule Applier Spark Example

The Rule Applier is used to apply Iguanas-readable rules to a dataset stored in Spark.

## Requirements

To run, you'll need the following:

* A dataset containing the same features used in the rules.

----

## Import packages

In [8]:
from iguanas.rule_application import RuleApplier
from iguanas.metrics.classification import FScore

import databricks.koalas as ks

## Read in data

Let's read in some dummy data using Koalas, which implements the Pandas DataFrame API on top of Apache Spark:

In [9]:
X = ks.read_csv(
    'dummy_data/X_train.csv', 
    index_col='eid'
)

----

## Apply rules

### Set up class parameters

Now we can set our class parameters for the Rule Applier.

**Please see the class docstring for more information on each parameter.**

In [10]:
params = {
    'rule_strings': {
        'Rule1': "(X['account_number_num_fraud_transactions_per_account_number_1day']>=1)",
        'Rule2': "(X['account_number_num_fraud_transactions_per_account_number_1day']>=1)&(X['account_number_num_fraud_transactions_per_account_number_30day']>=1)",
        'Rule3': "(X['account_number_num_fraud_transactions_per_account_number_1day']>=1)&(X['order_total']>50.87)"
    }
}

### Instantiate class and run

Once the parameters have been set, we can run the `transform` method to apply the list of rules to the dataset.

In [11]:
ara = RuleApplier(**params)
X_rules = ara.transform(X=X)



### Outputs

The `transform` method returns a dataframe giving the binary columns of the rules as applied to the training dataset:

In [12]:
X_rules.head()

Unnamed: 0_level_0,Rule1,Rule2,Rule3
eid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
867-8837095-9305559,0,0,0
974-5306287-3527394,0,0,0
584-0112844-9158928,0,0,0
956-4190732-7014837,0,0,0
349-7005645-8862067,0,0,0


----