# CIU - Example: Random Forest, Loan Application Classification
This example demonstrates how to use Py-CIU to explain loan application classification decisions as made by a random forest. Let us first create a synthetic dataset of loan applications, with the features and their approval decisions. The dataset as the following features:

* ``age``,
* ``assets``
* ``monthly_income``,
* ``gender_female``,
* ``gender_male``,
* ``gender_other``,
* ``job_type_fixed``,
* ``job_type_none``,
* ``job_type_permanent``.

We import the third-party dependencies, Py-CIU, and a synthetic data generator: 

In [None]:
!pip install -e ../
!pip install sklearn

In [None]:
from sklearn.ensemble import RandomForestClassifier

import project_path
from ciu import determine_ciu
from ciu_tests.data_generator import generate_data

Now, we run the generator to create our dataset and use it to create a random forest classifier:

In [None]:
data = generate_data()
train_data = data['train'][1]
test_data = data
test_data_encoded = data['test'][1].drop(['approved'], axis=1)
random_forest = RandomForestClassifier(
    n_estimators=1000,
    random_state=42
)

labels = train_data[['approved']].values.ravel()
data = train_data.drop(['approved'], axis=1)
random_forest.fit(data, labels)



Now, we take a case and classify it:

In [None]:
feature_names = [
    'age', 'assets', 'monthly_income', 'gender_female', 'gender_male',
    'gender_other', 'job_type_fixed', 'job_type_none', 'job_type_permanent'
]

case = test_data_encoded.values[0]
example_prediction = random_forest.predict([test_data_encoded.values[0]])
example_prediction_prob = random_forest.predict_proba([test_data_encoded.values[0]])
print(feature_names)
print(f'Case: {case}; Prediction {example_prediction}; Probability: {example_prediction_prob}')



We call the CIU function. Note that this requires use to provide a mapping from "raw data" feature names to one-hot encoded feature names:

In [None]:
category_mapping = {
    'gender': ['gender_female', 'gender_male', 'gender_other'],
    'job_type': ['job_type_fixed', 'job_type_none', 'job_type_permanent']
}
ciu = determine_ciu(
    test_data_encoded.values[0],
    random_forest,
    [
        [20, 70, True], [-20000, 150000, True], [0, 20000, True],
        [0, 1, True], [0, 1, True], [0, 1, True],
        [0, 1, True], [0, 1, True], [0, 1, True]
    ],
    ['age', 'assets', 'monthly_income', 'gender_female', 'gender_male',
        'gender_other', 'job_type_fixed', 'job_type_none', 'job_type_permanent'],
    1000,
    1,
    category_mapping
)

Finally, we display the contextual importance () and utility () in different ways:

In [None]:
ciu.plot_ci()

In [None]:
ciu.plot_cu()