# Credit Approval Model using a Bayesian Network

Let us look at a credit approval process example. Please note that the model/process shown here does not closely follow any real life approval process. This model is a completely generated from scratch solely for the purpose of practice and easy explanation.

There are two factors, Outstanding Loan (OL) and Payment History (PH) which are independent of each other and influence another factor Credit Rating (CR). Credit Rating and Income Level (IL) are in turn two independent factors which influence Interest Rate (IR) of a credit line that would be extended to a customer. Depending upon CR and IL, a customer may receive a credit/loan at a premium rate, par rate or discounted interest rate.

The model is shown in the form of a directed graph, with probabilities:

<img src="../images/credit_approval.png", style="height:50vh;">

Create a Bayesian Network and add CPDs to the above model.


## Specify the CPDs

* Given the above examples, specify all CPDs for the fraud model:
* credit_rating_cpd
* interest_rate_cpd
* Outstanding_loan_cpd 
* Payment_history_cpd
* Income_level_cpd



In [74]:
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import BayesianModel

credit_rating_cpd = TabularCPD(
                variable='CR',
                variable_card=4,
                values=[[0.85, 0.04, 0.12, 0.02, 0.13, 0.01],
                        [0.1, 0.07, 0.65, 0.07, 0.2, 0.04],
                        [0.04, 0.75, 0.15, 0.25, 0.45, 0.25],
                        [0.01, 0.14, 0.08, 0.66, 0.22, 0.7]],
                evidence=['OL', 'PH'],
                evidence_card=[3, 2])




#hint

In [75]:
credit_rating_cpd = TabularCPD(variable='CR', variable_card=4,
                values=[[0.85, 0.04, 0.12, 0.02, 0.13, 0.01],
                        [0.1, 0.07, 0.65, 0.07, 0.2, 0.04],
                        [0.04, 0.75, 0.15, 0.25, 0.45, 0.25],
                        [0.01, 0.14, 0.08, 0.66, 0.22, 0.7]],
                evidence=['OL', 'PH'], evidence_card=[3, 2])

interest_rate_cpd = TabularCPD(variable='IR', variable_card=3,
                values=[[0.01, 0.05, 0.12, 0.02, 0.05, 0.15, 0.3, 0.4, 0.55, 0.57, 0.83, 0.94],
                        [0.09, 0.7, 0.7, 0.23, 0.4, 0.45, 0.6, 0.55, 0.4, 0.4, 0.15, 0.05],
                        [0.9, 0.25, 0.18, 0.75, 0.55, 0.4, 0.1, 0.05, 0.05, 0.03, 0.02, 0.01]],
                evidence=['CR', 'IL'], evidence_card=[4, 3])

Outstanding_loan_cpd = TabularCPD(variable='OL', variable_card=3, values=[[0.15, 0.55, 0.3]])

Payment_history_cpd = TabularCPD(variable='PH', variable_card=2, values=[[0.8, 0.2]])

Income_level_cpd = TabularCPD(variable='IL', variable_card=3, values=[[0.1, 0.6, 0.3]])


In [76]:
ref_tmp_var = False

import numpy as np
credit_rating_cpd = TabularCPD(
                variable='CR',
                variable_card=4,
                values=[[0.85, 0.04, 0.12, 0.02, 0.13, 0.01],
                        [0.1, 0.07, 0.65, 0.07, 0.2, 0.04],
                        [0.04, 0.75, 0.15, 0.25, 0.45, 0.25],
                        [0.01, 0.14, 0.08, 0.66, 0.22, 0.7]],
                evidence=['OL', 'PH'],
                evidence_card=[3, 2])

interest_rate_cpd = TabularCPD(
                variable='IR',
                variable_card=3,
                values=[[0.01, 0.05, 0.12, 0.02, 0.05, 0.15, 0.3, 0.4, 0.55, 0.57, 0.83, 0.94],
                        [0.09, 0.7, 0.7, 0.23, 0.4, 0.45, 0.6, 0.55, 0.4, 0.4, 0.15, 0.05],
                        [0.9, 0.25, 0.18, 0.75, 0.55, 0.4, 0.1, 0.05, 0.05, 0.03, 0.02, 0.01]],
                evidence=['CR', 'IL'],
                evidence_card=[4, 3])

Outstanding_loan_cpd = TabularCPD(
                variable='OL',
                variable_card=3,
                values=[[0.15, 0.55, 0.3]])

Payment_history_cpd = TabularCPD(
                variable='PH',
                variable_card=2,
                values=[[0.8, 0.2]])

Income_level_cpd = TabularCPD(
                variable='IL',
                variable_card=3,
                values=[[0.1, 0.6, 0.3]])
try:
    if (np.all(credit_rating_cpd.get_values() == credit_rating_cpd.get_values())):
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## Building the Credit Approval Model

You can by specify the dependencies in the Bayesian Network as arguments to BayesianModel() instance:
    
``` python
[('OL', 'CR'),
('PH', 'CR'),
('IL', 'IR'),
('CR', 'IR')])
```

* Assign the instance to credit_approval_model.

In [77]:
credit_approval_model = BayesianModel()

Use BayesianModel([('OL', 'CR'),
                             ('PH', 'CR'),
                             ('IL', 'IR'),
                             ('CR', 'IR')])

In [78]:

credit_approval_model = BayesianModel([('OL', 'CR'),
                             ('PH', 'CR'),
                             ('IL', 'IR'),
                             ('CR', 'IR')])


In [79]:
ref_tmp_var = False

a =1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## Add CPDs

Add CPDs using add_cpds() and validate the model.

In [80]:
credit_approval_model.add_cpds(credit_rating_cpd, interest_rate_cpd, Outstanding_loan_cpd, Payment_history_cpd, Income_level_cpd)

In [81]:
credit_approval_model.check_model()


True

In [82]:
ref_tmp_var = False

a = 1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## Obtain CPDs, Leaves and Independencies

You can now look at the CPDs, leaves, independencies.

In [83]:
#

In [84]:
credit_approval_model.get_leaves()

['IR']

In [85]:
ref_tmp_var = False

a = 1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## Verifying the CPDs

``` python
for cpd in credit_approval_model.get_cpds():
    print("CPD of {variable}:".format(variable=cpd.variable))
    print(cpd)
```

In [86]:
# Iterate over credit_approval_model.get_cpds()


In [87]:
for cpd in credit_approval_model.get_cpds():
    print("CPD of {variable}:".format(variable=cpd.variable))
    print(cpd)

CPD of CR:
╒══════╤══════╤══════╤══════╤══════╤══════╤══════╕
│ OL   │ OL_0 │ OL_0 │ OL_1 │ OL_1 │ OL_2 │ OL_2 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ PH   │ PH_0 │ PH_1 │ PH_0 │ PH_1 │ PH_0 │ PH_1 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ CR_0 │ 0.85 │ 0.04 │ 0.12 │ 0.02 │ 0.13 │ 0.01 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ CR_1 │ 0.1  │ 0.07 │ 0.65 │ 0.07 │ 0.2  │ 0.04 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ CR_2 │ 0.04 │ 0.75 │ 0.15 │ 0.25 │ 0.45 │ 0.25 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ CR_3 │ 0.01 │ 0.14 │ 0.08 │ 0.66 │ 0.22 │ 0.7  │
╘══════╧══════╧══════╧══════╧══════╧══════╧══════╛
CPD of IR:
╒══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╤══════╕
│ CR   │ CR_0 │ CR_0 │ CR_0 │ CR_1 │ CR_1 │ CR_1 │ CR_2 │ CR_2 │ CR_2 │ CR_3 │ CR_3 │ CR_3 │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┼──────┤
│ IL   │ IL_0 │ IL_1 │ IL_2 │ IL_0 │

In [88]:
ref_tmp_var = False

a = 1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## Computations of Probabilities

``` python

from pgmpy.inference.base import Inference
from pgmpy.factors import factor_product

import itertools


class SimpleInference(Inference):
    def query(self, var, evidence):
        # self.factors is a dict of the form of {node: [factors_involving_node]}
        factors_list = set(itertools.chain(*self.factors.values()))
        product = factor_product(*factors_list)
        reduced_prod = product.reduce(evidence, inplace=False)
        reduced_prod.normalize()
        var_to_marg = set(self.model.nodes()) - set(var) - set([state[0] for state in evidence])
        marg_prod = reduced_prod.marginalize(var_to_marg, inplace=False)
        return marg_prod
```

### Computing CPDs against Evidence

* Query IR|OL=0 and assign to ir.

In [89]:
from pgmpy.inference.base import Inference
from pgmpy.factors import factor_product

import itertools


class SimpleInference(Inference):
    def query(self, var, evidence):
        # self.factors is a dict of the form of {node: [factors_involving_node]}
        factors_list = set(itertools.chain(*self.factors.values()))
        product = factor_product(*factors_list)
        reduced_prod = product.reduce(evidence, inplace=False)
        reduced_prod.normalize()
        var_to_marg = set(self.model.nodes()) - set(var) - set([state[0] for state in evidence])
        marg_prod = reduced_prod.marginalize(var_to_marg, inplace=False)
        return marg_prod

Use SimpleInference(credit_approval_model)

In [90]:
infer = SimpleInference(credit_approval_model)
ir = infer.query(var=['IR'], evidence=[('OL', 0)])
print(ir)

╒══════╤═══════════╕
│ IR   │   phi(IR) │
╞══════╪═══════════╡
│ IR_0 │    0.1626 │
├──────┼───────────┤
│ IR_1 │    0.5751 │
├──────┼───────────┤
│ IR_2 │    0.2623 │
╘══════╧═══════════╛


In [91]:
ref_tmp_var = False

import numpy as np

try:
    if abs(ir.values[0] - 0.1626 ) < 0.1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


### Computing CPDs against Evidence of OL and CR

* Query IR|OL=0, CR=0 and assign to ir.

In [92]:
#

use SimpleInference(credit_approval_model)

In [93]:
infer = SimpleInference(credit_approval_model)
ir = infer.query(var=['IR'], evidence=[('OL', 0),('CR', 0)])
print(ir)

╒══════╤═══════════╕
│ IR   │   phi(IR) │
╞══════╪═══════════╡
│ IR_0 │    0.0670 │
├──────┼───────────┤
│ IR_1 │    0.6390 │
├──────┼───────────┤
│ IR_2 │    0.2940 │
╘══════╧═══════════╛


In [94]:
ref_tmp_var = False

a = 1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


### Computing CPDs against additional Evidence of OL and CR

* Query IR|OL=2, CR=0 and assign to ir.

In [95]:
#

use SimpleInference(credit_approval_model)

In [96]:
infer = SimpleInference(credit_approval_model)
ir = infer.query(var=['IR'], evidence=[('OL', 2), ('CR', 0)])
print(ir)

╒══════╤═══════════╕
│ IR   │   phi(IR) │
╞══════╪═══════════╡
│ IR_0 │    0.0670 │
├──────┼───────────┤
│ IR_1 │    0.6390 │
├──────┼───────────┤
│ IR_2 │    0.2940 │
╘══════╧═══════════╛


In [97]:
ref_tmp_var = False

a = 1
try:
    if a == 1:
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


## D-separation 

* Consider all possible paths from any node in A to any node in B
* If all possible paths are blocked then A is d-separated from B by C, then $A \perp B \space | \space C$
* Is  $A \perp B \space | \space C$?

<img src="../images/d-sep-1.png", style="height:60vh;">

Path from a to b is
Not blocked by f because 
*  It is a tail-to-tail node
*  It is not observed

Not blocked by e because

*  Although it is a head-to-head node, it has a descendent c in the conditioning set
*  Thus does not follow from this graph

### D-Separation - Example 2

* Is  $A \perp B \space | \space C$?

<img src="../images/d-sep-2.png", style="height:60vh;">



### D-Separation in Credit Approval Model

<img src="../images/credit_approval.png", style="height:50vh;">

* Is $IL \perp OL \space | \space IR$? Change the probabilites and demonstrate your result. Assign the True/False result to IL_OL

In [98]:
infer = SimpleInference(credit_approval_model)

Use OL and toggle IR to observe the independence.

In [99]:
il_ol1 = infer.query(var=['IL'], evidence=[('OL', 1), ('IR', 0)])
il_ol2 = infer.query(var=['IL'], evidence=[('OL', 1), ('IR', 1)])
print(il_ol1, il_ol2)
IL_OL = False

╒══════╤═══════════╕
│ IL   │   phi(IL) │
╞══════╪═══════════╡
│ IL_0 │    0.0610 │
├──────┼───────────┤
│ IL_1 │    0.5508 │
├──────┼───────────┤
│ IL_2 │    0.3882 │
╘══════╧═══════════╛ ╒══════╤═══════════╕
│ IL   │   phi(IL) │
╞══════╪═══════════╡
│ IL_0 │    0.0797 │
├──────┼───────────┤
│ IL_1 │    0.6229 │
├──────┼───────────┤
│ IL_2 │    0.2974 │
╘══════╧═══════════╛


In [100]:
ref_tmp_var = False

try:
    if IL_OL is False: 
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue
