# IN-STK5000/9000 - Adaptive methods for data-based decision making
## Credit Project
#### Syed Moeen Ali Naqvi - Geir Severin Rakh Elvatun Langberg - Markus Sverdvik Heiervang  
***

### Part 1: Banker agent
In this notebook, we display and comment on the development of our banker model, and measure it against the random banker

### Task A - Implementing the fit function

(todo)

### Task B - Implementing the predict

(todo)

### Task C - Get best action

Our action space $\mathcal{A}$ is binary: $\mathcal{A} = \{0, 1\} = \{ \text{refuse_loan}, \text{grant_loan} \}$


Assuming that we are maximising utility, a general function would be

$$
\text{best_action}(x) = \underset{a \in \mathcal{A}}{\text{argmax}} \  \mathbf{E} [\mathit{U}(x, a)]
$$

but since our action space is binary, it can be expressed as

$$  
\text{best_action}(x) = \begin{cases}
    1,& \text{if } \mathbf{E}[\mathit{U}(x, 1)] > 0\\
    0,              & \text{otherwise}
\end{cases}
$$



Where $\mathbf{E}[\mathit{U}(x, a)]$ is our expected utility function for a given feature vector and an action.

We can translate this into python code as such:

```Python
def get_best_action(self, x: pd.Series) -> int:
        return int(self.expected_utility(x, 1) > 0)
```

### Task D - Documenting the banker  

For this part, we'll be interacting with the NameBanker in the cells below. 
Before measuring the performance, we conduct a series of unit tests to assert that each method works for a few cases

In [1]:
from name_banker import NameBanker
import pandas as pd
import unittest
from unittest.mock import MagicMock
from sklearn.utils.validation import check_is_fitted
import numpy as np

# seed for reproducibility
np.random.seed(42)

In [2]:
class TestNameBanker(unittest.TestCase):
    """
    Testcase for namebanker containing uninttests for all its methods
    
    fields:
        r: default sample interest rate
        never_returns: instance of NameBanker that always assumes loan will not be returned
        always_returns: instance of NameBanker that always assumes loan will be returned
        
    """
    r = 0.05
    
    # Agent setup
    never_returns, always_returns = NameBanker(), NameBanker()
    for i, a in (0, never_returns), (1, always_returns): 
            a.set_interest_rate(r)
            a.predict_proba = MagicMock(return_value=i)
            
    
    def test_set_interest_rate(self):
        banker = NameBanker()
        assert not hasattr(banker, "rate")
        banker.set_interest_rate(self.r)
        assert hasattr(banker, "rate")
        assert banker.rate == self.r
        
    
    def test_predict_proba(self):
        banker = NameBanker()
        banker.fit([[1, 0], 
                    [0, 1]], [1, 2])
        p1 = banker.predict_proba(pd.Series([1, 0]))
        assert p1 <= 0.5
        p2 = banker.predict_proba(pd.Series([0, 1]))
        assert p2 >= 0.5
    
    def test_fit(self):
        decision_maker = NameBanker()
        decision_maker.fit([[0, 0], [0, 0]], [0, 0])
        assert "classifier" in decision_maker.__dict__, \
            "NameBanker should have attribute 'classifier' after calling fit"
        
        check_is_fitted(decision_maker.classifier, "estimators_")
        
    def test_expected_utility(self):
        x = pd.Series({"duration": 10, "amount": 100})
        assert self.never_returns.expected_utility(x, 1) < 0, \
            "Utility must be negative if person does not return loan"
        assert self.always_returns.expected_utility(x, 1) > 0, \
            "Utility must be positive if person does not return loan"
        
        proba = 0.7
        decision_maker = NameBanker()
        decision_maker.set_interest_rate(self.r)
        decision_maker.predict_proba = MagicMock(return_value=proba)
        
        estimate = decision_maker.expected_utility(x, 1)
        ground_truth = 14.022623874420933
        assert np.isclose(estimate, ground_truth), \
            f"Estimate should be close to {ground_truth}, was {estimate}"
    
    
    def test_get_best_action(self):
        for d in range(1, 1000, 10):
            for amt in range(1, 50000, 1500):
                x = pd.Series({"duration": d, "amount": amt})    
                assert not self.never_returns.get_best_action(x), \
                    "When probability of return is 0, best action should always be 0"
                assert self.always_returns.get_best_action(x), \
                    "When probability of return is 1, best action should always be 1"

In [3]:
# We'll need these arguments when running the tests in jupyter notebook
unittest.main(argv=["first-arg-is-ignored"], exit=False)

.....
----------------------------------------------------------------------
Ran 5 tests in 2.115s

OK


<unittest.main.TestProgram at 0x7f08c7edad50>

We rewrote the TestLending script into a neat command-line interface so that we can customize the programs parameters.   
This will also display progress of the training, since the classifier might take some time.  

From this, we can observe that our banker performs better than the RandomBanker

In [4]:
# Let's inspect TestLending
!python3 TestLendingV2.py ../../data/credit/D_train.csv --n-tests 100 --seed 42

r=0.017, n_tests=100, seed=42

Testing on class: RandomBanker ...
100%|█████████████████████████████████████████| 100/100 [00:06<00:00, 15.42it/s]
Results:
	Average utility: 5286300.638762048
	Average return on investment: 133.09537160909298

Testing on class: NameBanker ...
100%|█████████████████████████████████████████| 100/100 [03:13<00:00,  1.83s/it]
Results:
	Average utility: 9083920.983178237
	Average return on investment: 221.825440959028
