## New simulation setup

We assume that neither lender knows the fraction of H-type borrowers in the population, or how the signals are generated. They simply employ some learning algorithm, based on the utility of some action, and learn the best strategy.

The market consists of a fraction $\theta$ of H-type borrowers, and $1-\theta$ of L-type borrowers. There are two lenders, 1 and 2, and each of them will receive a signal from each borrower. We let $p_{s_1, s_2}^{t}$ denote the probability that a borrower of true type $t\in \{L, H\}$ will give a signal of $s_1$ to lender 1, and $s_2$ to lender $s_1$, where $s_1,s_2\in \{l, h\}$.

In [1]:
import numpy as np

### Constants

Market size $N$, fraction of H-type borrower $\theta$, and probabilities of signal generation.

$$
P^H = \begin{array}{cc}
p_{HH}^H, p_{HL}^H\\
p_{LH}^H, p_{LL}^H
\end{array}
$$

$$
P^L = \begin{array}{cc}
p_{HH}^L, p_{HL}^L\\
p_{LH}^L, p_{LL}^L
\end{array}
$$

In [58]:
N = 100000
Theta = 0.9
Borrower_types = ['H', 'L']
Signals = ['hh', 'hl', 'lh', 'll']
PH = np.array([[0.90, 0.03], 
               [0.02, 0.05]])
PL = np.array([[0.10, 0.06],
               [0.04, 0.80]])
assert(PH.sum() == 1); assert(PL.sum() == 1)

### Generating borrowers with true types

In [59]:
Borrowers = np.random.choice(Borrower_types, size=N, p=[Theta, 1-Theta])
np.unique(Borrowers, return_counts=True)

(array(['H', 'L'], dtype='<U1'), array([89966, 10034]))

### Generating signals

In [60]:
def signal(true_type, PH, PL):
    if true_type == 'H':
        return np.random.choice(Signals, p=PH)
    elif true_type == 'L':
        return np.random.choice(Signals, p=PL)
    else:
        raise ValueError

In [61]:
ph_1d, pl_1d = PH.reshape(-1), PL.reshape(-1)
signals = [signal(b, ph_1d, pl_1d) for b in Borrowers]

In [62]:
lender1_signals = [s[0] for s in signals]
lender2_signals = [s[1] for s in signals]

In [63]:
np.unique(lender1_signals, return_counts=True), np.unique(lender2_signals, return_counts=True)

((array(['h', 'l'], dtype='<U1'), array([85261, 14739])),
 (array(['h', 'l'], dtype='<U1'), array([84203, 15797])))

### Wrapper

In [64]:
def gen_signals_for_one_round():
    borrowers = np.random.choice(Borrower_types, size=N, p=[Theta, 1-Theta])
    signals = [signal(b, ph_1d, pl_1d) for b in borrowers]
    lender1_signals = [s[0] for s in signals]
    lender2_signals = [s[1] for s in signals]
    return borrowers, lender1_signals, lender2_signals

In [65]:
br, l1, l2 = gen_signals_for_one_round()

In [66]:
np.unique(br, return_counts=True), np.unique(l1, return_counts=True), np.unique(l2, return_counts=True)

((array(['H', 'L'], dtype='<U1'), array([90102,  9898])),
 (array(['h', 'l'], dtype='<U1'), array([85411, 14589])),
 (array(['h', 'l'], dtype='<U1'), array([84409, 15591])))

In [67]:
print('Lender 1 expected number of h signals', N*Theta*PH[0].sum() + N*(1-Theta)*PL[0].sum())
print('Lender 2 expected number of h signals', N*Theta*PH[:,0].sum() + N*(1-Theta)*PL[:,0].sum())

Lender 1 expected number of h signals 85300.0
Lender 2 expected number of h signals 84200.0


In [68]:
from tqdm import tqdm

In [69]:
# option 1: both players are no-regret learners
# algorithm: hedge + propensity scoring

T = 100

# lender chooses r, lends to everyone of type H
# chooses r
# .. more complicated...


# let r_bar be the highest rate any H borrower would accept
r_bar = 0.5
n = 10
rs = np.linspace(0, 0.5, n)
rng = np.random.default_rng()
# for now: sample r_h, r_l indepedently
U1 = {
    'h': np.zeros(n), 
    'l': np.zeros(n)
}
U2 = {
    'h': np.zeros(n), 
    'l': np.zeros(n)
}

U1s = []
U2s = []

eps = 1e-3
def weights(U):
    return (1-eps)**(-U)

def dist(w):
    return w / np.sum(w)

def update_Us(U, u_t):
    for type in U.keys():
        U[type] = U[type] + u_t[type]
    return U

for t in tqdm(range(T)):
    U1s.append(U1)
    U2s.append(U2)
    
    br, l1, l2 = gen_signals_for_one_round()

    u1_t = {
        'h': np.zeros(n),
        'l': np.zeros(n)
    }
    count1_t = {
        'h': np.zeros(n),
        'l': np.zeros(n)
    }
    u2_t = {
        'h': np.zeros(n),
        'l': np.zeros(n)
    }
    count2_t = {
        'h': np.zeros(n),
        'l': np.zeros(n)
    }
    for i, (b, l1_b, l2_b) in enumerate(zip(br, l1, l2)):
        j1 = rng.choice(n, p=dist(weights(U1[l1_b])))
        j2 = rng.choice(n, p=dist(weights(U2[l2_b])))

        r1 = rs[j1]
        r2 = rs[j2]
        
        u1, u2 = 0, 0

        if r1 <= r2:
            u1 = r1 if b == 'H' else -1
        if r1 >= r2:
            u2 = r2 if b == 'H' else -1
        
        if r1 == r2:
            if r1 == r_bar:
                u1, u2 = 0,0
            else:
                u1, u2 = u1 / 2, u2 / 2

        # update u1_t
        u1_t[l1_b][j1] = u1_t[l1_b][j1] + u1
        count1_t[l1_b][j1] = count1_t[l1_b][j1] + 1

        # update u2_t
        u2_t[l2_b][j2] = u2_t[l2_b][j2] + u2
        count2_t[l2_b][j2] = count2_t[l2_b][j2] + 1

    u1t_avg = {}
    for type in count1_t:
        u1t_avg[type] = u1_t[type] / count1_t[type]

    u2t_avg = {}
    for type in count2_t:
        u2t_avg[type] = u2_t[type] / count2_t[type]

    # update policies using HEDGE
    U1 = update_Us(u1t_avg, U1)
    U2 = update_Us(u2t_avg, U2)

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [04:07<00:00,  2.48s/it]


In [70]:
U1_final = U1s[-1]
U2_final = U2s[-1]

In [71]:
print('high dist', dist(weights(U1['h'])))
print('low dist', dist(weights(U1['l'])))

high dist [0.09916585 0.09964245 0.10001871 0.1002824  0.10043737 0.10048398
 0.10041764 0.10024687 0.09996296 0.09934176]
low dist [0.09709674 0.0978501  0.09855931 0.09923853 0.0998419  0.10042803
 0.10100369 0.10150108 0.1019506  0.10253002]


In [72]:
print('high dist', dist(weights(U2['h'])))
print('low dist', dist(weights(U2['l'])))

high dist [0.09917452 0.09965073 0.10002217 0.10028566 0.10043938 0.1004852
 0.10041787 0.10024155 0.09995391 0.09932902]
low dist [0.09721012 0.09794715 0.09864223 0.09925646 0.0998731  0.10046235
 0.10095932 0.10144036 0.10183959 0.1023693 ]
