# Rescaling predicted probabilities at one level to conform to the predicted probabilities at another  

This notebook briefly checks the implications of the methods laid out in the EBMA_levels paper in the ViEWS papers sharelatex project.

I find that 

* Rescaling country probs using grid level probs is not useful (the original country level probs are completely ignored)
* Rescaling grid level probs using country level probabilities leads to negative probabilities. 

## Comments

### Country
The country level predicted probability is completely determined by the grid level probabilties when it is rescaled. There is no rescaling, grid level completely determines country level in the current methodology. 
This can be seen empirically in the output: p_c_rescaled is always the same for the same set of grid level probabilities. The country level probability going in (p_c) has no effect.

### Grid level
Grid level probabilties do satisfy all the conditions laid out in the text but they are sometimes nonsensical with values below zero. 

In [1]:
import numpy as np 
import matplotlib.pyplot as plt

In [2]:
def rescale(p_c, p_pg, returns='p'):
    """Rescale country and grid probabilities to be consistent
    
    This function applies rescaling to both p_c and p_pg to make them consistent
    between each other. The problem is that p_g recieves negative values when p_c
    is small and some values of p_g are small. These small values of p_g, when
    rescaled, yield negative values.
    
    Args:
        p_c: country probability
        p_pg: list of grid level probabilities 
        returns: 'p' or 'q' for probabilities of 1 (p) vs probability of 0 (q). 
        
    Returns:
        p_c, p_pg: probabilties adjusted for each other
        
    
        """
    
    assert returns in ['p', 'q'], "Supply 'p' or 'q' to returns"
    
    p_pg = np.array(p_pg)
    
    q_c = 1 - p_c 
    q_pg = 1 - p_pg
    N = len(p_pg)
    
    prod_q_pg = np.prod(q_pg)
    
    # This was given as 
    # k = prod_q_pg / q_c
    # in the paper, corrected here
    k =  q_c / prod_q_pg

    k_adjusted_q_c = q_c / k
    k_adjusted_p_c = 1 - k_adjusted_q_c
    
    r = np.exp( (np.log(q_c) - np.sum( np.log(q_pg) )) / N )
    r_adjusted_q_pg = r * q_pg
    r_adjusted_p_pg = 1 - r_adjusted_q_pg 
    
    assert np.isclose(np.prod(r_adjusted_q_pg),q_c), "product of r-adjusted grid level probs not close to country prob"

    if returns == 'p':
        return k_adjusted_p_c, r_adjusted_p_pg
    elif returns == 'q':
        return k_adjusted_q_c, r_adjusted_q_pg

In [3]:
# for country probability in interval 0-1 with .1 steps
probs_country = np.arange(0, 1, 0.1)

# some extreme grid probs, two low and one high
probs_grids = [0.1, 0.3, 0.9]
# applying rescaling yields negative probs in grids. 
# p_pc is probability in country as 
for p_c in probs_country:
    p_c_rescaled, p_pg_rescaled = rescale(p_c, probs_grids, 'p')
    print("p_c:", p_c)    
    print("p_c_rescaled:", p_c_rescaled)
    print("p_pg_recaled:", p_pg_rescaled)
    

p_c: 0.0
p_c_rescaled: 0.937
p_pg_recaled: [-1.26184232 -0.7592107   0.74868419]
p_c: 0.1
p_c_rescaled: 0.937
p_pg_recaled: [-1.18378475 -0.69849925  0.75735725]
p_c: 0.2
p_c_rescaled: 0.937
p_pg_recaled: [-1.09970841 -0.63310654  0.76669907]
p_c: 0.3
p_c_rescaled: 0.937
p_pg_recaled: [-1.00829885 -0.56201022  0.77685568]
p_c: 0.4
p_c_rescaled: 0.937
p_pg_recaled: [-0.9077117  -0.48377577  0.78803203]
p_c: 0.5
p_c_rescaled: 0.937
p_pg_recaled: [-0.79522544 -0.39628645  0.80053051]
p_c: 0.6
p_c_rescaled: 0.937
p_pg_recaled: [-0.66653967 -0.29619752  0.81482893]
p_c: 0.7
p_c_rescaled: 0.937
p_pg_recaled: [-0.51415178 -0.17767361  0.83176091]
p_c: 0.8
p_c_rescaled: 0.937
p_pg_recaled: [-0.32273341 -0.02879266  0.85302962]
p_c: 0.9
p_c_rescaled: 0.937
p_pg_recaled: [-0.04985421  0.18344673  0.88334953]


## Verification of p_pg_rescaled back to original p_c
By setting the raw input grid level probs to the values from the rescaled grid level probs for  country prob = 0.5, which are

[-0.79522544, -0.39628645,  0.80053051]

we verify that they yield the input probability p_c_rescaled=0.5
Which they do. 

In [4]:
p_c = 0.5
probs_grids = [-0.79522544, -0.39628645,  0.80053051]
# applying rescaling yields negative probs in grids. 
# p_pc is probability in country as 
p_c_rescaled, p_pg_rescaled = rescale(p_c, probs_grids, 'p')
print("p_c:", p_c)    
print("p_c_rescaled:", p_c_rescaled)
print("p_pg_recaled:", p_pg_rescaled)
if np.isclose(p_c, p_c_rescaled):
    print("p_c and p_c_rescaled are really close!")
else:
    print("p_c and p_c_rescaled are not close")
    

p_c: 0.5
p_c_rescaled: 0.500000011024
p_pg_recaled: [-0.79522545 -0.39628646  0.80053051]
p_c and p_c_rescaled are really close!


## Algebra error in paper

There was an algebra error in the first draft of the paper, correction here.

$\hat{q}^c_j = k_j \prod^{N_j}_{i=1} q_{ij}^{pg}$

The $k_j$ that satisfies above is

$ k_j = \frac{\hat{q}_j^c}{\prod^{N_j}_{i=1}(\hat{q}^{pg}_{ij})} $
