# My World Bank jottings

In [7]:
import pandas as pd
import numpy as np
import math

## Headcount index
measures the proportion of the population that is counted as poor

$P_{0} = N_{p}$ / N

where $N_{p}$ is the number of poor people and N is the total population.

Take expenditure for each individual, make an assumption of the poverty line, then count the total number of people who fall below it and divide it by the total number of people in the sample.

(see page 68)

In [8]:
def get_headcount_index(pl,data):
    #pl is the poverty line
    #data is a dataframe containing survey data
    total_sample = data.shape[0] # number of rows
    the_poor = data[data['total_expenditure'] < pl].shape[0]
    #print("the poor are ", the_poor)
    poverty_index = the_poor / total_sample
    return poverty_index
    ##TODO - could add some concept of weight to the formula to account for the total population based on the sample
    ## also the data is household, but is the poverty index about individuals?
    
    

In [9]:
url = "https://virtual-worlds.scot/ou/uk-lcf-subset-2005-6.csv"
storage_options = {'User-Agent': 'Mozilla/9.0'}

lcf = pd.read_csv(url,storage_options=storage_options)
#print(lcf.info) 
lcf = lcf.iloc[:10] #take a slice of 10 rows for easier debugging
lcf_sorted = lcf.sort_values(by='total_expenditure').reset_index(drop=True)
#print(lcf_sorted)
lcf_test = pd.DataFrame({'total_expenditure': [ 110,120,150,160]})

In [10]:

pi = get_headcount_index(125,lcf_sorted)
print(pi)

0.1


## Poverty Gap Index

Adds up the extent to which individuals on average fall below the poverty line, and expresses it as a percentage of the poverty line. More specifically, define the poverty gap (Gi) as the poverty line (z) less actual income (yi) for poor individuals; the gap is considered to
be zero for everyone else. Using the index function, we have

Gi = (z – yi) × I(yi < z)

So basically take the diff between the expenditure and the poverty line (the gap) and then divide it by the poverty line to get an index (and do it over the whole population to get a national gap)

(See page 70)

In [6]:
def get_poverty_gap_index(pl,data):
    #pl is the poverty line
    #data is a dataframe containing survey data
    total_sample = data.shape[0] # number of rows
    for i in range(0,total_sample):
        data.loc[i,"poverty_gap"] = max(0, (pl-data.loc[i,"total_expenditure"])/pl)
    poverty_gap_index = data["poverty_gap"].sum() / total_sample
    #print(data)
    return round(poverty_gap_index,5)

In [7]:
pgi = get_poverty_gap_index(125, lcf_test)
print(pgi)

0.08


## Poverty Severity Index

This one squares the poverty gap before adding up and averaging over the population.
So it makes the poverty of the more poor count for more

(see page 71)

In [8]:
def get_poverty_severity_index(pl,data):
    #pl is the poverty line
    #data is a dataframe containing survey data
    total_sample = data.shape[0] # number of rows
    for i in range(0,total_sample):
        data.loc[i,"poverty_gap_squared"] = (max(0, (pl-data.loc[i,"total_expenditure"])/pl))**2
    poverty_severity_index = data["poverty_gap_squared"].sum() / total_sample 
    #print(data)
    return round(poverty_severity_index,5)

In [9]:
psi=get_poverty_severity_index(125, lcf_test)
print(psi)

0.02


In [10]:
print(lcf_test)

   total_expenditure  poverty_gap  poverty_gap_squared
0                 90         0.28               0.0784
1                120         0.04               0.0016
2                150         0.00               0.0000
3                160         0.00               0.0000


(P 71)
<quote>At the other extreme, one can consider the maximum cost of eliminating poverty, assuming that the policy maker knows nothing about who is poor and who is not.
From the form of the index, it can be seen that the ratio of the minimum cost of eliminating poverty with perfect targeting (that is, Gi) to the maximum cost with no targeting (that is, z, which would involve providing everyone with enough to ensure they are not below the poverty line) is simply the poverty gap index. Thus, this measure is an indicator of the potential savings to the poverty alleviation budget from targeting: the smaller the poverty gap index, the greater the potential economies for a poverty alleviation budget from identifying the characteristics of the poor—using survey or other information—so as to target benefits and programs.</quote>

So if the UK has a povertty line and we can use the HH survey data to calculate a Gi for every household, then we could calculate how much it would cost gross to eliminate poverty?



## Povert Severity Index (generic)

When parameter α = 0, P0 is simply the headcount index.
When α = 1, the index is the poverty gapindex P1, 
and when α is set equal to 2, P2 is the poverty severity index.

(see page 72)

In [11]:
def get_poverty_severity_index_generic(pl,data, alpha):
    #pl is the poverty line
    #data is a dataframe containing survey data
    # alpha is the measure of the sensitivity of the index to poverty.Alpha has to be >=o
    if alpha < 0:
        return "Error. Alpha must be >=0"
    total_sample = data.shape[0] # number of rows
    for i in range(0,total_sample):
        data.loc[i,"poverty_gap_alpha"] = (max(0, (pl-data.loc[i,"total_expenditure"])/pl))**alpha
    poverty_severity_index = data["poverty_gap_alpha"].sum() / total_sample 
    #print(data)
    return round(poverty_severity_index,5)

In [12]:
psig= get_poverty_severity_index_generic(125,lcf_test,1)
print(psig)

0.08


## Sen Index

The Sen index can also be written as the average of the headcount and poverty gap measures, weighted by the Gini coefficient
of the poor, giving:

$P_{s} = P_{0}G^{p} + P_{1}(1 – G^{p})$

(see page 74)

In [17]:
def get_sen_index(pl,data):
    pov_headcount = get_headcount_index(pl,data)
    pov_gap = get_poverty_gap_index(pl,data)
    #for now let's just make Ginni a constant until i figure out how to calculate it
    gini = 0.7
    sen_index = pov_headcount*gini + pov_gap*(1-gini)
    return sen_index

In [18]:
sen = get_sen_index(125,lcf_test)
print(sen)

0.374


### TODO - the Sen Shorrocks Thon index

## Watts Index

![title](./img/watts.png)

where the N individuals in the population are indexed in ascending order of income (or expenditure), and the sum is taken over the q individuals whose income (or expenditure) yi falls below the poverty line z.

the Watts index is increasingly used by researchers because it satisfies all the theoretical properties that one would want in a poverty index. Ravallion and Chen (2001) argue that three axioms are essential to any good measure of poverty. 
- Under the focus axiom, the measure should not vary if the income of the nonpoor varies;
- under the monotonicity axiom, any income gain for the poor should reduce poverty;
- and under the transfer axiom, inequality-reducing transfers among the poor should reduce poverty.

The Watts index satisfies these three axioms


In [2]:
def get_watts_index (pl, data):
    total_sample = data.shape[0] # number of rows
    watts_total=0 # to contain our adding of the totals below
    for i in range(0,total_sample):
        if data.loc[i, "total_expenditure"] < pl:
            ## add it to the total according to the formula above, because the sum is oer individualss whose income/expenditure falls below the pl
            watts_total += math.log(pl/data.loc[i,"total_expenditure"]) 
    #finally divide by the total sample
    watts_total = watts_total / total_sample
    return watts_total
        

In [33]:
wi = get_watts_index(125,lcf_test)
print(wi)

0.04216384150753505


## Time to exit poverty

For the jth person below the poverty line, the expected time to exit poverty (that is, to reach the poverty line), if consumption per capita grows at positive rate g per year, is

![title](./img/timeexit.png)

Thus, the time taken to exit is the same as the Watts index divided by the expected growth rate of income (or expenditure) of the poor.

In [5]:
def get_time_to_exit(pl,data, growth):
    total_sample = data.shape[0] # number of rows
    for i in range(0,total_sample):
        if data.loc[i, "total_expenditure"] < pl:
            #add a time to exit to that row
            data.loc[i, "time_to_exit"] = (math.log(pl/data.loc[i,"total_expenditure"]))/growth
    return data

In [13]:
tte = get_time_to_exit(125, lcf_test, 0.04)
print(tte)

   total_expenditure  time_to_exit
0                110      3.195834
1                120      1.020550
2                150           NaN
3                160           NaN
