## <span style = "color:#1A237E;">Hypothesis Two</span>
### <span style = "color:green;">Null Hypothesis</span>
Households in counties with higher poverty gaps are more likely to access credit.
### <span style = "color:green;">Alternative Hypothesis</span>
Households in counties with higher poverty gaps are less likely to access credit.
### <span style = "color:green;">Relevance</span>
Poverty gaps measure the depth of poverty, indicating how far the poor are from the poverty line. If households are far from the poverty line, they may face more difficulties accessing credit, thereby exacerbating their poverty situation

In [1]:
# Import required libraries
import pandas as pd
from scipy.stats import pearsonr

In [2]:
# Read data into pandas dataframe
data = pd.read_csv("overall_poverty_est.csv")
data.head()

Unnamed: 0,residence_county,Headcount Rate (%),Distribution of the Poor (%),Poverty Gap (%),Severity of Poverty (%),Population (ths),Number of Poor (ths),Proportion of households that sought credit (%),Number of Households (ths),Proportion of households that sought and accessed credit (%),Number of Households that sought credit (ths),Distribution of the Poor (%).1,Poverty Gap (%).1,Severity of Poverty (%).1,Population (ths).1,Number of Poor (ths).1
0,Baringo,39.6,1.7,9.7,4.2,704,278,44.4,152,98.6,68,2.0,10.8,4.1,704,291
1,Bomet,48.8,2.7,9.3,2.8,916,447,19.5,179,83.8,35,2.1,5.6,1.6,916,300
2,Bungoma,35.7,3.4,9.5,3.6,1553,555,32.8,321,58.0,105,3.5,9.5,3.9,1553,503
3,Busia,69.3,3.6,22.3,9.3,840,583,5.5,177,62.2,10,3.4,17.5,7.2,840,500
4,Elgeyo/Marakwet,43.4,1.2,13.4,5.6,469,204,43.1,99,98.7,43,1.4,10.8,4.0,469,210


In [3]:
# Display number of rows and columns
data.shape

(47, 16)

In [4]:
# Display statistical summary
data.describe()

Unnamed: 0,Headcount Rate (%),Distribution of the Poor (%),Poverty Gap (%),Severity of Poverty (%),Number of Poor (ths),Proportion of households that sought credit (%),Number of Households (ths),Proportion of households that sought and accessed credit (%),Number of Households that sought credit (ths),Distribution of the Poor (%).1,Poverty Gap (%).1,Severity of Poverty (%).1,Number of Poor (ths).1
count,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0,47.0
mean,40.557447,2.129787,12.085106,5.306383,349.042553,33.082979,242.808511,85.814894,81.851064,2.12766,10.314894,4.46383,309.297872
std,16.291085,1.109429,8.496751,5.254911,182.080125,16.18717,225.060343,16.561138,79.732972,1.142404,6.053852,3.569827,166.793795
min,16.7,0.2,2.4,0.5,36.0,5.5,30.0,33.9,4.0,0.2,3.0,0.8,25.0
25%,28.8,1.4,7.35,2.5,231.0,21.3,127.0,84.05,39.5,1.3,6.75,2.55,192.0
50%,35.8,2.0,9.4,3.5,321.0,32.9,210.0,92.5,69.0,2.0,9.1,3.5,287.0
75%,47.45,2.75,13.4,5.8,455.5,43.1,277.5,97.55,108.5,2.65,11.75,4.95,385.5
max,79.4,5.2,46.0,30.8,860.0,66.1,1503.0,99.2,510.0,4.9,32.9,20.4,717.0


In [5]:
# Display column headers
data.columns

Index(['residence_county', 'Headcount Rate (%)',
       'Distribution of the Poor (%)', 'Poverty Gap (%)',
       'Severity of Poverty (%)', 'Population (ths)', 'Number of Poor (ths)',
       'Proportion of households that sought credit (%)',
       'Number of Households (ths)',
       'Proportion of households that sought and accessed credit (%)',
       'Number of Households that sought credit (ths)',
       'Distribution of the Poor (%).1', 'Poverty Gap (%).1',
       'Severity of Poverty (%).1', 'Population (ths).1',
       'Number of Poor (ths).1'],
      dtype='object')

### <span style = "color:green;">Pearson correlation coefficient and p-value</span>

In [6]:
# Calculate Pearson correlation coefficient and p-value
corr, pval = pearsonr(data["Poverty Gap (%)"], data["Proportion of households that sought credit (%)"])

In [7]:
# Print results
print("Pearson correlation coefficient:", corr)
print("p-value:", pval)

Pearson correlation coefficient: -0.4777136954992022
p-value: 0.0006844717800419033


### <span style = "color:green;">Observations and Inferences</span>
1. The Pearson correlation coefficient between the `Poverty Gap (%)` and `Proportion of households that 
sought credit (%)` is `-0.4777`, which means that there is a moderate negative correlation between 
these two variables. This indicates that as the `Poverty Gap (%)` increases, the `Proportion of 
households that sought credit (%)` _decreases_, and vice versa.
2. The `p-value` is `0.000684`, which is less than the commonly used threshold of `0.05`. This suggests 
that the observed correlation between the two variables _is statistically significant_, and not 
likely to have occurred by chance.