# Inequality, premature mortality and Covid-19

This is code for a python workshop with LSE undergraduate economics students to explore the relationship between inequality and premature morality. The projects can use replication data from existing work, or create their own project using the other datasets provided. 

The pandemic has exacerbated the inequalities that were already present in society, and brought to light many inequalities such as variation in digital connectivity and access to health care provision. 

__Topics covered using this data__ 
*	Income inequality and mortality 
*   Covid-19 and health outcomes

------

# Motivation 

OECD figures suggest that the UK has among the highest levels of income inequality in the European Union (as measured by the Gini coefficient), although income inequality is lower than in the United States. 

Data published by Eurostat, the statistical office of the European Union, gives a more positive picture, indicating income inequality in the UK is lower than in several other EU countries although it is slightly higher than the EU average.

__The Deaton Review__

In 2019 the Institute for Fiscal Studies with Professor Angus Deaton launched the Deaton Review. A 5-year review into examining income and wealth inequalities, but also differences in health outcomes, political power and economic opportunities in British society and across the world.

You can read more here: https://www.ifs.org.uk/inequality/

------ 

## Data sets 
*	London borough profiles for premature mortality [Link](https://data.london.gov.uk/dataset/london-borough-profiles)
*	Cross-country inequalities (replication data from Prof Eric Neumayer (LSE)) [Link](https://www.dropbox.com/s/cp1gc60xagx4uj5/Article%20for%20AJPH.xlsx?dl=0)
*	England and Wales, Index of multiple deprivation [Link](https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019) plus shape files [Link](https://data-communities.opendata.arcgis.com/search?q=IMD&sort=-modified)
*	UK Covid-19 cases and deaths [Link](https://coronavirus.data.gov.uk/)

## Reading list

Atkinson, A. (2015) Inequality: what can be done? LSE International Inequalities Institute.

Neumayer, E. (2016) Inequalities of Income and Inequalities of Longevity: A Cross-Country Study.

Neumayer, E. (2017) Regional Inequalities in Premature Mortality in Great Britain.

__Covid-19 impact__ 
Davenport, A. et al (2020) The geography of the COVID-19 crisis in England. 

Deaton, A. (2021) COVID-19 and global income inequality. 

Public Health England (2020) Disparities in the risk and outcomes of COVID-19.

ONS (2020) Updating ethnic contrasts in deaths involving the coronavirus (COVID-19), England and Wales: deaths occurring 2 March to 28 July 2020.



# Example 1: Premature morality in London

------
This dataset can be used as a stepping stone in understanding the demographic inequality in London. By creating a simple choropleth map, you can begin to make the case for further investigation into poor health outcomes during the pandemic. 

Can use data from a tutorial which can be found <a href="https://towardsdatascience.com/lets-make-a-map-using-geopandas-pandas-and-matplotlib-to-make-a-chloropleth-map-dddc31c1983d">here</a>

Note: there is a mistake in the author's final result as he uses the wrong 'column', and therefore doesn't actually show mortality. 


# Import packages 

You will need to import: 
* pandas
* numpy
* matplotlib in addition to a new package: 
* geopandas. 
* statsmodels

We are using geopandas to create the choropleth maps. To carry out any econometric analysis, the recommended package is statsmodels. (You'll want to use this when using the replication data, or for your covid-19 analysis on inequality and mortality). 

__A word of caution when using geopandas__

You need to make sure you carefully follow the installation documentation. 
You can read it <a href="https://geopandas.readthedocs.io/en/latest/install.html">here</a>  

When using conda, make sure you use this new environment 'geo_env' so that you don't have any conflicts when using geopandas. 

# Import the data 

Data here varies for multiple years. last updated 2 months ago (from site)
You can download the data yourself <a href="https://data.london.gov.uk/dataset/london-borough-profiles">here</a>  


# Inspect the data 

This is a good opportunity to udnerstand how to inspect and clean data. In your final projects, you should include the techniques used to clean the data and identify any data wrangling you had to do. Try to answer answer the following questions: 

* How many observations are in the data. 
* What are the various columns - list them
* How many missing observations? 
* How many unique boroughs? 
* Rename the columns of interest to simplier names (hint: df.rename(index=str, columns={'Area name':'borough',....})

# Import the shape file

You can download the shapefiles here: https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london

Select the option 'statistical-gis-boundaries-london.zip  (27.34 MB)'
Make sure to move it to the same directory as all your other files and this notebook for ease of use. when saving you need to make sure you have the .shx AND .shp files stored together! 

# Cleaning the dataframes

# Creating the map

Colours of the map codes: https://matplotlib.org/tutorials/colors/colormaps.html

Here you just want to select mortality and map it. 

# Example 2: Inequality of longevity and income in the UK

This dataset was used to examined the effects of market income inequality (income inequality
before taxes and transfers) and income redistribution via taxes and transfers on inequality in longevity.

The corresponding paper can be found here: https://ajph.aphapublications.org/doi/pdf/10.2105/AJPH.2015.302849


__Motivation__
Public policies affect not only health and mortality at the individual level, but also the inequality of longevity-inequality in the number of years lived. 

For example, higher tobacco and alcohol taxes reduce consumption, as do nonfiscal regulatory measures such as restrictions on smoking in closed spaces. This reduces avoidable mortality from lung cancer and liver cirrhosis. More directly, governments implement health and safety regulations, influence total health spending and its allocation, and regulate the coverage of health insurance across individuals. All factors that reduce premature deaths also reduce longevity inequality.


# Import the data 

(Note: You have already imported all the relevant datasets, so you won't need to do that again)

We can begin simply by importing the data. 

The data can be downloaded from here: https://www.dropbox.com/s/cp1gc60xagx4uj5/Article%20for%20AJPH.xlsx?dl=0

# Inspect the data 

As before, generate some summary stats to describe the data that you have. This should also help you to understand what the dataset is and how you can use it. 

Consider finding the following: 
* Number of countries. And the names of these countries
* Years the data runs from and to 
* Null observations 
* Most unequal country (High Gini coefficient)
* Most equal country (Low Gini coeffcient) 
* The total columns and what they describe 


The next step would be to test for a few relationships, this could be by creating some scatter plots first or by running a regression. 

# Covid-19 inequality and health outcomes
The above datasets give an insight into where how and how income inequality and poor health outcomes are prevalent. Using the coranavirus data on hospital admissions or deaths, students can begin to analyse these relationship of deprivation and poverty. 

The Index of Multiple Deprivation could be a useful dataset to begin assessing these relationships. (Link is provided above.) The Annual Survey of Hours and Earnings data from the ONS might be another source of data to capture income data at an local authority level. Available [here](https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/datasets/placeofresidencebylocalauthorityashetable8)