# 002-Article 2-The £80,000 question: Why do the rich believe they are anything but?

In [2]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

import csv


def read_csv_as_list_dict(filename, separator, quote):
    """
    Inputs:
    filename  - name of CSV file
    separator - character that separates fields
    quote     - character used to optionally quote fields
    Output:
    Returns a list of dictionaries where each item in the list
    corresponds to a row in the CSV file.  The dictionaries in the
    list map the field names to the field values for that row.
    """
    table = []
    with open(filename, newline='') as csvfile:
        csvreader = csv.DictReader(csvfile, delimiter=separator, quotechar=quote)
        for row in csvreader:
            table.append(row)
    return table


def add_list_of_dictionaries_to_CSV(filename, listofobjects):
    """
    Inputs:
    filename - the csv file to write the data to
    listofobjects - the data to be written out
    """
    
    keys = listofobjects[0].keys()
    with open(filename, 'w', newline='') as output_file:
        dict_writer = csv.DictWriter(output_file, keys)
        dict_writer.writeheader()
        dict_writer.writerows(listofobjects)
    
    return []

### Background

I first learned about this story the day after Question Time aired on BBC One, when the BBC reality check team published [this](https://www.bbc.co.uk/news/50517136) article. I liked the article as an opportunistic move that used a viral moment to provide people with better data on the UK income distribution. However, the Question Time incident itself initially seemed pretty unremarkable - it could easily be headlined _"one man in sample of one-hundred people significantly misjudges the UK income distribution"_ - not an earth-shaking or even particularly surprising discovery.

But later that week I came across [this tweet](https://twitter.com/mattsmithetc/status/1197812093443067904) from Matthew Smith, a data journalist at YouGov, which shows that people's definition of 'richness' changes as their income levels change. While most people seemed to be judging the £80,000 man as an isolated bit-of-a-plonker, YouGov's data demonstrated that his views were actually more consistent with a typical high-income individual that you might intuitively expect.

The Harvard psychologist Dan Ariely coined the term "predictably irrational" to describe human behaviours that are neither rational nor erratic and random. These behaviours are key to the disciplines of cognitive psychology and behavioural economics. If people's beliefs are scattered randomly around the objective truth, there's not much to be done other that warn those people against overconfidence. But if we can identify the systematic mistakes that lead person after person to the same erroneous beliefs, people can adjust their behaviour and form new cognitive strategies to mitigate these mistakes.

Systematic mistakes are also particularly incidious. Where belief's scatter around the truth at random, better decisions can be made by pooling the views of multiple people (so that, in the long-run, one misconception balances out another). But when everyone is making the same mistakes, we cannot arrive at better decisions and beliefs without behavioural change (even more importantly, systematic errors can be exploited on a mass scale - think of cult followings, advertising and even fascism). 

To me, YouGov's data transformed the Question Time story from one of random misperception to one of predictable irrationality. As I researched the subject, I discovered that Labour's tax policy was designed specifically to tip-toe around these misperceptions of medium- to high-income earners. The idea that a policy belonging to the more liberal of the two potential governing parties was being determined, in part, by the systematic misjudgments of the country's wealthiest citizens seemed problematic and a story worth telling.

Of course, our misjudgments of 'richness' are not purely the product of biases; most people simply do not have a good feel for how much income and wealth other people have. Therefore, I wanted the feature to also provide a nuanced look at inequality in the UK.

### Labour's tax policy

You can view the Question Time exchange [here](https://www.bbc.co.uk/news/av/uk-politics-50514656/question-time-tax-row-i-m-one-of-the-people-labour-will-tax-more). Full episodes of Question Time are also availabe on the BBC website for a year after broadcast - the 21st of November episode is [here](https://www.bbc.co.uk/iplayer/episode/m000bhzc/question-time-2019-21112019)

> the Institute for Fiscal Studies (IFS) shared their 'initial reaction' to the Labour Party's manifesto. They described Labour's plans as _"a very substantial increase in the role of the state"_ and argued that the party needed to _"be clear that the tax increases required to do that will need to be widely shared rather than pretending that everything can be paid for by companies and the rich."_

That report can be viewed [here](https://www.ifs.org.uk/election/2019/article/labour-s-proposed-income-tax-rises-for-high-income-individuals).

### Visualisation 1 - Marginal tax rates explained, Labour's tax plans and recent changes to income tax thresholds

The marginal tax rates for income tax and national insurance are taken from the [government website](https://www.gov.uk/guidance/rates-and-thresholds-for-employers-2020-to-2021). From the 19/20 financial year to 20/21 income tax rates (unusually) remained constant, but the threshold for paying national insurance contributions increased from £8,632 to £9,500. 

It was important to include national insurance because this is an often overlooked component of the tax system. Too regularly, we evaluate the progressivity of the tax system by only considering income tax bands, forgetting the impact of the regressive national insurance tax (ideally we would also consider sales taxes - VAT is regressive since low-income earners spend a higher percentage of their income on essential goods and services, while stamp duty and council tax are both broadly progressive).

Present calculations of take-home pay for every single permutation of a person's circumstances - do they live in Scotland or England; are they self-employed or PAYE; are they earning through dividends; are they married, blind or over seventy-five - was clear impractical, so it was necessary to establish a 'typical' individual for calculation purposes. The visualisation shows take-home pay during the 20/21 financial year for a person whose personal allowance extends to £12,500 (this figure can vary depending on personal circumstances), and whose entire earnings come through PAYE employment.

For the visualisation, I needed to calculate the take-home pay (after income tax only and after income tax and national insurance) for individual's with different incomes. The code below calculates these figures:

In [6]:
def CreateTakeHomePayArrays(MarginalTaxRatesObject):
    """
    inputs an object where the keys represent thresholds where the marginal tax rate changes and the values are the rate
    of marginal tax paid on income earned below this figure (but above any other key in the array)
    
    the function then calculates the take home pay at each of these thresholds and outputs an object containing
    these values
    """
    
    numbArray = list(MarginalTaxRatesObject.keys())

    outputObject = {}
    
    # for each threshold - take the take-home pay of the previous threshold and add on the difference between
    # that threshold and this one multiplied one minus by the marginal tax rate
    for i, numb in enumerate(numbArray):
        
        if i == 0:
            
            outputObject[numb] = numb
        
        else: 
            
            outputObject[numb] = outputObject[numbArray[i-1]]+(numb-numbArray[i-1])*(1-MarginalTaxRatesObject[numb])
 
    return outputObject

In [None]:
# calculate take-home pay at values between thresholds - this

def CreateExtraDataPoints(PostTaxKeyDataPoints, MarginalTaxRatesObject):
    """
    this function inputs the object (marginal tax threshold: rate on income above threshold) used in the
    CreateTakeHomePayArrays and the output of that function - the take home income at those thresholds.
    
    it then calculates take-home income levels at £500 intervals. 
    
    I transformed the data in this way because it was the simplest solution for allowing readers to hover over
    the line charts and view the take-home income for different pre-tax figures. By creating lots of data points
    javascript (the in-browser programming language) can simply snap to the nearest £500 value and read the
    post-tax value off the imported data, instead of having to interpolate how far between data points the mouse
    was hovering.
    """
    outputArray = []
    
    numbArray = list(MarginalTaxRatesObject.keys())
    
    xVariable = 0
    
    while xVariable <= 250000:
        
        outputObject = {}
    
        for i, numb in enumerate(numbArray[:len(numbArray)-1]):
            
            # identify which thresholds the xVariable falls between
            
            if xVariable > numb and xVariable <= numbArray[i+1]:
                
                if xVariable <= (numb+500):
                    
                    outputObjectB = {}
                    
                    outputObjectB["xVariable"] = numb
                    outputObjectB["yVariable"] = PostTaxKeyDataPoints[numb]
                    outputObjectB["marginalTaxRate"] = MarginalTaxRatesObject[numbArray[i]]
                    
                    outputArray.append(outputObjectB)
                        
                    outputObjectC = {}

                    outputObjectC["xVariable"] = numb
                    outputObjectC["yVariable"] = PostTaxKeyDataPoints[numb]
                    outputObjectC["marginalTaxRate"] = MarginalTaxRatesObject[numbArray[i+1]]

                    outputArray.append(outputObjectC)
                    
                if xVariable not in numbArray or xVariable == 250000:
                
                    outputObject["xVariable"] = xVariable
                    outputObject["yVariable"] = ((xVariable-numb)*(PostTaxKeyDataPoints[numbArray[i+1]])+(numbArray[i+1]-xVariable)*(PostTaxKeyDataPoints[numb]))/(numbArray[i+1]-numb)
                    outputObject["marginalTaxRate"] = MarginalTaxRatesObject[numbArray[i+1]]

                    outputArray.append(outputObject)
    
                break
                
        xVariable += 500
                
    return outputArray

In [21]:
#output data required for visualisation 1 phase 1 (take-home pay and marginal tax rates including and excluding national insurance)

marginalTaxThresholdsMasterArray = [ 
    
{"name": "001-understandingMarginalTaxRates-incomeTaxOnlyFor2021", "thresholdsObject": {0: 0, 12500: 0, 50000: 0.2, 100000: 0.4, 125000: 0.6, 150000: 0.4, 250000: 0.45}},
{"name": "002-understandingMarginalTaxRates-incomeTaxAndNationalInsuranceFor2021", "thresholdsObject": {0: 0, 9500: 0, 12500: 0.12, 50000: 0.32, 100000: 0.42, 125000: 0.62, 150000: 0.42, 250000: 0.47 }}    
   
]

outputArrayOfObjects = []

for i, obj in enumerate(marginalTaxThresholdsMasterArray):
    
    outputArrayOfObjects.append({"thresholdsObjectCalculated": CreateTakeHomePayArrays(obj["thresholdsObject"])})
    
    outputArrayOfObjects[i]["extraDataPointsArray"] = CreateExtraDataPoints(outputArrayOfObjects[i]["thresholdsObjectCalculated"], obj["thresholdsObject"])
    
    add_list_of_dictionaries_to_CSV("./Data/Visualisation 1 - Understanding marginal tax rates/Outputs/Visualisation 1 - "+obj["name"]+".csv", outputArrayOfObjects[i]["extraDataPointsArray"])

In [22]:
#output data required for visualisation 1 phase 2 (take-home pay after income tax only under Labour's proposed income tax policy)

marginalTaxThresholdsMasterArray = [ 
    
{"name": "003-laboursIncomeTaxPolicy", "thresholdsObject": {0: 0, 12500: 0, 50000: 0.2, 80000: 0.4, 100000: 0.45, 125000: 0.675, 250000: 0.5}},
    
]

outputArrayOfObjects = []

for i, obj in enumerate(marginalTaxThresholdsMasterArray):

    outputArrayOfObjects.append({"thresholdsObjectCalculated": CreateTakeHomePayArrays(obj["thresholdsObject"])})
    
    outputArrayOfObjects[i]["extraDataPointsArray"] = CreateExtraDataPoints(outputArrayOfObjects[i]["thresholdsObjectCalculated"], obj["thresholdsObject"])
    
    add_list_of_dictionaries_to_CSV("./Data/Visualisation 1 - Understanding marginal tax rates/Outputs/Visualisation 1 - "+obj["name"]+".csv", outputArrayOfObjects[i]["extraDataPointsArray"])

Historical income tax thresholds were calculated by taking [this](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/792502/Table-a2.pdf) data from  the government website which provides historical tax rates excluding the personal allowance, and [this](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/792497/Table-a1.pdf) data which provides information on the "typical" tax-free personal allowance and also looks at how various circumstances have affected the personal allowance over time.

The financial years chosen for final publication were those with the biggest changes in policy from the previous year.

The inflation-adjusted marginal tax rates are based on the March 2010 (89.4) and March 2020 (108.6) Consumer Price Index figures published by the [National Office of Statistics](https://www.ons.gov.uk/economy/inflationandpriceindices/datasets/consumerpriceinflation). The idea is that if tax rates have kept up with inflation, a tax-free personal allowance of £6,475 in 2010 would have risen to: 

$$\frac{108.6}{89.4}6475 = 7866$$

In [23]:
#output data required for visualisation 1 phase 3 (historical marginal tax rates)

marginalTaxThresholdsMasterArray = [ 
    
{"name": "004-marginalTaxRates-200910", "thresholdsObject": {0: 0, 6475: 0, 43875: 0.2, 250000: 0.4}},
{"name": "005-marginalTaxRates-201011", "thresholdsObject": {0: 0, 6475: 0, 43875: 0.2, 100000: 0.4,  112950: 0.6, 150000: 0.4, 250000: 0.5}},

{"name": "006-marginalTaxRates-201314", "thresholdsObject": {0: 0, 9440: 0, 41450: 0.2, 100000: 0.4,  118880: 0.6, 150000: 0.4, 250000: 0.45}},
{"name": "007-marginalTaxRates-201516", "thresholdsObject": {0: 0, 10600: 0, 42385: 0.2, 100000: 0.4,  121200: 0.6, 150000: 0.4, 250000: 0.45}},

{"name": "008-marginalTaxRates-201011-inflationAdjusted", "thresholdsObject": {0: 0, 7866: 0, 53298: 0.2, 121477: 0.4,  137208: 0.6, 182215: 0.4, 250000: 0.5}},
]

outputArrayOfObjects = []

for i, obj in enumerate(marginalTaxThresholdsMasterArray):
    
    outputArrayOfObjects.append({"thresholdsObjectCalculated": CreateTakeHomePayArrays(obj["thresholdsObject"])})
    
    outputArrayOfObjects[i]["extraDataPointsArray"] = CreateExtraDataPoints(outputArrayOfObjects[i]["thresholdsObjectCalculated"], obj["thresholdsObject"])
    
    add_list_of_dictionaries_to_CSV("./Data/Visualisation 1 - Understanding marginal tax rates/Outputs/Visualisation 1 - "+obj["name"]+".csv", outputArrayOfObjects[i]["extraDataPointsArray"])


The Institute of Fiscal Studies report that demonstrates that high earners have been slipping into higher tax thresholds because the recovery threshold for the personal allowance has not move in the past decade was title ['Dragging people into higher rates of tax'](https://www.ifs.org.uk/publications/14048) and was published in April 2019.

### Visualisation 2 - What does the UK income distribution look like?

The first graph takes the same data plotted in the BBC reality check article ['General election 2019: Does £80,000 put you in the top 5% of earners?'](https://www.bbc.co.uk/news/50517136). Namely, this is the distribution of income tax payers for the 2016/17 financial year published by [HMRC](https://www.gov.uk/government/statistics/percentile-points-from-1-to-99-for-total-income-before-and-after-tax).

For uplifting income levels from 2016/17 to 'projected' current income levels, I used the [Average Weekly Earnings](https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/datasets/averageweeklyearningsearn01) data published by the Office for National Statistics. It shows that the average weekly earnings for the whole economy across the financial year were £496. In February 2020 (the most up-to-date figures at time of publication) average weekly earnings were £545. Therefore, the uplift figure was  $\frac{545}{496}-1=9.8\%$.

The World Inequality Lab data can be accessed [here](https://wid.world/country/united-kingdom/). The most up-to-date data at the time of publication was from 2016. I could have tried to adjust this to the 2016/17 financial year using the Average Weekly Earnings data as a proxy adjustment factor, but I decided not to because I wanted to keep the calculations as simple and easy to follow as possible. For the same reason, I decided to uplift the World Inequality Lab data by the same 9.8% adjustment factor.

### Motivations behind Labour's tax policy

> Because the party was unwilling to ask middle-income earners to pay more, as they do in countries with greater government spending like Finland and Denmark, it necessitated putting a heavy burden on corporation tax to raise revenue.

In Finland, the tax-free personal allowance is €18,100. On additional income up to €27,200 the marginal tax rate is 6%. On additional income up to €44,800 the marginal tax rate is 17.25%. On additional income up to €78,500 the marginal tax rate is 21.25%. And income above €78,500 is taxed at a rate of 31.25%. Additionally, all residents pay a flat local income tax of between 16.5% and 23.5% depending on the municipality ([source](https://taxsummaries.pwc.com/finland/individual/taxes-on-personal-income))

Greater government spending is defined as a share of the country's GDP. These figures are plotted by the Data Website [Our World In Data](https://ourworldindata.org/government-spendinghttps://ourworldindata.org/government-spending) and show that Denmark and Finland spend 7 and 9 percentage points more of their GDP respectively than the UK.

> Both Owen Jones at The Guardian and Stephen Bush at the New Statesman reported that the £80,000 figure was chosen, instead of a lower figure of £50,000, because internal polling indicated that while few people earn £50,000 (just 10% of taxpayers), many saw it as a realistic future income. 

Owen Jones stated this in ['Corbyn and McDonnell tax radicals? I say they aren’t radical enough '](https://www.theguardian.com/commentisfree/2018/feb/15/corbyn-mcdonnell-tax-radical-labour), an opinion piece published in the Guardian in February 2018. Stephen Bush made the same point in [this](https://www.newstatesman.com/politics/economy/2018/11/how-labour-s-tax-wobbles-may-help-conservatives-win-back-lost-voters) November 2018 article for the New Statesman.

> In April 2017, Shadow Chancellor John McDonnell announced that the rich, by which he meant individuals earning over £70,000 to £80,000 a year, would be expected to pay more under a future Labour government. 

McDonnell said this during an interview on the BBC's Today programme and it was widely reported, including in [this](https://www.theguardian.com/money/2017/apr/19/how-much-earn-rich-70000-labour) article for the Guardian.

> ...YouGov published data showing what the general public thinks constitutes 'rich'.

You can read about YouGov's results and view the raw data [here](https://yougov.co.uk/topics/politics/articles-reports/2017/06/02/how-much-money-do-you-need-earn-year-be-rich).

### Visualisation 3 - Factors other than income that affect a person's financial prosperity

The regional income distribution data was taken from the [provisional 2019 figures](https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/datasets/placeofresidencebylocalauthorityashetable8) published by the Office of National Statistics as part of the Annual Survey of Hours and Earnings (ASHE). For these figures, the unit is not individuals but full-time employee jobs. 

The percentiles in between the provided figures were interpolated using HMRC's UK-wide employee earnings as a proxy. 

For example, if the 11th percentile lower threshold was £20,000 and the 21st percentile lower threshold was £25,000 for Region A, to interpolate the 16th percentile I would check HMRC's UK-wide distribution for these figures (they are £13,300, £14,400 and £15,500) and use them in the following calculation:

$$ RegionA(p16) = (25000-20000)\frac{14400-13300}{15500-13300}+20000 $$

This interpolation was conducted using the code below. Ultimately, the approach makes little impact compared to assuming a linear relationship between percentile and pre-tax earnings for the percentiles between published figures.

In [5]:
#Import Data
ASHERegionIncomeData = read_csv_as_list_dict("./Data/Visualisation 3 - Variations in income and wealth/CSV/Visualisation 3 - ASHE percentile data points by region.csv", ",", '"')

HMRCIncomeTaxPayerData = read_csv_as_list_dict("./Data/Visualisation 3 - Variations in income and wealth/CSV/Visualisation 3 - HMRC income tax payer data.csv", ",", '"')

In [12]:
def ConvertArrayOfObjectsToSingleObject(ArrayOfObjects, XFieldName, YFieldName):
    """
    This function simply convert the imported data (which is in list format) into an object.
    """
    
    outputObject = {}
    
    for obj in ArrayOfObjects:
        
        outputObject[obj[XFieldName]] = obj[YFieldName]
    
    return outputObject

HMRCIncomeTaxPayerDataObject = ConvertArrayOfObjectsToSingleObject(HMRCIncomeTaxPayerData, "Percentile", "Pretax income")


def CreateExtraDataPoints(MainDataPointsObject, ProxyDataObject, Classification):
    """
    Takes as input the published data points for a given region, the HMRC full income distribution being used as a proxy
    and the region name (which is simply used to name the outputted object)
    
    The function interpolate income thresholds for the percentiles between the published data points
    """
    outputArray = []
    
    mainDataPointsXValuesArray = [11,21,31,41,51,61,71,81,91]
    
    for numb in range(11,92):
        
        outputObject = {}
        
        outputObject["classification"] = Classification
        outputObject["xVariable"] = numb
        
        if numb in mainDataPointsXValuesArray:
            
            if MainDataPointsObject[str(numb)] == "NULL":
                
                outputObject["yVariable"] = "NULL"
            
            else:
                
                outputObject["yVariable"] = MainDataPointsObject[str(numb)]
            
        else:
            
            for i, d in enumerate(mainDataPointsXValuesArray):
                
                if numb > d and numb < mainDataPointsXValuesArray[i+1]:
                    
                    lowerBound = d
                    upperBound = mainDataPointsXValuesArray[i+1]
                    
                    if MainDataPointsObject[str(lowerBound)] == "NULL" or MainDataPointsObject[str(upperBound)] == "NULL":
                        
                        outputObject["yVariable"] = "NULL"
                        
                    else:
                    
                        lowerBoundYVariable = float(MainDataPointsObject[str(lowerBound)])
                        upperBoundYVariable = float(MainDataPointsObject[str(upperBound)])

                        proxyLowerBoundYVariable = float(ProxyDataObject[str(lowerBound)])
                        proxyUpperBoundYVariable = float(ProxyDataObject[str(upperBound)])

                        proxyDYVariable = float(ProxyDataObject[str(numb)])

                        ratio = (proxyDYVariable - proxyLowerBoundYVariable)/(proxyUpperBoundYVariable - proxyLowerBoundYVariable)


                        outputObject["yVariable"] = ratio*(upperBoundYVariable - lowerBoundYVariable) + lowerBoundYVariable
                    
    
        outputArray.append(outputObject)
    
    return outputArray

In [14]:
OutputArray = []

#for each region, interpolate figures between published data points

for obj in ASHERegionIncomeData:

    OutputArray += CreateExtraDataPoints(obj, HMRCIncomeTaxPayerDataObject, obj["Region"])
    

#Output Data
add_list_of_dictionaries_to_CSV("./Data/Visualisation 3 - Variations in income and wealth/Outputs/Visualisation 3 - percentile earnings data by region.csv", OutputArray)

[]

The wealth distribution data is lifted directly from Figure 5 of the [Total wealth in Great Britain: April 2016 to March 2018](https://www.ons.gov.uk/peoplepopulationandcommunity/personalandhouseholdfinances/incomeandwealth/bulletins/totalwealthingreatbritain/april2016tomarch2018) report published by the Office for National Statistics.

The report is based on the Wealth and Assets Survey and full methodology for the survey can be read [here](https://www.ons.gov.uk/peoplepopulationandcommunity/personalandhouseholdfinances/debt/methodologies/wealthandassetssurveyqmi). 
The survey is based on continuous data collection between April 2016 and March 2018. Net wealth is the sum value of all assets minus the value of all liabilities.According to the Office of National Statistics, the self-evaluation of wealth used by the
Wealth and Assets Survey typically yields higher estimates than other methods.

The data for income by household composition comes from [analysis](https://www.gov.uk/government/statistics/family-resources-survey-financial-year-201718) published by the Department of Work and Pensions based on the Family Resource Survey for 2017/18.

### How do our perceptions of richness change as our income levels change?

> YouGov repeated the survey in America in 2018, finding the same results. 67% of those earning below \\$60,000 considered a person earning \\$100,000 to be rich, whereas just 27% of people with incomes above \\$90,000 agreed.

A summary of YouGov's results in America and the raw survey data can be found [here](https://today.yougov.com/topics/economy/articles-reports/2019/01/14/how-much-money-do-you-need-earn-year-be-rich). The study was essentially identical to their UK study a year earlier.

> Richard Reeves, a senior fellow at the Brookings Institution, has described this link between income levels and perceptions of richness as the ' "Me? I'm not rich!" problem.' He has argued that the relationship makes implementing progressive tax reform more difficult.

Reeves talks about the problem in [this](https://www.brookings.edu/opinions/wealth-inequality-and-the-me-im-not-rich-problem/)
2015 article for the Brookings Institute. 

> ...For example, in 2015, former US president Barack Obama abandoned his plans to reduce tax breaks on college saving plans. The beneficiaries of the tax were mainly rich: the White House estimated that households with incomes over $200,000 received 70% of the tax breaks. 

In his article for the Brookings Institute, Reeves cites [this](https://blogs.wsj.com/washwire/2015/01/27/what-the-demise-of-obamas-529-plan-says-about-tax-reform/) story from the Wall Street Journal about Obama's plan for tax breaks on college saving plans. 

Guillermo Cruces, Martin Tetaz and Ricardo Perez-Truglia's research paper was title ['Biased perceptions of income distribution and preferences for redistribution: Evidence from a survey experiment'](https://www.researchgate.net/publication/309823043_Biased_perceptions_of_income_distribution_and_preferences_for_redistribution_Evidence_from_a_survey_experiment) and was published in the Journal of Public Economics in January 2013.

> The British Social Attitudes Survey asked a similar question in 2005. They found that only six in every one-hundred participants believed they were in the richest 20% of households (and just nine in every one-hundred believed they were in the bottom 20%).

Unfortunately, historical versions of the British Social Attitudes Survey are kept behind a steep paywall, but the figure is referenced in an interesting qualitative study by Ipsos-Moi called ['https://www.ipsos.com/ipsos-mori/en-uk/opinion-high-earners-2007-qualitative-research'](https://www.ipsos.com/ipsos-mori/en-uk/opinion-high-earners-2007-qualitative-research).

>There has been extensive research looking at the neurological and psychological effects of poverty. It has shown that, separated from objective levels of income, people's perception of their socioeconomic status - whether they feel poor - has a significant impact on chronic stress levels, physical health, mental health and cognitive development.

For more detail on this subject I would recommend 'The Inner Level: How More Equal Societies Reduce Stress, Restore Sanity and Improve Everyone’s Well-being' by Richard Wilkinson and Kate Pickett. In my opinion, _the_ definitive book on human behaviour 'Behave: The Biology of Humans at Our Best and Worst' by Robert M. Sapolsky also looks in detail at the neurological implications of perceptions of social status.

### Visualisation 5 - How do earnings vary by occupation?

Like the regional data, the income distributions by occupation data was taken from the [provisional 2019 figures](https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/datasets/regionbyoccupation2digitsocashetable3) published by the Office of National Statistic as part of the Annual Survey of Hours and Earnings (ASHE).

Three of the occupations to visualise were chosen for me; the earnings of doctors, solicitors and accountants were clearly of interest given Mr Smith's assertion that they all earn in excess of £80,000 a year. For the remaining twenty-one occupations, I aimed to select a diverse range of jobs that would be of interest to the reader. I also tried to select jobs for which there was a high quality of data.

Percentile figures were interpolated using the same method as for Visualisation 3.

In [15]:
ASHEOccupationIncomeData = read_csv_as_list_dict("./Data/Visualisation 5 - Income distribution by occupation/CSV/Visualisation 5 - ASHE percentile data points by occupation.csv", ",", '"')

HMRCIncomeTaxPayerData = read_csv_as_list_dict("./Data/Visualisation 5 - Income distribution by occupation/CSV/Visualisation 5 - HMRC income tax payer data.csv", ",", '"')

#convert array of objects into a single object
HMRCIncomeTaxPayerDataObject = ConvertArrayOfObjectsToSingleObject(HMRCIncomeTaxPayerData, "Percentile", "Pretax income")

OutputArray = []

for obj in ASHEOccupationIncomeData:

    OutputArray += CreateExtraDataPoints(obj, HMRCIncomeTaxPayerDataObject, obj["OccupationCode"])
    

#Output Data
add_list_of_dictionaries_to_CSV("./Data/Visualisation 5 - Income distribution by occupation/Outputs/Visualisation 5 - percentile earnings data by occupation.csv", OutputArray)

[]

The referenced interiew with Natalie Schmook for the American magazine Fast Car can be read [here](https://www.fastcompany.com/90330573/who-is-actually-middle-class).

> For example, when Labour first announced their proposed tax increase on incomes over £80,000, the Telegraph described it as a “raid on the middle classes." 

An article published by the telegraph on May 16th 2017 was titled ['Labour faces £30bn 'black hole' in spending plans as Corbyn launches tax raid on middle classes'](https://www.telegraph.co.uk/news/2017/05/16/general-election-2017-jeremy-corbyns-manifesto-will-hit-nearly/).

> In January, leadership candidate Lisa Nandy (now shadow foreign secretary) was asked by the Evening Standard if she would be prepared to raise the basic income tax rate and replied:<br><br> 
_"I do not believe that you can go to the country and argue for the level of investment in public services that we did at the last election without being honest and clear about where that money comes from. And that does mean raising money through tax revenue."_

That article was published on January 30th and titled ['Lisa Nandy demands Labour Party publishes in full secret report on the anti-Semitism scandal'](https://www.standard.co.uk/news/politics/lisa-nandy-anti-semitism-scandal-secret-report-a4349066.html).