# Data Story Gender Pay Gap

## Introduction

For more than two centuries women have been fighting for equal rights and equal treatment. It has become more and more a political issue.
In many Western societies equal rights for men and women are laid down in law.
This does not mean that men and women are treated equally, but in many contries gender discrimination is against the law.
A notorious difference between men and women is their salaries, how much they get paid for the same amount and quality of work.

This story will explore salaries of software developers, based on data from Stack Overflow (SO), an online community of software developers.
SO runs a yearly survey, asking its members questions about their age, skills, work, interests.
Data from these surveys are available and can be used to analyse the working conditions for these developers.
We will use the surveys from the years 2014 until 2022.

In particular, we will explore the correlation between yearly salary and features like age, country and job type.

## Salary and Age

Salaries tend to rise when people grow older. Let's see what the basic correlation between the two features Salary and Age is.

In [18]:
%run salary_age.ipynb

Clearly, this graph shows a rise in salary when people grow older.

We expect the same happening when the experience of the developer increases.

In [19]:
%run salary_yearscode.ipynb

In [20]:
%run gender_work_experience_2.ipynb

## The Gender Pay Gap

Based on supplied salaries in the SO surveys, we see that the average salary of men is <span>\$</span> 62,264 and for women it is <span>\$</span> 59,289.

If we define the Gender Pay Gap (GPG) as $\frac{averageSalaryMen - averageSalaryWomen}{averageSalaryMen}$,
the overall GPG is 4.8%, so women receive 4.8% less pay than men.

In [21]:
%run gender_work_experience.ipynb

Since women are earning, on average, less than men, we are going to search for factors that may influence this difference.

First let's analyse if this difference occurs in every age segment and in every country.

In [22]:
%run salaries_gender_age.ipynb

#### Analysis of GPG for different countries

The graph below shows that there are differences the GPG between countries. In all countries we see a slow decrease of the GPG. 
In the UK the GPG was the highest of all displayed countries, starting at 35% in 2014 and decreasing to 16% in 2022. In the UK a directive (see ...) is issued for organisations to report the GPG. This may raise the awareness and help to level out the salaries between men and women.
India shows a very volatile GPG. The number of respondents from India and their ratio of responding male and female developers does not indicate a possible explanation of this volatility, compared to other countries.

In [23]:
%run salaries_gender_countries.ipynb

## GPG across the world for different age groups

The world map, showing the gaps across the world for different age groups, gives an interesting insight in the development along the age axis. 
For younger people (less than 25 years old) the gap is low in most countries, but at the age 25-34 the maps colors red, indicating high GPGs in many countries.
At higher ages the GPGs decrease again and they fade out after the age of 55.
The high GPG for people at ages 25-34 may be explained by higher starting salaries for men, but available data from the OS surveys does not enable us to check this.

In [24]:
%run gender_salary_gap_map.ipynb

## GPG in relation to education

The levels of education, displayed in the graph below, shows a slightly higher level of education for female developers.
If higher education is resulting in higher salaries, the higher education levels for women should result in overall higher salaries for women.  
Since this is not the case, we can conclude that other factors cancel out this positive factor for women salaries.

In [25]:
%run gender_education.ipynb

## Gender ratio

Another interesting graph being shown below shows the correlation between the GPG and the ratio of men and women for specific developers types. 
Each of the different developer types (e.g. ....) is indicated as a single dot in above graph.
The graph shows that where the ratio increases (more men than women doing this type of job) the GPG tends to decrease.

Note: We should look into the job types to get a better picture.

In [26]:
%run gender_devtype.ipynb