# Analyze influence of gender
In this notebook we investigate what the correlation is between the difference in gender, on the rate of unemployment.

In [30]:
import os

import pandas as pd
import numpy as np
import matplotlib as plt
import seaborn as sns
%matplotlib inline

BASE = os.path.join(os.pardir, "data", "output")

In [19]:
# Import file
df = pd.read_pickle(os.path.join(BASE, "filtered_sex_country.pkl")
df_t = df.copy()

Let's check the data

In [6]:
df.head()

sex,F,M,T
geo,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
AT,3.83,4.15,4.01
BE,4.24,4.83,4.56
BG,3.75,4.25,4.03
CH,3.96,3.71,3.81
CY,6.38,5.63,6.0


Prefer to work with a column instead of an index

In [3]:
df.reset_index(inplace=True)

In [4]:
df.head()

sex,geo,F,M,T
0,AT,3.83,4.15,4.01
1,BE,4.24,4.83,4.56
2,BG,3.75,4.25,4.03
3,CH,3.96,3.71,3.81
4,CY,6.38,5.63,6.0


In [11]:
df["geo"].unique()

array(['AT', 'BE', 'BG', 'CH', 'CY', 'CZ', 'DE', 'DK', 'EA', 'EA18',
       'EA19', 'EE', 'EL', 'ES', 'EU27_2007', 'EU27_2020', 'EU28', 'FI',
       'FR', 'HR', 'HU', 'IE', 'IS', 'IT', 'LT', 'LU', 'LV', 'MT', 'NL',
       'NO', 'PL', 'PT', 'RO', 'SE', 'SI', 'SK', 'TR', 'UK', 'US'],
      dtype=object)

Filter for Germany, Greece and the EU28 countries

In [23]:
df.loc[(df["geo"] == "EU28") | (df["geo"] == "DE")  | (df["geo"] == "EL")]

sex,geo,F,M,T
6,DE,3.16,3.67,3.44
12,EL,19.05,12.74,15.51
16,EU28,5.72,5.12,5.4


Difference between the unemployment rate between males and females for the EU28 countries, Greece and Germany

In [55]:
diff_eu28 = df["F"][16] - df["M"][16]
diff_greece = df["F"][12] - df["M"][12]
diff_germany = df["M"][6] - df["F"][6]
t_eu28 = df["T"][16]
t_greece = df["T"][12]
t_germany = df["T"][6]
diff_t_germany_eu28 = df["T"][16] - df["T"][6]
diff_t_greece_eu28 = df["T"][12] - df["T"][16]
diff_t_greece_germany = df["T"][12] - df["T"][6]

print(f"""The table this analysis is based on includes the overall mean for all years. This analysis focuses
on the difference between the unemployment rate of males and females for Germany, Greece and the EU28 countries
combined. The data is about the active population, meaning people who are unemployed and willing to work. This 
already eliminates other factors e.g.  maternity/paternity leave, stay-at-home mothers/dads, among others.

The total unemployment for rate, total of males and females (T), for all EU28 countries is {round(t_eu28, 2)}. 
Germany is doing better in this regard and has a lower total unemployment rate with a difference of 
{round(diff_t_germany_eu28,2)} percent points compared to the EU28 countries. Greece has the highest unemployment rate of {round(t_greece, 2)}, 
which is {round(diff_t_greece_eu28, 2)} percent points higher than the EU28 countries and {round(diff_t_greece_germany, 2)} 
percent points higher than Germany.

For all EU28 countries combined, we see there is a slight difference in the unemployment rate 
for male and female. The total overall difference for all EU28 countries is {round(diff_eu28, 2)} percent points.
For Germany, the difference between the 2 genders is smaller than all EU28 countries combined, namely {round(diff_germany, 2)}.
Plus, in Germany the unemployment rate for males is slightly higher than for females. 
The biggest difference is visible in Greece, there's a difference of {round(diff_greece, 2)}. The unemployment rate 
among females is much higher than for males. A 2009 research paper states that in Greece there is clear evidence
of gender differences and that there is evidence of female employment discbrimination (article:'Gender employment 
discrimination: Greece and the United Kingdom', written by Ilias Livanos, Çagri Yalkin, Imanol Nuñez, 2009) """)


The table this analysis is based on includes the overall mean for all years. This analysis focuses
on the difference between the unemployment rate of males and females for Germany, Greece and the EU28 countries
combined. The data is about the active population, meaning people who are unemployed and willing to work. This 
already eliminates other factors e.g.  maternity/paternity leave, stay-at-home mothers/dads, among others.

The total unemployment for rate, total of males and females (T), for all EU28 countries is 5.4. 
Germany is doing better in this regard and has a lower total unemployment rate with a difference of 
1.96 percent points compared to the EU28 countries. Greece has the highest unemployment rate of 15.51, 
which is 10.11 percent points higher than the EU28 countries and 12.07 
percent points higher than Germany.

For all EU28 countries combined, we see there is a slight difference in the unemployment rate 
for male and female. The total overall difference for all EU28 

## Transpose data for Tableau
To be better able to make a stacked bar graph in Tableau, we need to transpose the data.

In [20]:
# Transpose
df_t = df_t.T

# Drop other countries
keep = ["DE", "EL", "EU28"]
drop = [col for col in df_t.columns if col not in keep]
df_t.drop(columns=drop, inplace=True)

In [25]:
# Drop labels
df_t.columns.name = ""
df_t.index.name = ""

In [26]:
df_t

Unnamed: 0,DE,EL,EU28
,,,
F,3.16,19.05,5.72
M,3.67,12.74,5.12
T,3.44,15.51,5.4


In [31]:
# Export the file to CSV
df_t.to_csv(os.path.join(BASE, 
                         "output",
                         "sex_transposed.csv"))