# Brecha salarial entre hombres y mujeres

![](https://cdn.prod.website-files.com/5f4f67c5950db17954dd4f52/6422fc2899987b74dc89c4b6_diferencias%20salariales.webp)

### Definición del problema

La brecha de ingresos entre hombres y mujeres es un problema que afecta a las mujeres de todos los países. En  la actualidad hay un debate sobre si la brecha de género es algo del pasado o sigue ocurriendo en la actualidad. Por lo que en el siguiente EDA tiene como finalidad mostrar la situación en la ultima decada y en que sectores laborales hay más diferencias.

### Sobre el conjunto de datos

Este dataSet contiene el salario promedio por hora en los empleados, por sexo, edad y ocupación, mediante el Indicador 8.5.1 (Ingresos promedio por hora de los empleados, por sexo, edad, ocupación). Los datos han sido recogidos en ILOSTAT (International Labour Organization),un portal estadistico que recoge los datos sociolaborales. 


### Preguntas clave

1. ¿Existe diferencia en ingresos entre hombres y mujeres?

2. ¿Hay alguna profesion en la que las mujeres ganen mas que los hombres?

3. ¿Cuál es el sector laboral en el que la brecha de ingresos es mayor?

4. ¿Cuál es el país con las mayores diferencias salariales?

5. ¿ Hay algún patrón en países respecto a la brecha salarial?

6. ¿ Ha evolucionado en los ultimos años la brecha salarial?


### Librerías

In [9]:
#Tratamiento de los datos
import numpy as np
import pandas as pd
#Visualización
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import scipy.stats as stats
#Estadisticas
from scipy.stats import f_oneway,pearsonr, ttest_ind, ttest_rel, shapiro, levene, mannwhitneyu, kruskal, chi2_contingency
import warnings
warnings.filterwarnings('ignore')
import sys
import os



### Carga de datos

In [10]:
df=pd.read_csv('/Users/isaromobru/Desktop/EDA/data/raw/SDG_0851_SEX_OCU_NB_A-filtered-2024-12-17.csv')

# 1. Primera Observación

### Estructura del Dataframe en bruto

In [11]:
pd.set_option('display.max_colwidth',None)
df.copy()
df

Unnamed: 0,ref_area.label,source.label,indicator.label,sex.label,classif1.label,time,obs_value,obs_status.label,note_classif.label,note_indicator.label,note_source.label
0,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Total,2018,293.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
1,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2018,1315.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
2,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 2 ~ medium,2018,1109.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
3,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 1 ~ low,2018,189.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
4,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Male,Occupation (Skill level): Total,2018,303.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
...,...,...,...,...,...,...,...,...,...,...,...
7741,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 5. Service and sales workers,2016,16.90,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7742,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,21.91,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7743,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 7. Craft and related trades workers,2016,24.95,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7744,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2016,25.40,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"


1. Estructura del registro

In [52]:
df.shape

(7746, 11)

2. Número de Columnas

In [104]:
df.columns

Index(['ref_area.label', 'source.label', 'indicator.label', 'sex.label',
       'classif1.label', 'time', 'obs_value', 'obs_status.label',
       'note_classif.label', 'note_indicator.label', 'note_source.label'],
      dtype='object')

3. Espacio e información general 

In [105]:
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7746 entries, 0 to 7745
Data columns (total 11 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   ref_area.label        7746 non-null   object 
 1   source.label          7746 non-null   object 
 2   indicator.label       7746 non-null   object 
 3   sex.label             7746 non-null   object 
 4   classif1.label        7746 non-null   object 
 5   time                  7746 non-null   int64  
 6   obs_value             7690 non-null   float64
 7   obs_status.label      319 non-null    object 
 8   note_classif.label    0 non-null      float64
 9   note_indicator.label  7746 non-null   object 
 10  note_source.label     7746 non-null   object 
dtypes: float64(2), int64(1), object(8)
memory usage: 665.8+ KB


4. Vista general de Dataframe

In [6]:
df.head()

Unnamed: 0,ref_area.label,source.label,indicator.label,sex.label,classif1.label,time,obs_value,obs_status.label,note_classif.label,note_indicator.label,note_source.label
0,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Total,2018,293.0,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
1,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2018,1315.0,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
2,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 2 ~ medium,2018,1109.0,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
3,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 1 ~ low,2018,189.0,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
4,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Male,Occupation (Skill level): Total,2018,303.0,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"


In [7]:
df.dtypes

ref_area.label           object
source.label             object
indicator.label          object
sex.label                object
classif1.label           object
time                      int64
obs_value               float64
obs_status.label         object
note_classif.label      float64
note_indicator.label     object
note_source.label        object
dtype: object

In [108]:
df.tail()

Unnamed: 0,ref_area.label,source.label,indicator.label,sex.label,classif1.label,time,obs_value,obs_status.label,note_classif.label,note_indicator.label,note_source.label
7741,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 5. Service and sales workers,2016,16.9,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7742,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,21.91,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7743,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 7. Craft and related trades workers,2016,24.95,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7744,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2016,25.4,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7745,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 9. Elementary occupations,2016,16.32,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"


In [110]:
df.isnull().sum()

ref_area.label             0
source.label               0
indicator.label            0
sex.label                  0
classif1.label             0
time                       0
obs_value                 56
obs_status.label        7427
note_classif.label      7746
note_indicator.label       0
note_source.label          0
dtype: int64

Se puede observar que hay 56 valores perdidos en 'obs_value' y las columnas 'obs_status.label', 'note_classif.label', se encuentran practicamente vacías.

## 2. Tipos de datos de datos 

|Nombre Variable	|Tipo Propuesto	|Descripción|	tipo Actual|	Missing|
|---|---|---|---|---|
|ref_area.label  | categorica|País|	object|0
|source.label |Categorica |tipo de encuesta|object|0
|indicator.label |Categorica |Indicador SDG relacionado con ingresos medios por hora.|object|0
|sex.label |Categorica|Género y total|object|0
|classif1.label | Categorica |clasificación ocupación  según niveles de habilidad.|object|0
|time  | datatime| año de referencia de los datos.| int|0
|obs_value  | float|salario promedio por hora|float|56
|obs_status.label  |--|---|object|7427|
|note_indicator.label |--|---|float| 7746|
|note_source.label |str|Información adicional sobre la fuente (período de referencia, actividad, etc.)|object|0|



### Cardinalidad

In [8]:
cardinalidades = df.nunique()
cardinalidades_porcentaje = (df.nunique() / len(df)) * 100
# Clasificar usando map y lambda
categorias = cardinalidades.map(lambda x: 'Baja' if x < 10 else ('Media' if x <= 50 else 'Alta'))
cardinalidad=pd.DataFrame({'card_abs':cardinalidades,'card_relat(%)':cardinalidades_porcentaje,'Tipo':categorias})
cardinalidad

Unnamed: 0,card_abs,card_relat(%),Tipo
ref_area.label,39,0.503486,Media
source.label,14,0.180738,Media
indicator.label,1,0.01291,Baja
sex.label,3,0.03873,Baja
classif1.label,28,0.361477,Media
time,11,0.142009,Media
obs_value,3741,48.295895,Alta
obs_status.label,2,0.02582,Baja
note_classif.label,0,0.0,Baja
note_indicator.label,52,0.671314,Alta


1. Columnas con alta cardinalidad:
- obs_value: Debería ser numérica (datos observados).
- note_indicator.label y ref_area.label: Son variables categóricas útiles para agrupar o segmentar datos.
2. Columnas con baja cardinalidad:
- sex.label, obs_status.label, source.label: Representan categorías claras. Pueden ser convertidas a tipo categórico para optimizar espacio.
3. Columnas redundantes o vacías:
- note_classif.label no tiene datos y puede eliminarse.
- indicator.label tiene solo un valor único, por lo que no aporta variabilidad.

### Tipos de datos apropiados para cada variable

Observando hay que cambiar el tipo de datos para poder manejarlos y como cambio adicional cambiar el nombre de variables para que sea más fácil identidicarla

In [111]:
# Convertir los object en categoricas 
text_columns = ['ref_area.label', 'source.label', 'indicator.label', 'sex.label', 'classif1.label', 
                'obs_status.label', 'note_indicator.label', 'note_source.label']

for col in text_columns:
    df[col] = df[col].astype('category')


### variable time

In [112]:
df.dtypes

ref_area.label          category
source.label            category
indicator.label         category
sex.label               category
classif1.label          category
time                       int64
obs_value                float64
obs_status.label        category
note_classif.label       float64
note_indicator.label    category
note_source.label       category
dtype: object

In [113]:
df['time']=pd.to_datetime(df['time'],format='%Y').dt.year

In [114]:
df['time'].sort_values(ascending=False)

844     2023
5264    2023
854     2023
855     2023
840     2023
        ... 
720     2013
721     2013
722     2013
723     2013
724     2013
Name: time, Length: 7746, dtype: int32

### consistencia en el naming

Cambio de nombre de las variables para que sea mas intuitiva en Castellano y snake_case

In [115]:
df.rename(columns={
    'ref_area.label': 'País',
    'time': 'año',
    'sex.label': 'sexo',
    'obs_value': 'prom_ganancias',
    'classif1.label': 'nivel_laboral',
    'note_indicator.label':'moneda'
   
    
}, inplace=True)
df


Unnamed: 0,País,source.label,indicator.label,sexo,nivel_laboral,año,prom_ganancias,obs_status.label,note_classif.label,moneda,note_source.label
0,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Total,2018,293.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
1,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2018,1315.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
2,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 2 ~ medium,2018,1109.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
3,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Total,Occupation (Skill level): Skill level 1 ~ low,2018,189.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
4,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Male,Occupation (Skill level): Total,2018,303.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
...,...,...,...,...,...,...,...,...,...,...,...
7741,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 5. Service and sales workers,2016,16.90,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7742,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,21.91,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7743,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,Occupation (ISCO-08): 7. Craft and related trades workers,2016,24.95,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7744,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Sex: Female,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2016,25.40,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"


Cambiar etiquetas de los nombres para que sean más manejables

In [116]:
df['sexo'] = df['sexo'].replace({
    'Sex: Male': 'Hombres',
    'Sex: Female': 'Mujeres',
    'Sex: Total': 'Total'})

In [117]:
df

Unnamed: 0,País,source.label,indicator.label,sexo,nivel_laboral,año,prom_ganancias,obs_status.label,note_classif.label,moneda,note_source.label
0,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Total,Occupation (Skill level): Total,2018,293.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
1,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2018,1315.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
2,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Total,Occupation (Skill level): Skill level 2 ~ medium,2018,1109.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
3,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Total,Occupation (Skill level): Skill level 1 ~ low,2018,189.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
4,Albania,ES - Structure of Earnings Survey,SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Hombres,Occupation (Skill level): Total,2018,303.00,,,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
...,...,...,...,...,...,...,...,...,...,...,...
7741,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Mujeres,Occupation (ISCO-08): 5. Service and sales workers,2016,16.90,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7742,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,21.91,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7743,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Mujeres,Occupation (ISCO-08): 7. Craft and related trades workers,2016,24.95,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7744,Ukraine,"ES - Survey on wages of employees by gender, age, education and Occupational group",SDG indicator 8.5.1 - Average hourly earnings of employees by sex (Local currency),Mujeres,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2016,25.40,,,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"


### Limpieza de variables

In [118]:
#Voy a eliminar las columnas source.label', 'indicator.label', porque no me dan información relevante y obs_status.label	note_classif.label ya que estan vacías 
df.drop(columns=['source.label','indicator.label','obs_status.label', 'note_classif.label'], inplace=True)



In [119]:
df

Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
0,Albania,Total,Occupation (Skill level): Total,2018,293.00,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
1,Albania,Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2018,1315.00,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
2,Albania,Total,Occupation (Skill level): Skill level 2 ~ medium,2018,1109.00,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
3,Albania,Total,Occupation (Skill level): Skill level 1 ~ low,2018,189.00,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
4,Albania,Hombres,Occupation (Skill level): Total,2018,303.00,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
...,...,...,...,...,...,...,...
7741,Ukraine,Mujeres,Occupation (ISCO-08): 5. Service and sales workers,2016,16.90,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7742,Ukraine,Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,21.91,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7743,Ukraine,Mujeres,Occupation (ISCO-08): 7. Craft and related trades workers,2016,24.95,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"
7744,Ukraine,Mujeres,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2016,25.40,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH),"Geographical coverage: Total national, excluding some areas"


Viendo el dataframe se puede observar que los valores estan medidos por diferentes monedas 

In [120]:
df['moneda'].values 

['Currency: ALB - Lek (ALL)', 'Currency: ALB - Lek (ALL)', 'Currency: ALB - Lek (ALL)', 'Currency: ALB - Lek (ALL)', 'Currency: ALB - Lek (ALL)', ..., 'Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH)', 'Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH)', 'Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH)', 'Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH)', 'Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: UKR - Hryvnia (UAH)']
Length: 7746
Categories (52, object): ['Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH -

Por lo que se tomara una decisión sobre que hacer con los datos y se eligirá aquellos países que estén en la zona Euro

## 3. Identificar valores duplicados o errores evidentes

### Duplicados exacto

In [121]:
# Identificar duplicados exactos
duplicados_exactos = df[df.duplicated()]

# Número de duplicados exactos
n_duplicados_exactos = len(duplicados_exactos)

duplicados_exactos.head()


Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label


### Duplicados parciales

In [122]:
# Identificar duplicados parciales basados en ciertas columnas
columnas_a_verificar = ['País', 'año', 'sexo']
duplicados_parciales = df[df.duplicated(subset=columnas_a_verificar)]

# Número de duplicados parciales
len(duplicados_parciales)

duplicados_parciales.head()
# Identificar errores evidentes en los datos (problemas de codificación y parsing)
# Comprobando si hay valores mal decodificados (UTF-8 o ASCII)
# Ver si alguna columna contiene caracteres extraños no imprimibles
errores_codificacion = df.applymap(lambda x: isinstance(x, str) and not x.isprintable())

# Resumen de resultados
{
    "Duplicados Exactos": len(duplicados_exactos),
    "Duplicados Parciales": len(duplicados_parciales),
    "Errores de Codificación": errores_codificacion.any().sum()  # Total de columnas con errores
}


{'Duplicados Exactos': 0,
 'Duplicados Parciales': 7224,
 'Errores de Codificación': np.int64(0)}

### Ver Valores perdidos

Se puede observar 56 valores perdidos en la columna obs_Value que se tiene que ver el porque

In [124]:
filas_con_perdidos = df[df.isna().any(axis=1)]
filas_con_perdidos


Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
392,Bosnia and Herzegovina,Mujeres,Occupation (Skill level): Not elsewhere classified,2023,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM) | Break in series: Methodology revised,Repository: ILO-STATISTICS - Micro data processing
425,Bosnia and Herzegovina,Mujeres,Occupation (ISCO-08): 0. Armed forces occupations,2023,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM) | Break in series: Methodology revised,Repository: ILO-STATISTICS - Micro data processing
584,Bosnia and Herzegovina,Mujeres,Occupation (Skill level): Not elsewhere classified,2016,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
613,Bosnia and Herzegovina,Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2016,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
617,Bosnia and Herzegovina,Mujeres,Occupation (ISCO-08): 0. Armed forces occupations,2016,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
632,Bosnia and Herzegovina,Mujeres,Occupation (Skill level): Not elsewhere classified,2015,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
661,Bosnia and Herzegovina,Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2015,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
665,Bosnia and Herzegovina,Mujeres,Occupation (ISCO-08): 0. Armed forces occupations,2015,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
680,Bosnia and Herzegovina,Mujeres,Occupation (Skill level): Not elsewhere classified,2014,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing
709,Bosnia and Herzegovina,Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2014,,Central tendency measure: Mean or mean of the midpoints of the intervals provided | Currency: BIH - Marka (BAM),Repository: ILO-STATISTICS - Micro data processing


In [74]:
df[df['País']=='Spain']

Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
1884,Spain,Total,Occupation (Skill level): Total,2021,17.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
1885,Spain,Total,Occupation (Skill level): Skill levels 3 and 4 ~ high,2021,79.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
1886,Spain,Total,Occupation (Skill level): Skill level 2 ~ medium,2021,68.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
1887,Spain,Total,Occupation (Skill level): Skill level 1 ~ low,2021,11.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
1888,Spain,Hombres,Occupation (Skill level): Total,2021,17.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
...,...,...,...,...,...,...,...
2257,Spain,Mujeres,Occupation (ISCO-08): 5. Service and sales workers,2013,10.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
2258,Spain,Mujeres,"Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers",2013,0.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
2259,Spain,Mujeres,Occupation (ISCO-08): 7. Craft and related trades workers,2013,10.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"
2260,Spain,Mujeres,"Occupation (ISCO-08): 8. Plant and machine operators, and assemblers",2013,11.0,Job coverage: Main job currently held | Working time arrangement coverage: Full-time and part time workers | Currency: ESP - Euro (EUR),"Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees"



aqui se puede observar los países donde falta los datos, en otros faltan años
- Eliminar esos países 
- proporcionar un tratamiento de datos

## ocupación

In [125]:
df['nivel_laboral'].unique()

['Occupation (Skill level): Total', 'Occupation (Skill level): Skill levels 3 and 4 ~ high', 'Occupation (Skill level): Skill level 2 ~ medium', 'Occupation (Skill level): Skill level 1 ~ low', 'Occupation (ISCO-08): Total', ..., 'Occupation (ISCO-88): 7. Craft and related trades workers', 'Occupation (ISCO-88): 8. Plant and machine operators and assemblers', 'Occupation (ISCO-88): 9. Elementary occupations', 'Occupation (ISCO-08): X. Not elsewhere classified', 'Occupation (ISCO-88): 0. Armed forces']
Length: 28
Categories (28, object): ['Occupation (ISCO-08): 0. Armed forces occupations', 'Occupation (ISCO-08): 1. Managers', 'Occupation (ISCO-08): 2. Professionals', 'Occupation (ISCO-08): 3. Technicians and associate professionals', ..., 'Occupation (Skill level): Skill level 1 ~ low', 'Occupation (Skill level): Skill level 2 ~ medium', 'Occupation (Skill level): Skill levels 3 and 4 ~ high', 'Occupation (Skill level): Total']

In [126]:
len(df['nivel_laboral'].unique())

28

In [127]:
filtered_data_isco_08 = df[df['nivel_laboral'].str.contains('ISCO-08)', regex=False)]

In [130]:
filtered_data_isco_08.head()

Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
12,Albania,Total,Occupation (ISCO-08): Total,2018,293.0,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
13,Albania,Total,Occupation (ISCO-08): 1. Managers,2018,568.0,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
14,Albania,Total,Occupation (ISCO-08): 2. Professionals,2018,399.0,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
15,Albania,Total,Occupation (ISCO-08): 3. Technicians and associate professionals,2018,348.0,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"
16,Albania,Total,Occupation (ISCO-08): 4. Clerical support workers,2018,333.0,Currency: ALB - Lek (ALL),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Establishment size coverage: All establishments with at least 10 employees"


In [132]:
filtered_data_isco_08['nivel_laboral'].value_counts()

nivel_laboral
Occupation (ISCO-08): Total                                                    516
Occupation (ISCO-08): 1. Managers                                              504
Occupation (ISCO-08): 3. Technicians and associate professionals               504
Occupation (ISCO-08): 2. Professionals                                         504
Occupation (ISCO-08): 9. Elementary occupations                                504
Occupation (ISCO-08): 4. Clerical support workers                              504
Occupation (ISCO-08): 7. Craft and related trades workers                      504
Occupation (ISCO-08): 5. Service and sales workers                             504
Occupation (ISCO-08): 8. Plant and machine operators, and assemblers           502
Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers    449
Occupation (ISCO-08): 0. Armed forces occupations                              288
Occupation (ISCO-08): X. Not elsewhere classified                        

Se ha hecho un filtrado para seleccionar que tipo de Indicador de trabajadores se iba a utilizar para el analisis. Estaba compuesto por dos índices ISCO-08 y ISCO-88. Después de documentarme en la plataforma de datos estadisticos de donde provienen los datos, he decidido elegir ISCO-08 debido a que es la más reciente. Por otro lado, se cambiará las etiquetas para que sea más sencillo a la hora de graficar

In [144]:
filtered_data_isco_08['nivel_laboral'] = filtered_data_isco_08['nivel_laboral'].replace({
    'Occupation (ISCO-08): Total': 'Total',
    'Occupation (ISCO-08): 1. Managers': 'Directores y gerentes',
    'Occupation (ISCO-08): 2. Professionals': 'Profesionales científicos e intelectuales',
    'Occupation (ISCO-08): 3. Technicians and associate professionals': 'Técnicos y profesionales de nivel medio',
    'Occupation (ISCO-08): 4. Clerical support workers': 'Personal de apoyo administrativo',
    'Occupation (ISCO-08): 5. Service and sales workers': 'Trabajadores de los servicios y vendedores de comercios y mercados',
    'Occupation (ISCO-08): 6. Skilled agricultural, forestry and fishery workers': 'Agricultores y trabajadores calificados agropecuarios, forestales y pesqueros',
    'Occupation (ISCO-08): 7. Craft and related trades workers': 'Oficiales, operarios y artesanos de artes mecánicas y de otros oficios',
    'Occupation (ISCO-08): 8. Plant and machine operators, and assemblers': 'Operadores de instalaciones y máquinas y ensambladores',
    'Occupation (ISCO-08): 9. Elementary occupations': 'Ocupaciones elementales',
    'Occupation (ISCO-08): 0. Armed forces occupations': 'Ocupaciones militares',
    'Occupation (ISCO-08): X. Not elsewhere classified': 'No clasificado'
    })

In [145]:
print(filtered_data_isco_08['nivel_laboral'].unique())


['Total', 'Directores y gerentes', 'Profesionales científicos e intelectuales', 'Técnicos y profesionales de nivel medio', 'Personal de apoyo administrativo', ..., 'Oficiales, operarios y artesanos de artes mecánicas y de otros oficios', 'Operadores de instalaciones y máquinas y ensambladores', 'Ocupaciones elementales', 'Ocupaciones militares', 'No clasificado']
Length: 12
Categories (27, object): ['Ocupaciones militares', 'Directores y gerentes', 'Profesionales científicos e intelectuales', 'Técnicos y profesionales de nivel medio', ..., 'Occupation (Skill level): Skill level 1 ~ low', 'Occupation (Skill level): Skill level 2 ~ medium', 'Occupation (Skill level): Skill levels 3 and 4 ~ high', 'Occupation (Skill level): Total']


## Países

In [146]:
df.head(), df.columns

(      País     sexo                                          nivel_laboral  \
 0  Albania    Total                        Occupation (Skill level): Total   
 1  Albania    Total  Occupation (Skill level): Skill levels 3 and 4 ~ high   
 2  Albania    Total       Occupation (Skill level): Skill level 2 ~ medium   
 3  Albania    Total          Occupation (Skill level): Skill level 1 ~ low   
 4  Albania  Hombres                        Occupation (Skill level): Total   
 
     año  prom_ganancias                     moneda  \
 0  2018           293.0  Currency: ALB - Lek (ALL)   
 1  2018          1315.0  Currency: ALB - Lek (ALL)   
 2  2018          1109.0  Currency: ALB - Lek (ALL)   
 3  2018           189.0  Currency: ALB - Lek (ALL)   
 4  2018           303.0  Currency: ALB - Lek (ALL)   
 
                                                                                                                                                                                                

In [147]:
eu_data= filtered_data_isco_08[filtered_data_isco_08['moneda'].str.contains('Euro', regex=False)]

Como hay diferentes monedas y es imposible comparar por el diferente valor que tiene cada moneda voy a realizar un filtrado con los países que tienen euros.

     (Alemania, Austria, Bélgica,Chipre, Eslovaquia, Eslovenia, España, Estonia, Finlandia, Francia, Grecia, Irlanda, Italia,Letonia, Lituania, Luxemburgo, Malta, Países Bajos y Portugal.)

In [148]:
print(eu_data.isnull().sum())

País                  0
sexo                  0
nivel_laboral         0
año                   0
prom_ganancias       19
moneda                0
note_source.label     0
dtype: int64


In [149]:
eu_data.head()

Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
51,Austria,Total,Total,2018,17.42,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
52,Austria,Total,Directores y gerentes,2018,36.58,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
53,Austria,Total,Profesionales científicos e intelectuales,2018,24.81,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
54,Austria,Total,Técnicos y profesionales de nivel medio,2018,19.93,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
55,Austria,Total,Personal de apoyo administrativo,2018,16.48,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"


### Años

In [150]:

years_by_country = eu_data.groupby('País')['año'].unique()
print(years_by_country)


País
Albania                                                                                                                 []
Austria                                                                                                       [2018, 2014]
Belarus                                                                                                                 []
Belgium                                                                                     [2018, 2017, 2016, 2015, 2014]
Bosnia and Herzegovina                                                                                                  []
Bulgaria                                                                                                            [2014]
Croatia                                                                                                                 []
Czechia                                                                                                                 []
Denmark    

In [151]:
eu_data.head(10)

Unnamed: 0,País,sexo,nivel_laboral,año,prom_ganancias,moneda,note_source.label
51,Austria,Total,Total,2018,17.42,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
52,Austria,Total,Directores y gerentes,2018,36.58,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
53,Austria,Total,Profesionales científicos e intelectuales,2018,24.81,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
54,Austria,Total,Técnicos y profesionales de nivel medio,2018,19.93,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
55,Austria,Total,Personal de apoyo administrativo,2018,16.48,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
56,Austria,Total,Trabajadores de los servicios y vendedores de comercios y mercados,2018,12.29,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
57,Austria,Total,"Oficiales, operarios y artesanos de artes mecánicas y de otros oficios",2018,15.59,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
58,Austria,Total,Operadores de instalaciones y máquinas y ensambladores,2018,15.11,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
59,Austria,Total,Ocupaciones elementales,2018,11.66,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"
60,Austria,Hombres,Total,2018,19.03,Working time arrangement coverage: Full-time and part time workers | Currency: AUT - Euro (EUR),"Data reference period: October | Economic activity coverage: Excluding agriculture, public administration, and activities of households as employers and of extraterritorial organisations and bodies | Reference group coverage: Employees | Establishment size coverage: All establishments with at least 10 employees"


Se ha elegido los datos de los países que tienen euro para poder hacer el analisis en una misma moneda, por otro lado, debido a que cada país tiene los datos años distintos se hara un analisis macro del año 2014 o 2018, año más actual que tiene la mayoría de países y la regresión de España.

In [152]:
eu_data.to_csv('datos_procesados3.csv', index=False)