<a href="https://colab.research.google.com/github/cristiandarioortegayubro/BA/blob/main/eda_02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![logo](https://github.com/cristiandarioortegayubro/BA/blob/main/dba.png?raw=true)

## **Análisis Exploratorio de los Datos**

**En este cuaderno colab se realizaran algunas tareas de preprocesamiento de los datos...**

## **Actualizando los módulos necesarios**

In [1]:
!pip install scikit-learn --upgrade

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


## **Carga de módulos**

### **Para análisis de los datos**

In [2]:
import pandas as pd
import numpy as np

### **Para preprocesamiento de los datos**

In [38]:
from sklearn.preprocessing import LabelEncoder

In [42]:
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import RobustScaler

## **Carga de Datos**

In [3]:
url = "https://raw.githubusercontent.com/LucaAPiattelli/Diplomatura_Business_Analytics_UDA/main/Modulo_08_Aprendizaje_Automatico/visualizacion.csv"

In [4]:
analisis = pd.read_csv(url)
analisis.head()

Unnamed: 0,satisfaction_level,last_evaluation,number_project,average_montly_hours,time_spend_company,Work_accident,left,promotion_last_5years,sales,salary
0,0.38,0.53,2,157,3,0,1,0,sales,low
1,0.8,0.86,5,262,6,0,1,0,sales,medium
2,0.11,0.88,7,272,4,0,1,0,sales,medium
3,0.72,0.87,5,223,5,0,1,0,sales,low
4,0.37,0.52,2,159,3,0,1,0,sales,low


**Se cambian los nombres de las columnas para mejor interpretacion...**

In [5]:
analisis.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14999 entries, 0 to 14998
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   satisfaction_level     14999 non-null  float64
 1   last_evaluation        14999 non-null  float64
 2   number_project         14999 non-null  int64  
 3   average_montly_hours   14999 non-null  int64  
 4   time_spend_company     14999 non-null  int64  
 5   Work_accident          14999 non-null  int64  
 6   left                   14999 non-null  int64  
 7   promotion_last_5years  14999 non-null  int64  
 8   sales                  14999 non-null  object 
 9   salary                 14999 non-null  object 
dtypes: float64(2), int64(6), object(2)
memory usage: 1.1+ MB


In [6]:
analisis.rename(columns={"satisfaction_level":"niveldesatisfaccion",
                         "last_evaluation":"ultimaevaluacion",
                         "number_project":"numerosdeproyectos",
                         "average_montly_hours":"horasmensualespromedio",
                         "time_spend_company":"tiempoenlaempresa",
                         "Work_accident":"accidentedetrabajo",
                         "left":"abandono",
                         "promotion_last_5years":"promocionultimos5años",
                         "sales":"ventas",
                         "salary":"sueldo"}, inplace= True)

In [7]:
analisis.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14999 entries, 0 to 14998
Data columns (total 10 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   niveldesatisfaccion     14999 non-null  float64
 1   ultimaevaluacion        14999 non-null  float64
 2   numerosdeproyectos      14999 non-null  int64  
 3   horasmensualespromedio  14999 non-null  int64  
 4   tiempoenlaempresa       14999 non-null  int64  
 5   accidentedetrabajo      14999 non-null  int64  
 6   abandono                14999 non-null  int64  
 7   promocionultimos5años   14999 non-null  int64  
 8   ventas                  14999 non-null  object 
 9   sueldo                  14999 non-null  object 
dtypes: float64(2), int64(6), object(2)
memory usage: 1.1+ MB


# **Limpieza y transformación de los datos**

## **Filtrado de datos**

In [8]:
analisis.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14999 entries, 0 to 14998
Data columns (total 10 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   niveldesatisfaccion     14999 non-null  float64
 1   ultimaevaluacion        14999 non-null  float64
 2   numerosdeproyectos      14999 non-null  int64  
 3   horasmensualespromedio  14999 non-null  int64  
 4   tiempoenlaempresa       14999 non-null  int64  
 5   accidentedetrabajo      14999 non-null  int64  
 6   abandono                14999 non-null  int64  
 7   promocionultimos5años   14999 non-null  int64  
 8   ventas                  14999 non-null  object 
 9   sueldo                  14999 non-null  object 
dtypes: float64(2), int64(6), object(2)
memory usage: 1.1+ MB


In [9]:
analisis.head(2)

Unnamed: 0,niveldesatisfaccion,ultimaevaluacion,numerosdeproyectos,horasmensualespromedio,tiempoenlaempresa,accidentedetrabajo,abandono,promocionultimos5años,ventas,sueldo
0,0.38,0.53,2,157,3,0,1,0,sales,low
1,0.8,0.86,5,262,6,0,1,0,sales,medium


In [10]:
analisis.rename(columns={"ventas":"sector"}, inplace=True)

In [11]:
analisis.filter(["sector", "sueldo"])

Unnamed: 0,sector,sueldo
0,sales,low
1,sales,medium
2,sales,medium
3,sales,low
4,sales,low
...,...,...
14994,support,low
14995,support,low
14996,support,low
14997,support,low


In [12]:
analisis.sueldo

0           low
1        medium
2        medium
3           low
4           low
          ...  
14994       low
14995       low
14996       low
14997       low
14998       low
Name: sueldo, Length: 14999, dtype: object

In [13]:
analisis.filter([0,1,2,5,7,19],axis=0)

Unnamed: 0,niveldesatisfaccion,ultimaevaluacion,numerosdeproyectos,horasmensualespromedio,tiempoenlaempresa,accidentedetrabajo,abandono,promocionultimos5años,sector,sueldo
0,0.38,0.53,2,157,3,0,1,0,sales,low
1,0.8,0.86,5,262,6,0,1,0,sales,medium
2,0.11,0.88,7,272,4,0,1,0,sales,medium
5,0.41,0.5,2,153,3,0,1,0,sales,low
7,0.92,0.85,5,259,5,0,1,0,sales,low
19,0.76,0.89,5,262,5,0,1,0,sales,low


In [14]:
analisis[analisis.sector=="support"]

Unnamed: 0,niveldesatisfaccion,ultimaevaluacion,numerosdeproyectos,horasmensualespromedio,tiempoenlaempresa,accidentedetrabajo,abandono,promocionultimos5años,sector,sueldo
46,0.40,0.55,2,147,3,0,1,0,support,low
47,0.57,0.70,3,273,6,0,1,0,support,low
48,0.40,0.54,2,148,3,0,1,0,support,low
49,0.43,0.47,2,147,3,0,1,0,support,low
50,0.13,0.78,6,152,2,0,1,0,support,low
...,...,...,...,...,...,...,...,...,...,...
14994,0.40,0.57,2,151,3,0,1,0,support,low
14995,0.37,0.48,2,160,3,0,1,0,support,low
14996,0.37,0.53,2,143,3,0,1,0,support,low
14997,0.11,0.96,6,280,4,0,1,0,support,low


## **Valores faltantes**

In [15]:
url2 = "https://raw.githubusercontent.com/LucaAPiattelli/Diplomatura_Business_Analytics_UDA/main/Modulo_08_Aprendizaje_Automatico/empleados.csv"

In [16]:
empleados = pd.read_csv(url2)

In [17]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


### **Eliminando los NaN**

In [18]:
empleados.dropna(how="any", inplace=True)

In [19]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


### **Completando los NaN**

In [24]:
empleados = pd.read_csv(url2)
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [25]:
empleados["edad"] = empleados.edad.fillna(empleados.edad.mean())
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,40.428571,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,40.428571,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [26]:
empleados = pd.read_csv(url2)
empleados["edad"] = round(empleados.edad.fillna(empleados.edad.mean()),0)
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,40.0,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,40.0,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [27]:
empleados = pd.read_csv(url2)
empleados["edad"] = empleados.edad.fillna(empleados.edad.median())
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,45.0,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,45.0,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [28]:
empleados["sueldo"] = empleados.sueldo.fillna(empleados.sueldo.median())
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,52000.0,,Operations,G3,723
1,Sandro Kumar,45.0,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,45.0,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,52000.0,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


## **Codificación**

In [29]:
empleados = pd.read_csv(url2)
empleados.dropna(how="any", inplace=True)
empleados.reset_index(drop=True, inplace=True)
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
1,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
2,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
3,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
4,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [30]:
empleadoscodificados = pd.get_dummies(empleados["sexo"])

In [31]:
empleadoscodificados

Unnamed: 0,F,M
0,0,1
1,1,0
2,1,0
3,1,0
4,0,1


In [32]:
empleados = empleados.join(empleadoscodificados)

In [33]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,F,M
0,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,0,1
1,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,1,0
2,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1,0
3,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,1,0
4,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,0,1


**O todo en una sola linea... primero cargamos el dataframe**

In [34]:
empleados = pd.read_csv(url2)
empleados.dropna(how="any", inplace=True)
empleados.reset_index(drop=True, inplace=True)
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
1,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
2,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
3,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
4,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


**Ahora si, una linea**

In [35]:
empleados = empleados.join(pd.get_dummies(empleados["sexo"]))

In [36]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,F,M
0,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,0,1
1,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,1,0
2,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1,0
3,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,1,0
4,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,0,1


In [37]:
empleados = pd.read_csv(url2)
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,723
1,Sandro Kumar,,16000.0,F,Finance,G0,520
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711
5,Samara Sharma,,62000.0,,Sales,G3,649
6,Joaquin Fleiman,54.0,,F,Operations,G3,53
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709


In [39]:
etiqueta_ordenada = LabelEncoder()

In [40]:
empleados["nivel_cod"] = etiqueta_ordenada.fit_transform(empleados["nivel"])

In [41]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,nivel_cod
0,Alan Smith,45.0,,,Operations,G3,723,2
1,Sandro Kumar,,16000.0,F,Finance,G0,520,0
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,1
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,2
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1
5,Samara Sharma,,62000.0,,Sales,G3,649,2
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,2
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,2
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,3


## **Escalas de las características de los datos**

In [43]:
escala = StandardScaler()

In [44]:
escala.fit(empleados["performance"].values.reshape(-1,1))

StandardScaler()

In [45]:
empleados["escala_perform"]=escala.transform(empleados["performance"].values.reshape(-1,1))

In [46]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,nivel_cod,escala_perform
0,Alan Smith,45.0,,,Operations,G3,723,2,0.505565
1,Sandro Kumar,,16000.0,F,Finance,G0,520,0,-0.408053
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,1,0.285037
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,2,-0.246032
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1,0.451558
5,Samara Sharma,,62000.0,,Sales,G3,649,2,0.172522
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,2,-2.509823
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,2,1.306668
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,3,0.442557


In [47]:
escalaminmax = MinMaxScaler()

In [48]:
escalaminmax.fit(empleados["performance"].values.reshape(-1,1))

MinMaxScaler()

In [49]:
empleados["escala_minmax"]=escalaminmax.transform(empleados["performance"].values.reshape(-1,1))

In [50]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,nivel_cod,escala_perform,escala_minmax
0,Alan Smith,45.0,,,Operations,G3,723,2,0.505565,0.790094
1,Sandro Kumar,,16000.0,F,Finance,G0,520,0,-0.408053,0.550708
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,1,0.285037,0.732311
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,2,-0.246032,0.59316
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1,0.451558,0.775943
5,Samara Sharma,,62000.0,,Sales,G3,649,2,0.172522,0.70283
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,2,-2.509823,0.0
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,2,1.306668,1.0
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,3,0.442557,0.773585


In [51]:
escala_robusta = RobustScaler()

In [52]:
escala_robusta.fit(empleados["performance"].values.reshape(-1,1))

RobustScaler()

In [53]:
empleados["escala_robusta"]=escala_robusta.transform(empleados["performance"].values.reshape(-1,1))

In [54]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,nivel_cod,escala_perform,escala_minmax,escala_robusta
0,Alan Smith,45.0,,,Operations,G3,723,2,0.505565,0.790094,0.316129
1,Sandro Kumar,,16000.0,F,Finance,G0,520,0,-0.408053,0.550708,-0.993548
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,1,0.285037,0.732311,0.0
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,2,-0.246032,0.59316,-0.76129
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,1,0.451558,0.775943,0.23871
5,Samara Sharma,,62000.0,,Sales,G3,649,2,0.172522,0.70283,-0.16129
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,2,-2.509823,0.0,-4.006452
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,2,1.306668,1.0,1.464516
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,3,0.442557,0.773585,0.225806


## **Transformación de las características de los datos**

In [55]:
empleados = pd.read_csv(url2)

In [56]:
def grado_performance(performance):
    if performance >= 700:
        return "A"
    elif performance < 700 and performance >= 500:
        return "B"
    else:
        return "C"

In [57]:
empleados["grado_performance"] = empleados.performance.apply(grado_performance)

In [58]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,grado_performance
0,Alan Smith,45.0,,,Operations,G3,723,A
1,Sandro Kumar,,16000.0,F,Finance,G0,520,B
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,B
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,B
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,A
5,Samara Sharma,,62000.0,,Sales,G3,649,B
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,C
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,A
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,A


In [59]:
empleados = pd.read_csv(url2)

In [60]:
empleados.performance = empleados.performance.apply(grado_performance)

In [61]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance
0,Alan Smith,45.0,,,Operations,G3,A
1,Sandro Kumar,,16000.0,F,Finance,G0,B
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,B
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,B
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,A
5,Samara Sharma,,62000.0,,Sales,G3,B
6,Joaquin Fleiman,54.0,,F,Operations,G3,C
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,A
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,A


## **Division de las columnas** 

In [62]:
empleados = pd.read_csv(url2)

In [63]:
empleados["primer_nombre"] = empleados.nombre.str.split(" ").map(lambda var: var[0])

In [64]:
empleados["ultimo_nombre"] = empleados.nombre.str.split(" ").map(lambda var: var[1])

In [65]:
empleados

Unnamed: 0,nombre,edad,sueldo,sexo,sector,nivel,performance,primer_nombre,ultimo_nombre
0,Alan Smith,45.0,,,Operations,G3,723,Alan,Smith
1,Sandro Kumar,,16000.0,F,Finance,G0,520,Sandro,Kumar
2,Jacinto Morgan,32.0,35000.0,M,Finance,G2,674,Jacinto,Morgan
3,Ernesto Chin,45.0,65000.0,F,Sales,G3,556,Ernesto,Chin
4,Fernanda Patel,30.0,42000.0,F,Operations,G2,711,Fernanda,Patel
5,Samara Sharma,,62000.0,,Sales,G3,649,Samara,Sharma
6,Joaquin Fleiman,54.0,,F,Operations,G3,53,Joaquin,Fleiman
7,Juana Wilkis,54.0,52000.0,F,Finance,G3,901,Juana,Wilkis
8,Leonardo Doberti,23.0,98000.0,M,Sales,G4,709,Leonardo,Doberti
