# **1. Limpieza de datos**

## 1.1 Importando datos a Python
**Importando librerias**

Las librerias más utilizadas para Ciencia de Datos son:

In [1]:
# Importando librerias
import pandas as pd
import os
import math as mt
import numpy as np
import scipy
import matplotlib.pyplot as plt
import seaborn
import statistics
from scipy import stats

**Importando datos desde Excel**

Para importar datos a Python desde un libro de Microsoft Excel, se utiliza la función **read_excel** de **pandas**. Su sintaxis es la siguiente:
*nombre_dataframe = pd.read_excel("ruta//nombrearchivo", sheet_name = "nombre_hoja")*

In [2]:
data_banco= pd.read_excel('Data_Banco_v2.xlsx', sheet_name='Data')
data_banco.head(5)

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Transaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Satisfaccion,Monto,Interes
0,62,4820,2,Cobro/Pago,Internacional,311.0,Muy Bueno,28893.0,335.97
1,62,4820,2,Cobro/Pago,Nacional,156.0,Malo,167069.0,137.23
2,62,4820,2,Cobro/Pago,Nacional,248.0,Regular,317249.0,143.3
3,62,4820,2,Cobro/Pago,Nacional,99.0,Regular,1764.92,42.55
4,62,4820,2,Cobro/Pago,Nacional,123.0,Muy Bueno,1835.69,199.47


In [3]:
data_banco.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24299 entries, 0 to 24298
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   Sucursal             24299 non-null  int64  
 1   Cajero               24299 non-null  int64  
 2   ID_Transaccion       24299 non-null  int64  
 3   Transaccion          24299 non-null  object 
 4   Entidad_Bancaria     24299 non-null  object 
 5   Tiempo_Servicio_seg  24299 non-null  float64
 6   Satisfaccion         24299 non-null  object 
 7   Monto                24299 non-null  object 
 8   Interes              24299 non-null  float64
dtypes: float64(2), int64(3), object(4)
memory usage: 1.7+ MB


In [4]:
data_sucursal= pd.read_excel('Data_Banco_v2.xlsx', sheet_name='Data_Sucursal')
data_sucursal.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 4 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   ID_Sucursal    5 non-null      int64 
 1   Sucursal       5 non-null      object
 2   Nuevo_Sistema  5 non-null      object
 3   Plataforma     5 non-null      object
dtypes: int64(1), object(3)
memory usage: 292.0+ bytes


In [5]:
data_cajero= pd.read_excel('Data_Banco_v2.xlsx', sheet_name='Data_Cajero')
data_cajero.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 27 entries, 0 to 26
Data columns (total 5 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Cajero           27 non-null     int64 
 1   Edad             27 non-null     int64 
 2   Sexo             27 non-null     object
 3   Nivel_Formacion  27 non-null     object
 4   Ingreso          27 non-null     int64 
dtypes: int64(3), object(2)
memory usage: 1.2+ KB


In [6]:
data_banco.isna().sum()

Sucursal               0
Cajero                 0
ID_Transaccion         0
Transaccion            0
Entidad_Bancaria       0
Tiempo_Servicio_seg    0
Satisfaccion           0
Monto                  0
Interes                0
dtype: int64

In [7]:
data_banco.duplicated()

0        False
1        False
2        False
3        False
4        False
         ...  
24294    False
24295    False
24296    False
24297    False
24298    False
Length: 24299, dtype: bool

## 1.2. Entendiendo los datos
Para entender los datos que tenemos disponibles, podemos:
*   Acceder a los primeros elementos de cada dataframe, con el metodo de lista .*head()*

*nombre_dataframe.head()*
*   A través de una tabla resumen con el método *.info()*

*data_banco.info()*


In [8]:
# Correcion de variables numerica
data_banco['Monto'] = pd.to_numeric(data_banco['Monto'].replace(',','.',regex = True))
data_banco.head(5)
data_banco.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24299 entries, 0 to 24298
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   Sucursal             24299 non-null  int64  
 1   Cajero               24299 non-null  int64  
 2   ID_Transaccion       24299 non-null  int64  
 3   Transaccion          24299 non-null  object 
 4   Entidad_Bancaria     24299 non-null  object 
 5   Tiempo_Servicio_seg  24299 non-null  float64
 6   Satisfaccion         24299 non-null  object 
 7   Monto                24299 non-null  float64
 8   Interes              24299 non-null  float64
dtypes: float64(3), int64(3), object(3)
memory usage: 1.7+ MB


## 1.3 Transformación de tipos datos
Ejecutaremos las recomendaciones con tipos de datos:
*   Transformación a *string*
*   Validación de datos númericos
*   Validación de datos fecha y/o categoricos

In [9]:
data_banco['Satisfaccion'].unique()

array(['Muy Bueno', 'Malo', 'Regular', 'Bueno', 'Muy Malo'], dtype=object)

In [10]:
# Categoricas (nominales y ordinales)
data_banco['Satisfaccion'] = pd.Categorical(data_banco['Satisfaccion'],
               categories = ['Muy Malo', 'Malo', 'Regular', 'Bueno', 'Muy Bueno'],
               ordered = True)
data_banco.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24299 entries, 0 to 24298
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype   
---  ------               --------------  -----   
 0   Sucursal             24299 non-null  int64   
 1   Cajero               24299 non-null  int64   
 2   ID_Transaccion       24299 non-null  int64   
 3   Transaccion          24299 non-null  object  
 4   Entidad_Bancaria     24299 non-null  object  
 5   Tiempo_Servicio_seg  24299 non-null  float64 
 6   Satisfaccion         24299 non-null  category
 7   Monto                24299 non-null  float64 
 8   Interes              24299 non-null  float64 
dtypes: category(1), float64(3), int64(3), object(2)
memory usage: 1.5+ MB


In [11]:
data_banco ['Entidad_Bancaria'] = pd.Categorical(data_banco['Entidad_Bancaria'],
  categories = ['Nacional', 'Internacional'],
  ordered = False)
data_banco.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24299 entries, 0 to 24298
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype   
---  ------               --------------  -----   
 0   Sucursal             24299 non-null  int64   
 1   Cajero               24299 non-null  int64   
 2   ID_Transaccion       24299 non-null  int64   
 3   Transaccion          24299 non-null  object  
 4   Entidad_Bancaria     24299 non-null  category
 5   Tiempo_Servicio_seg  24299 non-null  float64 
 6   Satisfaccion         24299 non-null  category
 7   Monto                24299 non-null  float64 
 8   Interes              24299 non-null  float64 
dtypes: category(2), float64(3), int64(3), object(1)
memory usage: 1.3+ MB


# **2. Creación de nuevas variables**
1.   Crear las variable Monto_total = Monto + Interes
2.   Crear la variable Tiempo_Servicio_min



In [12]:
# Monto Total (calculo directo)
data_banco['Monto_total'] = data_banco['Monto'] + data_banco ['Interes']
data_banco.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24299 entries, 0 to 24298
Data columns (total 10 columns):
 #   Column               Non-Null Count  Dtype   
---  ------               --------------  -----   
 0   Sucursal             24299 non-null  int64   
 1   Cajero               24299 non-null  int64   
 2   ID_Transaccion       24299 non-null  int64   
 3   Transaccion          24299 non-null  object  
 4   Entidad_Bancaria     24299 non-null  category
 5   Tiempo_Servicio_seg  24299 non-null  float64 
 6   Satisfaccion         24299 non-null  category
 7   Monto                24299 non-null  float64 
 8   Interes              24299 non-null  float64 
 9   Monto_total          24299 non-null  float64 
dtypes: category(2), float64(4), int64(3), object(1)
memory usage: 1.5+ MB


In [13]:
#Monto_total = data_banco['Monto'] + data_banco ['Interes']
#data_banco['Monto_total_2'] = Monto_total


In [14]:
#  Tiempo_servicio_min
# dataframe = dataframe.assign(Nueva_columna = función_lambda)
# lambda parametros: calculo_parametros
# lambda dataframe: calculo_dataframe_columnas
data_banco = data_banco.assign(Tiempo_Servicio_min = lambda df : df['Tiempo_Servicio_seg']/60)

In [15]:
data_banco['Tiempo_Servicio_min_2']=data_banco['Tiempo_Servicio_seg']/60
data_banco.head(5)

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Transaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Satisfaccion,Monto,Interes,Monto_total,Tiempo_Servicio_min,Tiempo_Servicio_min_2
0,62,4820,2,Cobro/Pago,Internacional,311.0,Muy Bueno,2889.3,335.97,3225.27,5.183333,5.183333
1,62,4820,2,Cobro/Pago,Nacional,156.0,Malo,1670.69,137.23,1807.92,2.6,2.6
2,62,4820,2,Cobro/Pago,Nacional,248.0,Regular,3172.49,143.3,3315.79,4.133333,4.133333
3,62,4820,2,Cobro/Pago,Nacional,99.0,Regular,1764.92,42.55,1807.47,1.65,1.65
4,62,4820,2,Cobro/Pago,Nacional,123.0,Muy Bueno,1835.69,199.47,2035.16,2.05,2.05


# **3. Estadística descriptiva**

## 3.1 Medidas de tendencia central

In [16]:
# Promedio
data_banco['Tiempo_Servicio_min'].mean()

2.5929998872252464

In [17]:
# Mediana (2.040871505892167)
data_banco['Tiempo_Servicio_min'].median()

2.040871505892167

## 3.2 Medidas de posicion

In [18]:
# Minimo
data_banco['Tiempo_Servicio_min'].min()

0.302196172877495

In [19]:
# Minimo
data_banco['Tiempo_Servicio_min'].max()

26.7116386425825

In [20]:
# Cuartiles
# dataframe[columna].quantile(np.arange(0, 1.25, 0.25))
data_banco['Tiempo_Servicio_min'].quantile(np.arange(0, 1.25, 0.25))

0.00     0.302196
0.25     1.261520
0.50     2.040872
0.75     3.295508
1.00    26.711639
Name: Tiempo_Servicio_min, dtype: float64

## 3.3 Medidas de dispersion

In [21]:
data_banco['Tiempo_Servicio_min'].var()

4.0006305210855775

In [22]:
data_banco['Tiempo_Servicio_min'].std()

2.0001576240600585

## 3.4 Resumen en Python

In [23]:
data_banco['Tiempo_Servicio_min'].describe()

count    24299.000000
mean         2.593000
std          2.000158
min          0.302196
25%          1.261520
50%          2.040872
75%          3.295508
max         26.711639
Name: Tiempo_Servicio_min, dtype: float64

In [24]:
#para todas las columnas
data_banco.describe()

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Tiempo_Servicio_seg,Monto,Interes,Monto_total,Tiempo_Servicio_min,Tiempo_Servicio_min_2
count,24299.0,24299.0,24299.0,24299.0,24299.0,24299.0,24299.0,24299.0,24299.0
mean,208.112968,2918.810774,4.433968,155.579993,1996.156149,86.404638,2082.560786,2.593,2.593
std,176.493661,1755.399031,2.9952,120.009457,816.146998,99.591424,856.154931,2.000158,2.000158
min,62.0,56.0,2.0,18.13177,53.82,0.0,59.71,0.302196,0.302196
25%,85.0,472.0,3.0,75.691187,1417.73,0.0,1476.03,1.26152,1.26152
50%,85.0,3678.0,3.0,122.45229,2087.43,50.11,2172.79,2.040872,2.040872
75%,443.0,3983.0,3.0,197.730457,2482.09,147.425,2592.235,3.295508,3.295508
max,586.0,5286.0,10.0,1602.698319,6278.02,756.92,6632.72,26.711639,26.711639


# **4. Tablas resumen**

## 4.1 Resumen de estadísticas

In [25]:
# data_frame[lista variables estadísticas + variables agrupamiento].groupby(lista variables agrupamiento).metodo

data_banco[['Satisfaccion','Tiempo_Servicio_seg']].groupby('Satisfaccion').mean()

Unnamed: 0_level_0,Tiempo_Servicio_seg
Satisfaccion,Unnamed: 1_level_1
Muy Malo,138.094382
Malo,144.352672
Regular,157.659044
Bueno,161.619952
Muy Bueno,164.758238


In [26]:
data_banco.groupby('Satisfaccion').describe()

Unnamed: 0_level_0,Sucursal,Sucursal,Sucursal,Sucursal,Sucursal,Sucursal,Sucursal,Sucursal,Cajero,Cajero,...,Tiempo_Servicio_min,Tiempo_Servicio_min,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2,Tiempo_Servicio_min_2
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,...,75%,max,count,mean,std,min,25%,50%,75%,max
Satisfaccion,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
Muy Malo,3009.0,224.932868,184.443997,62.0,85.0,85.0,443.0,586.0,3009.0,2909.480891,...,2.938995,13.721921,3009.0,2.301573,1.702266,0.30249,1.159079,1.85,2.938995,13.721921
Malo,4474.0,212.907689,175.071826,62.0,85.0,85.0,267.0,586.0,4474.0,2909.79392,...,3.062627,22.276138,4474.0,2.405878,1.825518,0.333333,1.195739,1.923507,3.062627,22.276138
Regular,4639.0,208.380685,175.58899,62.0,85.0,85.0,443.0,586.0,4639.0,2917.800604,...,3.341692,15.50357,4639.0,2.627651,2.009998,0.333333,1.283333,2.056655,3.341692,15.50357
Bueno,5915.0,199.975486,172.646539,62.0,85.0,85.0,267.0,586.0,5915.0,2951.393238,...,3.41368,20.800597,5915.0,2.693666,2.099949,0.316781,1.288512,2.104748,3.41368,20.800597
Muy Bueno,6262.0,204.093261,177.262283,62.0,85.0,85.0,443.0,586.0,6262.0,2899.707601,...,3.509022,26.711639,6262.0,2.745971,2.120134,0.302196,1.330326,2.137101,3.509022,26.711639


## 4.2 Tablas de frecuencias

In [27]:
data_banco[['Satisfaccion','Entidad_Bancaria','Tiempo_Servicio_seg', 'Monto_total']].groupby(['Satisfaccion','Entidad_Bancaria']).describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total
Unnamed: 0_level_1,Unnamed: 1_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max
Satisfaccion,Entidad_Bancaria,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2
Muy Malo,Nacional,2562.0,137.798329,101.276232,20.0,70.0,112.026396,175.87981,823.315231,2562.0,2045.868833,824.077783,114.04,1460.525,2132.305,2566.26,6168.01
Muy Malo,Internacional,447.0,139.791224,107.032346,18.149392,66.779243,108.203744,178.919602,647.732891,447.0,2021.013982,814.110237,250.1,1452.93,2144.46,2472.73,4730.47
Malo,Nacional,3840.0,144.071278,109.448714,20.0,71.136946,115.251455,183.77524,1336.568307,3840.0,2044.195422,850.339064,104.33,1433.4625,2143.34,2553.965,6287.65
Malo,Internacional,634.0,146.057012,110.100079,22.0,73.295833,117.380454,180.859295,858.797733,634.0,2061.044763,860.746307,149.66,1438.03,2114.64,2550.2525,5659.76
Regular,Nacional,3974.0,157.943961,121.021344,20.0,77.0,123.637468,200.519707,930.214187,3974.0,2072.724887,865.924109,59.71,1452.23,2168.105,2591.685,6044.42
Regular,Internacional,665.0,155.956394,118.125051,20.0,77.0,121.648149,200.047365,907.700386,665.0,2104.646526,853.354254,199.6,1517.41,2193.0,2615.74,6090.35
Bueno,Nacional,5052.0,161.755885,125.643235,19.006876,78.556383,126.264829,204.384199,1248.035801,5052.0,2103.611625,866.515034,64.37,1485.885,2202.09,2622.425,6632.72
Bueno,Internacional,863.0,160.824201,128.118946,20.0,71.283859,126.418729,209.430861,1060.395312,863.0,2138.000243,893.293968,95.98,1494.32,2183.22,2641.28,5763.46
Muy Bueno,Nacional,5364.0,164.733128,126.579587,18.13177,80.809364,128.699744,211.369865,1602.698319,5364.0,2106.083967,856.33901,88.87,1514.375,2187.58,2610.4,6420.75
Muy Bueno,Internacional,898.0,164.908228,130.971216,21.016823,76.01897,125.568178,206.159985,947.596243,898.0,2112.081704,841.125352,134.19,1546.7775,2199.28,2632.2625,5540.65


In [28]:
data_banco[['Satisfaccion','Entidad_Bancaria','Tiempo_Servicio_seg', 'Monto_total']].groupby(['Satisfaccion','Entidad_Bancaria']).describe().reset_index()

Unnamed: 0_level_0,Satisfaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Tiempo_Servicio_seg,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max
0,Muy Malo,Nacional,2562.0,137.798329,101.276232,20.0,70.0,112.026396,175.87981,823.315231,2562.0,2045.868833,824.077783,114.04,1460.525,2132.305,2566.26,6168.01
1,Muy Malo,Internacional,447.0,139.791224,107.032346,18.149392,66.779243,108.203744,178.919602,647.732891,447.0,2021.013982,814.110237,250.1,1452.93,2144.46,2472.73,4730.47
2,Malo,Nacional,3840.0,144.071278,109.448714,20.0,71.136946,115.251455,183.77524,1336.568307,3840.0,2044.195422,850.339064,104.33,1433.4625,2143.34,2553.965,6287.65
3,Malo,Internacional,634.0,146.057012,110.100079,22.0,73.295833,117.380454,180.859295,858.797733,634.0,2061.044763,860.746307,149.66,1438.03,2114.64,2550.2525,5659.76
4,Regular,Nacional,3974.0,157.943961,121.021344,20.0,77.0,123.637468,200.519707,930.214187,3974.0,2072.724887,865.924109,59.71,1452.23,2168.105,2591.685,6044.42
5,Regular,Internacional,665.0,155.956394,118.125051,20.0,77.0,121.648149,200.047365,907.700386,665.0,2104.646526,853.354254,199.6,1517.41,2193.0,2615.74,6090.35
6,Bueno,Nacional,5052.0,161.755885,125.643235,19.006876,78.556383,126.264829,204.384199,1248.035801,5052.0,2103.611625,866.515034,64.37,1485.885,2202.09,2622.425,6632.72
7,Bueno,Internacional,863.0,160.824201,128.118946,20.0,71.283859,126.418729,209.430861,1060.395312,863.0,2138.000243,893.293968,95.98,1494.32,2183.22,2641.28,5763.46
8,Muy Bueno,Nacional,5364.0,164.733128,126.579587,18.13177,80.809364,128.699744,211.369865,1602.698319,5364.0,2106.083967,856.33901,88.87,1514.375,2187.58,2610.4,6420.75
9,Muy Bueno,Internacional,898.0,164.908228,130.971216,21.016823,76.01897,125.568178,206.159985,947.596243,898.0,2112.081704,841.125352,134.19,1546.7775,2199.28,2632.2625,5540.65


In [29]:
data_banco[['Entidad_Bancaria','Satisfaccion']].value_counts()

Entidad_Bancaria  Satisfaccion
Nacional          Muy Bueno       5364
                  Bueno           5052
                  Regular         3974
                  Malo            3840
                  Muy Malo        2562
Internacional     Muy Bueno        898
                  Bueno            863
                  Regular          665
                  Malo             634
                  Muy Malo         447
dtype: int64

In [30]:
# Crosstab
pd.crosstab(index = data_banco['Entidad_Bancaria'],
            columns = data_banco['Satisfaccion'],
            normalize = 'all')*100

Satisfaccion,Muy Malo,Malo,Regular,Bueno,Muy Bueno
Entidad_Bancaria,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Nacional,10.543644,15.803119,16.354582,20.790979,22.074983
Internacional,1.839582,2.609161,2.736738,3.551586,3.695625


In [31]:
pd.crosstab(index = [data_banco['Entidad_Bancaria'], data_banco['Transaccion']],
            columns = [data_banco['Satisfaccion'], data_banco['Sucursal']],
            normalize = 'all')*100

Unnamed: 0_level_0,Satisfaccion,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Malo,Malo,Malo,Malo,Malo,...,Bueno,Bueno,Bueno,Bueno,Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno
Unnamed: 0_level_1,Sucursal,62,85,267,443,586,62,85,267,443,586,...,62,85,267,443,586,62,85,267,443,586
Entidad_Bancaria,Transaccion,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2
Nacional,Cambio de cheque,0.283962,0.539117,0.493847,0.271616,0.168731,0.444463,0.98358,0.995926,0.333347,0.234578,...,0.703733,2.020659,0.88481,1.070003,0.349809,0.596732,2.316968,0.843656,1.522696,0.432117
Nacional,Cobro/Pago,0.197539,0.201654,0.230462,0.098769,0.069962,0.279847,0.485617,0.333347,0.193424,0.078193,...,0.362155,1.56385,0.345693,0.604963,0.098769,0.321001,1.81489,0.333347,0.73254,0.15227
Nacional,Deposito,1.004157,3.654471,1.164657,1.185234,0.979464,1.407465,5.543438,1.625581,1.851928,1.012387,...,1.341619,7.226635,1.316927,2.008313,0.893041,1.131734,7.979752,0.925964,2.016544,0.954772
Internacional,Cambio de cheque,0.065846,0.094654,0.098769,0.045269,0.016462,0.057616,0.189308,0.144039,0.057616,0.016462,...,0.082308,0.36627,0.139923,0.172847,0.065846,0.107,0.31277,0.111116,0.218116,0.094654
Internacional,Cobro/Pago,0.024692,0.041154,0.057616,0.0535,0.004115,0.049385,0.102885,0.094654,0.028808,0.012346,...,0.061731,0.308655,0.049385,0.086423,0.020577,0.061731,0.308655,0.078193,0.107,0.028808
Internacional,Deposito,0.197539,0.584386,0.172847,0.242808,0.139923,0.246924,0.794271,0.296308,0.304539,0.214001,...,0.263385,1.246965,0.172847,0.353924,0.1605,0.246924,1.382773,0.148154,0.296308,0.193424


In [32]:
pd.crosstab(index = [data_banco['Entidad_Bancaria'], data_banco['Transaccion']],
            columns = [data_banco['Satisfaccion'], data_banco['Sucursal'], data_banco['Cajero']],
            normalize = 'all')*100

Unnamed: 0_level_0,Satisfaccion,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,...,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno
Unnamed: 0_level_1,Sucursal,62,62,62,62,85,85,85,85,85,85,...,443,586,586,586,586,586,586,586,586,586
Unnamed: 0_level_2,Cajero,4820,5174,5211,5286,70,357,472,3678,3983,4837,...,4208,56,63,87,299,2623,3023,3327,4353,4424
Entidad_Bancaria,Transaccion,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3,Unnamed: 22_level_3
Nacional,Cambio de cheque,0.107,0.004115,0.057616,0.115231,0.032923,0.107,0.156385,0.098769,0.090539,0.0535,...,0.637886,0.1605,0.0,0.004115,0.176962,0.020577,0.016462,0.020577,0.008231,0.024692
Nacional,Cobro/Pago,0.061731,0.004115,0.0535,0.078193,0.024692,0.028808,0.069962,0.032923,0.028808,0.016462,...,0.300424,0.078193,0.008231,0.0,0.041154,0.004115,0.012346,0.0,0.004115,0.004115
Nacional,Deposito,0.353924,0.037039,0.300424,0.31277,0.201654,0.584386,0.893041,0.736656,1.049426,0.189308,...,0.695502,0.325116,0.028808,0.008231,0.378616,0.098769,0.045269,0.008231,0.004115,0.057616
Internacional,Cambio de cheque,0.032923,0.0,0.024692,0.008231,0.004115,0.016462,0.057616,0.004115,0.008231,0.004115,...,0.090539,0.032923,0.004115,0.0,0.032923,0.008231,0.004115,0.004115,0.0,0.008231
Internacional,Cobro/Pago,0.012346,0.0,0.004115,0.008231,0.0,0.008231,0.020577,0.0,0.008231,0.004115,...,0.032923,0.020577,0.0,0.0,0.008231,0.0,0.0,0.0,0.0,0.0
Internacional,Deposito,0.086423,0.012346,0.057616,0.041154,0.024692,0.115231,0.111116,0.123462,0.1605,0.049385,...,0.098769,0.057616,0.0,0.004115,0.107,0.012346,0.0,0.0,0.0,0.012346


In [33]:
pd.crosstab(index = data_banco['Entidad_Bancaria'],
            columns = data_banco['Satisfaccion'],
            values = data_banco['Monto_total'],
            aggfunc = ('min','max'))

Unnamed: 0_level_0,max,max,max,max,max,min,min,min,min,min
Satisfaccion,Muy Malo,Malo,Regular,Bueno,Muy Bueno,Muy Malo,Malo,Regular,Bueno,Muy Bueno
Entidad_Bancaria,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
Nacional,6168.01,6287.65,6044.42,6632.72,6420.75,114.04,104.33,59.71,64.37,88.87
Internacional,4730.47,5659.76,6090.35,5763.46,5540.65,250.1,149.66,199.6,95.98,134.19


In [34]:
pd.crosstab(index = [data_banco['Entidad_Bancaria'], data_banco['Transaccion']],
            columns = [data_banco['Satisfaccion'], data_banco['Sucursal']],
            values = data_banco['Monto_total'],
            aggfunc = ('min','max','std'))

Unnamed: 0_level_0,Unnamed: 1_level_0,max,max,max,max,max,max,max,max,max,max,...,std,std,std,std,std,std,std,std,std,std
Unnamed: 0_level_1,Satisfaccion,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Malo,Malo,Malo,Malo,Malo,...,Bueno,Bueno,Bueno,Bueno,Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno
Unnamed: 0_level_2,Sucursal,62,85,267,443,586,62,85,267,443,586,...,62,85,267,443,586,62,85,267,443,586
Entidad_Bancaria,Transaccion,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3,Unnamed: 22_level_3
Nacional,Cambio de cheque,3541.52,5170.97,5203.7,4323.11,3351.01,3673.49,4220.48,4877.58,4472.59,3165.95,...,754.408329,859.589707,806.769993,761.041294,807.628056,689.170953,835.656598,825.662049,924.837577,683.058178
Nacional,Cobro/Pago,3774.48,5010.33,6168.01,4078.59,4085.79,3633.0,5394.94,5903.72,6278.98,4878.16,...,913.439056,931.131984,905.541838,1013.26187,648.171313,706.034321,965.495709,890.576169,1014.143722,805.483312
Nacional,Deposito,3628.1,4877.87,4666.78,4064.2,3804.37,3369.86,6287.65,4337.98,5070.34,3657.51,...,699.13926,821.543042,842.015411,797.184955,690.108107,687.729524,795.696282,804.979969,812.750379,753.188298
Internacional,Cambio de cheque,2743.16,4172.49,4489.88,4303.85,3070.31,3354.14,3853.55,4161.96,3501.57,3061.64,...,683.636571,756.512399,813.332766,765.763904,746.785866,716.49676,803.460504,933.251384,915.765024,737.211725
Internacional,Cobro/Pago,2340.93,4730.47,4202.01,3345.9,1461.78,3240.33,4291.53,4411.27,3756.15,2792.61,...,1117.092798,1060.081102,842.412457,1382.063625,487.969026,689.08598,873.406329,840.504958,997.600985,511.123945
Internacional,Deposito,3659.59,3924.85,4424.05,3875.83,3181.06,3077.51,5659.76,4251.26,4681.38,3000.02,...,672.645614,846.74464,905.434993,832.446288,696.191317,723.35946,793.916482,827.326044,809.157366,737.097962


In [35]:
pd.crosstab(index = [data_banco['Entidad_Bancaria'], data_banco['Transaccion']],
            columns = [data_banco['Satisfaccion'], data_banco['Sucursal']],
            values = data_banco['Monto_total'],
            aggfunc = ('mean'))

Unnamed: 0_level_0,Satisfaccion,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Malo,Malo,Malo,Malo,Malo,...,Bueno,Bueno,Bueno,Bueno,Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno
Unnamed: 0_level_1,Sucursal,62,85,267,443,586,62,85,267,443,586,...,62,85,267,443,586,62,85,267,443,586
Entidad_Bancaria,Transaccion,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2
Nacional,Cambio de cheque,2024.87087,2250.053817,2268.751833,2320.939545,1880.92122,2007.889074,2234.251757,2191.300331,2236.70037,1847.927368,...,1809.614035,2328.502648,2218.217767,2214.471077,1826.343412,1924.236,2207.671474,2332.191902,2235.710351,1883.52419
Nacional,Cobro/Pago,2046.653125,2842.692449,2727.502857,2207.56875,2146.262353,2085.866912,2669.200085,2631.321975,2956.595106,2150.663684,...,2092.098636,2689.196237,2656.291071,2789.243265,2094.45375,2104.585769,2560.653333,2610.118148,2644.838989,2269.218108
Nacional,Deposito,1812.322664,2050.882601,1997.838304,1968.569757,1802.632143,1699.360205,2048.811403,1982.536709,2011.029267,1639.485447,...,1712.924785,2013.957278,1950.198719,1998.891885,1764.654378,1738.388509,2025.384956,1965.381689,1974.852143,1712.432931
Internacional,Cambio de cheque,1589.861875,2385.933478,2070.034167,2274.514545,1702.435,2183.645714,2017.801087,2504.664857,2091.245714,2571.73,...,2068.782,2128.08618,2271.415882,2117.297143,1831.94875,1704.684615,2228.274079,2027.23037,2309.906415,1776.826522
Internacional,Cobro/Pago,2012.905,2834.87,2452.595,2289.489231,1461.78,2160.758333,2735.8208,2498.276522,2343.275714,2555.476667,...,2236.307333,2693.585867,3258.44,2662.936667,1937.088,2261.479333,2578.015733,2581.548947,2828.108462,2180.43
Internacional,Deposito,1772.318958,2034.199859,2102.794286,1997.837966,1631.284706,1770.0145,2001.742694,2005.56625,1986.784054,1851.356154,...,1723.520156,2087.520924,2163.333333,2134.291628,1573.024359,1837.3005,2010.465685,2102.483889,2098.884306,1856.835106


In [36]:
pd.pivot_table(data_banco,
               index = ['Entidad_Bancaria', 'Transaccion'],
               columns = ['Satisfaccion', 'Sucursal'],
               values = ['Monto_total'],
               aggfunc = ('mean'))

Unnamed: 0_level_0,Unnamed: 1_level_0,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total,Monto_total
Unnamed: 0_level_1,Satisfaccion,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Muy Malo,Malo,Malo,Malo,Malo,Malo,...,Bueno,Bueno,Bueno,Bueno,Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno,Muy Bueno
Unnamed: 0_level_2,Sucursal,62,85,267,443,586,62,85,267,443,586,...,62,85,267,443,586,62,85,267,443,586
Entidad_Bancaria,Transaccion,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3,Unnamed: 22_level_3
Nacional,Cambio de cheque,2024.87087,2250.053817,2268.751833,2320.939545,1880.92122,2007.889074,2234.251757,2191.300331,2236.70037,1847.927368,...,1809.614035,2328.502648,2218.217767,2214.471077,1826.343412,1924.236,2207.671474,2332.191902,2235.710351,1883.52419
Nacional,Cobro/Pago,2046.653125,2842.692449,2727.502857,2207.56875,2146.262353,2085.866912,2669.200085,2631.321975,2956.595106,2150.663684,...,2092.098636,2689.196237,2656.291071,2789.243265,2094.45375,2104.585769,2560.653333,2610.118148,2644.838989,2269.218108
Nacional,Deposito,1812.322664,2050.882601,1997.838304,1968.569757,1802.632143,1699.360205,2048.811403,1982.536709,2011.029267,1639.485447,...,1712.924785,2013.957278,1950.198719,1998.891885,1764.654378,1738.388509,2025.384956,1965.381689,1974.852143,1712.432931
Internacional,Cambio de cheque,1589.861875,2385.933478,2070.034167,2274.514545,1702.435,2183.645714,2017.801087,2504.664857,2091.245714,2571.73,...,2068.782,2128.08618,2271.415882,2117.297143,1831.94875,1704.684615,2228.274079,2027.23037,2309.906415,1776.826522
Internacional,Cobro/Pago,2012.905,2834.87,2452.595,2289.489231,1461.78,2160.758333,2735.8208,2498.276522,2343.275714,2555.476667,...,2236.307333,2693.585867,3258.44,2662.936667,1937.088,2261.479333,2578.015733,2581.548947,2828.108462,2180.43
Internacional,Deposito,1772.318958,2034.199859,2102.794286,1997.837966,1631.284706,1770.0145,2001.742694,2005.56625,1986.784054,1851.356154,...,1723.520156,2087.520924,2163.333333,2134.291628,1573.024359,1837.3005,2010.465685,2102.483889,2098.884306,1856.835106


# **5. Unificación**

## 5.1 Función/metodo merge

La estructura del merge es:

*merge(dataframe1, dataframe2, how = tipo_union, left_on = 'llave_left', rigth_on = 'llave_right')*

Como método, su estructura es la siguiente:

*dataframe1.merge(dataframe2, how = tipo_union, left_on = 'llave_left', rigth_on = 'llave_right')*

In [37]:
# Ejemplos de dataframes
df1 = pd.DataFrame({
  'Nombre': ['Jose', 'Claudia', 'Ruben', 'Cecilia', 'Patricia'],
  'Edad' : [30,35,29,39,28],
  'Ciudad': ['Santa Tecla', 'Sonsonate', 'Antiguo Cuscatlan', 'Santa Tecla', 'Soyapango']
    })

df2 = pd.DataFrame({
  'Nombre_2': ['Jose', 'Claudia','Rodrigo'],
  'Unidades' : [130,250,30]
    })

df3 = pd.DataFrame({
  'Nombre': ['Jose', 'Ricardo','Lucas'],
  'Unidades' : [130,250,30]
    })

In [38]:
# Inner join
# merge(dataframe1, dataframe2, how = tipo_union, left_on = 'llave_left', rigth_on = 'llave_right')
pd.merge(df1, df2, how = 'inner', left_on ='Nombre', right_on = 'Nombre_2')

Unnamed: 0,Nombre,Edad,Ciudad,Nombre_2,Unidades
0,Jose,30,Santa Tecla,Jose,130
1,Claudia,35,Sonsonate,Claudia,250


In [39]:
# Inner join
# merge(dataframe1, dataframe2, how = tipo_union, left_on = 'llave_left', rigth_on = 'llave_right')
pd.merge(df1, df2, how = 'inner', left_on ='Nombre', right_on = 'Nombre_2')

Unnamed: 0,Nombre,Edad,Ciudad,Nombre_2,Unidades
0,Jose,30,Santa Tecla,Jose,130
1,Claudia,35,Sonsonate,Claudia,250


In [40]:
# Inner join
# merge(dataframe1, dataframe2, how = tipo_union, left_on = 'llave_left', rigth_on = 'llave_right')
pd.merge(df1, df2, how = 'inner', left_on ='Nombre', right_on = 'Nombre_2')

Unnamed: 0,Nombre,Edad,Ciudad,Nombre_2,Unidades
0,Jose,30,Santa Tecla,Jose,130
1,Claudia,35,Sonsonate,Claudia,250


In [41]:
# Right join
pd.merge(df1, df2, how = 'right', left_on ='Nombre', right_on = 'Nombre_2')

Unnamed: 0,Nombre,Edad,Ciudad,Nombre_2,Unidades
0,Jose,30.0,Santa Tecla,Jose,130
1,Claudia,35.0,Sonsonate,Claudia,250
2,,,,Rodrigo,30


In [42]:
# Full join
pd.merge(df1, df2, how = 'outer', left_on ='Nombre', right_on = 'Nombre_2')

Unnamed: 0,Nombre,Edad,Ciudad,Nombre_2,Unidades
0,Jose,30.0,Santa Tecla,Jose,130.0
1,Claudia,35.0,Sonsonate,Claudia,250.0
2,Ruben,29.0,Antiguo Cuscatlan,,
3,Cecilia,39.0,Santa Tecla,,
4,Patricia,28.0,Soyapango,,
5,,,,Rodrigo,30.0


**Regresando al conjunto de datos**

¿Qué tipo de unión necesitariamos para nuestro set de datos?

In [43]:
data_banco.head(1)

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Transaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Satisfaccion,Monto,Interes,Monto_total,Tiempo_Servicio_min,Tiempo_Servicio_min_2
0,62,4820,2,Cobro/Pago,Internacional,311.0,Muy Bueno,2889.3,335.97,3225.27,5.183333,5.183333


In [44]:
# Join de data_banco con data_cajero
data_banco_consolidado = pd.merge(data_banco, data_cajero, how ='left', left_on = 'Cajero', right_on='Cajero')

In [45]:
data_banco_consolidado.head()

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Transaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Satisfaccion,Monto,Interes,Monto_total,Tiempo_Servicio_min,Tiempo_Servicio_min_2,Edad,Sexo,Nivel_Formacion,Ingreso
0,62,4820,2,Cobro/Pago,Internacional,311.0,Muy Bueno,2889.3,335.97,3225.27,5.183333,5.183333,42,F,Bachiller,1995
1,62,4820,2,Cobro/Pago,Nacional,156.0,Malo,1670.69,137.23,1807.92,2.6,2.6,42,F,Bachiller,1995
2,62,4820,2,Cobro/Pago,Nacional,248.0,Regular,3172.49,143.3,3315.79,4.133333,4.133333,42,F,Bachiller,1995
3,62,4820,2,Cobro/Pago,Nacional,99.0,Regular,1764.92,42.55,1807.47,1.65,1.65,42,F,Bachiller,1995
4,62,4820,2,Cobro/Pago,Nacional,123.0,Muy Bueno,1835.69,199.47,2035.16,2.05,2.05,42,F,Bachiller,1995


In [46]:
# Cambiar nombres en las variables de un dataframe
data_sucursal.rename(columns = {'Sucursal':'Nombre_Sucursal', 'ID_Sucursal': 'Sucursal'}, inplace = True)

In [47]:
data_banco_final = pd.merge(data_banco_consolidado, data_sucursal, how = 'left', on = 'Sucursal')

In [48]:
data_banco_final.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 24299 entries, 0 to 24298
Data columns (total 19 columns):
 #   Column                 Non-Null Count  Dtype   
---  ------                 --------------  -----   
 0   Sucursal               24299 non-null  int64   
 1   Cajero                 24299 non-null  int64   
 2   ID_Transaccion         24299 non-null  int64   
 3   Transaccion            24299 non-null  object  
 4   Entidad_Bancaria       24299 non-null  category
 5   Tiempo_Servicio_seg    24299 non-null  float64 
 6   Satisfaccion           24299 non-null  category
 7   Monto                  24299 non-null  float64 
 8   Interes                24299 non-null  float64 
 9   Monto_total            24299 non-null  float64 
 10  Tiempo_Servicio_min    24299 non-null  float64 
 11  Tiempo_Servicio_min_2  24299 non-null  float64 
 12  Edad                   24299 non-null  int64   
 13  Sexo                   24299 non-null  object  
 14  Nivel_Formacion        24299 non-null 

In [50]:
data_banco_final.head(5)
#al final hemos juntado las 3 tablas en una sola

Unnamed: 0,Sucursal,Cajero,ID_Transaccion,Transaccion,Entidad_Bancaria,Tiempo_Servicio_seg,Satisfaccion,Monto,Interes,Monto_total,Tiempo_Servicio_min,Tiempo_Servicio_min_2,Edad,Sexo,Nivel_Formacion,Ingreso,Nombre_Sucursal,Nuevo_Sistema,Plataforma
0,62,4820,2,Cobro/Pago,Internacional,311.0,Muy Bueno,2889.3,335.97,3225.27,5.183333,5.183333,42,F,Bachiller,1995,Sonsonate,No,A
1,62,4820,2,Cobro/Pago,Nacional,156.0,Malo,1670.69,137.23,1807.92,2.6,2.6,42,F,Bachiller,1995,Sonsonate,No,A
2,62,4820,2,Cobro/Pago,Nacional,248.0,Regular,3172.49,143.3,3315.79,4.133333,4.133333,42,F,Bachiller,1995,Sonsonate,No,A
3,62,4820,2,Cobro/Pago,Nacional,99.0,Regular,1764.92,42.55,1807.47,1.65,1.65,42,F,Bachiller,1995,Sonsonate,No,A
4,62,4820,2,Cobro/Pago,Nacional,123.0,Muy Bueno,1835.69,199.47,2035.16,2.05,2.05,42,F,Bachiller,1995,Sonsonate,No,A
