## **Procesamiento de Datos y creación de variables**

## Preparación y Análisis de Datos

Esta sección del notebook cubre la configuración y los pasos preliminares para la preparación y análisis de datos. Las siguientes bibliotecas y módulos se importan para apoyar diversas tareas:

- **json**: Para manejar archivos y datos JSON.
- **pandas**: Para la manipulación y análisis de datos.
- **os**: Para interactuar con el sistema operativo, específicamente para la gestión de archivos y directorios.
- **numpy**: Para operaciones numéricas y manipulaciones de arrays.
- **datetime**: Para trabajar con datos de fecha y hora.
- **sys**: Para manipular el entorno de ejecución de Python.
- **chardet**: Para la detección de codificación de caracteres.
- **sweetviz**: Para generar visualizaciones e informes de análisis exploratorio de datos.
- **utilities_meli**: Funciones utilitarias personalizadas específicas para el proyecto.

La variable `module_path` establece la ruta a los scripts de Python del proyecto, asegurando que los módulos personalizados puedan ser importados y utilizados.

In [1]:
import os
import json
import pandas as pd
import os
import numpy as np # type: ignore
from datetime import datetime
import sys

module_path = os.path.abspath(os.path.join('/Users/juanmanuelpaiba/Documents/Juan_Paiba/new_or_used_algorithm_MELI/', 'python_scripts'))
if module_path not in sys.path:
    sys.path.append(module_path)
import utilities_meli # type: ignore
#!pip install feature_engine
from feature_engine.encoding import RareLabelEncoder

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/juanmanuelpaiba/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

os.getcwd()
os.chdir(path="/Users/juanmanuelpaiba/Documents/Juan_Paiba/new_or_used_algorithm_MELI")

- Se utiliza la función `pd.read_parquet` para leer el archivo Parquet (`df_eda.parquet`) que fue generado en un notebook anterior. Este archivo contiene los datos que se procesaron y limpiaron específicamente para el EDA.


In [3]:
# Read the Parquet files back into DataFrames
df_products = pd.read_parquet("data/Outputs/df_eda.parquet")

# Display the first few rows of the loaded DataFrames
print("Loaded Training DataFrame head:")
(df_products.head())

Loaded Training DataFrame head:


Unnamed: 0,warranty,condition,base_price,seller_id,site_id,listing_type_id,price,buying_mode,parent_item_id,category_id,last_updated,international_delivery_mode,id,official_store_id,differential_pricing,accepts_mercadopago,original_price,currency_id,title,automatic_relist,date_created,stop_time,status,video_id,catalog_product_id,subtitle,initial_quantity,start_time,permalink,sold_quantity,available_quantity,country_name,country_id,state_name,state_id,city_name,city_id,local_pick_up,free_shipping,mode,dimensions,descrip_mdo_0,id_mdo_0,type_mdo_0,season_name,gender_name,target
0,,new,80.0,8208882349,MLA,bronze,80.0,buy_it_now,MLA6553902747,MLA126406,2015-09-05T20:42:58.000Z,none,MLA4695330653,,,True,,ARS,Auriculares Samsung Originales Manos Libres Ca...,False,2015-09-05T20:42:53.000Z,2015-11-04 20:42:53,active,,,,1,2015-09-05 20:42:53,http://articulo.mercadolibre.com.ar/MLA4695330...,0,1,Argentina,AR,Capital Federal,AR-C,San Cristóbal,TUxBQlNBTjkwNTZa,True,False,not_specified,,Transferencia bancaria,MLATB,G,,,1
1,NUESTRA REPUTACION,used,2650.0,8141699488,MLA,silver,2650.0,buy_it_now,MLA7727150374,MLA10267,2015-09-26T18:08:34.000Z,none,MLA7160447179,,,True,,ARS,Cuchillo Daga Acero Carbón Casco Yelmo Solinge...,False,2015-09-26T18:08:30.000Z,2015-11-25 18:08:30,active,,,,1,2015-09-26 18:08:30,http://articulo.mercadolibre.com.ar/MLA7160447...,0,1,Argentina,AR,Capital Federal,AR-C,Buenos Aires,,True,False,me2,,Transferencia bancaria,MLATB,G,,,0
2,,used,60.0,8386096505,MLA,bronze,60.0,buy_it_now,MLA6561247998,MLA1227,2015-09-09T23:57:10.000Z,none,MLA7367189936,,,True,,ARS,"Antigua Revista Billiken, N° 1826, Año 1954",False,2015-09-09T23:57:07.000Z,2015-11-08 23:57:07,active,,,,1,2015-09-09 23:57:07,http://articulo.mercadolibre.com.ar/MLA7367189...,0,1,Argentina,AR,Capital Federal,AR-C,Boedo,TUxBQkJPRTQ0OTRa,True,False,me2,,Transferencia bancaria,MLATB,G,,,0
3,,new,580.0,5377752182,MLA,silver,580.0,buy_it_now,,MLA86345,2015-10-05T16:03:50.306Z,none,MLA9191625553,,,True,,ARS,Alarma Guardtex Gx412 Seguridad Para El Automo...,False,2015-09-28T18:47:56.000Z,2015-12-04 01:13:16,active,,,,1,2015-09-28 18:47:56,http://articulo.mercadolibre.com.ar/MLA9191625...,0,1,Argentina,AR,Capital Federal,AR-C,Floresta,TUxBQkZMTzg5MjFa,True,False,me2,,Transferencia bancaria,MLATB,G,,,1
4,MI REPUTACION.,used,30.0,2938071313,MLA,bronze,30.0,buy_it_now,MLA3133256685,MLA41287,2015-08-28T13:37:41.000Z,none,MLA7787961817,,,True,,ARS,Serenata - Jennifer Blake,False,2015-08-24T22:07:20.000Z,2015-10-23 22:07:20,active,,,,1,2015-08-24 22:07:20,http://articulo.mercadolibre.com.ar/MLA7787961...,0,1,Argentina,AR,Buenos Aires,AR-B,Tres de febrero,TUxBQ1RSRTMxODE5NA,True,False,not_specified,,Transferencia bancaria,MLATB,G,,,0


## Ajustes en Variable Warranty

Este código utiliza una función lambda para aplicar la limpieza de garantías a cada valor de la columna 'warranty' del DataFrame df_products, utilizando la función clean_warranty del módulo utilities_meli. Luego, se cuentan y muestran los valores únicos en la nueva columna 'warranty_cleaned'.

Esta función clean_warranty se utiliza para estandarizar los términos de garantía encontrados en la columna 'warranty' de un DataFrame. Categoriza las descripciones de garantía en categorías como '3 meses', '6 meses', '12 meses', 'sin garantía', 'si', 'otros' o 'missing', dependiendo de las palabras clave encontradas en la cadena de garantía.

 * Se asegura de manejar valores nulos (NaN) devolviendo 'missing'.
 * Normaliza la cadena de garantía a minúsculas para una comparación consistente.
 * Utiliza palabras clave como 'mes', 'año', 'sí', 'con', 'sin garantía' para determinar la categoría adecuada.
 * Cualquier término que no coincida con estas categorías se etiqueta como 'otros'.
 * Este proceso facilita la posterior análisis y agrupación de productos según las condiciones de garantía especificadas en los datos. 



In [4]:
df_products['warranty_cleaned'] = df_products['warranty'].apply(lambda x: utilities_meli.clean_warranty(x))
df_products['warranty_type'] = df_products['warranty'].apply(lambda x: utilities_meli.classify_condition(x))

df_products['warranty_type'].value_counts()

missing    60896
unknown    36977
new         1373
used         754
Name: warranty_type, dtype: int64

## Ajustes en Variable title

Estas líneas de código son parte de un proceso para limpiar y estructurar la información en la columna 'title' de tu DataFrame, facilitando así su análisis y manipulación posterior en tu proyecto. Ajusta y adapta según las necesidades específicas de tu conjunto de datos y el flujo de trabajo que estás construyendo.

 * La función clean_title tiene como propósito normalizar y categorizar los títulos de productos en categorías estándar como 'new' para nuevos, 'used' para usados, 'missing' si falta información relevante, y 'otros' para cualquier otro caso. Esto facilita la organización y análisis de datos basados en el estado del producto según su título.

 * La función extract_first_two_words está diseñada para extraer las primeras dos palabras significativas de los títulos de productos. Esta acción ayuda a resumir brevemente la esencia del producto, útil para etiquetado y análisis inicial.

 * La función extract_first_word se enfoca en extraer la primera palabra de los títulos de productos. Esto proporciona una forma rápida de identificar categorías generales o temas principales relacionados con los productos en cuestión.

In [5]:
df_products['first_two_words_title'] = df_products['title'].apply(utilities_meli.extract_first_two_words)
df_products['first_word_title'] = df_products['title'].apply(utilities_meli.extract_first_word)
df_products['first_three_words_title'] = df_products['title'].apply(utilities_meli.extract_first_three_words)
df_products['title_type'] = df_products['title'].apply(lambda x: utilities_meli.classify_condition(x))

In [6]:
df_products['title_type'].value_counts()

unknown    92937
new         4786
used        2277
Name: title_type, dtype: int64

## Ajustes en Variable date_created

Primero nos aseguramos que los valores raros en la columna 'base_price' sean reemplazados adecuadamente y luego convierten la columna 'date_created' a un formato de fecha y hora específico, creando una nueva columna que representa el año y mes de creación de cada producto de manera simplificada para análisis posteriores.

De igual manera realiza transformaciones en el DataFrame df_products_00 basadas en la columna 'date_created'. Primero, extrae el mes y el día de la semana en las columnas 'month' y 'weekday', respectivamente. Luego, crea la columna 'year_month' que representa el año y mes de 'date_created' en formato de texto. Posteriormente, concatena este valor con otras columnas como 'status', 'listing_type_id', 'state_id', y variables booleanas convertidas a cadenas ('automatic_relist', 'accepts_mercadopago', 'local_pick_up', 'free_shipping'), creando nuevas variables combinadas que pueden facilitar análisis temporales y categorización en el DataFrame.

In [7]:
# List of rare values to replace
rare_values_to_replace = [-2147483648, 11111111, 1111111111, 8888888, 9000000, 123456789, 112111111]

# Replace rare values in 'base_price' column
df_products_00 = utilities_meli.replace_rare_values(df_products, 'base_price', rare_values_to_replace)
df_products_00 = utilities_meli.replace_rare_values(df_products_00, 'base_price', rare_values_to_replace)

# 'date_created' a tipo datetime
df_products_00['date_created_month'] = pd.to_datetime(df_products_00['date_created']).dt.strftime('%Y%m')
df_products_00['date_created'] = pd.to_datetime(df_products_00['date_created'])
# Variables Día - Mes
df_products_00['month'] = df_products_00['date_created'].dt.month
df_products_00['weekday'] = df_products_00['date_created'].dt.weekday
# Variable 'year_month' tipo texto
df_products_00['year_month'] = df_products_00['date_created'].dt.strftime('%Y-%m')
# status
df_products_00['concat_status'] = df_products_00['year_month'] + '_' + df_products_00['status']
# listing_type_id
df_products_00['concat_var_lt'] = df_products_00['year_month'] + '_' + df_products_00['listing_type_id']
# state_id
df_products_00['concat_var_state'] = df_products_00['year_month'] + '_' + df_products_00['state_id']
# automatic_relist
df_products_00['automatic_relist_str'] = df_products_00['automatic_relist'].astype(str)
df_products_00['concat_var_autrelist'] = df_products_00['year_month'] + '_' + df_products_00['automatic_relist_str']
# accepts_mercadopago
df_products_00['accepts_mercadopago_str'] = df_products_00['accepts_mercadopago'].astype(str)
df_products_00['concat_var_accmdopag'] = df_products_00['year_month'] + '_' + df_products_00['accepts_mercadopago_str']
# local_pick_up
df_products_00['local_pick_up_str'] = df_products_00['local_pick_up'].astype(str)
df_products_00['concat_var_localpu'] = df_products_00['year_month'] + '_' + df_products_00['local_pick_up_str']
# free_shipping
df_products_00['free_shipping_str'] = df_products_00['free_shipping'].astype(str)
df_products_00['concat_var_freesh'] = df_products_00['year_month'] + '_' + df_products_00['free_shipping_str']

df_products_00= df_products_00.drop(columns=['permalink','seller_id','warranty','condition','site_id','international_delivery_mode',
                                      'parent_item_id','last_updated','id','title','catalog_product_id',
                                      'dimensions','city_name','stop_time', 'start_time'])


## Ajustes y eliminación variables con valores nulos

Este bloque de código realiza varias operaciones importantes. Primero, calcula y muestra el porcentaje de valores faltantes para cada variable en el DataFrame df_products_00, utilizando (df_products_00.isnull().sum() / len(df_products_00)) * 100 para calcular el porcentaje y print(miss_perc.sort_values(ascending=False)) para imprimir los resultados ordenados de manera descendente. Luego, elimina específicamente las columnas 'differential_pricing', 'original_price', 'official_store_id' y 'date_created' del DataFrame df_products_00 que contienen valores nulos. Finalmente, calcula y muestra las columnas que aún tienen valores nulos con vl_nul_column = df_products_00.isnull().sum() y print(column_nulos) para mostrar estas variables junto con la cantidad de valores faltantes en cada una. Este proceso es crucial en la preparación de datos para análisis y modelado, asegurando que las variables con datos incompletos sean manejadas de manera apropiada para evitar sesgos en los resultados finales.

In [8]:
#############################################
# Missing por cada variable
#############################################
miss_perc = (df_products_00.isnull().sum() / len(df_products_00)) * 100
print(miss_perc.sort_values(ascending=False))

differential_pricing       100.000
subtitle                   100.000
original_price              99.857
official_store_id           99.182
video_id                    97.015
gender_name                 89.115
season_name                 87.575
type_mdo_0                  30.559
id_mdo_0                    30.559
descrip_mdo_0               30.559
concat_var_autrelist         0.000
title_type                   0.000
free_shipping_str            0.000
concat_var_localpu           0.000
local_pick_up_str            0.000
warranty_cleaned             0.000
warranty_type                0.000
first_two_words_title        0.000
first_word_title             0.000
first_three_words_title      0.000
date_created_month           0.000
automatic_relist_str         0.000
month                        0.000
weekday                      0.000
year_month                   0.000
concat_status                0.000
target                       0.000
concat_var_accmdopag         0.000
accepts_mercadopago_

In [9]:
df_products_00 = df_products_00.drop(columns=['differential_pricing','original_price',
                                              'official_store_id','date_created'])

In [10]:
# Cantidad de valores nulos en cada columna
vl_nul_column = df_products_00.isnull().sum()

# Columnas que tienen valores nulos
column_nulos = vl_nul_column[vl_nul_column > 0]
print("Variables con valores nulos:")
print(column_nulos)

Variables con valores nulos:
video_id          97015
subtitle         100000
descrip_mdo_0     30559
id_mdo_0          30559
type_mdo_0        30559
season_name       87575
gender_name       89115
dtype: int64


In [11]:
col_categoricas = df_products_00.select_dtypes('object').columns
col_categoricas

Index(['listing_type_id', 'buying_mode', 'category_id', 'currency_id',
       'status', 'video_id', 'country_name', 'country_id', 'state_name',
       'state_id', 'city_id', 'mode', 'descrip_mdo_0', 'id_mdo_0',
       'type_mdo_0', 'season_name', 'gender_name', 'warranty_cleaned',
       'warranty_type', 'first_two_words_title', 'first_word_title',
       'first_three_words_title', 'title_type', 'date_created_month',
       'year_month', 'concat_status', 'concat_var_lt', 'concat_var_state',
       'automatic_relist_str', 'concat_var_autrelist',
       'accepts_mercadopago_str', 'concat_var_accmdopag', 'local_pick_up_str',
       'concat_var_localpu', 'free_shipping_str', 'concat_var_freesh'],
      dtype='object')

In [12]:
df_products_00.first_two_words_title.value_counts()

kit x2                                                     295
50 suspensores                                             180
samsung galaxy                                             158
kit imprimible                                             143
manoenpez vinilo                                           136
lpr pastilla                                               122
disco vinilo                                               116
pastilla freno                                              94
cartas magic                                                87
hot wheels                                                  85
pastillas freno                                             85
bomba agua                                                  84
12 suspensores                                              81
libro digital                                               78
campera cuero                                               77
faro trasero                                           

## Variables nuevas, a partir de algunas agrupaciones y medidas de tendencia central

La función calculate_group_stats calcula múltiples estadísticas (media, mínimo, máximo, varianza, mediana, desviación estándar, primer cuartil y tercer cuartil) para una columna específica agrupada por otra columna en un DataFrame. Primero, agrupa los datos por group_column y selecciona value_column sobre el cual se calcularán las estadísticas. Luego, se crean nuevas columnas en el DataFrame original df que contienen cada una de estas estadísticas utilizando transformaciones grupales. Finalmente, todas las nuevas columnas se concatenan de vuelta al DataFrame original y se devuelve el DataFrame modificado con las nuevas columnas de estadísticas agregadas. Este proceso es útil para obtener un resumen detallado de las distribuciones de datos agrupadas por diferentes categorías, facilitando el análisis y la interpretación de los datos en proyectos de ciencia de datos y análisis estadístico.

In [13]:
columns_to_group = ['mode', 'status', 'listing_type_id', 'state_id', 'automatic_relist',
                    'accepts_mercadopago', 'local_pick_up', 'free_shipping', 'warranty_cleaned', 'weekday','title_type','warranty_type']

value_column = 'base_price'
value_column_1 = 'initial_quantity'
value_column_2 = 'sold_quantity'

for column in columns_to_group:
    df_products_00 = utilities_meli.calculate_group_stats(df_products_00, column, value_column)
    df_products_00 = utilities_meli.calculate_group_stats(df_products_00, column, value_column_1)
    df_products_00 = utilities_meli.calculate_group_stats(df_products_00, column, value_column_2)

In [14]:
df_products_00.head()

Unnamed: 0,base_price,listing_type_id,price,buying_mode,category_id,accepts_mercadopago,currency_id,automatic_relist,status,video_id,subtitle,initial_quantity,sold_quantity,available_quantity,country_name,country_id,state_name,state_id,city_id,local_pick_up,free_shipping,mode,descrip_mdo_0,id_mdo_0,type_mdo_0,season_name,gender_name,target,warranty_cleaned,warranty_type,first_two_words_title,first_word_title,first_three_words_title,title_type,date_created_month,month,weekday,year_month,concat_status,concat_var_lt,concat_var_state,automatic_relist_str,concat_var_autrelist,accepts_mercadopago_str,concat_var_accmdopag,local_pick_up_str,concat_var_localpu,free_shipping_str,concat_var_freesh,mean_base_price_mode,min_base_price_mode,max_base_price_mode,var_base_price_mode,median_base_price_mode,std_base_price_mode,q1_base_price_mode,q3_base_price_mode,mean_initial_quantity_mode,min_initial_quantity_mode,max_initial_quantity_mode,var_initial_quantity_mode,median_initial_quantity_mode,std_initial_quantity_mode,q1_initial_quantity_mode,q3_initial_quantity_mode,mean_sold_quantity_mode,min_sold_quantity_mode,max_sold_quantity_mode,var_sold_quantity_mode,median_sold_quantity_mode,std_sold_quantity_mode,q1_sold_quantity_mode,q3_sold_quantity_mode,mean_base_price_status,min_base_price_status,max_base_price_status,var_base_price_status,median_base_price_status,std_base_price_status,q1_base_price_status,q3_base_price_status,mean_initial_quantity_status,min_initial_quantity_status,max_initial_quantity_status,var_initial_quantity_status,median_initial_quantity_status,std_initial_quantity_status,q1_initial_quantity_status,q3_initial_quantity_status,mean_sold_quantity_status,min_sold_quantity_status,max_sold_quantity_status,var_sold_quantity_status,median_sold_quantity_status,std_sold_quantity_status,q1_sold_quantity_status,q3_sold_quantity_status,mean_base_price_listing_type_id,min_base_price_listing_type_id,max_base_price_listing_type_id,var_base_price_listing_type_id,median_base_price_listing_type_id,std_base_price_listing_type_id,q1_base_price_listing_type_id,q3_base_price_listing_type_id,mean_initial_quantity_listing_type_id,min_initial_quantity_listing_type_id,max_initial_quantity_listing_type_id,var_initial_quantity_listing_type_id,median_initial_quantity_listing_type_id,std_initial_quantity_listing_type_id,q1_initial_quantity_listing_type_id,q3_initial_quantity_listing_type_id,mean_sold_quantity_listing_type_id,min_sold_quantity_listing_type_id,max_sold_quantity_listing_type_id,var_sold_quantity_listing_type_id,median_sold_quantity_listing_type_id,std_sold_quantity_listing_type_id,q1_sold_quantity_listing_type_id,q3_sold_quantity_listing_type_id,mean_base_price_state_id,min_base_price_state_id,max_base_price_state_id,var_base_price_state_id,median_base_price_state_id,std_base_price_state_id,q1_base_price_state_id,q3_base_price_state_id,mean_initial_quantity_state_id,min_initial_quantity_state_id,max_initial_quantity_state_id,var_initial_quantity_state_id,median_initial_quantity_state_id,std_initial_quantity_state_id,q1_initial_quantity_state_id,q3_initial_quantity_state_id,mean_sold_quantity_state_id,min_sold_quantity_state_id,max_sold_quantity_state_id,var_sold_quantity_state_id,median_sold_quantity_state_id,std_sold_quantity_state_id,q1_sold_quantity_state_id,q3_sold_quantity_state_id,mean_base_price_automatic_relist,min_base_price_automatic_relist,max_base_price_automatic_relist,var_base_price_automatic_relist,median_base_price_automatic_relist,std_base_price_automatic_relist,q1_base_price_automatic_relist,q3_base_price_automatic_relist,mean_initial_quantity_automatic_relist,min_initial_quantity_automatic_relist,max_initial_quantity_automatic_relist,var_initial_quantity_automatic_relist,median_initial_quantity_automatic_relist,std_initial_quantity_automatic_relist,q1_initial_quantity_automatic_relist,q3_initial_quantity_automatic_relist,mean_sold_quantity_automatic_relist,min_sold_quantity_automatic_relist,max_sold_quantity_automatic_relist,var_sold_quantity_automatic_relist,median_sold_quantity_automatic_relist,std_sold_quantity_automatic_relist,q1_sold_quantity_automatic_relist,q3_sold_quantity_automatic_relist,mean_base_price_accepts_mercadopago,min_base_price_accepts_mercadopago,max_base_price_accepts_mercadopago,var_base_price_accepts_mercadopago,median_base_price_accepts_mercadopago,std_base_price_accepts_mercadopago,q1_base_price_accepts_mercadopago,q3_base_price_accepts_mercadopago,mean_initial_quantity_accepts_mercadopago,min_initial_quantity_accepts_mercadopago,max_initial_quantity_accepts_mercadopago,var_initial_quantity_accepts_mercadopago,median_initial_quantity_accepts_mercadopago,std_initial_quantity_accepts_mercadopago,q1_initial_quantity_accepts_mercadopago,q3_initial_quantity_accepts_mercadopago,mean_sold_quantity_accepts_mercadopago,min_sold_quantity_accepts_mercadopago,max_sold_quantity_accepts_mercadopago,var_sold_quantity_accepts_mercadopago,median_sold_quantity_accepts_mercadopago,std_sold_quantity_accepts_mercadopago,q1_sold_quantity_accepts_mercadopago,q3_sold_quantity_accepts_mercadopago,mean_base_price_local_pick_up,min_base_price_local_pick_up,max_base_price_local_pick_up,var_base_price_local_pick_up,median_base_price_local_pick_up,std_base_price_local_pick_up,q1_base_price_local_pick_up,q3_base_price_local_pick_up,mean_initial_quantity_local_pick_up,min_initial_quantity_local_pick_up,max_initial_quantity_local_pick_up,var_initial_quantity_local_pick_up,median_initial_quantity_local_pick_up,std_initial_quantity_local_pick_up,q1_initial_quantity_local_pick_up,q3_initial_quantity_local_pick_up,mean_sold_quantity_local_pick_up,min_sold_quantity_local_pick_up,max_sold_quantity_local_pick_up,var_sold_quantity_local_pick_up,median_sold_quantity_local_pick_up,std_sold_quantity_local_pick_up,q1_sold_quantity_local_pick_up,q3_sold_quantity_local_pick_up,mean_base_price_free_shipping,min_base_price_free_shipping,max_base_price_free_shipping,var_base_price_free_shipping,median_base_price_free_shipping,std_base_price_free_shipping,q1_base_price_free_shipping,q3_base_price_free_shipping,mean_initial_quantity_free_shipping,min_initial_quantity_free_shipping,max_initial_quantity_free_shipping,var_initial_quantity_free_shipping,median_initial_quantity_free_shipping,std_initial_quantity_free_shipping,q1_initial_quantity_free_shipping,q3_initial_quantity_free_shipping,mean_sold_quantity_free_shipping,min_sold_quantity_free_shipping,max_sold_quantity_free_shipping,var_sold_quantity_free_shipping,median_sold_quantity_free_shipping,std_sold_quantity_free_shipping,q1_sold_quantity_free_shipping,q3_sold_quantity_free_shipping,mean_base_price_warranty_cleaned,min_base_price_warranty_cleaned,max_base_price_warranty_cleaned,var_base_price_warranty_cleaned,median_base_price_warranty_cleaned,std_base_price_warranty_cleaned,q1_base_price_warranty_cleaned,q3_base_price_warranty_cleaned,mean_initial_quantity_warranty_cleaned,min_initial_quantity_warranty_cleaned,max_initial_quantity_warranty_cleaned,var_initial_quantity_warranty_cleaned,median_initial_quantity_warranty_cleaned,std_initial_quantity_warranty_cleaned,q1_initial_quantity_warranty_cleaned,q3_initial_quantity_warranty_cleaned,mean_sold_quantity_warranty_cleaned,min_sold_quantity_warranty_cleaned,max_sold_quantity_warranty_cleaned,var_sold_quantity_warranty_cleaned,median_sold_quantity_warranty_cleaned,std_sold_quantity_warranty_cleaned,q1_sold_quantity_warranty_cleaned,q3_sold_quantity_warranty_cleaned,mean_base_price_weekday,min_base_price_weekday,max_base_price_weekday,var_base_price_weekday,median_base_price_weekday,std_base_price_weekday,q1_base_price_weekday,q3_base_price_weekday,mean_initial_quantity_weekday,min_initial_quantity_weekday,max_initial_quantity_weekday,var_initial_quantity_weekday,median_initial_quantity_weekday,std_initial_quantity_weekday,q1_initial_quantity_weekday,q3_initial_quantity_weekday,mean_sold_quantity_weekday,min_sold_quantity_weekday,max_sold_quantity_weekday,var_sold_quantity_weekday,median_sold_quantity_weekday,std_sold_quantity_weekday,q1_sold_quantity_weekday,q3_sold_quantity_weekday,mean_base_price_title_type,min_base_price_title_type,max_base_price_title_type,var_base_price_title_type,median_base_price_title_type,std_base_price_title_type,q1_base_price_title_type,q3_base_price_title_type,mean_initial_quantity_title_type,min_initial_quantity_title_type,max_initial_quantity_title_type,var_initial_quantity_title_type,median_initial_quantity_title_type,std_initial_quantity_title_type,q1_initial_quantity_title_type,q3_initial_quantity_title_type,mean_sold_quantity_title_type,min_sold_quantity_title_type,max_sold_quantity_title_type,var_sold_quantity_title_type,median_sold_quantity_title_type,std_sold_quantity_title_type,q1_sold_quantity_title_type,q3_sold_quantity_title_type,mean_base_price_warranty_type,min_base_price_warranty_type,max_base_price_warranty_type,var_base_price_warranty_type,median_base_price_warranty_type,std_base_price_warranty_type,q1_base_price_warranty_type,q3_base_price_warranty_type,mean_initial_quantity_warranty_type,min_initial_quantity_warranty_type,max_initial_quantity_warranty_type,var_initial_quantity_warranty_type,median_initial_quantity_warranty_type,std_initial_quantity_warranty_type,q1_initial_quantity_warranty_type,q3_initial_quantity_warranty_type,mean_sold_quantity_warranty_type,min_sold_quantity_warranty_type,max_sold_quantity_warranty_type,var_sold_quantity_warranty_type,median_sold_quantity_warranty_type,std_sold_quantity_warranty_type,q1_sold_quantity_warranty_type,q3_sold_quantity_warranty_type
0,80.0,bronze,80.0,buy_it_now,MLA126406,True,ARS,False,active,,,1,0,1,Argentina,AR,Capital Federal,AR-C,TUxBQlNBTjkwNTZa,True,False,not_specified,Transferencia bancaria,MLATB,G,,,1,missing,missing,auriculares samsung,auriculares,auriculares samsung originales,new,201509,9,5,2015-09,2015-09_active,2015-09_bronze,2015-09_AR-C,False,2015-09_False,True,2015-09_True,True,2015-09_True,False,2015-09_False,59757.420878,1.0,2222222000.0,109329100000000.0,350.0,10456050.0,100.0,1500.0,27.159948,1,9999,106143.142293,1.0,325.796167,1.0,1.0,1.848199,0,2606,716.946028,0.0,26.775848,0.0,0.0,5174.568608,0.84,6500000.0,2879726000.0,250.0,53663.074155,90.0,800.0,34.051664,1,9999,174780.468169,1.0,418.06754,1.0,2.0,2.354534,0,8676,1872.264296,0.0,43.269669,0.0,0.0,36286.220461,0.84,2222222000.0,78174290000000.0,198.0,8841622.0,80.0,570.0,33.546161,1,9999,139594.226027,1.0,373.623107,1.0,3.0,0.607393,0,418,20.901961,0.0,4.571866,0.0,0.0,3422.331883,0.84,5330000.0,2192546000.0,200.0,46824.63,80.0,600.1875,43.14827,1,9999,231224.476175,1.0,480.858062,1.0,3.0,2.857901,0,8676,2829.409344,0.0,53.192193,0.0,0.0,28707.192126,0.84,2222222000.0,51819250000000.0,240.0,7198559.0,90.0,790.0,29.555743,1,9999,143619.380848,1.0,378.971478,1.0,2.0,2.067133,0,8676,1689.833881,0.0,41.107589,0.0,0.0,24059.579477,0.84,2222222000.0,50503410000000.0,240.0,7106575.0,90.0,750.0,35.86707,1,9999,181301.913849,1.0,425.795625,1.0,3.0,2.451386,0,8676,1863.230928,0.0,43.165159,0.0,0.0,29267.464656,0.84,2222222000.0,62068990000000.0,249.9,7878388.0,95.0,760.0,36.248124,1,9999,203978.474193,1.0,451.639762,1.0,3.0,2.665703,0,8676,2148.890092,0.0,46.356122,0.0,0.0,28213.54735,0.84,2222222000.0,50921090000000.0,239.9,7135901.0,90.0,780.0,31.741638,1,9999,152024.035565,1.0,389.902598,1.0,2.0,2.255589,0,8676,1839.791329,0.0,42.892789,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.159408,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0,4449.163763,1.0,2004105.0,1768778000.0,230.0,42056.845548,85.0,728.4,27.287118,1,9999,137730.231544,1.0,371.120239,1.0,2.0,2.191097,0,2299,1116.765369,0.0,33.41804,0.0,0.0,5160.768991,1.0,1559237.0,2240846000.0,250.0,47337.57,109.0,750.0,13.186586,1,9999,64945.557028,1.0,254.844182,1.0,3.0,1.492896,0,907,364.446031,0.0,19.09047,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.16,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0
1,2650.0,silver,2650.0,buy_it_now,MLA10267,True,ARS,False,active,,,1,0,1,Argentina,AR,Capital Federal,AR-C,,True,False,me2,Transferencia bancaria,MLATB,G,,,0,otros,unknown,cuchillo daga,cuchillo,cuchillo daga acero,unknown,201509,9,5,2015-09,2015-09_active,2015-09_silver,2015-09_AR-C,False,2015-09_False,True,2015-09_True,True,2015-09_True,False,2015-09_False,740.527782,0.84,239999.0,9144042.0,199.99,3023.912,82.0,508.25,41.108282,1,9999,231641.066864,1.0,481.291042,1.0,3.0,2.76137,0,8676,2768.851374,0.0,52.619876,0.0,0.0,5174.568608,0.84,6500000.0,2879726000.0,250.0,53663.074155,90.0,800.0,34.051664,1,9999,174780.468169,1.0,418.06754,1.0,2.0,2.354534,0,8676,1872.264296,0.0,43.269669,0.0,0.0,28727.390143,1.0,6500000.0,21082240000.0,525.0,145197.3,200.0,1999.0,23.78758,1,9999,156604.280342,1.0,395.732587,1.0,4.0,7.179065,0,8676,8981.900117,0.0,94.772887,0.0,3.0,3422.331883,0.84,5330000.0,2192546000.0,200.0,46824.63,80.0,600.1875,43.14827,1,9999,231224.476175,1.0,480.858062,1.0,3.0,2.857901,0,8676,2829.409344,0.0,53.192193,0.0,0.0,28707.192126,0.84,2222222000.0,51819250000000.0,240.0,7198559.0,90.0,790.0,29.555743,1,9999,143619.380848,1.0,378.971478,1.0,2.0,2.067133,0,8676,1689.833881,0.0,41.107589,0.0,0.0,24059.579477,0.84,2222222000.0,50503410000000.0,240.0,7106575.0,90.0,750.0,35.86707,1,9999,181301.913849,1.0,425.795625,1.0,3.0,2.451386,0,8676,1863.230928,0.0,43.165159,0.0,0.0,29267.464656,0.84,2222222000.0,62068990000000.0,249.9,7878388.0,95.0,760.0,36.248124,1,9999,203978.474193,1.0,451.639762,1.0,3.0,2.665703,0,8676,2148.890092,0.0,46.356122,0.0,0.0,28213.54735,0.84,2222222000.0,50921090000000.0,239.9,7135901.0,90.0,780.0,31.741638,1,9999,152024.035565,1.0,389.902598,1.0,2.0,2.255589,0,8676,1839.791329,0.0,42.892789,0.0,0.0,984.394862,1.0,1000000.0,82512650.0,199.0,9083.647509,79.0,589.99,48.190886,1,9999,249118.281844,1.0,499.117503,1.0,4.0,3.292268,0,1373,637.955845,0.0,25.257788,0.0,0.0,4449.163763,1.0,2004105.0,1768778000.0,230.0,42056.845548,85.0,728.4,27.287118,1,9999,137730.231544,1.0,371.120239,1.0,2.0,2.191097,0,2299,1116.765369,0.0,33.41804,0.0,0.0,28983.790709,0.84,2222222000.0,53138300000000.0,250.0,7289602.0,90.0,800.0,36.9874,1,9999,187120.375078,1.0,432.574127,1.0,3.0,2.499424,0,8676,1941.504242,0.0,44.062504,0.0,0.0,61467.609255,1.0,2222222000.0,133549700000000.0,215.0,11556370.0,76.0,766.0,61.581821,1,9999,397073.613822,1.0,630.137774,1.0,4.0,3.712227,0,6065,1864.324394,0.0,43.177823,0.0,0.0
2,60.0,bronze,60.0,buy_it_now,MLA1227,True,ARS,False,active,,,1,0,1,Argentina,AR,Capital Federal,AR-C,TUxBQkJPRTQ0OTRa,True,False,me2,Transferencia bancaria,MLATB,G,,,0,missing,missing,antigua revista,antigua,antigua revista billiken,unknown,201509,9,2,2015-09,2015-09_active,2015-09_bronze,2015-09_AR-C,False,2015-09_False,True,2015-09_True,True,2015-09_True,False,2015-09_False,740.527782,0.84,239999.0,9144042.0,199.99,3023.912,82.0,508.25,41.108282,1,9999,231641.066864,1.0,481.291042,1.0,3.0,2.76137,0,8676,2768.851374,0.0,52.619876,0.0,0.0,5174.568608,0.84,6500000.0,2879726000.0,250.0,53663.074155,90.0,800.0,34.051664,1,9999,174780.468169,1.0,418.06754,1.0,2.0,2.354534,0,8676,1872.264296,0.0,43.269669,0.0,0.0,36286.220461,0.84,2222222000.0,78174290000000.0,198.0,8841622.0,80.0,570.0,33.546161,1,9999,139594.226027,1.0,373.623107,1.0,3.0,0.607393,0,418,20.901961,0.0,4.571866,0.0,0.0,3422.331883,0.84,5330000.0,2192546000.0,200.0,46824.63,80.0,600.1875,43.14827,1,9999,231224.476175,1.0,480.858062,1.0,3.0,2.857901,0,8676,2829.409344,0.0,53.192193,0.0,0.0,28707.192126,0.84,2222222000.0,51819250000000.0,240.0,7198559.0,90.0,790.0,29.555743,1,9999,143619.380848,1.0,378.971478,1.0,2.0,2.067133,0,8676,1689.833881,0.0,41.107589,0.0,0.0,24059.579477,0.84,2222222000.0,50503410000000.0,240.0,7106575.0,90.0,750.0,35.86707,1,9999,181301.913849,1.0,425.795625,1.0,3.0,2.451386,0,8676,1863.230928,0.0,43.165159,0.0,0.0,29267.464656,0.84,2222222000.0,62068990000000.0,249.9,7878388.0,95.0,760.0,36.248124,1,9999,203978.474193,1.0,451.639762,1.0,3.0,2.665703,0,8676,2148.890092,0.0,46.356122,0.0,0.0,28213.54735,0.84,2222222000.0,50921090000000.0,239.9,7135901.0,90.0,780.0,31.741638,1,9999,152024.035565,1.0,389.902598,1.0,2.0,2.255589,0,8676,1839.791329,0.0,42.892789,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.159408,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0,7195.266137,1.0,6500000.0,6426513000.0,260.0,80165.532772,100.0,900.0,37.361497,1,9999,226110.967462,1.0,475.51127,1.0,2.0,2.010947,0,1074,340.380653,0.0,18.449408,0.0,0.0,28983.790709,0.84,2222222000.0,53138300000000.0,250.0,7289602.0,90.0,800.0,36.9874,1,9999,187120.375078,1.0,432.574127,1.0,3.0,2.499424,0,8676,1941.504242,0.0,44.062504,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.16,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0
3,580.0,silver,580.0,buy_it_now,MLA86345,True,ARS,False,active,,,1,0,1,Argentina,AR,Capital Federal,AR-C,TUxBQkZMTzg5MjFa,True,False,me2,Transferencia bancaria,MLATB,G,,,1,missing,missing,alarma guardtex,alarma,alarma guardtex gx412,unknown,201509,9,0,2015-09,2015-09_active,2015-09_silver,2015-09_AR-C,False,2015-09_False,True,2015-09_True,True,2015-09_True,False,2015-09_False,740.527782,0.84,239999.0,9144042.0,199.99,3023.912,82.0,508.25,41.108282,1,9999,231641.066864,1.0,481.291042,1.0,3.0,2.76137,0,8676,2768.851374,0.0,52.619876,0.0,0.0,5174.568608,0.84,6500000.0,2879726000.0,250.0,53663.074155,90.0,800.0,34.051664,1,9999,174780.468169,1.0,418.06754,1.0,2.0,2.354534,0,8676,1872.264296,0.0,43.269669,0.0,0.0,28727.390143,1.0,6500000.0,21082240000.0,525.0,145197.3,200.0,1999.0,23.78758,1,9999,156604.280342,1.0,395.732587,1.0,4.0,7.179065,0,8676,8981.900117,0.0,94.772887,0.0,3.0,3422.331883,0.84,5330000.0,2192546000.0,200.0,46824.63,80.0,600.1875,43.14827,1,9999,231224.476175,1.0,480.858062,1.0,3.0,2.857901,0,8676,2829.409344,0.0,53.192193,0.0,0.0,28707.192126,0.84,2222222000.0,51819250000000.0,240.0,7198559.0,90.0,790.0,29.555743,1,9999,143619.380848,1.0,378.971478,1.0,2.0,2.067133,0,8676,1689.833881,0.0,41.107589,0.0,0.0,24059.579477,0.84,2222222000.0,50503410000000.0,240.0,7106575.0,90.0,750.0,35.86707,1,9999,181301.913849,1.0,425.795625,1.0,3.0,2.451386,0,8676,1863.230928,0.0,43.165159,0.0,0.0,29267.464656,0.84,2222222000.0,62068990000000.0,249.9,7878388.0,95.0,760.0,36.248124,1,9999,203978.474193,1.0,451.639762,1.0,3.0,2.665703,0,8676,2148.890092,0.0,46.356122,0.0,0.0,28213.54735,0.84,2222222000.0,50921090000000.0,239.9,7135901.0,90.0,780.0,31.741638,1,9999,152024.035565,1.0,389.902598,1.0,2.0,2.255589,0,8676,1839.791329,0.0,42.892789,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.159408,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0,4046.895197,1.0,1800000.0,1054643000.0,250.0,32475.261583,93.0,840.0,35.520899,1,9999,172526.750835,1.0,415.363396,1.0,3.0,2.429202,0,6065,2405.503677,0.0,49.045934,0.0,0.0,28983.790709,0.84,2222222000.0,53138300000000.0,250.0,7289602.0,90.0,800.0,36.9874,1,9999,187120.375078,1.0,432.574127,1.0,3.0,2.499424,0,8676,1941.504242,0.0,44.062504,0.0,0.0,7704.202487,0.84,6500000.0,4583176000.0,265.0,67699.16,100.0,890.0,18.568001,1,9999,43605.488011,1.0,208.819271,1.0,1.0,1.601402,0,8676,1839.508069,0.0,42.889487,0.0,0.0
4,30.0,bronze,30.0,buy_it_now,MLA41287,True,ARS,False,active,,,1,0,1,Argentina,AR,Buenos Aires,AR-B,TUxBQ1RSRTMxODE5NA,True,False,not_specified,Transferencia bancaria,MLATB,G,,,0,otros,unknown,serenata jennifer,serenata,serenata jennifer blake,unknown,201508,8,0,2015-08,2015-08_active,2015-08_bronze,2015-08_AR-B,False,2015-08_False,True,2015-08_True,True,2015-08_True,False,2015-08_False,59757.420878,1.0,2222222000.0,109329100000000.0,350.0,10456050.0,100.0,1500.0,27.159948,1,9999,106143.142293,1.0,325.796167,1.0,1.0,1.848199,0,2606,716.946028,0.0,26.775848,0.0,0.0,5174.568608,0.84,6500000.0,2879726000.0,250.0,53663.074155,90.0,800.0,34.051664,1,9999,174780.468169,1.0,418.06754,1.0,2.0,2.354534,0,8676,1872.264296,0.0,43.269669,0.0,0.0,36286.220461,0.84,2222222000.0,78174290000000.0,198.0,8841622.0,80.0,570.0,33.546161,1,9999,139594.226027,1.0,373.623107,1.0,3.0,0.607393,0,418,20.901961,0.0,4.571866,0.0,0.0,71170.310216,1.0,2222222000.0,141043800000000.0,310.0,11876190.0,100.0,1099.0,25.832634,1,9999,111688.754919,1.0,334.198676,1.0,2.0,1.92023,0,2606,495.010345,0.0,22.248828,0.0,0.0,28707.192126,0.84,2222222000.0,51819250000000.0,240.0,7198559.0,90.0,790.0,29.555743,1,9999,143619.380848,1.0,378.971478,1.0,2.0,2.067133,0,8676,1689.833881,0.0,41.107589,0.0,0.0,24059.579477,0.84,2222222000.0,50503410000000.0,240.0,7106575.0,90.0,750.0,35.86707,1,9999,181301.913849,1.0,425.795625,1.0,3.0,2.451386,0,8676,1863.230928,0.0,43.165159,0.0,0.0,29267.464656,0.84,2222222000.0,62068990000000.0,249.9,7878388.0,95.0,760.0,36.248124,1,9999,203978.474193,1.0,451.639762,1.0,3.0,2.665703,0,8676,2148.890092,0.0,46.356122,0.0,0.0,28213.54735,0.84,2222222000.0,50921090000000.0,239.9,7135901.0,90.0,780.0,31.741638,1,9999,152024.035565,1.0,389.902598,1.0,2.0,2.255589,0,8676,1839.791329,0.0,42.892789,0.0,0.0,984.394862,1.0,1000000.0,82512650.0,199.0,9083.647509,79.0,589.99,48.190886,1,9999,249118.281844,1.0,499.117503,1.0,4.0,3.292268,0,1373,637.955845,0.0,25.257788,0.0,0.0,4046.895197,1.0,1800000.0,1054643000.0,250.0,32475.261583,93.0,840.0,35.520899,1,9999,172526.750835,1.0,415.363396,1.0,3.0,2.429202,0,6065,2405.503677,0.0,49.045934,0.0,0.0,28983.790709,0.84,2222222000.0,53138300000000.0,250.0,7289602.0,90.0,800.0,36.9874,1,9999,187120.375078,1.0,432.574127,1.0,3.0,2.499424,0,8676,1941.504242,0.0,44.062504,0.0,0.0,61467.609255,1.0,2222222000.0,133549700000000.0,215.0,11556370.0,76.0,766.0,61.581821,1,9999,397073.613822,1.0,630.137774,1.0,4.0,3.712227,0,6065,1864.324394,0.0,43.177823,0.0,0.0


## Transformación de variables categóricas a Dummies

En este fragmento de código se identifican y agrupan categorías raras o poco frecuentes en variables categóricas seleccionadas del DataFrame df_products_00. Se define una lista var_dummies que contiene los nombres de las variables categóricas a transformar. Luego, se utiliza la clase RareLabelEncoder con los parámetros especificados: tol=0.03 para definir el umbral de rareza, n_categories=2 para el número máximo de categorías a mantener antes de agrupar como "Rare", y replace_with='Rare' para especificar el nombre de la nueva categoría para las categorías raras. Se ajusta el encoder a los datos utilizando encoder.fit(df_products_00) y luego se transforman los datos originales, almacenando el resultado en df_products_01. Este proceso es crucial para manejar categorías infrecuentes y mejorar la robustez de los modelos de machine learning al reducir el ruido causado por categorías con poca representación en los datos.

In [15]:
# variables categóricas a transformar
df_products_00.select_dtypes('object').columns

Index(['listing_type_id', 'buying_mode', 'category_id', 'currency_id',
       'status', 'video_id', 'country_name', 'country_id', 'state_name',
       'state_id', 'city_id', 'mode', 'descrip_mdo_0', 'id_mdo_0',
       'type_mdo_0', 'season_name', 'gender_name', 'warranty_cleaned',
       'warranty_type', 'first_two_words_title', 'first_word_title',
       'first_three_words_title', 'title_type', 'date_created_month',
       'year_month', 'concat_status', 'concat_var_lt', 'concat_var_state',
       'automatic_relist_str', 'concat_var_autrelist',
       'accepts_mercadopago_str', 'concat_var_accmdopag', 'local_pick_up_str',
       'concat_var_localpu', 'free_shipping_str', 'concat_var_freesh'],
      dtype='object')

In [16]:
# variables categóricas a transformar
var_dummies =['listing_type_id', 'buying_mode', 'category_id', 'currency_id',
            'status', 'video_id', 'country_name', 'country_id', 'state_name',
            'state_id', 'city_id', 'mode', 'descrip_mdo_0', 'id_mdo_0',
            'type_mdo_0', 'season_name', 'gender_name', 'warranty_cleaned',
            'first_two_words_title', 'first_word_title', 'first_three_words_title',
            'title_type', 'date_created_month', 'year_month', 'concat_status',
            'concat_var_lt', 'concat_var_state', 'automatic_relist_str',
            'concat_var_autrelist', 'accepts_mercadopago_str',
            'concat_var_accmdopag', 'local_pick_up_str', 'concat_var_localpu',
            'free_shipping_str', 'concat_var_freesh']

# groups rare or infrequent categories in a new category called “Rare”, or any other name entered by the user
encoder = RareLabelEncoder(tol=0.03, n_categories=2, variables=var_dummies, replace_with='Rare', missing_values='ignore')

encoder.fit(df_products_00)
# transform the data
df_products_01 = encoder.transform(df_products_00)



In [17]:
##############################
#Validacion Manual variables
##############################
conteo_valores = df_products_01['title_type'].value_counts()
print(conteo_valores)

unknown    92937
new         4786
Rare        2277
Name: title_type, dtype: int64


In [18]:
###########################################
## Paso Variables Categoricas a Dummies
###########################################
df_products_01 = pd.get_dummies(df_products_01)
# Reemplazar NaN con 0
df_products_01 = df_products_01.fillna(0)
# Convertir True y False a 1 y 0
df_products_01 = df_products_01.astype(int)

In [19]:
print(df_products_01.shape)
df_products_01.head()

(100000, 424)


Unnamed: 0,base_price,price,accepts_mercadopago,automatic_relist,subtitle,initial_quantity,sold_quantity,available_quantity,local_pick_up,free_shipping,target,month,weekday,mean_base_price_mode,min_base_price_mode,max_base_price_mode,var_base_price_mode,median_base_price_mode,std_base_price_mode,q1_base_price_mode,q3_base_price_mode,mean_initial_quantity_mode,min_initial_quantity_mode,max_initial_quantity_mode,var_initial_quantity_mode,median_initial_quantity_mode,std_initial_quantity_mode,q1_initial_quantity_mode,q3_initial_quantity_mode,mean_sold_quantity_mode,min_sold_quantity_mode,max_sold_quantity_mode,var_sold_quantity_mode,median_sold_quantity_mode,std_sold_quantity_mode,q1_sold_quantity_mode,q3_sold_quantity_mode,mean_base_price_status,min_base_price_status,max_base_price_status,var_base_price_status,median_base_price_status,std_base_price_status,q1_base_price_status,q3_base_price_status,mean_initial_quantity_status,min_initial_quantity_status,max_initial_quantity_status,var_initial_quantity_status,median_initial_quantity_status,std_initial_quantity_status,q1_initial_quantity_status,q3_initial_quantity_status,mean_sold_quantity_status,min_sold_quantity_status,max_sold_quantity_status,var_sold_quantity_status,median_sold_quantity_status,std_sold_quantity_status,q1_sold_quantity_status,q3_sold_quantity_status,mean_base_price_listing_type_id,min_base_price_listing_type_id,max_base_price_listing_type_id,var_base_price_listing_type_id,median_base_price_listing_type_id,std_base_price_listing_type_id,q1_base_price_listing_type_id,q3_base_price_listing_type_id,mean_initial_quantity_listing_type_id,min_initial_quantity_listing_type_id,max_initial_quantity_listing_type_id,var_initial_quantity_listing_type_id,median_initial_quantity_listing_type_id,std_initial_quantity_listing_type_id,q1_initial_quantity_listing_type_id,q3_initial_quantity_listing_type_id,mean_sold_quantity_listing_type_id,min_sold_quantity_listing_type_id,max_sold_quantity_listing_type_id,var_sold_quantity_listing_type_id,median_sold_quantity_listing_type_id,std_sold_quantity_listing_type_id,q1_sold_quantity_listing_type_id,q3_sold_quantity_listing_type_id,mean_base_price_state_id,min_base_price_state_id,max_base_price_state_id,var_base_price_state_id,median_base_price_state_id,std_base_price_state_id,q1_base_price_state_id,q3_base_price_state_id,mean_initial_quantity_state_id,min_initial_quantity_state_id,max_initial_quantity_state_id,var_initial_quantity_state_id,median_initial_quantity_state_id,std_initial_quantity_state_id,q1_initial_quantity_state_id,q3_initial_quantity_state_id,mean_sold_quantity_state_id,min_sold_quantity_state_id,max_sold_quantity_state_id,var_sold_quantity_state_id,median_sold_quantity_state_id,std_sold_quantity_state_id,q1_sold_quantity_state_id,q3_sold_quantity_state_id,mean_base_price_automatic_relist,min_base_price_automatic_relist,max_base_price_automatic_relist,var_base_price_automatic_relist,median_base_price_automatic_relist,std_base_price_automatic_relist,q1_base_price_automatic_relist,q3_base_price_automatic_relist,mean_initial_quantity_automatic_relist,min_initial_quantity_automatic_relist,max_initial_quantity_automatic_relist,var_initial_quantity_automatic_relist,median_initial_quantity_automatic_relist,std_initial_quantity_automatic_relist,q1_initial_quantity_automatic_relist,q3_initial_quantity_automatic_relist,mean_sold_quantity_automatic_relist,min_sold_quantity_automatic_relist,max_sold_quantity_automatic_relist,var_sold_quantity_automatic_relist,median_sold_quantity_automatic_relist,std_sold_quantity_automatic_relist,q1_sold_quantity_automatic_relist,q3_sold_quantity_automatic_relist,mean_base_price_accepts_mercadopago,min_base_price_accepts_mercadopago,max_base_price_accepts_mercadopago,var_base_price_accepts_mercadopago,median_base_price_accepts_mercadopago,std_base_price_accepts_mercadopago,q1_base_price_accepts_mercadopago,q3_base_price_accepts_mercadopago,mean_initial_quantity_accepts_mercadopago,min_initial_quantity_accepts_mercadopago,max_initial_quantity_accepts_mercadopago,var_initial_quantity_accepts_mercadopago,median_initial_quantity_accepts_mercadopago,std_initial_quantity_accepts_mercadopago,q1_initial_quantity_accepts_mercadopago,q3_initial_quantity_accepts_mercadopago,mean_sold_quantity_accepts_mercadopago,min_sold_quantity_accepts_mercadopago,max_sold_quantity_accepts_mercadopago,var_sold_quantity_accepts_mercadopago,median_sold_quantity_accepts_mercadopago,std_sold_quantity_accepts_mercadopago,q1_sold_quantity_accepts_mercadopago,q3_sold_quantity_accepts_mercadopago,mean_base_price_local_pick_up,min_base_price_local_pick_up,max_base_price_local_pick_up,var_base_price_local_pick_up,median_base_price_local_pick_up,std_base_price_local_pick_up,q1_base_price_local_pick_up,q3_base_price_local_pick_up,mean_initial_quantity_local_pick_up,min_initial_quantity_local_pick_up,max_initial_quantity_local_pick_up,var_initial_quantity_local_pick_up,median_initial_quantity_local_pick_up,std_initial_quantity_local_pick_up,q1_initial_quantity_local_pick_up,q3_initial_quantity_local_pick_up,mean_sold_quantity_local_pick_up,min_sold_quantity_local_pick_up,max_sold_quantity_local_pick_up,var_sold_quantity_local_pick_up,median_sold_quantity_local_pick_up,std_sold_quantity_local_pick_up,q1_sold_quantity_local_pick_up,q3_sold_quantity_local_pick_up,mean_base_price_free_shipping,min_base_price_free_shipping,max_base_price_free_shipping,var_base_price_free_shipping,median_base_price_free_shipping,std_base_price_free_shipping,q1_base_price_free_shipping,q3_base_price_free_shipping,mean_initial_quantity_free_shipping,min_initial_quantity_free_shipping,max_initial_quantity_free_shipping,var_initial_quantity_free_shipping,median_initial_quantity_free_shipping,std_initial_quantity_free_shipping,q1_initial_quantity_free_shipping,q3_initial_quantity_free_shipping,mean_sold_quantity_free_shipping,min_sold_quantity_free_shipping,max_sold_quantity_free_shipping,var_sold_quantity_free_shipping,median_sold_quantity_free_shipping,std_sold_quantity_free_shipping,q1_sold_quantity_free_shipping,q3_sold_quantity_free_shipping,mean_base_price_warranty_cleaned,min_base_price_warranty_cleaned,max_base_price_warranty_cleaned,var_base_price_warranty_cleaned,median_base_price_warranty_cleaned,std_base_price_warranty_cleaned,q1_base_price_warranty_cleaned,q3_base_price_warranty_cleaned,mean_initial_quantity_warranty_cleaned,min_initial_quantity_warranty_cleaned,max_initial_quantity_warranty_cleaned,var_initial_quantity_warranty_cleaned,median_initial_quantity_warranty_cleaned,std_initial_quantity_warranty_cleaned,q1_initial_quantity_warranty_cleaned,q3_initial_quantity_warranty_cleaned,mean_sold_quantity_warranty_cleaned,min_sold_quantity_warranty_cleaned,max_sold_quantity_warranty_cleaned,var_sold_quantity_warranty_cleaned,median_sold_quantity_warranty_cleaned,std_sold_quantity_warranty_cleaned,q1_sold_quantity_warranty_cleaned,q3_sold_quantity_warranty_cleaned,mean_base_price_weekday,min_base_price_weekday,max_base_price_weekday,var_base_price_weekday,median_base_price_weekday,std_base_price_weekday,q1_base_price_weekday,q3_base_price_weekday,mean_initial_quantity_weekday,min_initial_quantity_weekday,max_initial_quantity_weekday,var_initial_quantity_weekday,median_initial_quantity_weekday,std_initial_quantity_weekday,q1_initial_quantity_weekday,q3_initial_quantity_weekday,mean_sold_quantity_weekday,min_sold_quantity_weekday,max_sold_quantity_weekday,var_sold_quantity_weekday,median_sold_quantity_weekday,std_sold_quantity_weekday,q1_sold_quantity_weekday,q3_sold_quantity_weekday,mean_base_price_title_type,min_base_price_title_type,max_base_price_title_type,var_base_price_title_type,median_base_price_title_type,std_base_price_title_type,q1_base_price_title_type,q3_base_price_title_type,mean_initial_quantity_title_type,min_initial_quantity_title_type,max_initial_quantity_title_type,var_initial_quantity_title_type,median_initial_quantity_title_type,std_initial_quantity_title_type,q1_initial_quantity_title_type,q3_initial_quantity_title_type,mean_sold_quantity_title_type,min_sold_quantity_title_type,max_sold_quantity_title_type,var_sold_quantity_title_type,median_sold_quantity_title_type,std_sold_quantity_title_type,q1_sold_quantity_title_type,q3_sold_quantity_title_type,mean_base_price_warranty_type,min_base_price_warranty_type,max_base_price_warranty_type,var_base_price_warranty_type,median_base_price_warranty_type,std_base_price_warranty_type,q1_base_price_warranty_type,q3_base_price_warranty_type,mean_initial_quantity_warranty_type,min_initial_quantity_warranty_type,max_initial_quantity_warranty_type,var_initial_quantity_warranty_type,median_initial_quantity_warranty_type,std_initial_quantity_warranty_type,q1_initial_quantity_warranty_type,q3_initial_quantity_warranty_type,mean_sold_quantity_warranty_type,min_sold_quantity_warranty_type,max_sold_quantity_warranty_type,var_sold_quantity_warranty_type,median_sold_quantity_warranty_type,std_sold_quantity_warranty_type,q1_sold_quantity_warranty_type,q3_sold_quantity_warranty_type,listing_type_id_Rare,listing_type_id_bronze,listing_type_id_free,listing_type_id_gold_special,listing_type_id_silver,buying_mode_Rare,buying_mode_buy_it_now,category_id_MLA1227,category_id_Rare,currency_id_ARS,currency_id_USD,status_Rare,status_active,status_paused,video_id_QQNfOicE_o8,video_id_Rare,country_name_,country_name_Argentina,country_id_,country_id_AR,state_name_Buenos Aires,state_name_Capital Federal,state_name_Rare,state_id_AR-B,state_id_AR-C,state_id_Rare,city_id_,city_id_Rare,city_id_TUxBQlBBTDI1MTVa,mode_Rare,mode_custom,mode_me2,mode_not_specified,descrip_mdo_0_Efectivo,descrip_mdo_0_Rare,descrip_mdo_0_Transferencia bancaria,id_mdo_0_MLAMO,id_mdo_0_MLATB,id_mdo_0_Rare,type_mdo_0_G,type_mdo_0_Rare,season_name_,season_name_All-Season,season_name_Hombre,season_name_Mujer,season_name_Niñas,season_name_Niños,season_name_Rare,gender_name_,gender_name_All-Season,gender_name_Autumn-Winter,gender_name_Rare,gender_name_Spring-Summer,warranty_cleaned_12 meses,warranty_cleaned_Rare,warranty_cleaned_missing,warranty_cleaned_otros,warranty_cleaned_si,warranty_cleaned_sin garantía,warranty_type_missing,warranty_type_new,warranty_type_unknown,warranty_type_used,first_two_words_title_Rare,first_word_title_Rare,first_three_words_title_Rare,title_type_Rare,title_type_new,title_type_unknown,date_created_month_201508,date_created_month_201509,date_created_month_201510,date_created_month_Rare,year_month_2015-08,year_month_2015-09,year_month_2015-10,year_month_Rare,concat_status_2015-08_active,concat_status_2015-09_active,concat_status_2015-10_active,concat_status_Rare,concat_var_lt_2015-08_bronze,concat_var_lt_2015-08_free,concat_var_lt_2015-09_bronze,concat_var_lt_2015-09_free,concat_var_lt_2015-09_silver,concat_var_lt_2015-10_bronze,concat_var_lt_2015-10_free,concat_var_lt_Rare,concat_var_state_2015-08_AR-B,concat_var_state_2015-08_AR-C,concat_var_state_2015-09_AR-B,concat_var_state_2015-09_AR-C,concat_var_state_2015-10_AR-B,concat_var_state_2015-10_AR-C,concat_var_state_Rare,automatic_relist_str_False,automatic_relist_str_True,concat_var_autrelist_2015-08_False,concat_var_autrelist_2015-09_False,concat_var_autrelist_2015-10_False,concat_var_autrelist_Rare,accepts_mercadopago_str_False,accepts_mercadopago_str_True,concat_var_accmdopag_2015-08_True,concat_var_accmdopag_2015-09_True,concat_var_accmdopag_2015-10_True,concat_var_accmdopag_Rare,local_pick_up_str_False,local_pick_up_str_True,concat_var_localpu_2015-08_False,concat_var_localpu_2015-08_True,concat_var_localpu_2015-09_False,concat_var_localpu_2015-09_True,concat_var_localpu_2015-10_False,concat_var_localpu_2015-10_True,concat_var_localpu_Rare,free_shipping_str_False,free_shipping_str_True,concat_var_freesh_2015-08_False,concat_var_freesh_2015-09_False,concat_var_freesh_2015-10_False,concat_var_freesh_Rare
0,80,80,1,0,0,1,0,1,1,0,1,9,5,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,5160,1,1559237,2240845748,250,47337,109,750,13,1,9999,64945,1,254,1,3,1,0,907,364,0,19,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0
1,2650,2650,1,0,0,1,0,1,1,0,0,9,5,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,1,0,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0
2,60,60,1,0,0,1,0,1,1,0,0,9,2,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,7195,1,6500000,6426512644,260,80165,100,900,37,1,9999,226110,1,475,1,2,2,0,1074,340,0,18,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0
3,580,580,1,0,0,1,0,1,1,0,1,9,0,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0
4,30,30,1,0,0,1,0,1,1,0,0,8,0,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,71170,1,2222222222,141043796917680,310,11876186,100,1099,25,1,9999,111688,1,334,1,2,1,0,2606,495,0,22,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0


## Funciones matemáticas y nuevas distribuciones 

La función create_transformed_columns permite aplicar múltiples transformaciones (como cuadrado, raíz cuadrada, logaritmo, etc.) a una columna especificada de un DataFrame. Cada transformación genera nuevas columnas con los valores transformados, lo que facilita la exploración y el análisis de datos al proporcionar diferentes perspectivas y distribuciones de las variables originales. Esto es especialmente útil en ciencia de datos para ajustar distribuciones, manejar escalas y preparar datos para modelos predictivos o análisis estadísticos más avanzados, asegurando que los datos estén adecuadamente preparados y transformados según las necesidades del análisis o modelado específico.

In [20]:
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'price')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'initial_quantity')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'sold_quantity')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'available_quantity')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'var_base_price_mode')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'var_base_price_title_type')
df_products_01 = utilities_meli.create_transformed_columns(df_products_01,'var_base_price_warranty_cleaned')

  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)
  result = getattr(ufunc, method)(*inputs, **kwargs)


Se han añadido tres nuevas columnas al DataFrame `df_products_01`. La primera columna, `price_x_initial_quantity`, calcula el producto entre la columna 'price' y 'initial_quantity'. Esto es útil para obtener el producto total de precio por cantidad inicial de productos. La segunda columna, `price_x_sold_quantity`, realiza el producto entre la columna 'price' y 'sold_quantity', proporcionando el producto total de precio por cantidad vendida de productos. Finalmente, la tercera columna, `price_2_x_sold_quantity`, multiplica el valor cuadrado de la columna 'price' por 'sold_quantity', lo que representa el producto total del precio al cuadrado por la cantidad vendida de productos.

In [21]:
print(df_products_01.shape)
df_products_01.head()

(100000, 506)


Unnamed: 0,base_price,price,accepts_mercadopago,automatic_relist,subtitle,initial_quantity,sold_quantity,available_quantity,local_pick_up,free_shipping,target,month,weekday,mean_base_price_mode,min_base_price_mode,max_base_price_mode,var_base_price_mode,median_base_price_mode,std_base_price_mode,q1_base_price_mode,q3_base_price_mode,mean_initial_quantity_mode,min_initial_quantity_mode,max_initial_quantity_mode,var_initial_quantity_mode,median_initial_quantity_mode,std_initial_quantity_mode,q1_initial_quantity_mode,q3_initial_quantity_mode,mean_sold_quantity_mode,min_sold_quantity_mode,max_sold_quantity_mode,var_sold_quantity_mode,median_sold_quantity_mode,std_sold_quantity_mode,q1_sold_quantity_mode,q3_sold_quantity_mode,mean_base_price_status,min_base_price_status,max_base_price_status,var_base_price_status,median_base_price_status,std_base_price_status,q1_base_price_status,q3_base_price_status,mean_initial_quantity_status,min_initial_quantity_status,max_initial_quantity_status,var_initial_quantity_status,median_initial_quantity_status,std_initial_quantity_status,q1_initial_quantity_status,q3_initial_quantity_status,mean_sold_quantity_status,min_sold_quantity_status,max_sold_quantity_status,var_sold_quantity_status,median_sold_quantity_status,std_sold_quantity_status,q1_sold_quantity_status,q3_sold_quantity_status,mean_base_price_listing_type_id,min_base_price_listing_type_id,max_base_price_listing_type_id,var_base_price_listing_type_id,median_base_price_listing_type_id,std_base_price_listing_type_id,q1_base_price_listing_type_id,q3_base_price_listing_type_id,mean_initial_quantity_listing_type_id,min_initial_quantity_listing_type_id,max_initial_quantity_listing_type_id,var_initial_quantity_listing_type_id,median_initial_quantity_listing_type_id,std_initial_quantity_listing_type_id,q1_initial_quantity_listing_type_id,q3_initial_quantity_listing_type_id,mean_sold_quantity_listing_type_id,min_sold_quantity_listing_type_id,max_sold_quantity_listing_type_id,var_sold_quantity_listing_type_id,median_sold_quantity_listing_type_id,std_sold_quantity_listing_type_id,q1_sold_quantity_listing_type_id,q3_sold_quantity_listing_type_id,mean_base_price_state_id,min_base_price_state_id,max_base_price_state_id,var_base_price_state_id,median_base_price_state_id,std_base_price_state_id,q1_base_price_state_id,q3_base_price_state_id,mean_initial_quantity_state_id,min_initial_quantity_state_id,max_initial_quantity_state_id,var_initial_quantity_state_id,median_initial_quantity_state_id,std_initial_quantity_state_id,q1_initial_quantity_state_id,q3_initial_quantity_state_id,mean_sold_quantity_state_id,min_sold_quantity_state_id,max_sold_quantity_state_id,var_sold_quantity_state_id,median_sold_quantity_state_id,std_sold_quantity_state_id,q1_sold_quantity_state_id,q3_sold_quantity_state_id,mean_base_price_automatic_relist,min_base_price_automatic_relist,max_base_price_automatic_relist,var_base_price_automatic_relist,median_base_price_automatic_relist,std_base_price_automatic_relist,q1_base_price_automatic_relist,q3_base_price_automatic_relist,mean_initial_quantity_automatic_relist,min_initial_quantity_automatic_relist,max_initial_quantity_automatic_relist,var_initial_quantity_automatic_relist,median_initial_quantity_automatic_relist,std_initial_quantity_automatic_relist,q1_initial_quantity_automatic_relist,q3_initial_quantity_automatic_relist,mean_sold_quantity_automatic_relist,min_sold_quantity_automatic_relist,max_sold_quantity_automatic_relist,var_sold_quantity_automatic_relist,median_sold_quantity_automatic_relist,std_sold_quantity_automatic_relist,q1_sold_quantity_automatic_relist,q3_sold_quantity_automatic_relist,mean_base_price_accepts_mercadopago,min_base_price_accepts_mercadopago,max_base_price_accepts_mercadopago,var_base_price_accepts_mercadopago,median_base_price_accepts_mercadopago,std_base_price_accepts_mercadopago,q1_base_price_accepts_mercadopago,q3_base_price_accepts_mercadopago,mean_initial_quantity_accepts_mercadopago,min_initial_quantity_accepts_mercadopago,max_initial_quantity_accepts_mercadopago,var_initial_quantity_accepts_mercadopago,median_initial_quantity_accepts_mercadopago,std_initial_quantity_accepts_mercadopago,q1_initial_quantity_accepts_mercadopago,q3_initial_quantity_accepts_mercadopago,mean_sold_quantity_accepts_mercadopago,min_sold_quantity_accepts_mercadopago,max_sold_quantity_accepts_mercadopago,var_sold_quantity_accepts_mercadopago,median_sold_quantity_accepts_mercadopago,std_sold_quantity_accepts_mercadopago,q1_sold_quantity_accepts_mercadopago,q3_sold_quantity_accepts_mercadopago,mean_base_price_local_pick_up,min_base_price_local_pick_up,max_base_price_local_pick_up,var_base_price_local_pick_up,median_base_price_local_pick_up,std_base_price_local_pick_up,q1_base_price_local_pick_up,q3_base_price_local_pick_up,mean_initial_quantity_local_pick_up,min_initial_quantity_local_pick_up,max_initial_quantity_local_pick_up,var_initial_quantity_local_pick_up,median_initial_quantity_local_pick_up,std_initial_quantity_local_pick_up,q1_initial_quantity_local_pick_up,q3_initial_quantity_local_pick_up,mean_sold_quantity_local_pick_up,min_sold_quantity_local_pick_up,max_sold_quantity_local_pick_up,var_sold_quantity_local_pick_up,median_sold_quantity_local_pick_up,std_sold_quantity_local_pick_up,q1_sold_quantity_local_pick_up,q3_sold_quantity_local_pick_up,mean_base_price_free_shipping,min_base_price_free_shipping,max_base_price_free_shipping,var_base_price_free_shipping,median_base_price_free_shipping,std_base_price_free_shipping,q1_base_price_free_shipping,q3_base_price_free_shipping,mean_initial_quantity_free_shipping,min_initial_quantity_free_shipping,max_initial_quantity_free_shipping,var_initial_quantity_free_shipping,median_initial_quantity_free_shipping,std_initial_quantity_free_shipping,q1_initial_quantity_free_shipping,q3_initial_quantity_free_shipping,mean_sold_quantity_free_shipping,min_sold_quantity_free_shipping,max_sold_quantity_free_shipping,var_sold_quantity_free_shipping,median_sold_quantity_free_shipping,std_sold_quantity_free_shipping,q1_sold_quantity_free_shipping,q3_sold_quantity_free_shipping,mean_base_price_warranty_cleaned,min_base_price_warranty_cleaned,max_base_price_warranty_cleaned,var_base_price_warranty_cleaned,median_base_price_warranty_cleaned,std_base_price_warranty_cleaned,q1_base_price_warranty_cleaned,q3_base_price_warranty_cleaned,mean_initial_quantity_warranty_cleaned,min_initial_quantity_warranty_cleaned,max_initial_quantity_warranty_cleaned,var_initial_quantity_warranty_cleaned,median_initial_quantity_warranty_cleaned,std_initial_quantity_warranty_cleaned,q1_initial_quantity_warranty_cleaned,q3_initial_quantity_warranty_cleaned,mean_sold_quantity_warranty_cleaned,min_sold_quantity_warranty_cleaned,max_sold_quantity_warranty_cleaned,var_sold_quantity_warranty_cleaned,median_sold_quantity_warranty_cleaned,std_sold_quantity_warranty_cleaned,q1_sold_quantity_warranty_cleaned,q3_sold_quantity_warranty_cleaned,mean_base_price_weekday,min_base_price_weekday,max_base_price_weekday,var_base_price_weekday,median_base_price_weekday,std_base_price_weekday,q1_base_price_weekday,q3_base_price_weekday,mean_initial_quantity_weekday,min_initial_quantity_weekday,max_initial_quantity_weekday,var_initial_quantity_weekday,median_initial_quantity_weekday,std_initial_quantity_weekday,q1_initial_quantity_weekday,q3_initial_quantity_weekday,mean_sold_quantity_weekday,min_sold_quantity_weekday,max_sold_quantity_weekday,var_sold_quantity_weekday,median_sold_quantity_weekday,std_sold_quantity_weekday,q1_sold_quantity_weekday,q3_sold_quantity_weekday,mean_base_price_title_type,min_base_price_title_type,max_base_price_title_type,var_base_price_title_type,median_base_price_title_type,std_base_price_title_type,q1_base_price_title_type,q3_base_price_title_type,mean_initial_quantity_title_type,min_initial_quantity_title_type,max_initial_quantity_title_type,var_initial_quantity_title_type,median_initial_quantity_title_type,std_initial_quantity_title_type,q1_initial_quantity_title_type,q3_initial_quantity_title_type,mean_sold_quantity_title_type,min_sold_quantity_title_type,max_sold_quantity_title_type,var_sold_quantity_title_type,median_sold_quantity_title_type,std_sold_quantity_title_type,q1_sold_quantity_title_type,q3_sold_quantity_title_type,mean_base_price_warranty_type,min_base_price_warranty_type,max_base_price_warranty_type,var_base_price_warranty_type,median_base_price_warranty_type,std_base_price_warranty_type,q1_base_price_warranty_type,q3_base_price_warranty_type,mean_initial_quantity_warranty_type,min_initial_quantity_warranty_type,max_initial_quantity_warranty_type,var_initial_quantity_warranty_type,median_initial_quantity_warranty_type,std_initial_quantity_warranty_type,q1_initial_quantity_warranty_type,q3_initial_quantity_warranty_type,mean_sold_quantity_warranty_type,min_sold_quantity_warranty_type,max_sold_quantity_warranty_type,var_sold_quantity_warranty_type,median_sold_quantity_warranty_type,std_sold_quantity_warranty_type,q1_sold_quantity_warranty_type,q3_sold_quantity_warranty_type,listing_type_id_Rare,listing_type_id_bronze,listing_type_id_free,listing_type_id_gold_special,listing_type_id_silver,buying_mode_Rare,buying_mode_buy_it_now,category_id_MLA1227,category_id_Rare,currency_id_ARS,currency_id_USD,status_Rare,status_active,status_paused,video_id_QQNfOicE_o8,video_id_Rare,country_name_,country_name_Argentina,country_id_,country_id_AR,state_name_Buenos Aires,state_name_Capital Federal,state_name_Rare,state_id_AR-B,state_id_AR-C,state_id_Rare,city_id_,city_id_Rare,city_id_TUxBQlBBTDI1MTVa,mode_Rare,mode_custom,mode_me2,mode_not_specified,descrip_mdo_0_Efectivo,descrip_mdo_0_Rare,descrip_mdo_0_Transferencia bancaria,id_mdo_0_MLAMO,id_mdo_0_MLATB,id_mdo_0_Rare,type_mdo_0_G,type_mdo_0_Rare,season_name_,season_name_All-Season,season_name_Hombre,season_name_Mujer,season_name_Niñas,season_name_Niños,season_name_Rare,gender_name_,gender_name_All-Season,gender_name_Autumn-Winter,gender_name_Rare,gender_name_Spring-Summer,warranty_cleaned_12 meses,warranty_cleaned_Rare,warranty_cleaned_missing,warranty_cleaned_otros,warranty_cleaned_si,warranty_cleaned_sin garantía,warranty_type_missing,warranty_type_new,warranty_type_unknown,warranty_type_used,first_two_words_title_Rare,first_word_title_Rare,first_three_words_title_Rare,title_type_Rare,title_type_new,title_type_unknown,date_created_month_201508,date_created_month_201509,date_created_month_201510,date_created_month_Rare,year_month_2015-08,year_month_2015-09,year_month_2015-10,year_month_Rare,concat_status_2015-08_active,concat_status_2015-09_active,concat_status_2015-10_active,concat_status_Rare,concat_var_lt_2015-08_bronze,concat_var_lt_2015-08_free,concat_var_lt_2015-09_bronze,concat_var_lt_2015-09_free,concat_var_lt_2015-09_silver,concat_var_lt_2015-10_bronze,concat_var_lt_2015-10_free,concat_var_lt_Rare,concat_var_state_2015-08_AR-B,concat_var_state_2015-08_AR-C,concat_var_state_2015-09_AR-B,concat_var_state_2015-09_AR-C,concat_var_state_2015-10_AR-B,concat_var_state_2015-10_AR-C,concat_var_state_Rare,automatic_relist_str_False,automatic_relist_str_True,concat_var_autrelist_2015-08_False,concat_var_autrelist_2015-09_False,concat_var_autrelist_2015-10_False,concat_var_autrelist_Rare,accepts_mercadopago_str_False,accepts_mercadopago_str_True,concat_var_accmdopag_2015-08_True,concat_var_accmdopag_2015-09_True,concat_var_accmdopag_2015-10_True,concat_var_accmdopag_Rare,local_pick_up_str_False,local_pick_up_str_True,concat_var_localpu_2015-08_False,concat_var_localpu_2015-08_True,concat_var_localpu_2015-09_False,concat_var_localpu_2015-09_True,concat_var_localpu_2015-10_False,concat_var_localpu_2015-10_True,concat_var_localpu_Rare,free_shipping_str_False,free_shipping_str_True,concat_var_freesh_2015-08_False,concat_var_freesh_2015-09_False,concat_var_freesh_2015-10_False,concat_var_freesh_Rare,price_square,price_cube,price_fourth,price_sqrt,price_log,price_reciprocal,price_tanh,price_inverse_sqrt,price_bin,price_sigmoid,price_arcsin,initial_quantity_square,initial_quantity_cube,initial_quantity_fourth,initial_quantity_sqrt,initial_quantity_log,initial_quantity_reciprocal,initial_quantity_tanh,initial_quantity_inverse_sqrt,initial_quantity_boxcox,initial_quantity_bin,initial_quantity_sigmoid,initial_quantity_arcsin,sold_quantity_square,sold_quantity_cube,sold_quantity_fourth,sold_quantity_sqrt,sold_quantity_log,sold_quantity_reciprocal,sold_quantity_tanh,sold_quantity_inverse_sqrt,sold_quantity_bin,sold_quantity_sigmoid,sold_quantity_arcsin,available_quantity_square,available_quantity_cube,available_quantity_fourth,available_quantity_sqrt,available_quantity_log,available_quantity_reciprocal,available_quantity_tanh,available_quantity_inverse_sqrt,available_quantity_boxcox,available_quantity_bin,available_quantity_sigmoid,available_quantity_arcsin,var_base_price_mode_square,var_base_price_mode_cube,var_base_price_mode_fourth,var_base_price_mode_sqrt,var_base_price_mode_log,var_base_price_mode_reciprocal,var_base_price_mode_tanh,var_base_price_mode_inverse_sqrt,var_base_price_mode_boxcox,var_base_price_mode_bin,var_base_price_mode_sigmoid,var_base_price_mode_arcsin,var_base_price_title_type_square,var_base_price_title_type_cube,var_base_price_title_type_fourth,var_base_price_title_type_sqrt,var_base_price_title_type_log,var_base_price_title_type_reciprocal,var_base_price_title_type_tanh,var_base_price_title_type_inverse_sqrt,var_base_price_title_type_boxcox,var_base_price_title_type_bin,var_base_price_title_type_sigmoid,var_base_price_title_type_arcsin,var_base_price_warranty_cleaned_square,var_base_price_warranty_cleaned_cube,var_base_price_warranty_cleaned_fourth,var_base_price_warranty_cleaned_sqrt,var_base_price_warranty_cleaned_log,var_base_price_warranty_cleaned_reciprocal,var_base_price_warranty_cleaned_tanh,var_base_price_warranty_cleaned_inverse_sqrt,var_base_price_warranty_cleaned_boxcox,var_base_price_warranty_cleaned_bin,var_base_price_warranty_cleaned_sigmoid,var_base_price_warranty_cleaned_arcsin
0,80,80,1,0,0,1,0,1,1,0,1,9,5,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,5160,1,1559237,2240845748,250,47337,109,750,13,1,9999,64945,1,254,1,3,1,0,907,364,0,19,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,6400,512000,40960000,8.944272,4.382027,0.0125,1.0,0.111803,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,399039851001513241,3951759870702142973,8366459324215131249,10456050.0,32.325383,9.146698e-15,1.0,9.563837e-08,20.129988,9,1.0,,5021389666329679504,-6183866105918272192,1992273427699142912,47337.57,21.530119,4.462601e-10,1.0,2.112487e-05,13066080000000.0,0,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,
1,2650,2650,1,0,0,1,0,1,1,0,0,9,5,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,1,0,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,7022500,18609625000,49315506250000,51.478151,7.882315,0.000377,1.0,0.019426,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,6808337740073104,-3141375632200001856,-9027633774132113152,9083.647505,18.228462,1.211935e-08,1.0,0.00011,9.284387,0,1.0,
2,60,60,1,0,0,1,0,1,1,0,0,9,2,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,7195,1,6500000,6426512644,260,80165,100,900,37,1,9999,226110,1,475,1,2,2,0,1074,340,0,18,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,3600,216000,12960000,7.745967,4.094345,0.016667,1.0,0.129099,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,
3,580,580,1,0,0,1,0,1,1,0,1,9,0,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,336400,195112000,113164960000,24.083189,6.363028,0.001724,1.0,0.041523,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,
4,30,30,1,0,0,1,0,1,1,0,0,8,0,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,71170,1,2222222222,141043796917680,310,11876186,100,1099,25,1,9999,111688,1,334,1,2,1,0,2606,495,0,22,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,900,27000,810000,5.477226,3.401197,0.033333,1.0,0.182574,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,399039851001513241,3951759870702142973,8366459324215131249,10456050.0,32.325383,9.146698e-15,1.0,9.563837e-08,20.129988,9,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,6808337740073104,-3141375632200001856,-9027633774132113152,9083.647505,18.228462,1.211935e-08,1.0,0.00011,9.284387,0,1.0,


In [22]:
df_products_01['price_x_initial_quantity'] = df_products_01['price']*df_products_01['initial_quantity']
df_products_01['price_x_sold_quantity'] = df_products_01['price']*df_products_01['sold_quantity']
df_products_01['price_2_x_sold_quantity'] = df_products_01['price_square']*df_products_01['sold_quantity']
df_products_01['price_x_initial_quantity'] = df_products_01['price']*df_products_01['initial_quantity']
df_products_01['price_x_sold_quantity'] = df_products_01['price']*df_products_01['sold_quantity']
df_products_01['price_4_x_sold_quantity'] = df_products_01['price_fourth']*df_products_01['sold_quantity']
df_products_01['price_4_x_initial_quantity'] = df_products_01['price_fourth']*df_products_01['initial_quantity']

In [23]:
df_products_01.head()

Unnamed: 0,base_price,price,accepts_mercadopago,automatic_relist,subtitle,initial_quantity,sold_quantity,available_quantity,local_pick_up,free_shipping,target,month,weekday,mean_base_price_mode,min_base_price_mode,max_base_price_mode,var_base_price_mode,median_base_price_mode,std_base_price_mode,q1_base_price_mode,q3_base_price_mode,mean_initial_quantity_mode,min_initial_quantity_mode,max_initial_quantity_mode,var_initial_quantity_mode,median_initial_quantity_mode,std_initial_quantity_mode,q1_initial_quantity_mode,q3_initial_quantity_mode,mean_sold_quantity_mode,min_sold_quantity_mode,max_sold_quantity_mode,var_sold_quantity_mode,median_sold_quantity_mode,std_sold_quantity_mode,q1_sold_quantity_mode,q3_sold_quantity_mode,mean_base_price_status,min_base_price_status,max_base_price_status,var_base_price_status,median_base_price_status,std_base_price_status,q1_base_price_status,q3_base_price_status,mean_initial_quantity_status,min_initial_quantity_status,max_initial_quantity_status,var_initial_quantity_status,median_initial_quantity_status,std_initial_quantity_status,q1_initial_quantity_status,q3_initial_quantity_status,mean_sold_quantity_status,min_sold_quantity_status,max_sold_quantity_status,var_sold_quantity_status,median_sold_quantity_status,std_sold_quantity_status,q1_sold_quantity_status,q3_sold_quantity_status,mean_base_price_listing_type_id,min_base_price_listing_type_id,max_base_price_listing_type_id,var_base_price_listing_type_id,median_base_price_listing_type_id,std_base_price_listing_type_id,q1_base_price_listing_type_id,q3_base_price_listing_type_id,mean_initial_quantity_listing_type_id,min_initial_quantity_listing_type_id,max_initial_quantity_listing_type_id,var_initial_quantity_listing_type_id,median_initial_quantity_listing_type_id,std_initial_quantity_listing_type_id,q1_initial_quantity_listing_type_id,q3_initial_quantity_listing_type_id,mean_sold_quantity_listing_type_id,min_sold_quantity_listing_type_id,max_sold_quantity_listing_type_id,var_sold_quantity_listing_type_id,median_sold_quantity_listing_type_id,std_sold_quantity_listing_type_id,q1_sold_quantity_listing_type_id,q3_sold_quantity_listing_type_id,mean_base_price_state_id,min_base_price_state_id,max_base_price_state_id,var_base_price_state_id,median_base_price_state_id,std_base_price_state_id,q1_base_price_state_id,q3_base_price_state_id,mean_initial_quantity_state_id,min_initial_quantity_state_id,max_initial_quantity_state_id,var_initial_quantity_state_id,median_initial_quantity_state_id,std_initial_quantity_state_id,q1_initial_quantity_state_id,q3_initial_quantity_state_id,mean_sold_quantity_state_id,min_sold_quantity_state_id,max_sold_quantity_state_id,var_sold_quantity_state_id,median_sold_quantity_state_id,std_sold_quantity_state_id,q1_sold_quantity_state_id,q3_sold_quantity_state_id,mean_base_price_automatic_relist,min_base_price_automatic_relist,max_base_price_automatic_relist,var_base_price_automatic_relist,median_base_price_automatic_relist,std_base_price_automatic_relist,q1_base_price_automatic_relist,q3_base_price_automatic_relist,mean_initial_quantity_automatic_relist,min_initial_quantity_automatic_relist,max_initial_quantity_automatic_relist,var_initial_quantity_automatic_relist,median_initial_quantity_automatic_relist,std_initial_quantity_automatic_relist,q1_initial_quantity_automatic_relist,q3_initial_quantity_automatic_relist,mean_sold_quantity_automatic_relist,min_sold_quantity_automatic_relist,max_sold_quantity_automatic_relist,var_sold_quantity_automatic_relist,median_sold_quantity_automatic_relist,std_sold_quantity_automatic_relist,q1_sold_quantity_automatic_relist,q3_sold_quantity_automatic_relist,mean_base_price_accepts_mercadopago,min_base_price_accepts_mercadopago,max_base_price_accepts_mercadopago,var_base_price_accepts_mercadopago,median_base_price_accepts_mercadopago,std_base_price_accepts_mercadopago,q1_base_price_accepts_mercadopago,q3_base_price_accepts_mercadopago,mean_initial_quantity_accepts_mercadopago,min_initial_quantity_accepts_mercadopago,max_initial_quantity_accepts_mercadopago,var_initial_quantity_accepts_mercadopago,median_initial_quantity_accepts_mercadopago,std_initial_quantity_accepts_mercadopago,q1_initial_quantity_accepts_mercadopago,q3_initial_quantity_accepts_mercadopago,mean_sold_quantity_accepts_mercadopago,min_sold_quantity_accepts_mercadopago,max_sold_quantity_accepts_mercadopago,var_sold_quantity_accepts_mercadopago,median_sold_quantity_accepts_mercadopago,std_sold_quantity_accepts_mercadopago,q1_sold_quantity_accepts_mercadopago,q3_sold_quantity_accepts_mercadopago,mean_base_price_local_pick_up,min_base_price_local_pick_up,max_base_price_local_pick_up,var_base_price_local_pick_up,median_base_price_local_pick_up,std_base_price_local_pick_up,q1_base_price_local_pick_up,q3_base_price_local_pick_up,mean_initial_quantity_local_pick_up,min_initial_quantity_local_pick_up,max_initial_quantity_local_pick_up,var_initial_quantity_local_pick_up,median_initial_quantity_local_pick_up,std_initial_quantity_local_pick_up,q1_initial_quantity_local_pick_up,q3_initial_quantity_local_pick_up,mean_sold_quantity_local_pick_up,min_sold_quantity_local_pick_up,max_sold_quantity_local_pick_up,var_sold_quantity_local_pick_up,median_sold_quantity_local_pick_up,std_sold_quantity_local_pick_up,q1_sold_quantity_local_pick_up,q3_sold_quantity_local_pick_up,mean_base_price_free_shipping,min_base_price_free_shipping,max_base_price_free_shipping,var_base_price_free_shipping,median_base_price_free_shipping,std_base_price_free_shipping,q1_base_price_free_shipping,q3_base_price_free_shipping,mean_initial_quantity_free_shipping,min_initial_quantity_free_shipping,max_initial_quantity_free_shipping,var_initial_quantity_free_shipping,median_initial_quantity_free_shipping,std_initial_quantity_free_shipping,q1_initial_quantity_free_shipping,q3_initial_quantity_free_shipping,mean_sold_quantity_free_shipping,min_sold_quantity_free_shipping,max_sold_quantity_free_shipping,var_sold_quantity_free_shipping,median_sold_quantity_free_shipping,std_sold_quantity_free_shipping,q1_sold_quantity_free_shipping,q3_sold_quantity_free_shipping,mean_base_price_warranty_cleaned,min_base_price_warranty_cleaned,max_base_price_warranty_cleaned,var_base_price_warranty_cleaned,median_base_price_warranty_cleaned,std_base_price_warranty_cleaned,q1_base_price_warranty_cleaned,q3_base_price_warranty_cleaned,mean_initial_quantity_warranty_cleaned,min_initial_quantity_warranty_cleaned,max_initial_quantity_warranty_cleaned,var_initial_quantity_warranty_cleaned,median_initial_quantity_warranty_cleaned,std_initial_quantity_warranty_cleaned,q1_initial_quantity_warranty_cleaned,q3_initial_quantity_warranty_cleaned,mean_sold_quantity_warranty_cleaned,min_sold_quantity_warranty_cleaned,max_sold_quantity_warranty_cleaned,var_sold_quantity_warranty_cleaned,median_sold_quantity_warranty_cleaned,std_sold_quantity_warranty_cleaned,q1_sold_quantity_warranty_cleaned,q3_sold_quantity_warranty_cleaned,mean_base_price_weekday,min_base_price_weekday,max_base_price_weekday,var_base_price_weekday,median_base_price_weekday,std_base_price_weekday,q1_base_price_weekday,q3_base_price_weekday,mean_initial_quantity_weekday,min_initial_quantity_weekday,max_initial_quantity_weekday,var_initial_quantity_weekday,median_initial_quantity_weekday,std_initial_quantity_weekday,q1_initial_quantity_weekday,q3_initial_quantity_weekday,mean_sold_quantity_weekday,min_sold_quantity_weekday,max_sold_quantity_weekday,var_sold_quantity_weekday,median_sold_quantity_weekday,std_sold_quantity_weekday,q1_sold_quantity_weekday,q3_sold_quantity_weekday,mean_base_price_title_type,min_base_price_title_type,max_base_price_title_type,var_base_price_title_type,median_base_price_title_type,std_base_price_title_type,q1_base_price_title_type,q3_base_price_title_type,mean_initial_quantity_title_type,min_initial_quantity_title_type,max_initial_quantity_title_type,var_initial_quantity_title_type,median_initial_quantity_title_type,std_initial_quantity_title_type,q1_initial_quantity_title_type,q3_initial_quantity_title_type,mean_sold_quantity_title_type,min_sold_quantity_title_type,max_sold_quantity_title_type,var_sold_quantity_title_type,median_sold_quantity_title_type,std_sold_quantity_title_type,q1_sold_quantity_title_type,q3_sold_quantity_title_type,mean_base_price_warranty_type,min_base_price_warranty_type,max_base_price_warranty_type,var_base_price_warranty_type,median_base_price_warranty_type,std_base_price_warranty_type,q1_base_price_warranty_type,q3_base_price_warranty_type,mean_initial_quantity_warranty_type,min_initial_quantity_warranty_type,max_initial_quantity_warranty_type,var_initial_quantity_warranty_type,median_initial_quantity_warranty_type,std_initial_quantity_warranty_type,q1_initial_quantity_warranty_type,q3_initial_quantity_warranty_type,mean_sold_quantity_warranty_type,min_sold_quantity_warranty_type,max_sold_quantity_warranty_type,var_sold_quantity_warranty_type,median_sold_quantity_warranty_type,std_sold_quantity_warranty_type,q1_sold_quantity_warranty_type,q3_sold_quantity_warranty_type,listing_type_id_Rare,listing_type_id_bronze,listing_type_id_free,listing_type_id_gold_special,listing_type_id_silver,buying_mode_Rare,buying_mode_buy_it_now,category_id_MLA1227,category_id_Rare,currency_id_ARS,currency_id_USD,status_Rare,status_active,status_paused,video_id_QQNfOicE_o8,video_id_Rare,country_name_,country_name_Argentina,country_id_,country_id_AR,state_name_Buenos Aires,state_name_Capital Federal,state_name_Rare,state_id_AR-B,state_id_AR-C,state_id_Rare,city_id_,city_id_Rare,city_id_TUxBQlBBTDI1MTVa,mode_Rare,mode_custom,mode_me2,mode_not_specified,descrip_mdo_0_Efectivo,descrip_mdo_0_Rare,descrip_mdo_0_Transferencia bancaria,id_mdo_0_MLAMO,id_mdo_0_MLATB,id_mdo_0_Rare,type_mdo_0_G,type_mdo_0_Rare,season_name_,season_name_All-Season,season_name_Hombre,season_name_Mujer,season_name_Niñas,season_name_Niños,season_name_Rare,gender_name_,gender_name_All-Season,gender_name_Autumn-Winter,gender_name_Rare,gender_name_Spring-Summer,warranty_cleaned_12 meses,warranty_cleaned_Rare,warranty_cleaned_missing,warranty_cleaned_otros,warranty_cleaned_si,warranty_cleaned_sin garantía,warranty_type_missing,warranty_type_new,warranty_type_unknown,warranty_type_used,first_two_words_title_Rare,first_word_title_Rare,first_three_words_title_Rare,title_type_Rare,title_type_new,title_type_unknown,date_created_month_201508,date_created_month_201509,date_created_month_201510,date_created_month_Rare,year_month_2015-08,year_month_2015-09,year_month_2015-10,year_month_Rare,concat_status_2015-08_active,concat_status_2015-09_active,concat_status_2015-10_active,concat_status_Rare,concat_var_lt_2015-08_bronze,concat_var_lt_2015-08_free,concat_var_lt_2015-09_bronze,concat_var_lt_2015-09_free,concat_var_lt_2015-09_silver,concat_var_lt_2015-10_bronze,concat_var_lt_2015-10_free,concat_var_lt_Rare,concat_var_state_2015-08_AR-B,concat_var_state_2015-08_AR-C,concat_var_state_2015-09_AR-B,concat_var_state_2015-09_AR-C,concat_var_state_2015-10_AR-B,concat_var_state_2015-10_AR-C,concat_var_state_Rare,automatic_relist_str_False,automatic_relist_str_True,concat_var_autrelist_2015-08_False,concat_var_autrelist_2015-09_False,concat_var_autrelist_2015-10_False,concat_var_autrelist_Rare,accepts_mercadopago_str_False,accepts_mercadopago_str_True,concat_var_accmdopag_2015-08_True,concat_var_accmdopag_2015-09_True,concat_var_accmdopag_2015-10_True,concat_var_accmdopag_Rare,local_pick_up_str_False,local_pick_up_str_True,concat_var_localpu_2015-08_False,concat_var_localpu_2015-08_True,concat_var_localpu_2015-09_False,concat_var_localpu_2015-09_True,concat_var_localpu_2015-10_False,concat_var_localpu_2015-10_True,concat_var_localpu_Rare,free_shipping_str_False,free_shipping_str_True,concat_var_freesh_2015-08_False,concat_var_freesh_2015-09_False,concat_var_freesh_2015-10_False,concat_var_freesh_Rare,price_square,price_cube,price_fourth,price_sqrt,price_log,price_reciprocal,price_tanh,price_inverse_sqrt,price_bin,price_sigmoid,price_arcsin,initial_quantity_square,initial_quantity_cube,initial_quantity_fourth,initial_quantity_sqrt,initial_quantity_log,initial_quantity_reciprocal,initial_quantity_tanh,initial_quantity_inverse_sqrt,initial_quantity_boxcox,initial_quantity_bin,initial_quantity_sigmoid,initial_quantity_arcsin,sold_quantity_square,sold_quantity_cube,sold_quantity_fourth,sold_quantity_sqrt,sold_quantity_log,sold_quantity_reciprocal,sold_quantity_tanh,sold_quantity_inverse_sqrt,sold_quantity_bin,sold_quantity_sigmoid,sold_quantity_arcsin,available_quantity_square,available_quantity_cube,available_quantity_fourth,available_quantity_sqrt,available_quantity_log,available_quantity_reciprocal,available_quantity_tanh,available_quantity_inverse_sqrt,available_quantity_boxcox,available_quantity_bin,available_quantity_sigmoid,available_quantity_arcsin,var_base_price_mode_square,var_base_price_mode_cube,var_base_price_mode_fourth,var_base_price_mode_sqrt,var_base_price_mode_log,var_base_price_mode_reciprocal,var_base_price_mode_tanh,var_base_price_mode_inverse_sqrt,var_base_price_mode_boxcox,var_base_price_mode_bin,var_base_price_mode_sigmoid,var_base_price_mode_arcsin,var_base_price_title_type_square,var_base_price_title_type_cube,var_base_price_title_type_fourth,var_base_price_title_type_sqrt,var_base_price_title_type_log,var_base_price_title_type_reciprocal,var_base_price_title_type_tanh,var_base_price_title_type_inverse_sqrt,var_base_price_title_type_boxcox,var_base_price_title_type_bin,var_base_price_title_type_sigmoid,var_base_price_title_type_arcsin,var_base_price_warranty_cleaned_square,var_base_price_warranty_cleaned_cube,var_base_price_warranty_cleaned_fourth,var_base_price_warranty_cleaned_sqrt,var_base_price_warranty_cleaned_log,var_base_price_warranty_cleaned_reciprocal,var_base_price_warranty_cleaned_tanh,var_base_price_warranty_cleaned_inverse_sqrt,var_base_price_warranty_cleaned_boxcox,var_base_price_warranty_cleaned_bin,var_base_price_warranty_cleaned_sigmoid,var_base_price_warranty_cleaned_arcsin,price_x_initial_quantity,price_x_sold_quantity,price_2_x_sold_quantity,price_4_x_sold_quantity,price_4_x_initial_quantity
0,80,80,1,0,0,1,0,1,1,0,1,9,5,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,5160,1,1559237,2240845748,250,47337,109,750,13,1,9999,64945,1,254,1,3,1,0,907,364,0,19,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,6400,512000,40960000,8.944272,4.382027,0.0125,1.0,0.111803,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,399039851001513241,3951759870702142973,8366459324215131249,10456050.0,32.325383,9.146698e-15,1.0,9.563837e-08,20.129988,9,1.0,,5021389666329679504,-6183866105918272192,1992273427699142912,47337.57,21.530119,4.462601e-10,1.0,2.112487e-05,13066080000000.0,0,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,,80,0,0,0,40960000
1,2650,2650,1,0,0,1,0,1,1,0,0,9,5,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4449,1,2004105,1768778257,230,42056,85,728,27,1,9999,137730,1,371,1,2,2,0,2299,1116,0,33,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,1,0,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,7022500,18609625000,49315506250000,51.478151,7.882315,0.000377,1.0,0.019426,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,6808337740073104,-3141375632200001856,-9027633774132113152,9083.647505,18.228462,1.211935e-08,1.0,0.00011,9.284387,0,1.0,,2650,0,0,0,49315506250000
2,60,60,1,0,0,1,0,1,1,0,0,9,2,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,7195,1,6500000,6426512644,260,80165,100,900,37,1,9999,226110,1,475,1,2,2,0,1074,340,0,18,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,3600,216000,12960000,7.745967,4.094345,0.016667,1.0,0.129099,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,,60,0,0,0,12960000
3,580,580,1,0,0,1,0,1,1,0,1,9,0,740,0,239999,9144042,199,3023,82,508,41,1,9999,231641,1,481,1,3,2,0,8676,2768,0,52,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,28727,1,6500000,21082242084,525,145197,200,1999,23,1,9999,156604,1,395,1,4,7,0,8676,8981,0,94,0,3,3422,0,5330000,2192546357,200,46824,80,600,43,1,9999,231224,1,480,1,3,2,0,8676,2829,0,53,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,7704,0,6500000,4583176184,265,67699,100,890,18,1,9999,43605,1,208,1,1,1,0,8676,1839,0,42,0,0,0,0,0,0,1,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1,1,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,1,0,0,336400,195112000,113164960000,24.083189,6.363028,0.001724,1.0,0.041523,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,83613504097764,8248886215034505832,-2664835518784670960,3023.912,16.028613,1.093608e-07,1.0,0.0003306975,12.534841,0,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,2558759859875250240,5635488737196113408,-2508130037736140800,67699.159404,22.245658,2.181893e-10,1.0,1.5e-05,10.01129,0,1.0,,580,0,0,0,113164960000
4,30,30,1,0,0,1,0,1,1,0,0,8,0,59757,1,2222222222,109329072070789,350,10456054,100,1500,27,1,9999,106143,1,325,1,1,1,0,2606,716,0,26,0,0,5174,0,6500000,2879725527,250,53663,90,800,34,1,9999,174780,1,418,1,2,2,0,8676,1872,0,43,0,0,36286,0,2222222222,78174286476326,198,8841622,80,570,33,1,9999,139594,1,373,1,3,0,0,418,20,0,4,0,0,71170,1,2222222222,141043796917680,310,11876186,100,1099,25,1,9999,111688,1,334,1,2,1,0,2606,495,0,22,0,0,28707,0,2222222222,51819250992331,240,7198558,90,790,29,1,9999,143619,1,378,1,2,2,0,8676,1689,0,41,0,0,24059,0,2222222222,50503409723708,240,7106575,90,750,35,1,9999,181301,1,425,1,3,2,0,8676,1863,0,43,0,0,29267,0,2222222222,62068994884541,249,7878387,95,760,36,1,9999,203978,1,451,1,3,2,0,8676,2148,0,46,0,0,28213,0,2222222222,50921086046018,239,7135901,90,780,31,1,9999,152024,1,389,1,2,2,0,8676,1839,0,42,0,0,984,1,1000000,82512652,199,9083,79,589,48,1,9999,249118,1,499,1,4,3,0,1373,637,0,25,0,0,4046,1,1800000,1054642614,250,32475,93,840,35,1,9999,172526,1,415,1,3,2,0,6065,2405,0,49,0,0,28983,0,2222222222,53138299393708,250,7289602,90,800,36,1,9999,187120,1,432,1,3,2,0,8676,1941,0,44,0,0,61467,1,2222222222,133549725152187,215,11556371,76,766,61,1,9999,397073,1,630,1,4,3,0,6065,1864,0,43,0,0,0,1,0,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,0,1,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,0,1,0,1,0,0,0,0,0,1,0,1,0,0,0,900,27000,810000,5.477226,3.401197,0.033333,1.0,0.182574,0,1.0,,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,0,0,0,0.0,,0.0,0.0,0.0,0,0.5,0.0,1,1,1,1.0,0.0,1.0,0.761594,1.0,0.0,0,0.731059,1.570796,399039851001513241,3951759870702142973,8366459324215131249,10456050.0,32.325383,9.146698e-15,1.0,9.563837e-08,20.129988,9,1.0,,-2697387914582162544,3699261575273694400,-8750533939605753600,7289602.0,31.603919,1.881882e-14,1.0,1.371817e-07,2.109661e+19,9,1.0,,6808337740073104,-3141375632200001856,-9027633774132113152,9083.647505,18.228462,1.211935e-08,1.0,0.00011,9.284387,0,1.0,,30,0,0,0,810000


In [24]:
df_products_01.to_parquet("data/Outputs/df_modelling.parquet")