# NULL ANALISYS (ANÁLISIS DE NULOS)

El objetivo de esta sección es analizar las variables con nulos y cómo tratarlas.

<br>

Se encarga de responder preguntas como:
- ¿Qué variables tienen nulos?
- ¿Cuál sería el mejor tratamiento para tales nulos?

<br>

---

## Configuración General

1. Carga de librerías.
2. Seteo de estilos del notebook.
3. Ingesta del dataset.

In [67]:
import sys
import os
import statistics

import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt



sys.path.append(os.path.abspath(os.path.join('..', '..', 'src', 'utils')))
import utils as ut

In [68]:
# Seteo de estilos
plt.style.use("ggplot")
sns.set_palette("viridis")
plt.rcParams["figure.figsize"] = (9,6)

In [80]:
wines = pd.read_csv("../../src/data/transformed/wines_transformed.csv")
pd.set_option('display.max_columns', None)
print(wines.shape)
wines.head(3)

(2026, 118)


Unnamed: 0,wine_link,name,year,winery,rating,rating_qty,price,body,tannis,sweetness,acidity,style,alcohol,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,spices,tree fruit,tropical,vegetal,yeasty,any junk food will do,aperitif,appetizers and snacks,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,lamb,lean fish,mature and hard cheese,mild and soft cheese,mushrooms,pasta,pork,poultry,"rich fish (salmon, tuna etc)",shellfish,spicy food,veal,vegetarian,Albariño,Barbera,Bonarda,Béquignol Noir,Cabernet Franc,Cabernet Sauvignon,Cereza,Chardonnay,Chenin Blanc,Criolla Grande,Garnacha,Gewürztraminer,Grenache,Grüner Veltliner,Malbec,Malvasia,Marsanne,Mencia,Merlot,Moscatel,Mourvedre,Pais,Pedro Ximenez,Petit Verdot,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Shiraz/Syrah,Sémillon,Tannat,Tempranillo,Torrontés,Trousseau,Verdejo,Viognier,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
0,https://www.vivino.com/US/en/luigi-bosca-parai...,Paraiso,2020.0,Luigi Bosca,4.8,582.0,188.33,0.7343,0.509,0.1361,0.4474,Argentinian Cabernet Sauvignon - Malbec,,https://images.vivino.com/thumbs/_Bf6JTwYRpSX6...,0.0,0.35,0.0,0.0,0.125,0.05,0.325,0.05,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,https://www.vivino.com/US/en/catena-zapata-est...,Estiba Reservada,2015.0,Catena Zapata,4.7,297.0,675.0,0.7417,0.5583,0.1434,0.5445,Argentinian Bordeaux Blend,0.14,https://images.vivino.com/thumbs/Yt464jw0QS-ug...,0.0241,0.2008,0.008,0.012,0.0964,0.0241,0.4378,0.0843,0.1004,0.0,0.0,0.004,0.008,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,https://www.vivino.com/US/en/catena-zapata-est...,Estiba Reservada,2017.0,Catena Zapata,4.7,219.0,580.0,0.7417,0.5583,0.1434,0.5445,Argentinian Bordeaux Blend,,https://images.vivino.com/thumbs/Yt464jw0QS-ug...,0.0241,0.2008,0.008,0.012,0.0964,0.0241,0.4378,0.0843,0.1004,0.0,0.0,0.004,0.008,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Análisis de Nulos

`pairing`: hay varios vinos que no tienen maridajes. Deberíamos eliminarlos, ya que no son útiles en nuestra app.

`grapes`: si no tiene ninguna uva, podemos dropear los vinos.

`price`: hay vinos que tienen precios nulos. Indagué un poco y me genera duda lo que aparece en vivino. Lo más seguro creo que es dropearlo (además, son pocos).

`rating_qty`: hay bastantes vinos con muy pocos ratings. En ese caso, podríamos hacer una compleción de nulos con el mínimo.

`alcohol y tastes`: tomamos el promedio de alcohol o taste por uva para llenar el campo.

`year`: tomamos la mediana de año del vino por bodega (asumimos que la misma bodega tiene vinos cercanos a la media en el catálogo).

In [70]:
pd.DataFrame(wines.isna().sum(), columns=["nulls"]).sort_values("nulls", ascending=False).T

Unnamed: 0,alcohol,rating_qty,tannis,sweetness,body,style,acidity,mushrooms,pasta,pork,poultry,mature and hard cheese,lamb,lean fish,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,any junk food will do,appetizers and snacks,aperitif,veal,vegetarian,shellfish,spicy food,"rich fish (salmon, tuna etc)",mild and soft cheese,year,price,Cabernet Franc,Béquignol Noir,Cabernet Sauvignon,Barbera,Albariño,Bonarda,Petit Verdot,Pedro Ximenez,Pais,Mourvedre,Moscatel,Merlot,Mencia,Marsanne,Malvasia,Malbec,Grüner Veltliner,Grenache,Gewürztraminer,Garnacha,Criolla Grande,Chenin Blanc,Chardonnay,Cereza,Shiraz/Syrah,Sémillon,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Torrontés,Trousseau,Tannat,Tempranillo,Verdejo,Viognier,spices,tree fruit,rating,name,tropical,yeasty,vegetal,winery,wine_link,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
nulls,1047,420,363,81,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,52,28,17,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


*pairings + grapes + price*

In [71]:
# Dropeo de filas donde no hay ningún maridaje

wines_clean = wines.copy()

pairings = [
    "aperitif", "appetizers and snacks", "beef", "blue cheese", "cured meat",
    "game (deer, venison)", "goat's milk cheese", "lamb", "lean fish",
    "mature and hard cheese", "mild and soft cheese", "mushrooms", "pasta",
    "pork", "poultry", "rich fish (salmon, tuna etc)", "shellfish",
    "spicy food", "veal", "vegetarian"
]

grapes = [
    "Albariño", "Barbera", "Bonarda", "Béquignol Noir", "Cabernet Franc",
    "Cabernet Sauvignon", "Cereza", "Chardonnay", "Chenin Blanc", "Criolla Grande",
    "Garnacha", "Gewürztraminer", "Grenache", "Grüner Veltliner", "Malbec",
    "Malvasia", "Marsanne", "Mencia", "Merlot", "Moscatel", "Mourvedre", "Pais",
    "Pedro Ximenez", "Petit Verdot", "Pinot Gris", "Pinot Noir", "Riesling",
    "Roussanne", "Sangiovese", "Sauvignon Blanc", "Shiraz/Syrah", "Sémillon",
    "Tannat", "Tempranillo", "Torrontés", "Trousseau", "Verdejo", "Viognier"
]

wines_clean = wines_clean[~wines_clean[pairings].isna().all(axis=1)]

wines_clean = wines_clean[~wines_clean[grapes].isna().all(axis=1)]

wines_clean = wines_clean[~wines_clean["price"].isna()]

# Comprobación
pd.DataFrame(wines_clean.isna().sum(), columns=["nulls"]).sort_values("nulls", ascending=False).T

Unnamed: 0,alcohol,rating_qty,tannis,sweetness,year,name,rating,winery,wine_link,body,price,acidity,style,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,spices,tree fruit,tropical,vegetal,yeasty,any junk food will do,aperitif,appetizers and snacks,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,lamb,lean fish,mature and hard cheese,mild and soft cheese,mushrooms,pasta,pork,poultry,"rich fish (salmon, tuna etc)",shellfish,spicy food,veal,vegetarian,Albariño,Barbera,Bonarda,Béquignol Noir,Cabernet Franc,Cabernet Sauvignon,Cereza,Chardonnay,Chenin Blanc,Criolla Grande,Garnacha,Gewürztraminer,Grenache,Grüner Veltliner,Malbec,Malvasia,Marsanne,Mencia,Merlot,Moscatel,Mourvedre,Pais,Pedro Ximenez,Petit Verdot,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Shiraz/Syrah,Sémillon,Tannat,Tempranillo,Torrontés,Trousseau,Verdejo,Viognier,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
nulls,993,393,302,27,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


*rating_qty*

In [72]:
# Imputación según rating mínimo
min_rating_qty = wines_clean["rating_qty"].min()
wines_clean["rating_qty"] = wines_clean["rating_qty"].fillna(min_rating_qty)

# Comprobación
pd.DataFrame(wines_clean.isna().sum(), columns=["nulls"]).sort_values("nulls", ascending=False).T

Unnamed: 0,alcohol,tannis,sweetness,year,name,wine_link,rating_qty,rating,winery,body,price,acidity,style,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,spices,tree fruit,tropical,vegetal,yeasty,any junk food will do,aperitif,appetizers and snacks,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,lamb,lean fish,mature and hard cheese,mild and soft cheese,mushrooms,pasta,pork,poultry,"rich fish (salmon, tuna etc)",shellfish,spicy food,veal,vegetarian,Albariño,Barbera,Bonarda,Béquignol Noir,Cabernet Franc,Cabernet Sauvignon,Cereza,Chardonnay,Chenin Blanc,Criolla Grande,Garnacha,Gewürztraminer,Grenache,Grüner Veltliner,Malbec,Malvasia,Marsanne,Mencia,Merlot,Moscatel,Mourvedre,Pais,Pedro Ximenez,Petit Verdot,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Shiraz/Syrah,Sémillon,Tannat,Tempranillo,Torrontés,Trousseau,Verdejo,Viognier,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
nulls,993,302,27,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


_alcohol, tannis, sweetness_

In [73]:
# Imputa el promedio de alcohol por uva a cada vino con el sabor NaN
def impute_taste(df, grapes, columns):
    for col in columns:
        # Diccionario con el promedio del sabor por uva
        grape_taste_mean = {}

        grape_taste_mean["General"] = round(df[col].mean(),4)

        for grape in grapes:
            taste_mean = df.loc[df[grape] == 1, col].mean()
            grape_taste_mean[grape] = round(taste_mean,4)

        for idx, row in df.iterrows():
            if pd.notna(row[col]):
                continue
            
            grape_taste_mix = []

            for grape in grapes:
                if row[grape] == 1:
                    mean = grape_taste_mean.get(grape)
                    if not np.isnan(mean):
                        grape_taste_mix.append(mean)
                    else:
                        grape_taste_mix.append(grape_taste_mean.get("General"))    
            # Calcula el promedio del sabor de las uvas del vino
            if grape_taste_mix:
                df.at[idx, col] = round(np.mean(grape_taste_mix), 4) 
            else:
                df.at[idx, col] = grape_taste_mean.get("General")
    return df


In [74]:
grapes = ["Albariño", "Barbera", "Bonarda", "Béquignol Noir", "Cabernet Franc",
          "Cabernet Sauvignon", "Cereza", "Chardonnay", "Chenin Blanc", "Criolla Grande",
          "Garnacha", "Gewürztraminer", "Grenache", "Grüner Veltliner", "Malbec",
          "Malvasia", "Marsanne", "Mencia", "Merlot", "Moscatel", "Mourvedre", "Pais",
          "Pedro Ximenez", "Petit Verdot", "Pinot Gris", "Pinot Noir", "Riesling",
          "Roussanne", "Sangiovese", "Sauvignon Blanc", "Shiraz/Syrah", "Sémillon",
          "Tannat", "Tempranillo", "Torrontés", "Trousseau", "Verdejo", "Viognier"]

wines_clean = impute_taste(wines_clean, grapes, ["alcohol", "tannis", "sweetness"])

# Comprobación
pd.DataFrame(wines_clean.isna().sum(), columns=["nulls"]).sort_values("nulls", ascending=False).T

Unnamed: 0,year,wine_link,name,winery,rating,rating_qty,price,body,tannis,sweetness,acidity,style,alcohol,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,spices,tree fruit,tropical,vegetal,yeasty,any junk food will do,aperitif,appetizers and snacks,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,lamb,lean fish,mature and hard cheese,mild and soft cheese,mushrooms,pasta,pork,poultry,"rich fish (salmon, tuna etc)",shellfish,spicy food,veal,vegetarian,Albariño,Barbera,Bonarda,Béquignol Noir,Cabernet Franc,Cabernet Sauvignon,Cereza,Chardonnay,Chenin Blanc,Criolla Grande,Garnacha,Gewürztraminer,Grenache,Grüner Veltliner,Malbec,Malvasia,Marsanne,Mencia,Merlot,Moscatel,Mourvedre,Pais,Pedro Ximenez,Petit Verdot,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Shiraz/Syrah,Sémillon,Tannat,Tempranillo,Torrontés,Trousseau,Verdejo,Viognier,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
nulls,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


*year*

In [75]:
# Imputa el promedio de año por bodega (winery) a cada vino con el año NaN
def impute_year(df, wineries, winery_col="winery", year_col="year"):
    # Diccionario con el promedio del año por bodega
    winery_year_median = {}

    winery_year_median["General"] = round(df[year_col].median(),4)

    for winery in wineries:
        winery_median = df[df[winery_col]==winery][year_col].median()
        if not pd.isna(winery_median):
            winery_year_median[winery] = round(winery_median,4)
        else:
            winery_year_median[winery] = winery_year_median.get("General")
    
    for idx, row in df[df[year_col].isna()].iterrows():
        winery_year = winery_year_median.get(row[winery_col])
        if winery_year:
            df.at[idx, year_col] = winery_year
        else:
            df.at[idx, year_col] = winery_year_median.get("General")
    return df

In [76]:
# Imputación de año según promedio por bodega
wineries = list(set(wines_clean[wines_clean["year"].isna()]["winery"]))
wines_clean = impute_year(wines_clean, wineries)

# Comprobación
pd.DataFrame(wines_clean.isna().sum(), columns=["nulls"]).sort_values("nulls", ascending=False).T

  return np.nanmean(a, axis, out=out, keepdims=keepdims)
  return np.nanmean(a, axis, out=out, keepdims=keepdims)
  return np.nanmean(a, axis, out=out, keepdims=keepdims)
  return np.nanmean(a, axis, out=out, keepdims=keepdims)
  return np.nanmean(a, axis, out=out, keepdims=keepdims)


Unnamed: 0,wine_link,name,year,winery,rating,rating_qty,price,body,tannis,sweetness,acidity,style,alcohol,image,ageing,black fruit,citrus,dried fruit,earthy,floral,oaky,red fruit,spices,tree fruit,tropical,vegetal,yeasty,any junk food will do,aperitif,appetizers and snacks,beef,blue cheese,cured meat,"game (deer, venison)",goat's milk cheese,lamb,lean fish,mature and hard cheese,mild and soft cheese,mushrooms,pasta,pork,poultry,"rich fish (salmon, tuna etc)",shellfish,spicy food,veal,vegetarian,Albariño,Barbera,Bonarda,Béquignol Noir,Cabernet Franc,Cabernet Sauvignon,Cereza,Chardonnay,Chenin Blanc,Criolla Grande,Garnacha,Gewürztraminer,Grenache,Grüner Veltliner,Malbec,Malvasia,Marsanne,Mencia,Merlot,Moscatel,Mourvedre,Pais,Pedro Ximenez,Petit Verdot,Pinot Gris,Pinot Noir,Riesling,Roussanne,Sangiovese,Sauvignon Blanc,Shiraz/Syrah,Sémillon,Tannat,Tempranillo,Torrontés,Trousseau,Verdejo,Viognier,Agrelo,Argentina,Brazil,Cafayate Valley,Calchaqui Valley,Campanha,Famatina,Gualtallary,La Consulta,La Rioja,Las Compuertas,Lujan de Cuyo,Lunlunta,Maipu,Mendoza,Paraje Altamira,Patagonia,Pedernal Valley,Perdriel,Rio Grande do Sul,Rio Negro,Salta,San Carlos,San Juan,San Rafael,Serra Gaúcha,Tulum Valley,Tunuyán,Tupungato,Uco Valley,Vale dos Vinhedos,Vista Flores
nulls,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


*Guardado final del dataset*

In [82]:
print(wines_clean.shape)
ut.save_csv(wines_clean, path="../../src/data/transformed/" , filename="wines_clean.csv")

(1955, 118)
Archivo guardado en: ../../src/data/transformed/wines_clean.csv
