# **Jour 45 : Winequality-red**

Cette base de données contient des informations sur 1,599 échantillons de vins rouges portugais (Vinho Verde). Elle associe des mesures physico-chimiques (analysées en laboratoire) à des évaluations sensorielles (notes de qualité attribuées par des experts).

L'objectif principal est de prédire la **qualité du vin** (notée de 0 à 10) en fonction de ses propriétés chimiques.

$\bigoplus$ **Signification des variables**

+  fixed acidity :	Acides fixes (acide tartrique, malique)
+  volatile acidity :	Acide acétique (trop élevé = goût de vinaigre)
+  citric acid :	Acidité citrique (fraîcheur et équilibre)
+  residual sugar :	Sucre restant après fermentation (en g/L)
+  chlorides :	Teneur en sel (goût salé/métallique)
+  free sulfur dioxide :	SO₂ libre (antioxydant et conservateur)
+  total sulfur dioxide :	SO₂ total (libre + lié)	
+  density :	Densité (liée à l'alcool et au sucre)
+  pH : 	Niveau d'acidité (échelle 0-14)	
+  sulphates :	Sulfate de potassium (agent conservateur)
+  alcohol :	Degré d'alcool (% en volume)
+  quality :	Note sensorielle (experts)	0 (très mauvais) → 10 (excellent)

## 0. Chargement des librairies

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_squared_error
import plotly.express as px
import statsmodels.api as sm


## 1. Chargement de la base données

In [2]:
data = pd.read_csv("../data/winequality-red.csv")


In [3]:
data.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5
1,7.8,0.88,0.0,2.6,0.098,25.0,67.0,0.9968,3.2,0.68,9.8,5
2,7.8,0.76,0.04,2.3,0.092,15.0,54.0,0.997,3.26,0.65,9.8,5
3,11.2,0.28,0.56,1.9,0.075,17.0,60.0,0.998,3.16,0.58,9.8,6
4,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5


In [4]:
data.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
fixed acidity,1596.0,8.31416,1.732203,4.6,7.1,7.9,9.2,15.6
volatile acidity,1596.0,0.527954,0.179176,0.12,0.39,0.52,0.64,1.58
citric acid,1596.0,0.270276,0.193894,0.0,0.09,0.26,0.42,0.79
residual sugar,1596.0,2.535558,1.405515,0.9,1.9,2.2,2.6,15.5
chlorides,1596.0,0.08712,0.045251,0.012,0.07,0.079,0.09,0.611
free sulfur dioxide,1596.0,15.858396,10.460554,1.0,7.0,14.0,21.0,72.0
total sulfur dioxide,1596.0,46.382206,32.839138,6.0,22.0,38.0,62.0,289.0
density,1596.0,0.996744,0.001888,0.99007,0.9956,0.996745,0.997833,1.00369
pH,1596.0,3.311917,0.153346,2.86,3.21,3.31,3.4,4.01
sulphates,1596.0,0.656385,0.163057,0.33,0.55,0.62,0.73,1.98
