# Wine Project: Data Analysis and Classification
**Goal:** Apply machine learning models to classify wine based on chemical features.  
**Tools:** Python, pandas, scikit-learn, matplotlib, seaborn.


## Load Dataset
Import the dataset and take a first look at its structure.


## Exploratory Data Analysis
Vsualize feature distributions and correlations to better understand the data.

## Model Training
Apply machine learning models (e.g., logistic regression, decision tree) to classify wine.


## Model Evaluation
Evaluate model performance using accuracy, confusion matrix, and other metrics.


In [None]:
from google.colab import files

uploaded = files.upload()


In [None]:
import pandas as pd
import matplotlib.pyplot as plt


df = pd.read_csv("winequality-red.txt", sep=";")
df.head()


In [None]:
# Explore the dataset
print("Dataset shape:", df.shape)
print("Columns:", df.columns.tolist())
print("Missing values:\n", df.isnull().sum())

# Summary statistics
df.describe()


In [None]:
# Simple analysis
# Count of each quality score
print("Quality counts:\n", df['quality'].value_counts().sort_index())

# Average wine quality
print("Average quality:", df['quality'].mean())

# Correlation between alcohol and quality
print("Alcohol vs Quality correlation:\n", df[['alcohol','quality']].corr())


In [None]:
# Visualizations

# Histogram of wine quality
df['quality'].hist(bins=10)
plt.xlabel("Wine Quality")
plt.ylabel("Count")
plt.title("Distribution of Wine Quality")
plt.show()

# Scatter plot: Alcohol vs Quality
df.plot(kind="scatter", x="alcohol", y="quality", alpha=0.5)
plt.title("Alcohol vs Wine Quality")
plt.show()


# Mini Conclusion
- Most wines have a quality rating of 5–6.  
- Alcohol content seems positively correlated with quality.  
- Chemical properties like acidity and pH may influence perceived taste.
