# 🐧 Palmer Penguins: A Data Science Recipe

This notebook demonstrates how to explore and analyze the Palmer Penguins dataset using **Pandas** and **Matplotlib**. You'll learn how to clean data, visualize relationships, and prepare for modeling.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset
url = "https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv"
penguins = pd.read_csv(url)
penguins.head()

## 🧹 Step 1: Data Cleaning
We'll check for missing values and clean the dataset by removing or filling them.

In [None]:
penguins.info()
penguins.isnull().sum()

In [None]:
# Drop rows with missing values
penguins_clean = penguins.dropna()
penguins_clean.shape

## 📊 Step 2: Visualizations with Matplotlib
Let's visualize the distribution of penguin species and relationships between features.

In [None]:
# Bar chart of species count
penguins_clean['species'].value_counts().plot(kind='bar', title='Species Count')
plt.xlabel('Species')
plt.ylabel('Count')
plt.show()

In [None]:
# Scatter plot: Bill Length vs Bill Depth
for species in penguins_clean['species'].unique():
    subset = penguins_clean[penguins_clean['species'] == species]
    plt.scatter(subset['bill_length_mm'], subset['bill_depth_mm'], label=species)

plt.xlabel('Bill Length (mm)')
plt.ylabel('Bill Depth (mm)')
plt.title('Bill Length vs Bill Depth by Species')
plt.legend()
plt.show()