# 🧹 3.4 Cleaning Data

Cleaning data is essential before any analysis. This includes handling missing values, correcting types, and renaming or recoding columns.

**Common operations:**
- Detect missing values: `df.isnull().sum()`
- Fill or drop missing values: `fillna()`, `dropna()`
- Convert data types: `astype()`
- Rename columns: `df.rename()`

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/ggkuhnle/data-analysis/main/data/hippos_cleaned.csv')
df['Weight_kg'] = pd.to_numeric(df['Weight_kg'], errors='coerce')
df['Species'] = df['Species'].str.strip().str.title()
df = df.dropna(subset=['Weight_kg'])

## 🧪 Exercise
Replace all zero nutrient values with `NaN` and rename columns for clarity.

### <details><summary>Advanced: Using `apply()` for Complex Cleaning</summary>
You can apply custom functions row-wise for advanced recoding.
```python
def clean_group(row):
    return row['Group'].strip().lower().replace('foods', 'Food')
df['Group'] = df.apply(clean_group, axis=1)
```
</details>