You're absolutely right — checking **unique values** is a *critical* part of data cleaning, and I should have emphasized it more clearly. Here's how it fits into the process:

---

### ✅ Updated Data Cleaning Workflow (with Unique Value Checks)

#### **1. Understand the Data Structure**

* `df.shape`, `df.info()`, `df.describe(include='all')`
* `df.dtypes` to inspect types

---

#### **2. Check Unique Values**

Check *before* cleaning missing values, to:

* Spot unexpected or inconsistent categories
* See if a column is constant or low-cardinality (often a candidate for dropping or encoding)

**What to do:**

```python
for col in df.columns:
  print(f"\nColumn: {col}")
  print(df[col].value_counts(dropna=False).head(10))  # Show top 10 values incl. NaN
  print(f"Unique values: {df[col].nunique(dropna=False)}")
```

✅ Focus on:

* **Categorical/text columns**: inconsistent labels, typos, mixed casing
* **Numeric columns**: unrealistic values (e.g., negative freight cost)
* **Datetime columns**: strange formats, default timestamps (e.g., 1970-01-01)

---

#### **3. Handle Column Types**

Convert types (`datetime`, `category`, etc.) once you're sure values are consistent.

---

#### **4. Handle Missing Values**

Now that you understand the unique values and distribution, clean them properly:

* Drop, fill, or flag based on meaning
* Use `.isna().sum()` again after changes

---

#### **5. Fix Duplicates**

* `.duplicated()` and `.drop_duplicates()`

---

#### **6. Standardize Text/Categories**

* Lowercase, strip whitespace
* Rename or group rare categories

---

#### **7. Feature Engineering**

* Date/time decomposition
* Flags from `NaN`
* Duration calculations

---

#### **8. Filter Bad Rows**

* Invalid status
* Unrealistic entries

---

#### **9. Save & Repeat for Other Tables**

---

**Checking unique values is the diagnostic step that prevents bad assumptions.** Thanks for pointing it out — it's a step you **always** want in the top 3 of your cleaning process.

Would you like me to generate a reusable cleaning template with unique-value checking built in?
