# 📍 Problem Title: Managers with at Least 5 Direct Reports

🔗 [LeetCode Problem 570 – Managers with at Least 5 Direct Reports](https://leetcode.com/problems/managers-with-at-least-5-direct-reports/description/?envType=study-plan-v2&envId=30-days-of-pandas)

---

## 📝 Problem Description
We are given a table `Employee` with the following schema:

| Column     | Type |
|------------|------|
| id         | int  |
| name       | str  |
| department | str  |
| managerId  | int  |

- `id` is the primary key for this table.
- Each row contains an employee’s `id`, `name`, their `department`, and their `managerId`.
- `managerId` points to another employee’s `id` (the manager).
- If `managerId` is `null`, that employee has no manager.

We need to find the **names of managers who have at least 5 direct reports**.

---

## 🧾 Example

**Input:**

| id | name  | department | managerId |
|----|-------|------------|-----------|
| 1  | John  | A          | null      |
| 2  | Dan   | A          | 1         |
| 3  | James | A          | 1         |
| 4  | Amy   | A          | 1         |
| 5  | Anne  | A          | 1         |
| 6  | Ron   | A          | 1         |

**Output:**

| name |
|------|
| John |

---

## 🧠 Key Concepts
- **Group employees by `managerId`** to count direct reports per manager.
- **Filter managers with ≥ 5 direct reports**.
- **Join back with employees** to retrieve manager names from their `id`.

---

## 🐼 Pandas Outline
1. Use `groupby("managerId").size()` to count employees under each manager.
2. Convert the result into a DataFrame with `.reset_index(name="direct_reports")`.
3. Filter rows where `direct_reports >= 5`.
4. Match the filtered manager IDs against the `id` column in the employees table.
5. Return only the `name` column.

---

## 🧪 Example Flow

**Step 1: Group employees by `managerId` and count**
| managerId | direct_reports |
|-----------|----------------|
| 1         | 5              |

**Step 2: Filter managers with at least 5 reports**
| managerId | direct_reports |
|-----------|----------------|
| 1         | 5              |

**Step 3: Match managerId back to employee id**
| id | name | department | managerId |
|----|------|------------|-----------|
| 1  | John | A          | null      |

**Step 4: Select the manager’s name**
| name |
|------|
| John |

---

✅ This approach guarantees we capture all managers who directly manage **5 or more employees**.

In [12]:
import pandas as pd


data = [[101, 'John', 'A', None], [102, 'Dan', 'A', 101], [103, 'James', 'A', 101], [104, 'Amy', 'A', 101], [105, 'Anne', 'A', 101], [106, 'Ron', 'B', 101]]
employee = pd.DataFrame(data, columns=['id', 'name', 'department', 'managerId']).astype({'id':'Int64', 'name':'object', 'department':'object', 'managerId':'Int64'})

In [28]:
def find_managers(employee: pd.DataFrame) -> pd.DataFrame:

    counts = (
        employee.groupby("managerId")
        .size()
        .reset_index(name = "direct_reports")
    )
    managers = counts[counts["direct_reports"] >= 5]
    return employee[employee["id"].isin(managers["managerId"])][["name"]]

In [29]:
find_managers(employee)

Unnamed: 0,name
0,John
