# 596. Classes More Than 5 Students

### Difficulty
**Easy**

---

## Problem Statement

Given a `Courses` table, write a **SQL query** to find all the classes that have **at least five students**.

Return the result table in **any order**.

---

## Table Schema

### **Table: Courses**
| Column Name | Type    |
|-------------|---------|
| `student`   | `varchar` |
| `class`     | `varchar` |

- (`student`, `class`) is the **primary key** (unique combination of columns).
- Each row represents:
  - The **name of a student** (`student`), and
  - The **class** (`class`) in which they are enrolled.

---

## Example

### **Input**
#### **Courses table:**
| student | class     |
|---------|-----------|
| A       | Math      |
| B       | English   |
| C       | Math      |
| D       | Biology   |
| E       | Math      |
| F       | Computer  |
| G       | Math      |
| H       | Math      |
| I       | Math      |

---

### **Output**
| class   |
|---------|
| Math    |

---

### **Explanation**
- **Math** has 6 students, so it is included in the output.
- **English**, **Biology**, and **Computer** each have only 1 student, so they are not included.

---

## **Constraints**
- Each `student` is enrolled in exactly one `class`.

---

# Solution:

In [1]:
import pandas as pd

In [2]:
def find_classes(courses: pd.DataFrame) -> pd.DataFrame:
    # Group by 'class' and count the number of students in each class
    grouped = courses.groupby('class')['student'].count()

    # Filter classes with at least 5 students
    result = grouped[grouped > 4].reset_index()

    # Rename columns to match the expected output
    result = result[['class']]
    return result

### **Complexity Analysis**

#### **1️⃣ Time Complexity**
- **Grouping and Counting:**
  - `groupby('class')` scans all rows and groups them by `class`, taking **O(n)**.
  - Counting students in each group takes **O(k)**, where `k` is the number of unique classes.
- **Filtering:**
  - Filtering the grouped result takes **O(k)**.

**Total Time Complexity:** **O(n + k)** ≈ **O(n)**, since `n` (number of rows) dominates for large datasets.

#### **2️⃣ Space Complexity**
- Temporary memory is required to store:
  - Grouped results (counts): **O(k)**, where `k` is the number of unique classes.
  - Filtered results: **O(k)**.

**Total Space Complexity:** **O(k)**.

---

### **Comparison with SQL Solution**

| **Aspect**           | **Pandas Solution**          | **SQL Solution**                          |
|-----------------------|------------------------------|-------------------------------------------|
| **Time Complexity**   | **O(n)**                    | **O(n)**                                  |
| **Space Complexity**  | **O(k)**                    | **O(k)**                                  |
| **Readability**       | Compact and Pythonic        | Compact and SQL-like                     |
| **Performance**       | Efficient for in-memory data| Efficient for database queries           |

---

In [3]:
data = {
    'student': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'class': ['Math', 'English', 'Math', 'Biology', 'Math', 'Computer', 'Math', 'Math', 'Math']
}
courses = pd.DataFrame(data)

In [4]:
courses

Unnamed: 0,student,class
0,A,Math
1,B,English
2,C,Math
3,D,Biology
4,E,Math
5,F,Computer
6,G,Math
7,H,Math
8,I,Math


In [5]:
find_classes(courses)

Unnamed: 0,class
0,Math
