diff --git a/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.md b/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.md deleted file mode 100644 index 289efac..0000000 --- a/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.md +++ /dev/null @@ -1,112 +0,0 @@ -# 185. Department Top Three Salaries - -### Question -``` - Table: Employee - -+--------------+---------+ -| Column Name | Type | -+--------------+---------+ -| id | int | -| name | varchar | -| salary | int | -| departmentId | int | -+--------------+---------+ -``` -id is the primary key (column with unique values) for this table. -departmentId is a foreign key (reference column) of the ID from the Department table. -Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department. - - -Table: Department -``` -+-------------+---------+ -| Column Name | Type | -+-------------+---------+ -| id | int | -| name | varchar | -+-------------+---------+ -``` -id is the primary key (column with unique values) for this table. -Each row of this table indicates the ID of a department and its name. - - -A company's executives are interested in seeing who earns the most money in each of the company's departments. A high earner in a department is an employee who has a salary in the top three unique salaries for that department. - -Write a solution to find the employees who are high earners in each of the departments. - -Return the result table in any order. - -The result format is in the following example. - - - -Example 1: -``` -Input: -Employee table: -+----+-------+--------+--------------+ -| id | name | salary | departmentId | -+----+-------+--------+--------------+ -| 1 | Joe | 85000 | 1 | -| 2 | Henry | 80000 | 2 | -| 3 | Sam | 60000 | 2 | -| 4 | Max | 90000 | 1 | -| 5 | Janet | 69000 | 1 | -| 6 | Randy | 85000 | 1 | -| 7 | Will | 70000 | 1 | -+----+-------+--------+--------------+ -``` -``` -Department table: -+----+-------+ -| id | name | -+----+-------+ -| 1 | IT | -| 2 | Sales | -+----+-------+ -Output: -+------------+----------+--------+ -| Department | Employee | Salary | -+------------+----------+--------+ -| IT | Max | 90000 | -| IT | Joe | 85000 | -| IT | Randy | 85000 | -| IT | Will | 70000 | -| Sales | Henry | 80000 | -| Sales | Sam | 60000 | -+------------+----------+--------+ -``` -Explanation: -In the IT department: -- Max earns the highest unique salary -- Both Randy and Joe earn the second-highest unique salary -- Will earns the third-highest unique salary - -In the Sales department: -- Henry earns the highest salary -- Sam earns the second-highest salary -- There is no third-highest salary as there are only two employees - - -Constraints: - -There are no employees with the exact same name, salary and department. - - -## Solution - - -```MySQL -# Write your MySQL query statement below -SELECT d.Name as Department, -e.Name as Employee, -e.Salary as Salary -FROM Department d, Employee e -WHERE( - SELECT COUNT(distinct Salary) - FROM Employee - WHERE Salary > e.Salary AND DepartmentId = d.Id -) < 3 AND e.DepartmentId = d.Id -ORDER BY d.Id, e.Salary desc; -``` \ No newline at end of file diff --git a/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.py b/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.py new file mode 100644 index 0000000..ff2e501 --- /dev/null +++ b/LeetCode SQL 50 Solution/185. Department Top Three Salaries/185. Department Top Three Salaries.py @@ -0,0 +1,18 @@ +### **Pandas Solution** + +import pandas as pd + +def department_top_three_salaries(employee: pd.DataFrame, department: pd.DataFrame) -> pd.DataFrame: + # Merge employee and department tables + employee = employee.merge(department, left_on='departmentId', right_on='id', suffixes=('', '_dept')) + + # Rank employees' salaries within each department + employee['rank'] = employee.groupby('departmentId')['salary'].rank(method='dense', ascending=False) + + # Filter top 3 salaries in each department + result = employee[employee['rank'] <= 3][['name_dept', 'name', 'salary']] + + # Rename columns to match the expected output + result.columns = ['Department', 'Employee', 'Salary'] + + return result \ No newline at end of file diff --git a/LeetCode SQL 50 Solution/185. Department Top Three Salaries/readme.md b/LeetCode SQL 50 Solution/185. Department Top Three Salaries/readme.md new file mode 100644 index 0000000..36117c3 --- /dev/null +++ b/LeetCode SQL 50 Solution/185. Department Top Three Salaries/readme.md @@ -0,0 +1,169 @@ +# **185. Department Top Three Salaries** + +## **Problem Statement** +You are given two tables: `Employee` and `Department`. + +### **Employee Table** +```rb ++--------------+---------+ +| Column Name | Type | ++--------------+---------+ +| id | int | +| name | varchar | +| salary | int | +| departmentId | int | ++--------------+---------+ +``` +- `id` is the primary key. +- `departmentId` is a foreign key referencing `id` in the `Department` table. +- Each row represents an employee with their `id`, `name`, `salary`, and `departmentId`. + +### **Department Table** +```rb ++-------------+---------+ +| Column Name | Type | ++-------------+---------+ +| id | int | +| name | varchar | ++-------------+---------+ +``` +- `id` is the primary key. +- Each row represents a department with its `id` and `name`. + +### **Task:** +Find employees who have a salary in the **top three unique salaries** in their respective departments. + +## **Example 1:** +### **Input:** +#### **Employee Table** +``` ++----+-------+--------+--------------+ +| id | name | salary | departmentId | ++----+-------+--------+--------------+ +| 1 | Joe | 85000 | 1 | +| 2 | Henry | 80000 | 2 | +| 3 | Sam | 60000 | 2 | +| 4 | Max | 90000 | 1 | +| 5 | Janet | 69000 | 1 | +| 6 | Randy | 85000 | 1 | +| 7 | Will | 70000 | 1 | ++----+-------+--------+--------------+ +``` +#### **Department Table** +```rb ++----+-------+ +| id | name | ++----+-------+ +| 1 | IT | +| 2 | Sales | ++----+-------+ +``` +### **Output:** +```rb ++------------+----------+--------+ +| Department | Employee | Salary | ++------------+----------+--------+ +| IT | Max | 90000 | +| IT | Joe | 85000 | +| IT | Randy | 85000 | +| IT | Will | 70000 | +| Sales | Henry | 80000 | +| Sales | Sam | 60000 | ++------------+----------+--------+ +``` + +--- + +## **Solution Approaches** + +### **SQL Solution (Using Self Join)** +```sql +SELECT d.Name as Department, + e.Name as Employee, + e.Salary as Salary +FROM Department d, Employee e +WHERE ( + SELECT COUNT(DISTINCT Salary) + FROM Employee + WHERE Salary > e.Salary AND DepartmentId = d.Id +) < 3 AND e.DepartmentId = d.Id +ORDER BY d.Id, e.Salary DESC; +``` +**Explanation:** +- For each employee, we count how many distinct salaries are greater than theirs. +- If fewer than 3 salaries are greater, the employee is in the **top three**. +- We filter results by department and order by salary in descending order. + +--- + +### **SQL Solution (Using Window Functions)** +```sql +WITH RankedSalaries AS ( + SELECT e.name AS Employee, + e.salary AS Salary, + d.name AS Department, + DENSE_RANK() OVER (PARTITION BY e.departmentId ORDER BY e.salary DESC) AS rnk + FROM Employee e + JOIN Department d ON e.departmentId = d.id +) +SELECT Department, Employee, Salary +FROM RankedSalaries +WHERE rnk <= 3; +``` +**Explanation:** +- We use `DENSE_RANK()` to assign a rank to salaries within each department. +- `PARTITION BY departmentId` ensures ranking is specific to each department. +- Employees with `rnk <= 3` are returned. + +--- + +### **Pandas Solution** +```python +import pandas as pd + +def department_top_three_salaries(employee: pd.DataFrame, department: pd.DataFrame) -> pd.DataFrame: + # Merge employee and department tables + employee = employee.merge(department, left_on='departmentId', right_on='id', suffixes=('', '_dept')) + + # Rank employees' salaries within each department + employee['rank'] = employee.groupby('departmentId')['salary'].rank(method='dense', ascending=False) + + # Filter top 3 salaries in each department + result = employee[employee['rank'] <= 3][['name_dept', 'name', 'salary']] + + # Rename columns to match the expected output + result.columns = ['Department', 'Employee', 'Salary'] + + return result +``` +**Explanation:** +- Merge the `Employee` and `Department` tables. +- Rank salaries within each department using `.rank()`. +- Filter the top 3 ranked salaries per department. + +--- + +## **File Structure** +``` +📂 LeetCode185 +│── 📜 problem_statement.md +│── 📜 sql_self_join_solution.sql +│── 📜 sql_window_function_solution.sql +│── 📜 pandas_solution.py +│── 📜 README.md +``` +- `problem_statement.md` → Contains the problem description and constraints. +- `sql_self_join_solution.sql` → Contains the SQL solution using self-join. +- `sql_window_function_solution.sql` → Contains the SQL solution using `DENSE_RANK()`. +- `pandas_solution.py` → Contains the Pandas solution for Python users. +- `README.md` → Provides an overview of the problem and solutions. + +--- + +## **Useful Links** +- [LeetCode Problem 185](https://leetcode.com/problems/department-top-three-salaries/) +- [SQL DENSE_RANK() Function](https://www.w3schools.com/sql/sql_functions.asp) +- [Pandas Rank Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html) + +--- + diff --git a/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.py b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.py new file mode 100644 index 0000000..7bc23de --- /dev/null +++ b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.py @@ -0,0 +1,5 @@ +import pandas as pd + +def delete_duplicate_emails(person: pd.DataFrame) -> None: + # Keep only the first occurrence of each email (smallest id) + person.drop_duplicates(subset=['email'], keep='first', inplace=True) \ No newline at end of file diff --git a/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.sql b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.sql index 214afb4..b9596df 100644 --- a/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.sql +++ b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/196. Delete Duplicate Emails.sql @@ -1,56 +1,4 @@ -196. Delete Duplicate Emails -Solved -Easy -Topics -Companies -SQL Schema -Pandas Schema -Table: Person - -+-------------+---------+ -| Column Name | Type | -+-------------+---------+ -| id | int | -| email | varchar | -+-------------+---------+ -id is the primary key (column with unique values) for this table. -Each row of this table contains an email. The emails will not contain uppercase letters. - - -Write a solution to delete all duplicate emails, keeping only one unique email with the smallest id. - -For SQL users, please note that you are supposed to write a DELETE statement and not a SELECT one. - -For Pandas users, please note that you are supposed to modify Person in place. - -After running your script, the answer shown is the Person table. The driver will first compile and run your piece of code and then show the Person table. The final order of the Person table does not matter. - -The result format is in the following example. - - - -Example 1: - -Input: -Person table: -+----+------------------+ -| id | email | -+----+------------------+ -| 1 | john@example.com | -| 2 | bob@example.com | -| 3 | john@example.com | -+----+------------------+ -Output: -+----+------------------+ -| id | email | -+----+------------------+ -| 1 | john@example.com | -| 2 | bob@example.com | -+----+------------------+ -Explanation: john@example.com is repeated two times. We keep the row with the smallest Id = 1. - - - +-- 196. Delete Duplicate Emails # Write your MySQL query statement below # Write your MySQL query statement below DELETE p2 FROM Person p1 diff --git a/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/readme.md b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/readme.md new file mode 100644 index 0000000..8e355db --- /dev/null +++ b/LeetCode SQL 50 Solution/196. Delete Duplicate Emails/readme.md @@ -0,0 +1,118 @@ + +# **196. Delete Duplicate Emails** + +## **Problem Statement** +You are given a table called `Person`, which stores email addresses. + +### **Person Table** +``` ++-------------+---------+ +| Column Name | Type | ++-------------+---------+ +| id | int | +| email | varchar | ++-------------+---------+ +``` +- `id` is the **primary key**. +- Each row contains an **email address**. +- All emails are in **lowercase**. + +### **Task:** +Delete all **duplicate emails**, keeping only **one unique email** with the **smallest id**. + +--- + +## **Example 1:** +### **Input:** +#### **Person Table** +``` ++----+------------------+ +| id | email | ++----+------------------+ +| 1 | john@example.com | +| 2 | bob@example.com | +| 3 | john@example.com | ++----+------------------+ +``` +### **Output:** +``` ++----+------------------+ +| id | email | ++----+------------------+ +| 1 | john@example.com | +| 2 | bob@example.com | ++----+------------------+ +``` +### **Explanation:** +- `john@example.com` appears **twice**. +- We keep the row with the **smallest `id`** (`id = 1`). +- The duplicate (`id = 3`) is **deleted**. + +--- + +## **Solution Approaches** + +### **SQL Solution (Using Self Join)** +```sql +DELETE p2 FROM Person p1 +JOIN Person p2 +ON p1.email = p2.email AND p1.id < p2.id; +``` +**Explanation:** +- `p1` and `p2` refer to the **same table** (`Person`). +- We **join** them on `email` to find duplicates. +- If `p1.id < p2.id`, we delete `p2`, keeping the row with the **smallest id**. + +--- + +### **SQL Solution (Using Subquery)** +```sql +DELETE FROM Person +WHERE id NOT IN ( + SELECT MIN(id) FROM Person GROUP BY email +); +``` +**Explanation:** +- We **group** by `email` and **select the smallest `id`** for each email. +- The `DELETE` statement removes rows **not in** this list. + +--- + +### **Pandas Solution** +```python +import pandas as pd + +def delete_duplicate_emails(person: pd.DataFrame) -> None: + # Keep only the first occurrence of each email (smallest id) + person.drop_duplicates(subset=['email'], keep='first', inplace=True) +``` +**Explanation:** +- `drop_duplicates(subset=['email'], keep='first', inplace=True)`: + - Keeps only **the first occurrence** of each email. + - Ensures **modification happens in place**. + +--- + +## **File Structure** +``` +📂 LeetCode196 +│── 📜 problem_statement.md +│── 📜 sql_self_join_solution.sql +│── 📜 sql_subquery_solution.sql +│── 📜 pandas_solution.py +│── 📜 README.md +``` +- `problem_statement.md` → Contains the problem description. +- `sql_self_join_solution.sql` → Contains the SQL solution using **JOIN**. +- `sql_subquery_solution.sql` → Contains the SQL solution using **Subquery**. +- `pandas_solution.py` → Contains the Pandas solution for Python users. +- `README.md` → Provides an overview of the problem and solutions. + +--- + +## **Useful Links** +- [LeetCode Problem 196](https://leetcode.com/problems/delete-duplicate-emails/) +- [SQL DELETE Statement](https://www.w3schools.com/sql/sql_delete.asp) +- [Pandas drop_duplicates()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html) + +---