diff --git a/LeetCode SQL 50 Solution/1789. Primary Department for Each Employee/readme.md b/LeetCode SQL 50 Solution/1789. Primary Department for Each Employee/readme.md index e69de29..db5ce3b 100644 --- a/LeetCode SQL 50 Solution/1789. Primary Department for Each Employee/readme.md +++ b/LeetCode SQL 50 Solution/1789. Primary Department for Each Employee/readme.md @@ -0,0 +1,128 @@ +# 🏢 Primary Department for Each Employee - LeetCode 1789 + +## 📌 Problem Statement +You are given a table **Employee** that contains the following columns: + +- **employee_id**: The ID of the employee. +- **department_id**: The ID of the department to which the employee belongs. +- **primary_flag**: An ENUM ('Y', 'N'). + - If `primary_flag` is `'Y'`, then the department is the primary department for that employee. + - If `primary_flag` is `'N'`, then the department is not primary. + +**Note:** +- An employee can belong to multiple departments. When an employee joins multiple departments, they decide which one is their primary (set to `'Y'`). +- If an employee belongs to only one department, then their `primary_flag` is `'N'`, but that department is still considered their primary department. + +Your task is to **report all employees with their primary department**. +For employees who belong to only one department, report that department. + +Return the result table in **any order**. + +--- + +## 📊 Table Structure + +### **Employee Table** +| Column Name | Type | +| ------------- | ------- | +| employee_id | int | +| department_id | int | +| primary_flag | varchar | + +- `(employee_id, department_id)` is the **primary key** for this table. + +--- + +## 📊 Example 1: + +### **Input:** +#### **Employee Table** +| employee_id | department_id | primary_flag | +| ----------- | ------------- | ------------ | +| 1 | 1 | N | +| 2 | 1 | Y | +| 2 | 2 | N | +| 3 | 3 | N | +| 4 | 2 | N | +| 4 | 3 | Y | +| 4 | 4 | N | + +### **Output:** +| employee_id | department_id | +| ----------- | ------------- | +| 1 | 1 | +| 2 | 1 | +| 3 | 3 | +| 4 | 3 | + +### **Explanation:** +- **Employee 1** belongs to only one department (1), so department 1 is their primary. +- **Employee 2** belongs to departments 1 and 2. The row with `primary_flag = 'Y'` indicates that department 1 is their primary. +- **Employee 3** belongs only to department 3. +- **Employee 4** belongs to departments 2, 3, and 4. The row with `primary_flag = 'Y'` indicates that department 3 is their primary. + +--- + +## 🖥 SQL Solution + +### ✅ **Approach:** +- **Step 1:** For employees who have `primary_flag = 'Y'`, choose those rows. +- **Step 2:** For employees who belong to only one department, return that row. +- Combine the results using `UNION DISTINCT`. + +```sql +SELECT employee_id, department_id +FROM Employee +WHERE primary_flag = 'Y' +UNION DISTINCT +SELECT employee_id, department_id +FROM Employee +GROUP BY employee_id +HAVING COUNT(*) = 1; +``` + +--- + +## 🐍 Python (Pandas) Solution + +### ✅ **Approach:** +1. **Group** the DataFrame by `employee_id`. +2. For each group: + - If any row has `primary_flag == 'Y'`, choose the first such row. + - Otherwise (i.e., employee belongs to only one department), choose that row. +3. Return the resulting DataFrame with only `employee_id` and `department_id`. + +```python +import pandas as pd + +def primary_department(employees: pd.DataFrame) -> pd.DataFrame: + def select_primary(group): + # If there's any row with primary_flag 'Y', choose the first one + if (group['primary_flag'] == 'Y').any(): + return group[group['primary_flag'] == 'Y'].iloc[0] + else: + # For employees with only one department + return group.iloc[0] + + result = employees.groupby('employee_id').apply(select_primary).reset_index(drop=True) + return result[['employee_id', 'department_id']] +``` + +--- + +## 📁 File Structure +``` +📂 Primary-Department +│── README.md +│── solution.sql +│── solution_pandas.py +│── test_cases.sql +│── sample_data.csv +``` + +--- + +## 🔗 Useful Links +- 📖 [LeetCode Problem](https://leetcode.com/problems/primary-department-for-each-employee/) +- 🔍 [MySQL UNION Operator](https://www.w3schools.com/sql/sql_union.asp) +- 🐍 [Pandas GroupBy Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) diff --git a/LeetCode SQL 50 Solution/1907. Count Salary Categories/readme.md b/LeetCode SQL 50 Solution/1907. Count Salary Categories/readme.md index e69de29..e5fc42d 100644 --- a/LeetCode SQL 50 Solution/1907. Count Salary Categories/readme.md +++ b/LeetCode SQL 50 Solution/1907. Count Salary Categories/readme.md @@ -0,0 +1,184 @@ +# 💰 Count Salary Categories - LeetCode 907 + +## 📌 Problem Statement +You are given a table **Accounts** that contains information about bank accounts, including their monthly income. +Your task is to calculate the number of bank accounts in each salary category. + +The salary categories are defined as follows: +- **"Low Salary"**: Salaries strictly less than \$20,000. +- **"Average Salary"**: Salaries in the inclusive range [\$20,000, \$50,000]. +- **"High Salary"**: Salaries strictly greater than \$50,000. + +The result table must contain **all three categories**. If there are no accounts in a category, return 0. + +Return the result in **any order**. + +--- + +## 📊 Table Structure + +### **Accounts Table** +| Column Name | Type | +| ----------- | ---- | +| account_id | int | +| income | int | + +- `account_id` is the **primary key** for this table. +- Each row contains the monthly income for one bank account. + +--- + +## 📊 Example 1: + +### **Input:** +#### **Accounts Table** +| account_id | income | +| ---------- | ------ | +| 3 | 108939 | +| 2 | 12747 | +| 8 | 87709 | +| 6 | 91796 | + +### **Output:** +| category | accounts_count | +| -------------- | -------------- | +| Low Salary | 1 | +| Average Salary | 0 | +| High Salary | 3 | + +### **Explanation:** +- **Low Salary**: Account with income 12747. +- **Average Salary**: No accounts have an income in the range [20000, 50000]. +- **High Salary**: Accounts with incomes 108939, 87709, and 91796. + +--- + +## 🖥 SQL Solution + +### ✅ **Approach:** +1. **CTE "S"**: Create a static table with the three salary categories. + ```sql + WITH S AS ( + SELECT 'Low Salary' AS category + UNION + SELECT 'Average Salary' + UNION + SELECT 'High Salary' + ), + ``` + - This defines the three salary categories to ensure every category appears in the final result. + +2. **CTE "T"**: Categorize each account from the **Accounts** table using a `CASE` statement and count the number of accounts in each category. + ```sql + T AS ( + SELECT + CASE + WHEN income < 20000 THEN 'Low Salary' + WHEN income > 50000 THEN 'High Salary' + ELSE 'Average Salary' + END AS category, + COUNT(1) AS accounts_count + FROM Accounts + GROUP BY 1 + ) + ``` + - The `CASE` statement assigns a salary category based on the income. + - `COUNT(1)` counts the number of accounts in each category. + +3. **Final SELECT with LEFT JOIN**: Combine the static category table `S` with the computed counts from `T` to ensure every category is included, using `IFNULL` to convert any missing count to 0. + ```sql + SELECT S.category, IFNULL(T.accounts_count, 0) AS accounts_count + FROM S + LEFT JOIN T USING (category); + ``` + +### ✅ **Complete SQL Query:** +```sql +WITH S AS ( + SELECT 'Low Salary' AS category + UNION + SELECT 'Average Salary' + UNION + SELECT 'High Salary' +), +T AS ( + SELECT + CASE + WHEN income < 20000 THEN 'Low Salary' + WHEN income > 50000 THEN 'High Salary' + ELSE 'Average Salary' + END AS category, + COUNT(1) AS accounts_count + FROM Accounts + GROUP BY 1 +) +SELECT S.category, IFNULL(T.accounts_count, 0) AS accounts_count +FROM S +LEFT JOIN T USING (category); +``` + +--- + +## 🐍 Python (Pandas) Solution + +### ✅ **Approach:** +1. **Categorize Accounts**: Create a new column `category` in the DataFrame by applying the salary conditions. +2. **Group and Count**: Group by the `category` column and count the number of accounts. +3. **Merge with Static Categories**: Ensure all three salary categories appear by merging with a predefined DataFrame that contains all categories, filling missing counts with 0. + +```python +import pandas as pd + +def count_salary_categories(accounts: pd.DataFrame) -> pd.DataFrame: + # Define the salary categorization function + def categorize(income): + if income < 20000: + return 'Low Salary' + elif income > 50000: + return 'High Salary' + else: + return 'Average Salary' + + # Apply categorization + accounts['category'] = accounts['income'].apply(categorize) + + # Count accounts in each category + counts = accounts.groupby('category').size().reset_index(name='accounts_count') + + # Define static categories DataFrame + categories = pd.DataFrame({ + 'category': ['Low Salary', 'Average Salary', 'High Salary'] + }) + + # Merge to ensure all categories are present, fill missing values with 0 + result = categories.merge(counts, on='category', how='left') + result['accounts_count'] = result['accounts_count'].fillna(0).astype(int) + + return result + +# Example usage: +# df = pd.read_csv("sample_accounts.csv") +# print(count_salary_categories(df)) +``` + +--- + +## 📁 File Structure +``` +📂 Count-Salary-Categories +│── README.md +│── solution.sql +│── solution_pandas.py +│── test_cases.sql +│── sample_accounts.csv +``` + +--- + +## 🔗 Useful Links +- 📖 [LeetCode Problem](https://leetcode.com/problems/count-salary-categories/) +- 📝 [MySQL WITH Clause (CTE)](https://www.w3schools.com/sql/sql_with.asp) +- 🔍 [MySQL IFNULL Function](https://www.w3schools.com/sql/func_mysql_ifnull.asp) +- 🐍 [Pandas GroupBy Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) +``` + diff --git a/LeetCode SQL 50 Solution/1934. Confirmation Rate/readme.md b/LeetCode SQL 50 Solution/1934. Confirmation Rate/readme.md index e69de29..eab61d4 100644 --- a/LeetCode SQL 50 Solution/1934. Confirmation Rate/readme.md +++ b/LeetCode SQL 50 Solution/1934. Confirmation Rate/readme.md @@ -0,0 +1,173 @@ +# 📩 Confirmation Rate - LeetCode 1934 + +## 📌 Problem Statement +You are given two tables: **Signups** and **Confirmations**. + +- The **Signups** table contains the signup time for each user. +- The **Confirmations** table records each confirmation request made by a user along with the outcome (either `'confirmed'` or `'timeout'`). + +The **confirmation rate** for a user is defined as: +\[ +\text{confirmation rate} = \frac{\text{Number of confirmed messages}}{\text{Total number of confirmation requests}} +\] +If a user did not request any confirmation messages, their confirmation rate is defined as 0. + +Your task is to calculate the confirmation rate for each user and round it to two decimal places. + +Return the result table in **any order**. + +--- + +## 📊 Table Structure + +### **Signups Table** +| Column Name | Type | +| ----------- | -------- | +| user_id | int | +| time_stamp | datetime | + +- `user_id` is unique for each user. + +### **Confirmations Table** +| Column Name | Type | +| ----------- | -------- | +| user_id | int | +| time_stamp | datetime | +| action | ENUM | + +- `(user_id, time_stamp)` is the primary key. +- `action` is either `'confirmed'` or `'timeout'`. +- `user_id` in Confirmations is a foreign key to Signups. + +--- + +## 📊 Example 1: + +### **Input:** +#### **Signups Table** +| user_id | time_stamp | +| ------- | ------------------- | +| 3 | 2020-03-21 10:16:13 | +| 7 | 2020-01-04 13:57:59 | +| 2 | 2020-07-29 23:09:44 | +| 6 | 2020-12-09 10:39:37 | + +#### **Confirmations Table** +| user_id | time_stamp | action | +| ------- | ------------------- | --------- | +| 3 | 2021-01-06 03:30:46 | timeout | +| 3 | 2021-07-14 14:00:00 | timeout | +| 7 | 2021-06-12 11:57:29 | confirmed | +| 7 | 2021-06-13 12:58:28 | confirmed | +| 7 | 2021-06-14 13:59:27 | confirmed | +| 2 | 2021-01-22 00:00:00 | confirmed | +| 2 | 2021-02-28 23:59:59 | timeout | + +### **Output:** +| user_id | confirmation_rate | +| ------- | ----------------- | +| 6 | 0.00 | +| 3 | 0.00 | +| 7 | 1.00 | +| 2 | 0.50 | + +### **Explanation:** +- **User 6** did not request any confirmation messages, so the rate is **0.00**. +- **User 3** made 2 requests; both were timeouts, so the rate is **0.00**. +- **User 7** made 3 requests; all were confirmed, so the rate is **1.00**. +- **User 2** made 2 requests; 1 confirmed and 1 timeout, so the rate is **0.50**. + +--- + +## 🖥 SQL Solutions + +### ✅ **Solution 1: Using Shorthand Boolean Expressions** +#### **Explanation:** +- `SUM(action = 'confirmed')` counts the number of rows where the action is `'confirmed'` (in MySQL, boolean expressions return 1 if true, 0 if false). +- `COUNT(1)` counts all confirmation requests. +- We use a `LEFT JOIN` between **Signups** and **Confirmations** so that users without any confirmation requests are included (their rate becomes 0). +- `IFNULL` is used to handle cases where a user has no confirmation requests. + +```sql +SELECT + user_id, + ROUND(IFNULL(SUM(action = 'confirmed') / COUNT(1), 0), 2) AS confirmation_rate +FROM Signups +LEFT JOIN Confirmations USING (user_id) +GROUP BY user_id; +``` + +### ✅ **Solution 2: Using a CASE Statement** +#### **Explanation:** +- The `CASE` statement explicitly counts 1 for `'confirmed'` actions and 0 otherwise. +- The rest of the query logic remains similar. + +```sql +SELECT + user_id, + ROUND(IFNULL(SUM(CASE WHEN action = 'confirmed' THEN 1 ELSE 0 END) / COUNT(1), 0), 2) AS confirmation_rate +FROM Signups +LEFT JOIN Confirmations USING (user_id) +GROUP BY user_id; +``` + +--- + +## 🐍 Python (Pandas) Solution + +### ✅ **Approach:** +1. **Merge** the **Signups** and **Confirmations** DataFrames on `user_id` using a left join, so that all users are included. +2. **Count** the total number of confirmation requests and the number of confirmed requests for each user. +3. **Calculate** the confirmation rate as the number of confirmed requests divided by the total requests. +4. **Handle** users with no confirmation requests by setting their rate to 0. +5. **Round** the confirmation rate to two decimal places. + +```python +import pandas as pd + +def confirmation_rate(signups: pd.DataFrame, confirmations: pd.DataFrame) -> pd.DataFrame: + # Merge the dataframes to include all users from signups + merged = pd.merge(signups, confirmations, on='user_id', how='left') + + # Group by user_id and calculate total requests and confirmed requests + summary = merged.groupby('user_id').agg( + total_requests=('action', 'count'), + confirmed_requests=('action', lambda x: (x == 'confirmed').sum()) + ).reset_index() + + # Calculate confirmation rate; if total_requests is 0, rate is 0. + summary['confirmation_rate'] = summary.apply( + lambda row: round(row['confirmed_requests'] / row['total_requests'], 2) if row['total_requests'] > 0 else 0.00, + axis=1 + ) + + # Select the relevant columns + result = summary[['user_id', 'confirmation_rate']] + return result + +# Example usage: +# signups_df = pd.read_csv("signups.csv") +# confirmations_df = pd.read_csv("confirmations.csv") +# print(confirmation_rate(signups_df, confirmations_df)) +``` + +--- + +## 📁 File Structure +``` +📂 Confirmation-Rate +│── README.md +│── solution.sql +│── solution_pandas.py +│── test_cases.sql +│── sample_signups.csv +│── sample_confirmations.csv +``` + +--- + +## 🔗 Useful Links +- 📖 [LeetCode Problem](https://leetcode.com/problems/confirmation-rate/) +- 📝 [MySQL IFNULL Function](https://www.w3schools.com/sql/func_mysql_ifnull.asp) +- 🔍 [MySQL ROUND Function](https://www.w3schools.com/sql/func_mysql_round.asp) +- 🐍 [Pandas GroupBy Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html)