Employee and Manager Salaries

Find employees who are earning more than their managers. Output the employee's first name along with the corresponding salary.

In [1]:
import pandas as pd
import numpy as np

In [3]:
employee = pd.read_csv("../CSV/employee.csv")
employee.head()

Unnamed: 0,id,first_name,last_name,age,sex,employee_title,department,salary,target,bonus,email,city,address,manager_id
0,5,Max,George,26,M,Sales,Sales,1300,200,150,Max@company.com,California,2638 Richards Avenue,1
1,13,Katty,Bond,56,F,Manager,Management,150000,0,300,Katty@company.com,Arizona,,1
2,11,Richerd,Gear,57,M,Manager,Management,250000,0,300,Richerd@company.com,Alabama,,1
3,10,Jennifer,Dion,34,F,Sales,Sales,1000,200,150,Jennifer@company.com,Alabama,,13
4,19,George,Joe,50,M,Manager,Management,100000,0,300,George@company.com,Florida,1003 Wyatt Street,1


In [4]:
merged = pd.merge(employee,employee, left_on = 'manager_id', right_on = 'id')
merged.head()

Unnamed: 0,id_x,first_name_x,last_name_x,age_x,sex_x,employee_title_x,department_x,salary_x,target_x,bonus_x,...,sex_y,employee_title_y,department_y,salary_y,target_y,bonus_y,email_y,city_y,address_y,manager_id_y
0,5,Max,George,26,M,Sales,Sales,1300,200,150,...,F,Manager,Management,200000,0,300,Allen@company.com,California,1069 Ventura Drive,1
1,13,Katty,Bond,56,F,Manager,Management,150000,0,300,...,F,Manager,Management,200000,0,300,Allen@company.com,California,1069 Ventura Drive,1
2,11,Richerd,Gear,57,M,Manager,Management,250000,0,300,...,F,Manager,Management,200000,0,300,Allen@company.com,California,1069 Ventura Drive,1
3,10,Jennifer,Dion,34,F,Sales,Sales,1000,200,150,...,F,Manager,Management,150000,0,300,Katty@company.com,Arizona,,1
4,19,George,Joe,50,M,Manager,Management,100000,0,300,...,F,Manager,Management,200000,0,300,Allen@company.com,California,1069 Ventura Drive,1


In [5]:
result = merged[merged['salary_x'] > merged['salary_y']][['first_name_x','salary_x']]
result

Unnamed: 0,first_name_x,salary_x
2,Richerd,250000


Solution Walkthrough
In this problem, we have a dataset called employee that contains information about employees such as their first name, salary, and their manager's ID. We need to find employees who are earning more than their managers and output their first name along with their salary. The solution involves merging the dataset with itself based on the manager ID, filtering the results based on the salary condition, and finally selecting the required columns.

Understanding The Data
The dataset employee contains the following columns: id, first_name, last_name, salary, and manager_id. Each row represents an employee, and the manager_id column contains the ID of the employee's manager.

The Problem Statement
We need to find employees who are earning more than their managers and output their first name along with their salary.

Breaking Down The Code
Let's break down the solution code step by step for better understanding:

Import the necessary libraries:
import pandas as pd
import numpy as np
- `pandas` is a library used for data manipulation and analysis.
- `numpy` is a library used for numerical operations and computations.
2. Merge the dataset with itself based on the manager ID:

merged = pd.merge(
    employee, employee, left_on="manager_id", right_on="id"
)
- `pd.merge` is used to combine two or more dataframes based on a common column.
- The first argument `employee` is the left dataframe that we want to merge.
- The second argument `employee` is the right dataframe that we want to merge.
- `left_on` is the column from the left dataframe to merge on, which is `manager_id` in this case.
- `right_on` is the column from the right dataframe to merge on, which is `id` in this case.
3. Filter the results based on the salary condition:

result = merged[merged["salary_x"] > merged["salary_y"]]
- `merged['salary_x']` represents the salary column from the left dataframe after merging.
- `merged['salary_y']` represents the salary column from the right dataframe after merging.
- `merged['salary_x'] > merged['salary_y']` checks if the salary of an employee is greater than the salary of their manager.
- `result = merged[merged['salary_x'] > merged['salary_y']]` assigns the filtered rows satisfying the salary condition to the `result` variable.
4. Select the required columns:

result = result[["first_name_x", "salary_x"]]
- `result[['first_name_x','salary_x']]` selects the `first_name_x` and `salary_x` columns from the `result` dataframe.
- `first_name_x` is the first name of the employee and `salary_x` is the salary of the employee.
Bringing It All Together
The complete code is as follows:

import pandas as pd
import numpy as np

merged = pd.merge(
    employee, employee, left_on="manager_id", right_on="id"
)
result = merged[merged['salary_x'] > merged['salary_y']][['first_name_x','salary_x']]
This code merges the employee dataframe with itself based on the manager ID, filters the merged dataframe based on the salary condition, and selects the required columns (first name and salary) for the final result.

Conclusion
The given code efficiently solves the problem of finding employees who are earning more than their managers and outputs their first name along with their salary.