# 🧑‍💼 Python Assignment: Employee Data Processing

## 📁 Dataset: `Employee.csv`

The dataset contains employee information for an organization. Each record includes:

- **EID** – Employee ID  
- **Employee Name**  
- **Gender**  
- **Salary**  
- **Date of Joining**

---

## 🎯 Objectives

### 1. Greet an employee using their first name based on their ID
- **Input**: `E123D`  
- **Output**: `"Hello, Mr Nitin"`

---

### 2. Calculate the average salary of each department
- Write a function that groups employees by department and returns the average salary.

---

### 3. Display employee details by ID
- **Input**: `E123D`  
- **Output**: `Nitin Bhole | 01-11-2016 | Products`  
- Fields to display:
  - Employee Name
  - Date of Joining
  - Department

---

### 4. Sort employees from oldest to newest
- Sort based on **Date of Joining** in ascending order.

---

## ✅ Notes
- Use `pandas` for reading and processing the CSV file.
- Ensure proper formatting and error handling for missing or invalid IDs.

In [1]:
# Importing all necessary libraries

import pandas as pd
import numpy as np

In [2]:
# Importing dataset using pandas

df = pd.read_csv('employee.csv')
df

Unnamed: 0,EID,Employee Name,Gender,Salary,Date of Joining,Department
0,E123D,Nitin Bhole,M,1900000,01-11-2016,Products
1,E876G,Mayur Kulkarni,M,1200000,03-06-2016,Semantics
2,E798G,Anurag Pandey,M,1200000,04-07-2016,IT
3,E123B,Pankaj Bansal,M,900000,01-12-2015,Research
4,E098Z,Rajendra Prabhu,M,1000000,11-11-2015,Semantics
5,E294M,Sumanth Mishra,M,1800000,04-09-2017,Research
6,E972L,Sai Bodha,M,1100000,11-11-2017,Semantics
7,E184A,Kushmakar Krishna,M,1100000,01-01-2015,Products
8,E104S,Amarnath Patil,M,600000,29-11-2015,Engineering
9,E093V,Shashidhar Reddy,M,700000,29-11-2018,Engineering


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 6 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   EID              14 non-null     object
 1   Employee Name    14 non-null     object
 2   Gender           14 non-null     object
 3   Salary           14 non-null     int64 
 4   Date of Joining  14 non-null     object
 5   Department       14 non-null     object
dtypes: int64(1), object(5)
memory usage: 804.0+ bytes


In [4]:
df.describe()

Unnamed: 0,Salary
count,14.0
mean,1392857.0
std,1126918.0
min,300000.0
25%,925000.0
50%,1100000.0
75%,1500000.0
max,5000000.0


In [5]:
df.isnull().sum()

EID                0
Employee Name      0
Gender             0
Salary             0
Date of Joining    0
Department         0
dtype: int64

In [6]:
df.duplicated().sum()

0

In [7]:
df.shape

(14, 6)

In [8]:
# Task 1: Given an employee id, greet them with their first name.
# Input - E123D
# Output - “Hello, Mr Nitin”

def greet_employee(id):
    emp = df[df['EID'] == id]
    if emp.empty:
        return "Employee not found"

    name = emp.iloc[0]['Employee Name'].split()[0]
    gender = emp.iloc[0]['Gender'].upper()

    if gender == "M":
        title = "Mr"
    elif gender == "F":
        title = "Ms"
    else:
        title = ""

    return f"Hello, {title} {name}"

greet_employee("E092E")

'Hello, Ms Mitali'

In [9]:
# Task 2: Function for calculating the average salary of each department.

def average_salary_by_department(dataframe):
    avg_salary = dataframe.groupby("Department")['Salary'].mean()
    return avg_salary

avg_salaries = average_salary_by_department(df)
print(avg_salaries)

Department
Engineering     650000.0
IT             1200000.0
Products       2075000.0
Research       1350000.0
Semantics      1200000.0
Name: Salary, dtype: float64


In [10]:
# Task 3: Given an Employee id, display respective employees Name, Data of Joining and Department.
# Input - E123D
# Output - Nitin Bhole | 01-11-2016 | Products

def employee_details(id):
    emp = df[df['EID'] == id]
    if emp.empty:
        return "Employee not found"

    name = emp.iloc[0]['Employee Name'].split()[0]
    department = emp.iloc[0]['Department']
    date = emp.iloc[0]['Date of Joining']

    return f"{name} | {date} | {department}"

employee = employee_details("E123D")
print(employee)

Nitin | 01-11-2016 | Products


In [11]:
# Task 4: Function to sort employees from oldest to newest based on Date of Joining

def sort_employees_by_joining(dataframe):
    dataframe['Date of Joining'] = pd.to_datetime(dataframe['Date of Joining'], dayfirst=True)
    sorted_df = dataframe.sort_values(by='Date of Joining')
    return sorted_df

# Example usage
sorted_employees = sort_employees_by_joining(df)
print(sorted_employees[['EID', 'Employee Name', 'Date of Joining']])

      EID      Employee Name Date of Joining
12  E620Y     Kumar Abhishek      2014-12-01
7   E184A  Kushmakar Krishna      2015-01-01
4   E098Z    Rajendra Prabhu      2015-11-11
8   E104S     Amarnath Patil      2015-11-29
3   E123B      Pankaj Bansal      2015-12-01
1   E876G     Mayur Kulkarni      2016-06-03
2   E798G      Anurag Pandey      2016-07-04
0   E123D        Nitin Bhole      2016-11-01
5   E294M     Sumanth Mishra      2017-09-04
6   E972L          Sai Bodha      2017-11-11
10  E983Q     Chakshu Bansal      2017-12-04
9   E093V   Shashidhar Reddy      2018-11-29
13  E109U     Sanket Gibbidi      2019-01-11
11  E092E       Mitali Gupta      2019-05-02


In [12]:
sorted_employees

Unnamed: 0,EID,Employee Name,Gender,Salary,Date of Joining,Department
12,E620Y,Kumar Abhishek,M,1100000,2014-12-01,Semantics
7,E184A,Kushmakar Krishna,M,1100000,2015-01-01,Products
4,E098Z,Rajendra Prabhu,M,1000000,2015-11-11,Semantics
8,E104S,Amarnath Patil,M,600000,2015-11-29,Engineering
3,E123B,Pankaj Bansal,M,900000,2015-12-01,Research
1,E876G,Mayur Kulkarni,M,1200000,2016-06-03,Semantics
2,E798G,Anurag Pandey,M,1200000,2016-07-04,IT
0,E123D,Nitin Bhole,M,1900000,2016-11-01,Products
5,E294M,Sumanth Mishra,M,1800000,2017-09-04,Research
6,E972L,Sai Bodha,M,1100000,2017-11-11,Semantics
