# Overview

**The Problem Presented:**
XYZ company manages thousands of employees spread across dozens of projects in different countries. During the year 2024, the executive management began to notice worrying indicators:
*   High employee turnover rates in some locations
*   Teams suffering from excessive pressure, and others with hardly any tasks
*   Operational costs are constantly rising without a clear explanation
*   Difficulty in predicting the efficiency of human resources in new projects

Management immediately asked the analysis team to intervene. But the challenge is not just about analysis, but about building an intelligent and dynamic system that supports real-time decision-making.

**Required:**
You are now a data analyst within an operational emergency team, and your task is:
*   **Understand the complete picture:**
    Start from raw data scattered among employee files, project details, daily tasks, and attendance and departure records.
*   **Data processing and linking:**
*   **Analysis of the root causes of the problem:**
    *   What is the reason for the high employee turnover?
    *   Why does productivity vary between locations?
    *   How are tasks distributed? Is there a problem with scheduling?
*   **Designing "What-if" scenarios:**
    What would happen if:
    *   Were employees rotated between projects?
    *   Work hours increased or decreased?
    *   Task distribution changed based on efficiency?

*   **Creating an intelligent automation system:**
    Management does not want a one-time analysis, but a living model:
    So that the system is ready to support decision-making at any time, and whenever the data changes, the results and analyses are automatically updated.
    It saves effort and time for the operational teams.
*   **Proposing practical solutions for the year 2025:**
    Present an operational plan based on data and realistic analysis.
    Explain with numbers how resource allocation can be improved and waste reduced.

**Deadline: Sunday, June 1, 2025**

The important things are the methodology, the deep analysis, the ability to present the outputs in a way that reflects your thinking process, and the ability to turn chaos into a clear and executable vision.

# Data Gathering
The dataset is split across four CSV files. Below are the column names and corresponding table names for each file.

- **Attendance(Employee ID, Date, Check In, Check Out)**

- **Employees(Employee ID, Position, Department, Country, Date of Joining, Status, Daily Working Hours)**

- **Projects(Project ID, Project Name, Location, Current Headcount, Required Headcount, Start Date, End Date)**

- **Tasks(Employee ID, Project ID, Task Type, Status, Date, Assigned By)**

# KPIs relevant to our data
- `Average Employee Tenure`:
Calculate the difference between each employee's last attendance date and their joining date, then take the average of those differences across all employees.
- `Turnover Rate`:
This is calculated by dividing the total number of resigned employees by the total number of employees.
- `Active Rate`:
This is calculated by dividing the total number of active employees by the total number of employees.
- `On-leave Rate`:
This is calculated by dividing the total number of on-leave employees by the total number of employees.
- `Open Task Rate`:
$$
\frac{Total\ Number\ of\ Opened\ Tasks}{Total\ Number\ of\ Tasks}
$$

- `Escalated Task Rate`:
$$
\frac{Total\ Number\ of\ Escalated\ Tasks}{Total\ Number\ of\ Tasks}
$$
- `Completion Task Rate`:
$$
\frac{Total\ Number\ of\ Completed\ Tasks}{Total\ Number\ of\ Tasks}
$$
- Calculate the actual daily working hours and compare them to standard working hours to assess if employees work more than usual and whether this correlates with turnover.  
    - Calculate the difference between check-in and check-out times. Then, determine the average number of hours per employee and compare that to the standard working hours. Investigate whether employees who leave the company work more hours than average.
- Calculate the Employee Shortage Rate to track the shortage in the number of employees per project and whether that affects the employee turnover rate.

# Exploratory Data Analysis

### Data Assessment and Initial Exploration
* There are 300 employees in our company.
* Most of our employees are technicians, while the fewest are coordinators.
* Most of our employees work in the Field Service department, and the fewest work in the Logistics department.
* Most of our employees work in Egypt, while the fewest work in KSA.
* The first employee joined the company on January 2nd, 2022, and the last employee joined on December 28th, 2023.
* Approximately 33% of employees resigned, 33% are active, and 33% are on leave.
* The maximum standard working hours are 9, and the minimum is 6.
* Most of our employees work more than 8 hours, above the average (7.633).
* The attendance table contains data for the period from 1/1/2024 to 4/30/2024 (4 months).
* The earliest recorded check-in time is 7:30, and the latest (i.e., late) check-in time is 8:30, while the earliest recorded check-out time is 1:30, and the latest (late) check-out time is 6:30.
* There are 20 projects. Most of them are located in Cairo (9 projects).
* The minimum number of current headcount per project is 11, and the maximum is 48. The average number of current headcount is 31.5.
* The minimum number of required headcount per project is 53, and the maximum is 99. The average number of required headcount is 77.9.
* The Tasks table contains data for the period from 1/1/2024 to 4/30/2024 (4 months), the same time period as the Attendance table.

# Data Cleaning
- In the `Attendance` Table
    * Created a new column by calculating the difference between check-out and check-in times and call it **Actual Working Hours**.
    * Created a new table by grouping by the `Employee ID` column and calculating the average working hours and last attendance date per employee.
    * Joined the resulting table with the `Employees` table using an inner join on `Employee ID`.

- In the `Employees` Table
    * Created a new column that measures the difference between `Last Attendance Date` and `Date of Joining` and call it Employee Tenure (Months). 


- In the new created table (`_Measures`)
    * Calculated a new measure called `Working Hours Variance`. This measures the percentage difference between the average working hours per employee and the standard daily working hours.
    $$
    100 \times \frac{Avg.\ Actual\ Working\ Hours\ -\ Daily\ Working\ Hours}{Daily\ Working\ Hours}
    $$
        - Positive values → Overload: Employee is working more than the standard (overtime).
        
        - Negative values → Underload: Employee is working less than the standard.
        
        - 0% → Employee is working exactly the standard daily hours.
    
            DAX Formual:
        
          ```dax
            Working Hours Variance = 
            DIVIDE(
                SUMX(
                    Employees,
                    Employees[Avg. Actual Working Hours] - Employees[Daily Working Hours]
                ),
                SUM(Employees[Daily Working Hours])
            )
            ```
        
    * `Total Employees`
      ```dax
      Total Employees = COUNT(Employees[Employee ID])
      ```
    * `Active Employees Rate`
 
      ```dax
                  Active Employees Rate = 
                    DIVIDE(
                        CALCULATE(
                            COUNT(Employees[Employee ID]),
                            Employees[Status] = "Active"
                        ),
                        [Total Employees]
                    )
      ```
 
    * `Resigned Employees Rate`
 
      ```dax
                Resigned Employees Rate = 
                    DIVIDE(
                        CALCULATE(
                            COUNT(Employees[Employee ID]),
                            Employees[Status] = "Resigned"
                        ),
                        [Total Employees]
                    )
      ```
 
    * `On-leave Employees Rate`
 
      ```dax
              On-leave Employees Rate = 
                    DIVIDE(
                        CALCULATE(
                            COUNT(Employees[Employee ID]),
                            Employees[Status] = "On Leave"
                        ),
                        [Total Employees]
                    )
      ```