###  SUMMARY OF THE PROJECT:

The data science project outlined here digs into the analysis of employee performance at INX Future Inc. The primary objective is to determine the performance ratings of employees based on various factors present in their data, such as total work experience, gender, department, and current role. 

The project aims to achieve several goals which includes:
1. Department wise performances
2. Top 3 Important Factors effecting employee performance
3. A trained model which can predict the employee performance based on factors as inputs. This will be used to hire employees.
4. Recommendations to improve the employee performance based on insights from analysis.

The dataset provided, comprises 1200 records, each containing 28 features. These features are classified into quantitative and qualitative categories, with 9 being qualitative and 19 quantitative. The dataset includes employee IDs, which are alphanumeric and deemed irrelevant for predicting performance ratings.

The dataset is a supervised learning multiclass classification, it has 27 features and a target variable (Performance Rating). It has no missing record (Null or NaN values). 

There were few outliers present in the dataset which were handled and removed using Interquartile Range method. The features which were categorical in nature were handled and converted to Numerical type using Manual Encoding.

The Machine Learning Algorithms which were used to build models are as follows:
1. Logistic Regression
2. Support Vector Machine
3. Decision Tree
4. Random Forest
5. Gradient Boosting
6. K-Nearest Neighbour

Out of these, **Random Forest and Gradient Boosting Algorithms** performed best with **accuracy score of ~ 90%**.

To determine the most influential features affecting performance ratings, **Feature Importance** techniques provided by the machine learning models are utilized. 

Overall, the project successfully achieves its objectives through the application of machine learning models and visualization techniques, providing valuable insights into employee performance and guiding recommendations for improvement.

### BUSINESS CASE 1: To give insights on the Employees performances for different Departments 

**Plot 1: INSIGHTS:**

From the count plot, we can infer that: 

1. The maximum number of employees are from **Sales Department** followed by **Development Department** with next as **Research & Development Department**. The number of employees in these departments are between 330 to 370.
2. There are very less number of employees working in Human Resources, Finance and Data Science Department as compared to other three departments.
3. Least number of employees work in **Data Science department** with **nearly 20 in number**.


**Plot 2: INSIGHTS:**

The count plot provides insights into the distribution of performance ratings across different departments within the organization. Some insights based on the visualization are as follows:

1. **Sales Department:**
This department has a relatively higher count of employees with performance ratings in the range of 3, indicating that a significant portion of employees in this department perform moderately to well.
There is also a considerable count of employees having performance ratings of 2, suggesting some underperformance issues that might need to be addressed.

2. **Development Department:**
This department shows a similar distribution of performance ratings, with a larger count of employees rated as 3. This indicates that the majority of employees in this department perform above average.
Also, the highest number of employees getting rating as 4 is from this department.

3. **Research & Development Department:**
The count of employees with performance ratings of 3 is notably higher in the Research & Development department compared to other departments, indicating a strong performance trend in this department.
There is also a significant count of employees with performance ratings of 2, suggesting that some improvement opportunities might exist.

4. **Human Resources Department:**
This department exhibits a relatively balanced distribution of performance ratings, with counts spread across ratings 2, 3 and 4. Most of the employees are performing average with rating of 3.

5. **Finance Department:**
This department also shows a balanced distribution of ratings, with a mix of performance ratings across the range of 2, 3, and 4.

6. **Data Science Department:**
This department seems to have a relatively higher proportion of employees with performance ratings of 3, indicating most of the employees are performing above average. However, there is also a noticeable count of employees with performance ratings of 3, suggesting room for further improvement or optimization.
These insights provide an overview of the performance distribution within each department, highlighting areas of strength as well as potential improvement opportunities.

**Plot 3: INSIGHTS:**

From the plot, we can deduce following insights:

1. In all Department, maximum employees received **rating of 3**.
2. In **Sales Department**, number of employees falling in rating 3 are approx. 240, followed by rating 2 (nearly 80 numbers). Near 30 number of employees received rating of 4 in this department.
3. In **Human Resources Department**, as we saw there are overall less number of employees working, hence the number of employees receiving rating of 3 and 4 are around 37 and 5 respectively. Around 10 employees received least rating of 2.
4. In **Development Department**, maximum number of employees received rating of 3 (nearly 300), followed by rating of 4 (nearly 50). Very less number of employees in this department gained 2 rating. **Overall performance of the department seems very good as compared to other departments**.
5. In **Data Science Department**, as total number of employees are least, maximum number of employees got rating of 3 followed by 4 (around 20). Only 1 employees received rating of 2 in the whole department.
6. In **Research & Development Department**, nearly 240 employees received rating of 3, followed by rating of 2 (nearly 60-70). Around 40 employees received 4 rating in this department.
7. In **Finance Department**, most empolyees (approx. 30) received rating of 3, followed by rating of 2 and very less received 4 as rating in this department.

**Plot 4: INSIGHTS:**

From the plot, the following insights can be drawn:

1. There are more number of **Male** employees than **Female** employees in all the departments.
2. Overall **Female** employees have outperformed than **Male** employees considering all the departments.
3. No female employees received least rating of 2 in Data Science department.
4. Most of male or female employees received a rating of 3 in each department.
5. Ony **Male** employees received rating of 4 in Finance department. 
6. **Maximum number of employees** receiving **rating of 4** are from **Development Department and Research & Development Department**. 

**Plot 5: INSIGHTS:**

From the plot, we can infer that:

1. **Maximum number of employees** who received a rating of 3 or 4, have around 7 to 10 years of work experience.
2. Most of the employees in **Data Science** department have received **rating of 3** and is irrespective of the work experience they have.
3. Most of the employees in **Research & Development** department have performed better with **increasing work experience**. More are they experienced, better is their performance. Employees with **more that 28 years of experience** have performed better and **none** of those employees received rating of 2.

### BUSINESS CASE 2: To show Top 3 important factors affecting employee performance

**Feature Importance Method: INSIGHTS:**

Using Feature Importance Method with both Random Forest and Gradient Boosting Models and after analyzing the count plot visualization, the top three most important features affecting the performance rating are ordered with their importance level as follows:

1. Employee Last Salary Hike Percent
2. Employee Environment Satisfaction
3. Years Since Last Promotion

**Plot: INSIGHTS:**

From the count plot for all of the above features, it appears that the most significant factors which are influencing employee performance are:

1. **EmpLastSalaryHikePercent:** Employees who got higher salary hikes percentages tend to have higher performance ratings. Those employees have shown better performance rating as 4 (top rating).

2. **EmpEnvironmentSatisfaction:** Employees who report higher levels of satisfaction with their work environment tend to have higher performance ratings.

3. **YearsSinceLastPromotion:** Employees who are promoted in the last two years tend to show higher performance rating of 3 and 4.

After analyzing the count plot visualization,  the above mentioned features can be identified as those categories which are consistently showing higher proportion of employees with high performance ratings 3 and 4. Maximum employees fall in rating of 3 and 4.

### BUSINESS CASE 3: To Create a Trained Model which can Predict the Employee Performance based on factors as inputs which will be used to hire employees

**INSIGHTS:**

The Machine Learning Algorithms which were used to build models for the provided dataset are as follows:

1. Logistic Regression
2. Support Vector Machine
3. Decision Tree
4. Random Forest
5. Gradient Boosting
6. K-Nearest Neighbour

Out of all the alogorithms applied, **Random Forest and Gradient Boosting Alogorithms** performed best with accuracy score of between 88-92%.

### BUSINESS CASE 4: To give recommendations to Improve the Employee Performance based on Insights from Analysis

**INSIGHTS:**

1. **Enhancing employee satisfaction** with their work environment is crucial for improving overall performance. The company should prioritize initiatives aimed at creating a positive and supportive workplace atmosphere.

2. **Offering salary hikes** can serve as a motivating factor for employees, boosting their performance both financially and psychologically. Adequate compensation is essential for maintaining employee morale and engagement.

3. **Providing opportunities for promotion** enables employees to take on more responsibilities and develop leadership skills, leading to improved performance. Recognizing and rewarding deserving employees with promotions can drive motivation and commitment.

4. It's essential to **consider the experience years in the current role** when hiring new employees. Revising this aspect ensures that new hires are adequately skilled and experienced for their respective positions, contributing to better performance outcomes.

5. **Maintaining a healthy work-life balance** is instrumental in determining employee performance. Supporting employees in managing their personal and professional lives effectively can positively impact their productivity and job satisfaction.

6. When recruiting for HR positions, **prioritizing female candidates** may yield favorable results, as they tend to perform well compared to their male counterparts in certain roles. Embracing diversity and inclusivity in hiring practices can enhance overall team performance.

7. Departments such as development and data science exhibit higher performance levels compared to others. **Identifying the strengths and weaknesses of different departments** allows for targeted interventions and resources allocation to optimize overall organizational performance.