## Problem Statement ASK by Hope AI and Solution Provided by Solomon
![image.png](attachment:image.png)

## A) Achieving this in AI:
- Use predictive analytics and machine learning to forecast employee attrition (resignation).
- Collect historical employee data, train a classification model (e.g., logistic regression, random forest)
- to predict the likelihood of resignation based on features like satisfaction, tenure, performance, etc.

## B) 3-Stage Problem Identification:
- 1. Data Collection: Gather employee data (demographics, performance, satisfaction, attendance, etc.)
- 2. Feature Engineering & Analysis: Identify key factors influencing resignation, preprocess data.
- 3. Model Development & Evaluation: Build and validate a predictive model to identify at-risk employees.

## C) Project Name: "Employee Attrition Prediction System"

## D) Dummy Dataset Creation

### Explanation of `np.random.choice([0, 1], num_employees, p=[0.9, 0.1])`

- `np.random.choice([0, 1], num_employees, p=[0.9, 0.1])` generates a random array of length `num_employees` (here, 100), where each element is either `0` or `1`.
- The `p=[0.9, 0.1]` argument specifies the probability of selecting each value:
    - `0` (Stay) is chosen with 90% probability.
    - `1` (Will resign) is chosen with 10% probability.
- This simulates employee resignation, with most employees staying and a small fraction predicted to resign next month.

In [None]:
import pandas as pd
import numpy as np


np.random.seed(42)
num_employees = 100

dummy_data = pd.DataFrame({
    'EmployeeID': range(1, num_employees + 1),
    'Age': np.random.randint(22, 60, num_employees),
    'Department': np.random.choice(['IT', 'HR', 'Finance', 'Sales', 'Operations'], num_employees),
    'YearsAtCompany': np.random.randint(0, 15, num_employees),
    'JobSatisfaction': np.random.randint(1, 5, num_employees),  # 1: Low, 4: High
    'PerformanceRating': np.random.randint(1, 5, num_employees),
    'MonthlyIncome': np.random.randint(15000, 100000, num_employees),
    'OverTime': np.random.choice(['Yes', 'No'], num_employees),
    'ResignedNextMonth': np.random.choice([0, 1], num_employees, p=[0.9, 0.1])  # 1: Will resign, 0: Stay
})

# Display the first few rows
print(dummy_data.head())

   EmployeeID  Age  Department  YearsAtCompany  JobSatisfaction  \
0           1   50          IT              10                3   
1           2   36     Finance               3                2   
2           3   29  Operations               2                4   
3           4   42     Finance               9                3   
4           5   40          IT               2                3   

   PerformanceRating  MonthlyIncome OverTime  ResignedNextMonth  
0                  4          74638      Yes                  0  
1                  4          88666      Yes                  0  
2                  2          82215      Yes                  0  
3                  4          84042      Yes                  0  
4                  2          28284       No                  0  
