Calculate the attrition rate from a dataset using pandas. It includes loading the data, exploring it, handling any missing values, and finally calculating the attrition rate.

Attrition Rate Analysis

Answer in chat instead
This code will:

Load the dataset.
Explore it to ensure the necessary column (Attrition) exists.
Handle any missing values.
Calculate the attrition rate.
Save the summary as a CSV file for documentation.

In [5]:
import pandas as pd

# Step 1: Load the dataset
file_path = 'greendestination.csv'
data = pd.read_csv(file_path)

# Step 2: Data Exploration
# Display the first few rows to understand the structure of the data
print("Dataset Overview:")
print(data.head())



Dataset Overview:
   Age Attrition     BusinessTravel  DailyRate              Department  \
0   41       Yes      Travel_Rarely       1102                   Sales   
1   49        No  Travel_Frequently        279  Research & Development   
2   37       Yes      Travel_Rarely       1373  Research & Development   
3   33        No  Travel_Frequently       1392  Research & Development   
4   27        No      Travel_Rarely        591  Research & Development   

   DistanceFromHome  Education EducationField  EmployeeCount  EmployeeNumber  \
0                 1          2  Life Sciences              1               1   
1                 8          1  Life Sciences              1               2   
2                 2          2          Other              1               4   
3                 3          4  Life Sciences              1               5   
4                 2          1        Medical              1               7   

   ...  RelationshipSatisfaction StandardHours  StockOpt

In [7]:
# Check for missing values
print("\nMissing Values:")
print(data.isnull().sum())

# Ensure the 'Attrition' column exists
if 'Attrition' not in data.columns:
    raise ValueError("The dataset does not contain an 'Attrition' column.")




Missing Values:
Age                         0
Attrition                   0
BusinessTravel              0
DailyRate                   0
Department                  0
DistanceFromHome            0
Education                   0
EducationField              0
EmployeeCount               0
EmployeeNumber              0
EnvironmentSatisfaction     0
Gender                      0
HourlyRate                  0
JobInvolvement              0
JobLevel                    0
JobRole                     0
JobSatisfaction             0
MaritalStatus               0
MonthlyIncome               0
MonthlyRate                 0
NumCompaniesWorked          0
Over18                      0
OverTime                    0
PercentSalaryHike           0
PerformanceRating           0
RelationshipSatisfaction    0
StandardHours               0
StockOptionLevel            0
TotalWorkingYears           0
TrainingTimesLastYear       0
WorkLifeBalance             0
YearsAtCompany              0
YearsInCurrentRole     

In [9]:

# Step 3: Handle Missing Data (if any)
# If there are missing values in the Attrition column, drop them (optional)
data = data.dropna(subset=['Attrition'])


In [10]:
# Step 4: Calculate Attrition Rate
# Total number of employees
total_employees = len(data)

# Number of employees with Attrition = 'Yes'
attrition_count = data[data['Attrition'] == 'Yes'].shape[0]

# Attrition rate formula
attrition_rate = (attrition_count / total_employees) * 100



In [11]:
# Step 5: Display Results
print("\nTotal Employees:", total_employees)
print("Employees with Attrition (Yes):", attrition_count)
print(f"Attrition Rate: {attrition_rate:.2f}%")




Total Employees: 1470
Employees with Attrition (Yes): 237
Attrition Rate: 16.12%


In [13]:
# Step 6: Save Results to a File (Optional)
result = {
    "Total Employees": total_employees,
    "Employees with Attrition (Yes)": attrition_count,
    "Attrition Rate (%)": round(attrition_rate, 2)
}

# Convert results to a DataFrame and save as a CSV file
result_df = pd.DataFrame([result])
output_path = 'attrition_rate_summary.csv'
result_df.to_csv(output_path, index=False)
print(f"\nAttrition summary saved to {output_path}")


Attrition summary saved to /Users/monty/Downloads/attrition_rate_summary.csv
