# Tasks Breakdown:

1. Data Understanding & Insights
    - Load and explore the dataset using pandas and matplotlib/seaborn.

Perform exploratory data analysis (EDA):

- Find missing values, data types, and unique categories.
- Visualize key relationships:
     - Attrition vs. Age
     - Attrition vs. MonthlyIncome
     - Attrition vs. Department
     - Correlation heatmap for numerical features.
- Write 3–5 key insights from your observations (e.g., “Younger employees tend to leave more frequently.”)

2. Data Preprocessing (Pipeline Implementation)
Split the dataset into train and test sets (80:20).
Use a ColumnTransformer and Pipeline to perform:
- Numerical preprocessing: SimpleImputer(strategy='mean') + StandardScaler
- Categorical preprocessing: SimpleImputer(strategy='most_frequent') + OneHotEncoder(handle_unknown='ignore')
- Combine preprocessing and model into a single pipeline for each model type.

3. Model Implementation
Train and evaluate the following models using the preprocessed data:
- Linear Regression (for predicting MonthlyIncome as an additional regression subtask)
- Logistic Regression (for Attrition classification)
- Regularized Linear Models:
    - Ridge
    - LASSO
    - Elastic Net
    - Decision Tree Classifier
- Use GridSearchCV to tune hyperparameters such as:
    - Ridge/LASSO alpha values
    - ElasticNet l1_ratio
    - DecisionTree depth and min_samples_split

4. Model Evaluation
Use appropriate metrics:
- For regression: RMSE, R² Score
- For classification: Accuracy, Precision, Recall, F1 Score, ROC-AUC
- Create a comparison dataframe summarizing all models’ performances.
    - Plot ROC curves for classification models.

5. Insights & Interpretation
- Identify which model performs best and why.
- Discuss which features most strongly influence Attrition (use .coef_ or feature_importances_).
- Write a short business interpretation of the findings:
    - “Employees with low satisfaction and longer overtime hours are more likely to leave.”
    - “Elastic Net performed best with balanced bias-variance trade-off.”

💡 Stretch Goal (Optional)

Use Pipeline + GridSearchCV end-to-end automation.

Perform feature selection using model coefficients or recursive elimination.

Visualize Decision Tree structure using plot_tree().