# Machine Learning Salary Disparity Analysis

This notebook demonstrates the comprehensive salary disparity analysis using machine learning models.

## Analysis Components:
1. **KMeans Clustering** - Job market segmentation
2. **Multiple Linear Regression** - Salary prediction
3. **Random Forest Regression** - Salary prediction
4. **Logistic Regression** - Above-average salary classification
5. **Random Forest Classification** - Job categorization

## Focus Areas:
- Salary disparity patterns across job segments
- Feature importance for salary prediction
- Job seeker recommendations based on findings

In [None]:
# Setup and imports
import sys
import os
import warnings
warnings.filterwarnings('ignore')

# Add project root to path
project_root = os.path.abspath('..')
if project_root not in sys.path:
    sys.path.insert(0, project_root)

# Import ML components
from src.ml.salary_disparity import SalaryDisparityAnalyzer
from src.core.analyzer import create_spark_analyzer
from src.core.processor import JobMarketDataProcessor
from src.config.settings import Settings
from src.utils.spark_utils import create_spark_session

print("✓ Imports successful")
print(f"Project root: {project_root}")