## <BR>
<BR>
<BR>
<BR>
<BR>
<BR>
   
                                          
                                          
                                          
                                          
                                          
#                                          Elastic Net Regression in Predicting Obesity Risk




##                                                        Gladys Murage

##                              College of Business, Engineering, and  Technology, National University

##                                         DDS8555 v1: PREDICTIVE ANALYSIS(3602869492)

##                                                        Dr MOHAMED NABEEL

##                                                            April 20, 2025


<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>

# Regularized Regression and Elastic Net Implementation
### Regularized regression is a technique that prevents overfitting by adding penalty terms to the loss function. There are three main types:
#### 1. Ridge Regression (L2 regularization) adds penalty proportional to the square of coefficients
#### 2. Lasso Regression (L1 regularization) adds penalty proportional to absolute values of coefficients
#### 3. Elastic Net: Combines both L1 and L2 penalties

## Elastic Net is particularly useful when:
1. There is many correlated features
2. When there is need for automatic feature selection like in  Lasso regression
3. When there is need to handle multicollinearity like in  Ridge regression.

## This Elastic Net model will:
1. Automatically handle feature scaling and encoding
2. Find the optimal balance between L1 and L2 regularization
3. Provide interpretable feature importance
4. Generate predictions in the original class labels.
5. implementation shows how to adapt ElasticNet for classification tasks.

In [2]:
# Import libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import LabelEncoder

# Load data
train = pd.read_csv('Otrain.csv')
test = pd.read_csv('Otest.csv')

# Separate features and target
X = train.drop('NObeyesdad', axis=1)
y = train['NObeyesdad']

# Encode target variable for regression
le = LabelEncoder()
y_encoded = le.fit_transform(y)

# Identify categorical and numerical columns
cat_cols = X.select_dtypes(include=['object']).columns.tolist()
num_cols = X.select_dtypes(include=['number']).columns.tolist()

# Preprocessing pipeline 
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), num_cols),
        ('cat', OneHotEncoder(handle_unknown='ignore'), cat_cols)
    ])

# Train-test split
X_train, X_val, y_train, y_val = train_test_split(X, y_encoded, test_size=0.2, random_state=42)

# ElasticNet model
elastic = ElasticNet(random_state=42)

# Parameter grid for tuning
param_grid = {
    'alpha': [0.001, 0.01, 0.1, 1, 10],
    'l1_ratio': [0.1, 0.3, 0.5, 0.7, 0.9],
    'max_iter': [1000, 5000]
}

# Create pipeline
pipeline = Pipeline([
    ('preprocessor', preprocessor),
    ('regressor', GridSearchCV(elastic, param_grid, cv=5, scoring='neg_mean_squared_error'))
])

# Train model
pipeline.fit(X_train, y_train)

# Get best parameters
best_params = pipeline.named_steps['regressor'].best_params_
print(f"Best parameters: {best_params}")

# Evaluate on validation set
y_pred = pipeline.predict(X_val)
print(f"\nValidation MSE: {mean_squared_error(y_val, y_pred):.4f}")
print(f"Validation R2: {r2_score(y_val, y_pred):.4f}")

# Feature importance
feature_names = (num_cols + 
                list(pipeline.named_steps['preprocessor']
                    .named_transformers_['cat']
                    .get_feature_names_out(cat_cols)))

coefs = pipeline.named_steps['regressor'].best_estimator_.coef_
importance_df = pd.DataFrame({'Feature': feature_names, 'Coefficient': coefs})
importance_df = importance_df.sort_values('Coefficient', key=abs, ascending=False)

print("\nTop 10 Most Important Features:")
print(importance_df.head(10))

# Prepare test predictions if test data exists
if not test.empty:
    test_pred = pipeline.predict(test)
    # If you need to convert back to original classes:
    # test_pred_labels = le.inverse_transform(test_pred.round().astype(int))
    
    # For regression output, you might want to keep as continuous values
    submission = pd.DataFrame({'Predicted': test_pred})
    submission.to_csv('elasticnet_predictions.csv', index=False)
    print("\nTest predictions saved to 'elasticnet_predictions.csv'")

Best parameters: {'alpha': 0.001, 'l1_ratio': 0.7, 'max_iter': 1000}

Validation MSE: 2.5107
Validation R2: 0.3109

Top 10 Most Important Features:
                              Feature  Coefficient
23                    CALC_Frequently     1.583043
18                            CAEC_no     1.480652
16                    CAEC_Frequently    -0.678778
3                              Weight     0.677520
15                        CAEC_Always    -0.586172
30                     MTRANS_Walking     0.464617
11  family_history_with_overweight_no    -0.340397
13                            FAVC_no     0.335165
14                           FAVC_yes    -0.321621
1                                 Age     0.289111

Test predictions saved to 'elasticnet_predictions.csv'


# Model Performance Metrics
## Validation MSE: 2.5107
1. The average squared difference between predicted and actual values is  approximately 2.51
2. Given that obesity levels are ordinal (typically 1-7), this suggests predictions are off by about 2.51  on average
3. This is a moderate error rate for a 7-class problem

## Validation R²: 0.3109
1. Only 31.09% of variance in obesity levels is explained by the Elastic Net  model
2. It indicatest that the Elastic Net  model captures some signal but likely misses important predictors
3. This is typical for linear models on complex biological data without feature engineering

## Feature Importance Analysis
### Top Predictive Features:
#### CALC_Frequently  with a coefficient of 1.58
1. Frequent alcohol consumption strongly predicts higher obesity levels
2. This is the largest positive coefficient in the model

#### CAEC_no with a coefficient of 1.48
1. Not consuming food between meals predicts higher obesity
2. This suggests irregular eating patterns correlate with weight gain

#### CAEC_Frequently with a coefficient of -0.68
1. Frequent between-meal consumption predicts lower obesity
2, This may indicate healthier snacking habits

#### Weight with a coefficient of 0.68
1. As expected, higher weight strongly predicts obesity
2. Surprisingly Weight is not the top predictor in this Elastic Net model

#### CAEC_Always with a coefficient of -0.59
1.  Constant between-meal eating is associated with lower obesity
2. This potentially indicates individuals with high-metabolism

## Key Insights:
1. Eating patterns dominate (CALC, CAEC features)
2. Behavioral factors outweigh pure biometrics Weight is only #4 in the importance of predicting obesity.
3. Unexpected relationships like "no between-meal eating"leading to  higher obesity rates
4. Missing strong predictors like exercise frequency or diet quality

## Recommendations for Improvement
1. Feature Engineering using the following code in python
Create interaction terms
X['weight_height_ratio'] = X['Weight']/(X['Height']**2)
X['meal_frequency'] = X['FCVC'] + X['NCP']

2. Model Tuning using the following code in python
Expand hyperparameter search
param_grid = {
    'alpha': np.logspace(-4, 2, 20),
    'l1_ratio': [0, 0.25, 0.5, 0.75, 1],
    'selection': ['cyclic', 'random']
}
## Alternative Approaches should be employed:
1. Gradient Boosted Trees (XGBoost) which may better capture non-linear relationships
2. Use ordinal regression instead of treating classes as numeric
3.  Collect more detailed dietary/exercise data

## Behavioral Interpretation:
The counterintuitive CAEC findings suggest the following:
1. That people reporting no between-meal eating may be under-reporting
2. Or the patients  may engage in larger, less frequent meals
3. The results warrants domain expert consultation

## Conclusion:
The Elastic Net model shows moderate predictive power but reveals interesting patterns in eating behaviors that could inform both better modeling and potential interventions. 
The relatively low R² suggests substantial unexplained variance that is  either from missing features or non-linear relationships not captured by linear regression