
# 📊 Creating Fake DataFrames for Explanation  

After the training process, we will have three key datasets:  

- **Training DataFrame**: Used to train the model.  
- **Validation DataFrame**: Used to tune the model's hyperparameters.  
- **Testing DataFrame**: Used to evaluate the final model performance.  

For the sake of this example, let's assume this process is already completed. Below, we generate sample DataFrames to demonstrate How to use The Ml Eval Lib. 📝🔍  

⚠ Note:

The validation_dataframes is required for the library to operate.
train_dataframes and test_dataframes are optional but useful for more comprehensive model evaluations.



In [0]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from xgboost import XGBClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, accuracy_score

# Generate synthetic data for demonstration
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# Simulate some categorical and continuous features for segmentation
np.random.seed(42)
age = np.random.randint(20, 60, size=1000)
income = np.random.randint(30000, 100000, size=1000)
gender = np.random.choice(['M', 'F'], size=1000)
region = np.random.choice(['Urban', 'Suburban', 'Rural'], size=1000)
education = np.random.choice(['High School', "Bachelor's", "Master's", 'PhD'], size=1000)
marital_status = np.random.choice(['Single', 'Married', 'Divorced'], size=1000)

# Add a timestamp column with random dates within a range
start_date = pd.Timestamp('2023-01-01')
end_date = pd.Timestamp('2023-12-31')
timestamps = pd.to_datetime(np.random.randint(start_date.value // 10**9, end_date.value // 10**9, size=1000), unit='s')

# Add those features to the data
features_df = pd.DataFrame({
    'Age': age,
    'Income': income,
    'Gender': gender,
    'Region': region,
    'Education Level': education,
    'Marital Status': marital_status,
    'Timestamp': timestamps
})

# Combine synthetic features with the classification problem for train, val, test sets
train_features = features_df.iloc[:len(X_train)]
val_features = features_df.iloc[len(X_train):len(X_train)+len(X_val)]
test_features = features_df.iloc[len(X_train)+len(X_val):]

# Train four models
model1 = LogisticRegression(max_iter=1000)
model2 = RandomForestClassifier()
model3 = SVC(probability=True)
model4 = XGBClassifier(use_label_encoder=False, eval_metric='logloss')

# Fit models
model1.fit(X_train, y_train)
model2.fit(X_train, y_train)
model3.fit(X_train, y_train)
model4.fit(X_train, y_train)

# Function to create predictions dataframe
def create_predictions_df(model, X, y, features, dataset_type):
    return pd.DataFrame({
        "ID": np.arange(len(X)),
        "Age": features['Age'].values,
        "Income": features['Income'].values,
        "Gender": features['Gender'].values,
        "Region": features['Region'].values,
        "Education Level": features['Education Level'].values,
        "Marital Status": features['Marital Status'].values,
        "Timestamp": features['Timestamp'].values,
        "Target (True Label)": y,
        "Prediction": model.predict(X),
        "Prediction_Probabilities y(1)": model.predict_proba(X)[:, 1].round(2)
    })

# Generate predictions for each model
logistic_train = create_predictions_df(model1, X_train, y_train, train_features, "train")
logistic_val = create_predictions_df(model1, X_val, y_val, val_features, "val")
logistic_test = create_predictions_df(model1, X_test, y_test, test_features, "test")

rf_train = create_predictions_df(model2, X_train, y_train, train_features, "train")
rf_val = create_predictions_df(model2, X_val, y_val, val_features, "val")
rf_test = create_predictions_df(model2, X_test, y_test, test_features, "test")

svm_train = create_predictions_df(model3, X_train, y_train, train_features, "train")
svm_val = create_predictions_df(model3, X_val, y_val, val_features, "val")
svm_test = create_predictions_df(model3, X_test, y_test, test_features, "test")

xgb_train = create_predictions_df(model4, X_train, y_train, train_features, "train")
xgb_val = create_predictions_df(model4, X_val, y_val, val_features, "val")
xgb_test = create_predictions_df(model4, X_test, y_test, test_features, "test")


#### Display  how  the Dataframe  looks like 


## ⚙️ Parameters  of Ml Evaluation FrameWork 

### 1. 🏷️ `model_names` (*List of Strings*)  

- **📌 Description**: A list containing the names of machine learning models to be evaluated.  
- **📝 Example**: `["logistic_regression", "random_forest"]`  

### 2. 📊 `train_dataframes` (*List of Pandas DataFrames, optional*)  

- **📌 Description**: A list of training datasets corresponding to each model. Optional parameter.  
- **📝 Example**: `[training_dataframe_model_1, training_dataframe_model_2]`  

### 3. 🧪 `validation_dataframes` (*List of Pandas DataFrames, required*)  

- **📌 Description**: A list of validation datasets that will be used to assess model performance. **Mandatory** for the library to function.  
- **📝 Example**: `[validation_dataframe_model_1, validation_dataframe_model_2]`  

### 4. 🎯 `test_dataframes` (*List of Pandas DataFrames, optional*)  

- **📌 Description**: A list of test datasets used for final evaluation after model training.  
- **📝 Example**: `[test_dataframe_model_1, test_dataframe_model_2]`  

### ⚠️ **Important Notes**  

- ✅ `validation_dataframes` is **required** for the library to operate.  
- ⚡ `train_dataframes` and `test_dataframes` are **optional** but useful for more comprehensive model evaluations.  


### 🔹 Step 1: Define Model Names and Load Data  

To use this library, you first need to 🏗️ define the models and 📥 load the relevant datasets.  
Store 🏋️ training, 🧪 validation, and 🎯 test data in lists. 

### ⚠ **Note**:

- The `validation_dataframes` is required for the library to operate.
- `train_dataframes` and `test_dataframes` are optional but useful for more comprehensive model evaluations.

# classification parameters 

In [0]:

# 🏷️ List of Model Names  
model_names = ["Old Model (logistic regression)", "Challenger Model Random  Forest ", "Challenger Model SVM ", "Challenger Model XGBoost"]  

# 📊 Training DataFrames for Each Model  
train_dataframes = [logistic_train, rf_train, svm_train, xgb_train]  

# 🧪 Validation DataFrames for Model Performance Assessment  
validation_dataframes = [logistic_val, rf_val, svm_val, xgb_val]  

# 🎯 Test DataFrames for Final Evaluation  
test_dataframes = [logistic_test, rf_test, svm_test, xgb_test]  


### 🛠️ Step 2  : Modify Configuration Files (if needed)  

If custom configurations are required, you can adjust the ⚙️ settings by editing the YAML file:  

📂 **model_evaluation/settings/model_evaluation_settings.yaml**  

```yaml
output_directory: "../eda_lib/model_evaluation/output/"
# 📂 The directory where all output files (e.g., logs, evaluation metrics) generated during model evaluation will be saved.
# ⚠️ Ensure this path exists beforehand and has the necessary write permissions to avoid runtime issues.

parameters:

    ai_feedback: False
    # 🤖 A boolean flag to indicate whether AI-generated feedback will be utilized during the evaluation process.
    # ✅ Set to `True` to include AI feedback integration, or leave as `False` to disable it.
    # ⚠️ Note: The AI feedback feature is not fully tuned yet; keeping this set to `False` is recommended for now.

    task_type: "classification"
    # 🎯 Defines the type of task being evaluated. The selected task type impacts the metrics and logic applied during evaluation.
    # 🔹 Possible options:
    #   - "classification" 🏷️: For tasks such as binary or multi-class classification.
    #   - "regression" 📈: For tasks that predict continuous values.

time_based_analysis_parameters:

    activate_analysis: True
    # ⏳ Boolean flag to activate or deactivate time-based analysis.
    # ✅ Set to `True` to enable evaluations that consider temporal trends in the data.

    timestamp_column: "Timestamp"
    # 📅 The name of the column containing timestamp information in the dataset.
    # ⏰ This column will be used to perform time-based segmentations.

    number_of_months: 4
    # 📆 Specifies the time interval (in months) to use for the time-based analysis.
    # 🔹 Examples:
    #   - Use `3` for quarterly analysis. 📊
    #   - Use `12` for annual analysis. 📅

non_time_based_analysis_parameters:

    activate_analysis: True
    # 🏷️ Boolean flag to enable or disable non-time-based segmentation analysis.
    # ✅ Set to `True` to include evaluations based on subcategories defined in the dataset.

    subcategory_threshold: 10
    # 🔢 The maximum number of subcategories to process during non-time-based segmentation analysis.
    # ⚠️ If the number of subcategories exceeds this threshold, optimizations or exclusions may be applied.

    segmentation_column: "Gender"  
    # 🏷️ The column name used for segmentation in non-time-based analysis.
    # 🔹 Examples:
    #   - Use "Gender" 🚻 for gender-based segmentation.
    #   - Use other categorical columns as needed for task-specific segmentation.

display_settings:

    side_bar_title: "📊 Master ML Evaluation Framework"
    # 🖥️ The title displayed in the sidebar of the evaluation interface.
    # 🎨 Customize this title to reflect the organization or framework's branding.

    side_bar_logo: "🌐 https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTBv8lPPqw_8NVRq01U8UhNguSO-Z6gdTlJjA&s"
    # 🖼️ The URL of the logo to be displayed in the sidebar of the evaluation interface.
    # ⚠️ Ensure the URL is accessible and points to a valid image file to avoid display errors.



# Step 3: Running the Model Evaluation 🧑‍💻📊  


##### 🧠 Call the `MlEvaluationApp` Class  

To perform model evaluation, we will call the **`MlEvaluationApp`** class from the following module:  
`model_evaluation.model_evaluation_lib.model_evaluation_app_runner`.

Once imported, the `run_app` function will execute the analysis.

The output results will be saved in the directory:  
`model_evaluation/output`.  



In [0]:
from model_evaluation.model_evaluation_lib.model_evaluation_app_runner import MlEvaluationApp


app = MlEvaluationApp(
    models=model_names,
    training_dataframes_list=train_dataframes,
    validation_dataframes_list=validation_dataframes,
    test_dataframes_list=test_dataframes,
    user_settings_path='model_evaluation/settings/model_evaluation_settings.yaml',
    project_root_name="eda_lib",
    base_path= "/Workspace/Repos/mohamed-naceur.mahmoud@telefonica.de",
    project_folder_name="classification",
    output_path="model_evaluation/output_folder" ,
    project_name="classification use case"
)

In [0]:
app.run_app()