### Pillar Creation  

After determining the final weightages, we compute **pillar values** using the harmonized dataset.  
These weighted pillar values serve as key components for downstream modeling and analysis.  

### Importance Modeling  

Once the pillars are constructed, we perform an **importance analysis** based on historical trends:  

- **Trend Analysis**: Examining past pillar data to identify patterns over time.  
- **Hierarchical Importance Modeling**: Building a structured importance model for different levels in the hierarchy to understand the contribution of each pillar.  

This step ensures a data-driven approach to prioritizing key metrics within the framework. 📊  


In [1]:
import sys
import os

project_path = os.path.abspath("..")

if project_path not in sys.path:
    sys.path.append(project_path)


import numpy as np
import pandas as pd

from src.brand_health_centre.data_preparation import data_prepare

#Change the config settings according to the documentation
config_path = r"D:\BRAND_HUB_PROJECT\brandhub-capability\src\brand_health_centre\config.yml"
scaled_data, idv_list, config, paths = data_prepare(config_file_path=config_path)

paths

{'filtered_data_path': './output\\filtered_data.csv', 'no_null_imputed_data_path': './output\\no_null_imputed_data.csv', 'scaled_data_path': './output\\scaled_data.csv', 'cfa_fit_data_path': './output\\cfa_fit_data.csv', 'rf_fit_data_path': 'output\\rf_fit_data.csv', 'rf_act_pred_data_path': 'output\\rf_act_pred_data.csv', 'pillar_weights_path': 'output\\pillar_weights.csv', 'pillar_data_path': 'output\\pillar_data.csv', 'trend_past_data_path': 'output\\trend_data.csv', 'scaled_score_data_path': 'output\\scaled_score_data.csv', 'imp_rf_fit_data_path': 'output\\imp_rf_fit_data.csv', 'imp_rf_act_pred_data_path': 'output\\imp_rf_act_pred_data.csv', 'score_card_final_df_path': 'output\\score_card_final_df.csv', 'relative_imp_model_results_path': 'output\\relative_imp_model_results.csv'}
All required columns are present in the DataFrame.
All independent variables in idv_list are present in the data.
Minimum date: 2017-01-07 00:00:00
Maximum date: 2025-01-11 00:00:00
Dropped_columns: [('vend

{'filtered_data_path': './output\\filtered_data.csv',
 'no_null_imputed_data_path': './output\\no_null_imputed_data.csv',
 'scaled_data_path': './output\\scaled_data.csv',
 'cfa_fit_data_path': './output\\cfa_fit_data.csv',
 'rf_fit_data_path': 'output\\rf_fit_data.csv',
 'rf_act_pred_data_path': 'output\\rf_act_pred_data.csv',
 'pillar_weights_path': 'output\\pillar_weights.csv',
 'pillar_data_path': 'output\\pillar_data.csv',
 'trend_past_data_path': 'output\\trend_data.csv',
 'scaled_score_data_path': 'output\\scaled_score_data.csv',
 'imp_rf_fit_data_path': 'output\\imp_rf_fit_data.csv',
 'imp_rf_act_pred_data_path': 'output\\imp_rf_act_pred_data.csv',
 'score_card_final_df_path': 'output\\score_card_final_df.csv',
 'relative_imp_model_results_path': 'output\\relative_imp_model_results.csv'}

In [5]:
import numpy as np
import pandas as pd

cfa_df = pd.read_csv(r'.\output\cfa_fit_data.csv')
rf_df = pd.read_csv(r'.\output\rf_fit_data.csv')

In [None]:
from src.brand_health_centre.score import scoring
from src.brand_health_centre.pillar_importance import importance_run_parallel_processing, scorecard_format

## Pillar Creation
pillar_weights, pillar_data, trend_past_data, scaled_score_data = scoring(
    cfa_df, rf_df, scaled_data, idv_list, config, paths
)

## Importance Model
imp_rf_df, imp_rf_act_pred_df = importance_run_parallel_processing(
    scaled_data, trend_past_data, idv_list, config, paths
)

## Scorecard Creation
scorecard, pillar_relative_importance = scorecard_format(
    config,
    pillar_weights,
    scaled_data,
    scaled_score_data,
    imp_rf_df,
    paths,
)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cfa_filtered.rename(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cfa_filtered[config["cfa_target_col"]] = cfa_filtered[
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 18 concurrent workers.
[Parallel(n_jobs=-1)]: Done   2 out of   8 | elapsed:  1.2min remaining:  3.6min
[Parallel(n_jobs=-1)]: Done   3 out of   8 | elapsed:  1.2min remaining:  2.0min
[Parallel(n_jobs=-1)]: Done   4 out of   8 | elapsed:  1.2min remaining:  1.2min
[Parallel(n_jobs=-1)]: Done   5 out of   8 | elapsed:  1.2min remaining:   44.8s
[Parallel(n_jobs=-1)]: Done   6 out of   8 | elaps