### Pillar Creation  

After determining the final weightages, we compute **pillar values** using the harmonized dataset.  
These weighted pillar values serve as key components for downstream modeling and analysis.  

### Importance Modeling  

Once the pillars are constructed, we perform an **importance analysis** based on historical trends:  

- **Trend Analysis**: Examining past pillar data to identify patterns over time.  
- **Hierarchical Importance Modeling**: Building a structured importance model for different levels in the hierarchy to understand the contribution of each pillar.  

This step ensures a data-driven approach to prioritizing key metrics within the framework. 📊  


In [1]:
import sys
import os

project_path = os.path.abspath("..")

if project_path not in sys.path:
    sys.path.append(project_path)


import numpy as np
import pandas as pd

from src.brand_health_centre.data_preparation import data_prepare

#Change the config settings according to the documentation
config_path = r"D:\BRAND_HUB_PROJECT\brandhub-capability\src\brand_health_centre\config.yml"
scaled_data, idv_list, config, paths = data_prepare(config_file_path=config_path)

paths

{'filtered_data_path': './output\\filtered_data.csv', 'no_null_imputed_data_path': './output\\no_null_imputed_data.csv', 'scaled_data_path': './output\\scaled_data.csv', 'cfa_fit_data_path': './output\\cfa_fit_data.csv', 'rf_fit_data_path': 'output\\rf_fit_data.csv', 'rf_act_pred_data_path': 'output\\rf_act_pred_data.csv', 'pillar_weights_path': 'output\\pillar_weights.csv', 'pillar_data_path': 'output\\pillar_data.csv', 'trend_past_data_path': 'output\\trend_data.csv', 'scaled_score_data_path': 'output\\scaled_score_data.csv', 'imp_rf_fit_data_path': 'output\\imp_rf_fit_data.csv', 'imp_rf_act_pred_data_path': 'output\\imp_rf_act_pred_data.csv', 'score_card_final_df_path': 'output\\score_card_final_df.csv', 'relative_imp_model_results_path': 'output\\relative_imp_model_results.csv'}
All required columns are present in the DataFrame.
All independent variables in idv_list are present in the data.
Minimum date: 2017-01-07 00:00:00
Maximum date: 2025-01-11 00:00:00
Dropped_columns: [('vend

{'filtered_data_path': './output\\filtered_data.csv',
 'no_null_imputed_data_path': './output\\no_null_imputed_data.csv',
 'scaled_data_path': './output\\scaled_data.csv',
 'cfa_fit_data_path': './output\\cfa_fit_data.csv',
 'rf_fit_data_path': 'output\\rf_fit_data.csv',
 'rf_act_pred_data_path': 'output\\rf_act_pred_data.csv',
 'pillar_weights_path': 'output\\pillar_weights.csv',
 'pillar_data_path': 'output\\pillar_data.csv',
 'trend_past_data_path': 'output\\trend_data.csv',
 'scaled_score_data_path': 'output\\scaled_score_data.csv',
 'imp_rf_fit_data_path': 'output\\imp_rf_fit_data.csv',
 'imp_rf_act_pred_data_path': 'output\\imp_rf_act_pred_data.csv',
 'score_card_final_df_path': 'output\\score_card_final_df.csv',
 'relative_imp_model_results_path': 'output\\relative_imp_model_results.csv'}

In [2]:
import numpy as np
import pandas as pd

cfa_df = pd.read_csv(r'.\output\cfa_fit_data.csv')
rf_df = pd.read_csv(r'.\output\rf_fit_data.csv')

In [3]:
from src.brand_health_centre.score import scoring
from src.brand_health_centre.pillar_importance import importance_run_parallel_processing, scorecard_format

## Pillar Creation
pillar_weights, pillar_data, trend_past_data, scaled_score_data = scoring(
    cfa_df, rf_df, scaled_data, idv_list, config, paths
)

## Importance Model
imp_rf_df, imp_rf_act_pred_df = importance_run_parallel_processing(
    scaled_data, trend_past_data, idv_list, config, paths
)

## Scorecard Creation
scorecard, pillar_relative_importance = scorecard_format(
    config,
    pillar_weights,
    scaled_data,
    scaled_score_data,
    imp_rf_df,
    paths,
)

  from .autonotebook import tqdm as notebook_tqdm
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cfa_filtered.rename(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cfa_filtered[config["cfa_target_col"]] = cfa_filtered[
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 18 concurrent workers.
[Parallel(n_jobs=-1)]: Done   2 out of   8 | elapsed:  1.4min remaining:  4.2min
[Parallel(n_jobs=-1)]: Done   3 out of   8 | elapsed:  1.4min remaining:  2.3min
[Parallel(n_jobs=-1)]: Done   4 out of   8 | elapsed:  1.4min remaining:  1.4min
[Parallel(n_jobs=-1)]: Done   5 out of   8 | elapsed:  1.4min remaining:   51.6s


In [4]:
pillar_weights.head()

Unnamed: 0,vendor,brand,category,pillar,metric,weight
0,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911
1,vendor_1,brand_1,category_1,advocacy,directions_strategic_measures_brand_love_index,0.292462
2,vendor_1,brand_1,category_1,advocacy,social_percent_positive_neutral,0.268428
3,vendor_1,brand_1,category_1,awareness,directions_awareness_total_awareness_net_mentions,0.516854
4,vendor_1,brand_1,category_1,awareness,directions_awareness_unaided_awareness_net_men...,0.35306


In [5]:
pillar_data.head()

Unnamed: 0,date,vendor,brand,category,pillar,score
0,2022-08-06,vendor_1,brand_1,category_1,advocacy,0.814644
1,2022-08-06,vendor_1,brand_1,category_1,awareness,0.480895
2,2022-08-06,vendor_1,brand_1,category_1,brand_perceptions,0.552029
3,2022-08-06,vendor_1,brand_1,category_1,consideration,0.386132
4,2022-08-06,vendor_1,brand_1,category_1,loyalty,0.187977


In [6]:
trend_past_data.head()

Unnamed: 0,date,vendor,brand,category,pillar,score,trend_past
0,2022-08-06,vendor_1,brand_1,category_1,advocacy,0.814644,0.814644
48,2022-08-13,vendor_1,brand_1,category_1,advocacy,0.799405,0.807024
96,2022-08-20,vendor_1,brand_1,category_1,advocacy,0.818642,0.810897
144,2022-08-27,vendor_1,brand_1,category_1,advocacy,0.817758,0.812612
192,2022-09-03,vendor_1,brand_1,category_1,advocacy,0.791072,0.808304


In [7]:
scaled_score_data.head()

Unnamed: 0,date,vendor,brand,category,pillar,score,scaled_score
0,2022-08-06,vendor_1,brand_1,category_1,advocacy,0.814644,100.693859
1,2022-08-06,vendor_1,brand_1,category_1,awareness,0.480895,110.230947
2,2022-08-06,vendor_1,brand_1,category_1,brand_perceptions,0.552029,96.711517
3,2022-08-06,vendor_1,brand_1,category_1,consideration,0.386132,76.24356
4,2022-08-06,vendor_1,brand_1,category_1,loyalty,0.187977,78.957858


In [8]:
imp_rf_df.head()

Unnamed: 0,shap_features,feature_importance,shap_values,model_type,latest_dv,r2_score_train,mape_train,r2_score_fold,mape_fold,r2_score_hold_out,mape_hold_out,r2_score_all,mape_all,best_params_gridsearchcv,vendor,brand,category
0,product_feedback,0.222312,9e-05,RandomForest,0.008493,0.792248,0.017836,0.196617,0.032471,0.339447,0.03004,0.755797,0.01898,"{'max_depth': 5, 'max_features': 2, 'n_estimat...",vendor_1,brand_1,category_1
1,advocacy,0.193339,4.8e-05,RandomForest,0.008493,0.792248,0.017836,0.196617,0.032471,0.339447,0.03004,0.755797,0.01898,"{'max_depth': 5, 'max_features': 2, 'n_estimat...",vendor_1,brand_1,category_1
2,awareness,0.168368,5.7e-05,RandomForest,0.008493,0.792248,0.017836,0.196617,0.032471,0.339447,0.03004,0.755797,0.01898,"{'max_depth': 5, 'max_features': 2, 'n_estimat...",vendor_1,brand_1,category_1
3,brand_perceptions,0.159768,3.7e-05,RandomForest,0.008493,0.792248,0.017836,0.196617,0.032471,0.339447,0.03004,0.755797,0.01898,"{'max_depth': 5, 'max_features': 2, 'n_estimat...",vendor_1,brand_1,category_1
4,loyalty,0.148345,3.4e-05,RandomForest,0.008493,0.792248,0.017836,0.196617,0.032471,0.339447,0.03004,0.755797,0.01898,"{'max_depth': 5, 'max_features': 2, 'n_estimat...",vendor_1,brand_1,category_1


In [9]:
scorecard.head()

Unnamed: 0,date,vendor,brand,category,pillar,metric,weight,value,metric_contribution,year,month,score,scaled_score
0,2022-08-13,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911,0.823293,0.361516,2022,8,0.799405,100.496579
1,2022-08-20,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911,0.823293,0.361516,2022,8,0.818642,103.417276
2,2022-08-27,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911,0.823293,0.361516,2022,8,0.817758,101.626006
3,2022-12-10,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911,0.840237,0.368956,2022,12,0.806603,100.691554
4,2023-02-04,vendor_1,brand_1,category_1,advocacy,directions_funnel_metrics_advocacy_t2b_buyers,0.43911,0.796484,0.349744,2023,2,0.797505,102.377937


In [10]:
pillar_relative_importance.head()

Unnamed: 0,vendor,brand,category,shap_features,shap_values,relative_importance
0,vendor_1,brand_1,category_1,product_feedback,9e-05,0.297662
1,vendor_1,brand_1,category_1,advocacy,4.8e-05,0.157523
2,vendor_1,brand_1,category_1,awareness,5.7e-05,0.188706
3,vendor_1,brand_1,category_1,brand_perceptions,3.7e-05,0.121336
4,vendor_1,brand_1,category_1,loyalty,3.4e-05,0.112731
