#### Phase 3: Risk Scoring and Integration

###### Objective: Combine all the individual risk factors (on_time_delivery_rate, quality_score, and geopolitical_risk_score) into a single, 
master total_risk_score for each supplier. This is the core of the project's logic and business impact.

In [1]:
import pandas as pd
import numpy as np

# --- Configuration for Risk Scoring ---
DELIVERY_RISK_WEIGHT = 0.45
QUALITY_RISK_WEIGHT = 0.45
GEOPOLITICAL_RISK_WEIGHT = 0.10

# Ensure weights sum to 1.0
if not np.isclose(DELIVERY_RISK_WEIGHT + QUALITY_RISK_WEIGHT + GEOPOLITICAL_RISK_WEIGHT, 1.0):
    raise ValueError("Risk weights must sum to 1.0")

# --- Main Script Execution ---

if __name__ == "__main__":
    # Load the final dataset from the previous phase
    try:
        df = pd.read_csv('final_supplier_data.csv')
    except FileNotFoundError:
        print("Error: 'final_supplier_data.csv' not found. Please run the previous phase's script.")
        exit()

    # --- Data Cleaning and Preparation (A quick review before scoring) ---
    # Handle the missing values we introduced in Phase 1
    # We will impute missing quality scores and delivery rates with the median.
    # This is a simple but effective strategy for a portfolio project.
    df['on_time_delivery_rate'].fillna(df['on_time_delivery_rate'].median(), inplace=True)
    df['quality_score'].fillna(df['quality_score'].median(), inplace=True)

    # --- Calculate Individual Risk Scores ---
    
    # 1. Convert on_time_delivery_rate to a risk score (0-1 scale)
    # A higher rate means lower risk, so we inverse it.
    df['delivery_risk_score'] = 1 - df['on_time_delivery_rate']

    # 2. Convert quality_score to a risk score (0-1 scale)
    # A higher score means lower risk, so we inverse and normalize it.
    df['quality_risk_score'] = 1 - (df['quality_score'] / 100)
    
    # The geopolitical_risk_score is already on a 0-1 scale, so no change is needed.
    
    # --- Calculate the Total Risk Score using the weighted algorithm ---
    df['total_risk_score'] = (
        df['delivery_risk_score'] * DELIVERY_RISK_WEIGHT +
        df['quality_risk_score'] * QUALITY_RISK_WEIGHT +
        df['geopolitical_risk_score'] * GEOPOLITICAL_RISK_WEIGHT
    )
    
    # --- Categorize suppliers based on their total risk score ---
    # We'll create bins for Low, Medium, and High risk.
    bins = [0, 0.2, 0.4, 1.0]
    labels = ['Low Risk', 'Medium Risk', 'High Risk']
    df['risk_category'] = pd.cut(df['total_risk_score'], bins=bins, labels=labels, right=False)
    
    # --- Final Dataset Preparation ---
    # We can drop the intermediate risk scores to make the final dataset cleaner for the dashboard.
    final_df_for_dashboard = df.drop(columns=['delivery_risk_score', 'quality_risk_score'])

    # Save the final dataset to a new CSV file
    final_df_for_dashboard.to_csv('dashboard_data.csv', index=False)
    
    print("Supplier risk scores calculated and categorized successfully!")
    print(f"Final dashboard data saved as 'dashboard_data.csv'.")
    print("\nHere's a preview of the final data ready for visualization:")
    print(final_df_for_dashboard.head())
    print("\nRisk category distribution:")
    print(final_df_for_dashboard['risk_category'].value_counts())
    
    print("\nWe are now ready for the final and most impactful phase: Dashboard Creation!")

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['on_time_delivery_rate'].fillna(df['on_time_delivery_rate'].median(), inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['quality_score'].fillna(df['quality_score'].median(), inplace=True)


Supplier risk scores calculated and categorized successfully!
Final dashboard data saved as 'dashboard_data.csv'.

Here's a preview of the final data ready for visualization:
  supplier_id        date  on_time_delivery_rate  quality_score  \
0     SUP-001  2024-01-01               0.758253      96.411776   
1     SUP-002  2024-01-01               0.968237      93.419158   
2     SUP-003  2024-01-01               0.812107      96.564508   
3     SUP-004  2024-01-01               0.909005      92.406934   
4     SUP-005  2024-01-01               0.824811      92.840992   

  supplier_country  geopolitical_risk_score  total_risk_score risk_category  
0          Vietnam                     0.40          0.164933      Low Risk  
1          Germany                     0.25          0.068907      Low Risk  
2           Turkey                     0.75          0.175011      Low Risk  
3            India                     0.45          0.120116      Low Risk  
4          Vietnam              

In [4]:
final_df_for_dashboard.tail(10)

Unnamed: 0,supplier_id,date,on_time_delivery_rate,quality_score,supplier_country,geopolitical_risk_score,total_risk_score,risk_category
590,SUP-041,2024-11-26,0.815812,99.799939,South Korea,0.5,0.133785,Low Risk
591,SUP-042,2024-11-26,0.986155,87.137433,Mexico,0.55,0.119112,Low Risk
592,SUP-043,2024-11-26,0.84824,92.483322,Taiwan,0.7,0.172117,Low Risk
593,SUP-044,2024-11-26,0.964584,94.272336,South Korea,0.5,0.091712,Low Risk
594,SUP-045,2024-11-26,0.805189,95.536975,Taiwan,0.7,0.177749,Low Risk
595,SUP-046,2024-11-26,0.870698,93.39473,India,0.45,0.13291,Low Risk
596,SUP-047,2024-11-26,0.757472,85.146563,Mexico,0.55,0.230978,Medium Risk
597,SUP-048,2024-11-26,0.9064,89.89692,Germany,0.25,0.112584,Low Risk
598,SUP-049,2024-11-26,0.838446,92.765675,Vietnam,0.4,0.145254,Low Risk
599,SUP-050,2024-11-26,0.957446,86.317997,Turkey,0.75,0.155718,Low Risk
