In [4]:
import pandas as pd

# Load moderator data
normalized_data = pd.read_excel("../EDA/Datasets/moderator-data-cleaned.xlsx")

# Creating a Composite Scoring System for Moderators

**Normalization**

Since the 4 metrics (`Productivity`, `Utilisation %`, `handing time` and `accuracy`) are on different scales, normalization is required to scale them to a common scale. This ensures that no single metric disproportionately affects the composite score.

We will use Min-Max scaling to normalize the features.

**Scoring System**

We will assume each metric has an equal weight of 1/4. Hence, the composite score will be computed by taking the average of the normalized values for each moderator.

In [5]:
# Normalizing the four metrics
normalized_data['Productivity_norm'] = (normalized_data['Productivity'] - normalized_data['Productivity'].min()) / \
                                      (normalized_data['Productivity'].max() - normalized_data['Productivity'].min())

normalized_data['Utilisation_norm'] = normalized_data['Utilisation %'] / 100  # Already a percentage, so just scaling it between 0 and 1

# Handling time is inversed since lower is better
normalized_data['handling_time_norm'] = 1 - ((normalized_data['handling time'] - normalized_data['handling time'].min()) / \
                                            (normalized_data['handling time'].max() - normalized_data['handling time'].min()))

# Assuming accuracy is already between 0 and 1 since it's a probability. If not, we'd need to normalize it similarly to the other metrics.
normalized_data['accuracy_norm'] = normalized_data['accuracy']

# Computing the composite score by taking the average of the normalized values
normalized_data['composite_score'] = normalized_data[['Productivity_norm', 'Utilisation_norm', 'handling_time_norm', 'accuracy_norm']].mean(axis=1)

# Sorting the data by the composite score in descending order to see the top moderators
sorted_moderators = normalized_data[['moderator', 'composite_score']].drop_duplicates().sort_values(by='composite_score', ascending=False)

sorted_moderators.head(10)  # Displaying top 10 moderators based on composite score


Unnamed: 0,moderator,composite_score
344,6073125,0.727262
392,1743843711746081,0.708418
198,1714052438738946,0.681299
775,1774283521390594,0.679771
706,9879733,0.676433
56,5338213,0.667758
31,8585325,0.661654
1194,1912879,0.661326
243,1737386135461890,0.659245
403,1734415103300609,0.657748


- The moderator with ID `6073125` has the highest composite score, making them the top-performing moderator based on the four metrics: Productivity, Utilisation %, Handling time, and Accuracy
- The table above lists the top 10 moderators based on the composite score