# Evaluation


The problem uses a newly developed metric that combines several submetrics to balance overall performance with various aspects of unintended bias.

The submetrics are:


**1.   Overall AUC:** This is the ROC-AUC for the full evaluation set.

**2.   Bias AUCs:** To measure unintended bias, we again calculate the ROC-AUC on three specific subsets of the test set for each identity, each capturing a different aspect of unintended bias.


*   *a) Subgroup AUC:* Here, we restrict the data set to only the examples that mention the specific identity subgroup. A low value in this metric means the model does a poor job of distinguishing between toxic and non-toxic comments that mention the identity.

*   *b) BPSN (Background Positive, Subgroup Negative) AUC:* Here, we restrict the test set to the non-toxic examples that mention the identity and the toxic examples that do not. A low value in this metric means that the model confuses non-toxic examples that mention the identity with toxic examples that do not, likely meaning that the model predicts higher toxicity scores than it should for non-toxic examples mentioning the identity.

*   *c) BNSP (Background Negative, Subgroup Positive) AUC:* Here, we restrict the test set to the toxic examples that mention the identity and the non-toxic examples that do not. A low value here means that the model confuses toxic examples that mention the identity with non-toxic examples that do not, likely meaning that the model predicts lower toxicity scores than it should for toxic examples mentioning the identity.







Basically, the final score is an average of 4 AUC. And as such, we will use a custom loss function instead of just using the binary cross entropy.

# Loss Function

There are 2 main changes of the loss fuction:

__1. weight each sample:__

The main idea is:

Each sample participates in some of these AUC. __A sample that participates in 3 AUCs is more important than a sample that participates in 2 AUCs__ since giving a bad score to that sample affects the overall score more.

So, We calculate the weight of each sample based on how many AUCs they belong to.


In [0]:
# Overall
weights = np.ones((len(train_df),)) / 4

# Subgroup
weights += (train_df[identity_columns].fillna(0).values>=0.5).sum(axis=1).astype(bool).astype(np.int) / 4

# Background Positive, Subgroup Negative
weights += (( (train_df['target'].values>=0.5).astype(bool).astype(np.int) +
   (train_df[identity_columns].fillna(0).values<0.5).sum(axis=1).astype(bool).astype(np.int) ) > 1 ).astype(bool).astype(np.int) / 4

# Background Negative, Subgroup Positive
weights += (( (train_df['target'].values<0.5).astype(bool).astype(np.int) +
   (train_df[identity_columns].fillna(0).values>=0.5).sum(axis=1).astype(bool).astype(np.int) ) > 1 ).astype(bool).astype(np.int) / 4

# for later normalization the loss
loss_weight = 1.0 / weights.mean()

In [0]:
y_train = np.vstack([train_df['target'], weights]).T

__2. Auxiliary Target:__

Because, the data also has several additional toxicity subtype attributes (severe_toxicity, obscene, threat, insult, identity_attack, sexual_explicit) that are highly correlated to the target, we also use the toxicity probabilities of these auxiliary targets.

In [0]:
y_aux_train = train_data[['target', 'severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat', 'sexual_explicit']]

__The Loss Function:__

- We will use a custom loss function to calculate loss for `y_train` outputs.

In [0]:
def custom_loss(y_true, y_pred):
    return binary_crossentropy(K.reshape(y_true[:,0],(-1,1)), y_pred) * y_true[:,1]

- `y_aux_train` losses will be calculated using usual 'binary_crossentropy'

The overall loss will calculated will be as follows:



<a href="https://www.codecogs.com/eqnedit.php?latex=finalLoss&space;=&space;(loss(yTtrain)&space;*&space;lossWeight)&space;&plus;&space;loss(yAuxTrain)" target="_blank"><img src="https://latex.codecogs.com/gif.latex?finalLoss&space;=&space;(loss(yTtrain)&space;*&space;lossWeight)&space;&plus;&space;loss(yAuxTrain)" title="finalLoss = (loss(yTtrain) * lossWeight) + loss(yAuxTrain)" /></a>