#  Part 2: Flexibility Make the Life Much More Easier -- LSTM Playground

LSTM is verty important for us. 

LSTM is our baseline model, we can compare different architecture with LSTM model to evaluate whether others model is worth to use.

There are two interesting things we have done for LSTM model:

### 1. We want to be faster
A big difficulty we have to deal with is Even LSTM Model takes so long to train. We need to do great bunch of experiments, but if we use the pipline out there on kaggle kernel, whatever change we do will takes 2 hours to evaluate.

We found that one optimization we can do to train the model faster is reduce the padding length. The current padding length for the sentences is 220. This is longer than 99% of the sentences in the dataset, but most of the sample length is not longer than 100.

So, we sort the dataset by sentences length and group similar length sentences together. Therefore at training stage, we can do different sequence padding by the longest sentences in the group, which save us a lot of time.

![padding](Pictures/padding.png)<br/>
<br/>

### 2. Statistical Features is worth to try
Acutally,statistical Feature didn't give us any boost finally. 

However, it seems that adding statistical features to the model will have a higher score at first 3 epoch, when the model is converge, the finally score won't better than non-stats LSTM model. 

We still think it's interesting finding. Maybe stats feature is easy to capture and make prediction with by the model, but they are not strong features. Therefore in later training stage, this kinds of features was outperformed by better features.

# Fast LSTM Model With Statistical Features

The following model is a interesting pipline we build that can easily train a good LSTM in a relatively short time using the techique just mentioned. (Comparable LSTM in Kaggle Hot Kernel takes about **about 1 hours** to train. But this pipline only takes **20 mins** per model.)

Model Architecture:
![LSTM](Pictures/LSTM.png)

In [5]:
from contextlib import contextmanager
import os
import random
import re
import string
import time
import warnings
from tqdm._tqdm_notebook import tqdm_notebook as tqdm

import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import StratifiedKFold
from sklearn import metrics
from sklearn.model_selection import train_test_split


from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

import torch
import torch.nn as nn
import torch.utils.data
from torch.optim.optimizer import Optimizer
from torch.nn import functional as F


## Configurations

In [6]:
len(pd.read_csv(TRAIN_DATA))

1804874

In [3]:
EMBEDDING_FASTTEXT = './input/crawl-300d-2M.vec'
TRAIN_DATA = './data/train.csv'
TEST_DATA = './data/test.csv'
SAMPLE_SUBMISSION = './data/sample_submission.csv'

embed_size = 300
max_features = 1000000
maxlen = 220

batch_size = 512
train_epochs = 8
n_splits = 5

seed = 2333

## Helper Fuctions


In [74]:
def custom_loss(data, targets):
    ''' Define custom loss function for weighted BCE on 'target' column '''
    bce_loss_1 = nn.BCEWithLogitsLoss(weight=targets[:,1:2])(data[:,:1],targets[:,:1])
    bce_loss_2 = nn.BCEWithLogitsLoss()(data[:,1:],targets[:,2:])
    return (bce_loss_1 * loss_weight) + bce_loss_2

In [75]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

In [76]:
@contextmanager
def timer(msg):
    t0 = time.time()
    print(f'[{msg}] start.')
    yield
    elapsed_time = time.time() - t0
    print(f'[{msg}] done in {elapsed_time / 60:.2f} min.')

In [77]:
def seed_torch(seed=1029):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
seed_torch()

In [78]:
puncts = [',', '.', '"', ':', ')', '(', '-', '!', '?', '|', ';', "'", '$', '&', '/', '[', ']',
          '>', '%', '=', '#', '*', '+', '\\', '•', '~', '@', '£', '·', '_', '{', '}', '©', '^',
          '®', '`', '<', '→', '°', '€', '™', '›', '♥', '←', '×', '§', '″', '′', 'Â', '█',
          '½', 'à', '…', '“', '★', '”', '–', '●', 'â', '►', '−', '¢', '²', '¬', '░', '¶',
          '↑', '±', '¿', '▾', '═', '¦', '║', '―', '¥', '▓', '—', '‹', '─', '▒', '：', '¼',
          '⊕', '▼', '▪', '†', '■', '’', '▀', '¨', '▄', '♫', '☆', 'é', '¯', '♦', '¤', '▲',
          'è', '¸', '¾', 'Ã', '⋅', '‘', '∞', '∙', '）', '↓', '、', '│', '（', '»', '，', '♪',
          '╩', '╚', '³', '・', '╦', '╣', '╔', '╗', '▬', '❤', 'ï', 'Ø', '¹', '≤', '‡', '√']


In [79]:
misspell_dict = {"aren't": "are not", "can't": "cannot", "couldn't": "could not",
                 "didn't": "did not", "doesn't": "does not", "don't": "do not",
                 "hadn't": "had not", "hasn't": "has not", "haven't": "have not",
                 "he'd": "he would", "he'll": "he will", "he's": "he is",
                 "i'd": "I had", "i'll": "I will", "i'm": "I am", "isn't": "is not",
                 "it's": "it is", "it'll": "it will", "i've": "I have", "let's": "let us",
                 "mightn't": "might not", "mustn't": "must not", "shan't": "shall not",
                 "she'd": "she would", "she'll": "she will", "she's": "she is",
                 "shouldn't": "should not", "that's": "that is", "there's": "there is",
                 "they'd": "they would", "they'll": "they will", "they're": "they are",
                 "they've": "they have", "we'd": "we would", "we're": "we are",
                 "weren't": "were not", "we've": "we have", "what'll": "what will",
                 "what're": "what are", "what's": "what is", "what've": "what have",
                 "where's": "where is", "who'd": "who would", "who'll": "who will",
                 "who're": "who are", "who's": "who is", "who've": "who have",
                 "won't": "will not", "wouldn't": "would not", "you'd": "you would",
                 "you'll": "you will", "you're": "you are", "you've": "you have",
                 "'re": " are", "wasn't": "was not", "we'll": " will", "tryin'": "trying"}

In [80]:
# re.sub() 和第一个参数match的东西 会返回一个re相关的object 传入第二个参数的方程中（如果第二个参数是方程）

def _get_misspell(misspell_dict):
    misspell_re = re.compile('(%s)' % '|'.join(misspell_dict.keys()))
    return misspell_dict, misspell_re


def replace_typical_misspell(text):
    misspellings, misspellings_re = _get_misspell(misspell_dict)

    def replace(match):
        return misspellings[match.group(0)]

    return misspellings_re.sub(replace, text) ##

In [81]:
def clean_text(x): ### 用空格把标点符号两边的内容隔开
    x = str(x)
    for punct in puncts + list(string.punctuation):
        if punct in x:
            x = x.replace(punct, f' {punct} ')
    return x


def clean_numbers(x): #### 数字直接删掉
    return re.sub('\d+', ' ', x)


def get_coefs(word, *arr):
    return word, np.asarray(arr, dtype='float32')


def load_fasttext(word_index):
    embeddings_index = dict(get_coefs(*o.strip().split(' ')) for o in open(EMBEDDING_FASTTEXT,encoding='utf8'))

    # word_index = tokenizer.word_index
    nb_words = min(max_features, len(word_index)+1)
    embedding_matrix = np.zeros((nb_words, embed_size))

    for word, i in word_index.items():
        if i >= max_features:
            continue
        embedding_vector = embeddings_index.get(word)
        if embedding_vector is not None:
            embedding_matrix[i] = embedding_vector

    return embedding_matrix

In [82]:
# Convert identity columns to booleans
def convert_to_bool(df, col_name):
    df[col_name] = np.where(df[col_name] >= 0.5, True, False)
    
def convert_dataframe_to_bool(df):
    bool_df = df.copy()
    for col in identity_columns:
        convert_to_bool(bool_df, col)
    return bool_df

In [83]:
def load_and_prec():
    train = pd.read_csv(TRAIN_DATA, index_col='id')
    test = pd.read_csv(TEST_DATA, index_col='id')

    # lower
    train['comment_text'] = train['comment_text'].str.lower()
    test['comment_text'] = test['comment_text'].str.lower()

    # clean misspellings
    train['comment_text'] = train['comment_text'].apply(replace_typical_misspell)
    test['comment_text'] = test['comment_text'].apply(replace_typical_misspell)

    # clean the text
    train['comment_text'] = train['comment_text'].apply(clean_text)
    test['comment_text'] = test['comment_text'].apply(clean_text)

    # clean numbers
    train['comment_text'] = train['comment_text'].apply(clean_numbers)
    test['comment_text'] = test['comment_text'].apply(clean_numbers)
    
    # fill up the missing values
    train_x = train['comment_text'].fillna('_##_').values
    test_x = test['comment_text'].fillna('_##_').values
    
    # tokenize the sentences
    tokenizer = Tokenizer(num_words=max_features)
    tokenizer.fit_on_texts(list(train_x))
    train_x = tokenizer.texts_to_sequences(train_x)
    test_x = tokenizer.texts_to_sequences(test_x)

    # pad the sentences
    train_x = pad_sequences(train_x, maxlen=maxlen)
    test_x = pad_sequences(test_x, maxlen=maxlen)
    
    # get the target values
    identity_columns = [
        'male', 'female', 'homosexual_gay_or_lesbian', 'christian', 'jewish',
        'muslim', 'black', 'white', 'psychiatric_or_mental_illness']
    train_y = (train['target'].values > 0.5).astype(int)
    train_y_identity = train[identity_columns].values

    # shuffling the data
    np.random.seed(seed)
    train_idx = np.random.permutation(len(train_x))

    train_x = train_x[train_idx]
    train_y = train_y[train_idx]
    train_y_identity = train_y_identity[train_idx]

    return train_x, train_y, train_y_identity, test_x, tokenizer.word_index

##  Preprocessing Pipline

In [84]:
%%time
identity_columns = [
    'male', 'female', 'homosexual_gay_or_lesbian', 'christian', 'jewish',
    'muslim', 'black', 'white', 'psychiatric_or_mental_illness']
aux_columns = ['target', 'severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat']


train = pd.read_csv(TRAIN_DATA, index_col='id')
test = pd.read_csv(TEST_DATA, index_col='id')


# Convert identity columns to booleans
train = convert_dataframe_to_bool(train)

# lower
train['comment_text'] = train['comment_text'].str.lower()
test['comment_text'] = test['comment_text'].str.lower()

# clean misspellings
train['comment_text'] = train['comment_text'].apply(replace_typical_misspell)
test['comment_text'] = test['comment_text'].apply(replace_typical_misspell)

# clean the text
train['comment_text'] = train['comment_text'].apply(clean_text)
test['comment_text'] = test['comment_text'].apply(clean_text)

# clean numbers
train['comment_text'] = train['comment_text'].apply(clean_numbers)
test['comment_text'] = test['comment_text'].apply(clean_numbers)

# fill up the missing values
train['comment_text'] = train['comment_text'].fillna('_##_').values
test['comment_text'] = test['comment_text'].fillna('_##_').values

# tokenize the sentences
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(list(train['comment_text']))

  mask |= (ar1 == a)


Wall time: 3min 15s


In [85]:
# get the target values
train['target'] = (train['target'].values > 0.5).astype(int)

## Statistic Features

In [86]:
train['total_length'] = train['comment_text'].apply(len)
train['capitals'] = train['comment_text'].apply(lambda comment: sum(1 for c in comment if c.isupper()))
train['caps_vs_length'] = train.apply(lambda row: float(row['capitals'])/float(row['total_length']),axis=1)
train['num_exclamation_marks'] = train['comment_text'].apply(lambda comment: comment.count('!'))
train['num_question_marks'] = train['comment_text'].apply(lambda comment: comment.count('?'))
train['num_punctuation'] = train['comment_text'].apply(lambda comment: sum(comment.count(w) for w in '.,;:'))
train['num_symbols'] = train['comment_text'].apply(lambda comment: sum(comment.count(w) for w in '*&$%'))
train['num_words'] = train['comment_text'].apply(lambda comment: len(comment.split()))
train['num_unique_words'] = train['comment_text'].apply(lambda comment: len(set(w for w in comment.split())))
train['words_vs_unique'] = train['num_unique_words'] / train['num_words']
train['num_smilies'] = train['comment_text'].apply(lambda comment: sum(comment.count(w) for w in (':-)', ':)', ';-)', ';)')))

In [87]:
test['total_length'] = test['comment_text'].apply(len)
test['capitals'] = test['comment_text'].apply(lambda comment: sum(1 for c in comment if c.isupper()))
test['caps_vs_length'] = test.apply(lambda row: float(row['capitals'])/float(row['total_length']),axis=1)
test['num_exclamation_marks'] = test['comment_text'].apply(lambda comment: comment.count('!'))
test['num_question_marks'] = test['comment_text'].apply(lambda comment: comment.count('?'))
test['num_punctuation'] = test['comment_text'].apply(lambda comment: sum(comment.count(w) for w in '.,;:'))
test['num_symbols'] = test['comment_text'].apply(lambda comment: sum(comment.count(w) for w in '*&$%'))
test['num_words'] = test['comment_text'].apply(lambda comment: len(comment.split()))
test['num_unique_words'] = test['comment_text'].apply(lambda comment: len(set(w for w in comment.split())))
test['words_vs_unique'] = test['num_unique_words'] / test['num_words']
test['num_smilies'] = test['comment_text'].apply(lambda comment: sum(comment.count(w) for w in (':-)', ':)', ';-)', ';)')))

In [88]:
statistic_columns = \
['total_length','capitals','caps_vs_length','num_exclamation_marks',\
 'num_question_marks','num_punctuation','num_symbols','num_words','num_unique_words','words_vs_unique','num_smilies']

## Standardize

In [90]:
from sklearn.preprocessing import Normalizer
scaler = Normalizer()
train[statistic_columns] = train[statistic_columns].fillna(0)
test[statistic_columns] =test[statistic_columns].fillna(0)
scaler.fit(train[statistic_columns])
train[statistic_columns] = scaler.transform(train[statistic_columns])
test[statistic_columns] = scaler.transform(test[statistic_columns])

## Train Valid Split

In [95]:
# split out validation set
train, valid = train_test_split(train, test_size=0.06,random_state=2333,stratify=\
                                np.vstack([np.any(train[identity_columns],axis=1),train['target']]).T)
print('%d train comments, %d validate comments' % (len(train), len(valid)))

1696581 train comments, 108293 validate comments


## Sorting TrainSet

In [96]:
train_x = tokenizer.texts_to_sequences(train['comment_text'])

In [97]:
train['length'] =np.array([len(sentence) for sentence in train_x])

In [98]:
sorted_train = train.sort_values(by=['length'],ascending=True)

In [99]:
sorted_train.head(100)

Unnamed: 0_level_0,target,comment_text,severe_toxicity,obscene,identity_attack,insult,threat,asian,atheist,bisexual,black,buddhist,christian,female,heterosexual,hindu,homosexual_gay_or_lesbian,intellectual_or_learning_disability,jewish,latino,male,muslim,other_disability,other_gender,other_race_or_ethnicity,other_religion,other_sexual_orientation,physical_disability,psychiatric_or_mental_illness,transgender,white,created_date,publication_id,parent_id,article_id,rating,funny,wow,sad,likes,disagree,sexual_explicit,identity_annotator_count,toxicity_annotator_count,total_length,capitals,caps_vs_length,num_exclamation_marks,num_question_marks,num_punctuation,num_symbols,num_words,num_unique_words,words_vs_unique,num_smilies,length
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1
5444896,0,? ? ? ?,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-06-20 10:38:11.305896+00,54,5442386.0,345960,approved,0,0,0,0,0,0.0,0,10,0.961069,0.0,0.0,0.0,0.192214,0.0,0.0,0.192214,0.048053,0.012013,0.0,0
390222,0,?,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2016-07-16 16:48:41.188958+00,21,390062.0,141486,approved,0,0,0,0,0,0.0,0,10,0.928477,0.0,0.0,0.0,0.185695,0.0,0.0,0.185695,0.185695,0.185695,0.0,0
555273,0,; - ),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False,0.0,False,False,0.0,0.0,False,0.0,False,0.0,False,False,0.0,0.0,0.0,0.0,0.0,0.0,False,0.0,False,2016-10-26 22:59:27.084162+00,53,555108.0,149566,approved,0,0,0,1,0,0.0,4,4,0.958315,0.0,0.0,0.0,0.0,0.063888,0.0,0.191663,0.191663,0.063888,0.0,0
5167560,0,- - ! ! !,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-04-23 12:34:52.471319+00,54,,328976,approved,0,0,0,0,0,0.0,0,6,0.981872,0.0,0.0,0.092051,0.0,0.0,0.0,0.153418,0.061367,0.012273,0.0,0
1825271,0,; - ),0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-03-06 21:52:16.791716+00,53,1085729.0,317162,approved,0,0,0,2,0,0.0,0,10,0.958315,0.0,0.0,0.0,0.0,0.063888,0.0,0.191663,0.191663,0.063888,0.0,0
595083,0,?,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2016-11-12 05:19:45.870267+00,22,595008.0,151364,approved,0,0,0,0,0,0.0,0,10,0.928477,0.0,0.0,0.0,0.185695,0.0,0.0,0.185695,0.185695,0.185695,0.0,0
5167671,0,.,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-04-23 13:14:17.691301+00,54,,329025,rejected,0,0,0,0,0,0.0,0,10,0.928477,0.0,0.0,0.0,0.0,0.185695,0.0,0.185695,0.185695,0.185695,0.0,0
716706,0,,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2016-12-16 04:39:54.372746+00,54,674299.0,155130,approved,0,0,0,0,0,0.0,0,4,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
6216305,0,,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-10-24 23:29:28.346830+00,55,6216208.0,392936,approved,1,0,0,0,0,0.0,0,4,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0
5347758,0,.,0.0,0.0,0.0,0.0,0.0,,,,False,,False,False,,,False,,False,,False,False,,,,,,,False,,False,2017-06-02 20:17:29.620335+00,54,,340193,rejected,0,0,0,0,0,0.0,0,10,0.928477,0.0,0.0,0.0,0.0,0.185695,0.0,0.185695,0.185695,0.185695,0.0,0


## Custom Weight

In [100]:
# Overall
weights = np.ones((len(train),)) / 4
# Subgroup
weights += (train[identity_columns].fillna(0).values>=0.5).sum(axis=1).astype(bool).astype(np.int) / 4
# Background Positive, Subgroup Negative
weights += (( (train['target'].values>=0.5).astype(bool).astype(np.int) +
   (train[identity_columns].fillna(0).values<0.5).sum(axis=1).astype(bool).astype(np.int) ) > 1 ).astype(bool).astype(np.int) / 4
# Background Negative, Subgroup Positive
weights += (( (train['target'].values<0.5).astype(bool).astype(np.int) +
   (train[identity_columns].fillna(0).values>=0.5).sum(axis=1).astype(bool).astype(np.int) ) > 1 ).astype(bool).astype(np.int) / 4
loss_weight = 1.0 / weights.mean()

In [101]:
def custom_loss(data, targets):
    ''' Define custom loss function for weighted BCE on 'target' column '''
    bce_loss_1 = nn.BCEWithLogitsLoss(weight=targets[:,1:2])(data[:,:1],targets[:,:1])
    bce_loss_2 = nn.BCEWithLogitsLoss()(data[:,1:],targets[:,2:])
    return (bce_loss_1 * loss_weight) + bce_loss_2

## Binning Different Length Sentences as A list of DataFrame

In [102]:
boundary_list = [0,300000,700000, 1000000,1200000,1350000,1500000, len(train)-1]
sorted_train_list = []
for i in range(len(boundary_list)):
    if i<len(boundary_list)-1: # except the last one
        temp_train = sorted_train[boundary_list[i]:boundary_list[i+1]]
        sorted_train_list.append(temp_train)

## Stat Features

In [103]:
sorted_train_stat_list = []
for sorted_train in sorted_train_list:
    temp_train_stat = sorted_train[statistic_columns].values
    sorted_train_stat_list.append(temp_train_stat)

In [104]:
test_stat = test[statistic_columns].values
valid_stat = valid[statistic_columns].values

## tokenize the sentences

In [105]:
sorted_train_x_list = [] # a list of train_x
for sorted_train in sorted_train_list:
    temp_train_x = tokenizer.texts_to_sequences(sorted_train['comment_text'])
    sorted_train_x_list.append(temp_train_x)

In [106]:
test_x = tokenizer.texts_to_sequences(test['comment_text'])
valid_x = tokenizer.texts_to_sequences(valid['comment_text'])

## Padding & Set Y

In [107]:
padded_train_x_list = []
PD_size_list = []
for sorted_train_x in sorted_train_x_list:
    PD_size = min(max(len(s)for s in sorted_train_x),maxlen) ## 取subset里最长 且不超过maxlen
    PD_size_list.append(PD_size)
    temp_padded_train_x = pad_sequences(sorted_train_x, maxlen=PD_size)
    padded_train_x_list.append(temp_padded_train_x)
print(PD_size_list)

[13, 28, 46, 65, 86, 122, 220]


In [108]:
valid_x = pad_sequences(valid_x, maxlen=maxlen)
test_x = pad_sequences(test_x, maxlen=maxlen)

In [109]:
valid_y = valid[['target','target', 'severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat']].values

In [110]:
sorted_train_y_list = []
for sorted_train in sorted_train_list:
    temp_train_y = sorted_train[['target','target', 'severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat']].values
    sorted_train_y_list.append(temp_train_y)

## Build Up Network

In [111]:
class SpatialDropout(nn.Dropout2d):
    def forward(self, x):
        x = x.unsqueeze(2)    # (N, T, 1, K)
        x = x.permute(0, 3, 2, 1)  # (N, K, 1, T)
        x = super(SpatialDropout, self).forward(x)  # (N, K, 1, T), some features are masked
        x = x.permute(0, 3, 2, 1)  # (N, T, 1, K)
        x = x.squeeze(2)  # (N, T, K)
        return x

In [112]:
max_features = min(len(tokenizer.word_index) + 1,max_features)
print(max_features)

289642


In [113]:
num_aux_targets = 6
NUM_MODELS = 2
LSTM_UNITS = 128
DENSE_HIDDEN_UNITS = 4 * LSTM_UNITS

In [114]:
class NeuralNet(nn.Module):
    def __init__(self, embedding_matrix, num_aux_targets,):
        super(NeuralNet, self).__init__()
        embed_size = embedding_matrix.shape[1]
        
        self.embedding = nn.Embedding(max_features, embed_size)
        self.embedding.weight = nn.Parameter(torch.tensor(embedding_matrix, dtype=torch.float32))
        self.embedding.weight.requires_grad = False
        self.embedding_dropout = SpatialDropout(0.3)
        
        self.lstm1 = nn.LSTM(embed_size, LSTM_UNITS, bidirectional=True, batch_first=True)
        self.lstm2 = nn.LSTM(LSTM_UNITS * 2, LSTM_UNITS, bidirectional=True, batch_first=True)
    
        self.linear1 = nn.Linear(DENSE_HIDDEN_UNITS+64, DENSE_HIDDEN_UNITS+64)
        self.linear2 = nn.Linear(DENSE_HIDDEN_UNITS+64, DENSE_HIDDEN_UNITS+64)
        
        self.linear_stat = nn.Linear(11, 64)
        
        
        self.linear_out = nn.Linear(DENSE_HIDDEN_UNITS+64, 1)
        self.linear_aux_out = nn.Linear(DENSE_HIDDEN_UNITS+64, num_aux_targets)
        
    def forward(self, x,stat_x,lengths=None):
        h_embedding = self.embedding(x.long())
        h_embedding = self.embedding_dropout(h_embedding)
        
        h_lstm1, _ = self.lstm1(h_embedding)
        h_lstm2, _ = self.lstm2(h_lstm1)
        
        
        # global average pooling
        avg_pool = torch.mean(h_lstm2, 1)
        # global max pooling
        max_pool, _ = torch.max(h_lstm2, 1)
        #stat feature out
        stat_out = self.linear_stat(stat_x)
        
        
        h_conc = torch.cat((max_pool, avg_pool,stat_out), 1)
        h_conc_linear1  = F.relu(self.linear1(h_conc))
        h_conc_linear2  = F.relu(self.linear2(h_conc))
        
        hidden = h_conc + h_conc_linear1 + h_conc_linear2
        
        result = self.linear_out(hidden)
        aux_result = self.linear_aux_out(hidden)
        out = torch.cat([result, aux_result], 1)
        
        return out

In [27]:
# class NeuralNet(nn.Module):
#     def __init__(self, embedding_matrix):
#         super(NeuralNet, self).__init__()

#         lstm_hidden_size = 120
#         gru_hidden_size = 60
        
#         self.gru_hidden_size = gru_hidden_size
#         self.embedding = nn.Embedding(max_features, embed_size)
#         self.embedding.weight = nn.Parameter(torch.tensor(embedding_matrix, dtype=torch.float32))
#         self.embedding.weight.requires_grad = False
#         self.embedding_dropout = nn.Dropout2d(0.25)

#         self.lstm = nn.LSTM(embed_size, lstm_hidden_size, bidirectional=True, batch_first=True)
#         self.gru = nn.GRU(lstm_hidden_size * 2, gru_hidden_size, bidirectional=True, batch_first=True)

#         self.linear = nn.Linear(gru_hidden_size * 6, 32)
#         self.relu = nn.ReLU()
#         self.dropout = nn.Dropout(0.25)
#         self.linear_aux_out = nn.Linear(32, 5)
#         self.out = nn.Linear(32, 1)

#     def apply_spatial_dropout(self, h_embedding):
#         h_embedding = h_embedding.transpose(1, 2).unsqueeze(2)
#         h_embedding = self.embedding_dropout(h_embedding).squeeze(2).transpose(1, 2)
#         return h_embedding

#     def forward(self, x):
#         h_embedding = self.embedding(x)
#         h_embedding = self.apply_spatial_dropout(h_embedding)

#         h_lstm, _ = self.lstm(h_embedding)
#         h_gru, hh_gru = self.gru(h_lstm)

#         hh_gru = hh_gru.view(-1, self.gru_hidden_size * 2) #把最后一个h展开 N * GRUOUT*2

#         avg_pool = torch.mean(h_gru, 1)    # N * GRUOUT*2
#         max_pool, _ = torch.max(h_gru, 1)  # N * GRUOUT*2

#         conc = torch.cat((hh_gru, avg_pool, max_pool), 1)
#         conc = self.relu(self.linear(conc))
#         conc = self.dropout(conc)
#         aux_result = self.linear_aux_out(conc)
#         result = self.out(conc)
#         out = torch.cat([result, aux_result], 1)
#         return out

## Evaluation Fuctions

In [115]:
SUBGROUP_AUC = 'subgroup_auc'
BPSN_AUC = 'bpsn_auc'  # stands for background positive, subgroup negative
BNSP_AUC = 'bnsp_auc'  # stands for background negative, subgroup positive

def compute_auc(y_true, y_pred):
    try:
        return metrics.roc_auc_score(y_true, y_pred)
    except ValueError:
        return np.nan

def compute_subgroup_auc(df, subgroup, label, model_name):
    subgroup_examples = df[df[subgroup]]
    return compute_auc(subgroup_examples[label], subgroup_examples[model_name])

def compute_bpsn_auc(df, subgroup, label, model_name):
    """Computes the AUC of the within-subgroup negative examples and the background positive examples."""
    subgroup_negative_examples = df[df[subgroup] & ~df[label]]
    non_subgroup_positive_examples = df[~df[subgroup] & df[label]]
    examples = subgroup_negative_examples.append(non_subgroup_positive_examples)
    return compute_auc(examples[label], examples[model_name])

def compute_bnsp_auc(df, subgroup, label, model_name):
    """Computes the AUC of the within-subgroup positive examples and the background negative examples."""
    subgroup_positive_examples = df[df[subgroup] & df[label]]
    non_subgroup_negative_examples = df[~df[subgroup] & ~df[label]]
    examples = subgroup_positive_examples.append(non_subgroup_negative_examples)
    return compute_auc(examples[label], examples[model_name])

def compute_bias_metrics_for_model(dataset,
                                   subgroups,
                                   model,
                                   label_col,
                                   include_asegs=False):
    """Computes per-subgroup metrics for all subgroups and one model."""
    records = []
    for subgroup in subgroups:
        record = {
            'subgroup': subgroup,
            'subgroup_size': len(dataset[dataset[subgroup]])
        }
        record[SUBGROUP_AUC] = compute_subgroup_auc(dataset, subgroup, label_col, model)
        record[BPSN_AUC] = compute_bpsn_auc(dataset, subgroup, label_col, model)
        record[BNSP_AUC] = compute_bnsp_auc(dataset, subgroup, label_col, model)
        records.append(record)
    return pd.DataFrame(records).sort_values('subgroup_auc', ascending=True)


def calculate_overall_auc(df, model_name):
    true_labels = df[TOXICITY_COLUMN]
    predicted_labels = df[model_name]
    return metrics.roc_auc_score(true_labels, predicted_labels)

def power_mean(series, p):
    total = sum(np.power(series, p))
    return np.power(total / len(series), 1 / p)

def get_final_metric(bias_df, overall_auc, POWER=-5, OVERALL_MODEL_WEIGHT=0.25):
    bias_score = np.average([
        power_mean(bias_df[SUBGROUP_AUC], POWER),
        power_mean(bias_df[BPSN_AUC], POWER),
        power_mean(bias_df[BNSP_AUC], POWER)
    ])
    return (OVERALL_MODEL_WEIGHT * overall_auc) + ((1 - OVERALL_MODEL_WEIGHT) * bias_score)


#bias_metrics_df = compute_bias_metrics_for_model(validate_df, identity_columns, MODEL_NAME, TOXICITY_COLUMN)
#bias_metrics_df

## Debugging Setting

In [116]:
DEBUGGING = False

## Loading Embedding_Matrix

In [117]:
embedding_matrix = load_fasttext(tokenizer.word_index)

## Setup Dataset & Optimizer

In [119]:
# with timer('load data'):
#     train_x, train_y, train_y_identity, test_x, word_index = load_and_prec()
#train_preds = np.zeros((len(train_x)))

seed_torch(seed)

# train_loader_list # 
train_loader_list = []
for x_train,y_train,stat_train in zip(padded_train_x_list,sorted_train_y_list,sorted_train_stat_list):
    temp_stat_train = torch.tensor(stat_train, dtype=torch.float32).cuda()
    temp_x_train_cuda = torch.tensor(x_train, dtype=torch.long).cuda()
    temp_y_train_cuda = torch.tensor(y_train, dtype=torch.float32).cuda()
    temp_train = torch.utils.data.TensorDataset(temp_x_train_cuda, temp_y_train_cuda,temp_stat_train)
    temp_train_loader = torch.utils.data.DataLoader(temp_train, batch_size=batch_size, shuffle=True)
    train_loader_list.append(temp_train_loader)

    
    
# testset & validset
x_test_cuda = torch.tensor(test_x, dtype=torch.long).cuda()
stat_test = torch.tensor(test_stat, dtype=torch.float32).cuda()
x_valid_cuda = torch.tensor(valid_x, dtype=torch.long).cuda()
y_valid_cuda = torch.tensor(valid_y, dtype=torch.float32).cuda()
stat_valid = torch.tensor(valid_stat, dtype=torch.float32).cuda()



test_ds = torch.utils.data.TensorDataset(x_test_cuda,stat_test)
valid_ds = torch.utils.data.TensorDataset(x_valid_cuda, y_valid_cuda,stat_valid)
test_loader = torch.utils.data.DataLoader(test_ds, batch_size=batch_size, shuffle=False)
valid_loader = torch.utils.data.DataLoader(valid_ds, batch_size=batch_size, shuffle=False)

# model optimizer loss_fn
model = NeuralNet(embedding_matrix,num_aux_targets)
model.cuda()
loss_fn = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr=0.01)
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lambda epoch: 0.65 ** epoch)

## Training Model

In [120]:
valid_preds = []
test_preds = []
seed_torch()
for epoch in range(train_epochs):
    print('epoch '+str(epoch+1)+' start')
    print()
    start_time = time.time()
    model.train()
    avg_loss = 0.
 
    i = 1
    for train_loader in train_loader_list:
        print('train_loader: '+str(i)+'/'+str(len(train_loader_list)))
        i += 1
        for x_batch, y_batch,stat_batch in tqdm(train_loader, disable=False):
            y_pred = model(x_batch,stat_batch)
            loss = loss_fn(y_pred, y_batch)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            avg_loss += loss.item()
    avg_loss = avg_loss/len(train)*100
 

    # start evaluation
    model.eval()
    valid_preds_epoch = np.zeros(len(valid_x))
    test_preds_epoch = np.zeros(len(test_x))
    avg_val_loss = 0.

    
    for i, (x_batch, y_batch,stat_batch) in enumerate(valid_loader):
        with torch.no_grad():
            y_pred = model(x_batch,stat_batch).detach()

        avg_val_loss += loss_fn(y_pred, y_batch).item()
        valid_preds_epoch[i * batch_size:(i + 1) * batch_size] = sigmoid(y_pred.cpu().numpy())[:, 0]
    avg_val_loss = avg_val_loss/len(valid)*100
    elapsed_time = time.time() - start_time
    print('Epoch {}/{} \t loss={:.6f} \t val_loss={:.6f} \t time={:.2f}s'.format(
        epoch + 1, train_epochs, avg_loss, avg_val_loss, elapsed_time))

    for i, (x_batch,stat_batch) in enumerate(test_loader):
        with torch.no_grad():
            y_pred = model(x_batch,stat_batch).detach()

        test_preds_epoch[i * batch_size:(i + 1) * batch_size] = sigmoid(y_pred.cpu().numpy())[:, 0]

    valid_preds.append(valid_preds_epoch)
    test_preds.append(test_preds_epoch)




epoch 1 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 1/8 	 loss=0.084264 	 val_loss=0.016836 	 time=263.97s
epoch 2 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 2/8 	 loss=0.016680 	 val_loss=0.016144 	 time=263.44s
epoch 3 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 3/8 	 loss=0.016038 	 val_loss=0.016080 	 time=263.00s
epoch 4 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 4/8 	 loss=0.015898 	 val_loss=0.016000 	 time=262.83s
epoch 5 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 5/8 	 loss=0.015806 	 val_loss=0.015942 	 time=263.60s
epoch 6 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 6/8 	 loss=0.015766 	 val_loss=0.016063 	 time=262.95s
epoch 7 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 7/8 	 loss=0.015715 	 val_loss=0.015921 	 time=264.47s
epoch 8 start

train_loader: 1/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 2/7


HBox(children=(IntProgress(value=0, max=782), HTML(value='')))


train_loader: 3/7


HBox(children=(IntProgress(value=0, max=586), HTML(value='')))


train_loader: 4/7


HBox(children=(IntProgress(value=0, max=391), HTML(value='')))


train_loader: 5/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 6/7


HBox(children=(IntProgress(value=0, max=293), HTML(value='')))


train_loader: 7/7


HBox(children=(IntProgress(value=0, max=384), HTML(value='')))


Epoch 8/8 	 loss=0.015711 	 val_loss=0.015859 	 time=261.13s


## Save Submission

In [124]:
submission = pd.read_csv(SAMPLE_SUBMISSION, index_col='id')
for i,weight in zip(range(len(test_preds)),checkpoint_weights):
    submission['prediction'] += weight*test_preds[i] / sum(checkpoint_weights)
    
submission.reset_index(drop=False, inplace=True)
submission.to_csv('submission_stat1.csv', index=False)

In [68]:
bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, MODEL_NAME, 'target')
bias_metrics_df

Unnamed: 0,bnsp_auc,bpsn_auc,subgroup,subgroup_auc,subgroup_size
4,0.938544,0.934018,jewish,0.853959,471
2,0.96956,0.881154,homosexual_gay_or_lesbian,0.857245,638
5,0.962121,0.904256,muslim,0.858823,1240
7,0.977015,0.86768,white,0.8696,1514
6,0.972047,0.883081,black,0.869641,861
0,0.970387,0.936914,male,0.932326,2662
3,0.95669,0.958484,christian,0.934374,2392
1,0.970325,0.943786,female,0.937337,3193
8,0.973128,0.936599,psychiatric_or_mental_illness,0.940696,289


In [69]:
def get_valid_score(valid,valid_preds,checkpoint_weights):
    valid['checkpoint_ensemble'] = 0
    for i,weight in zip(range(len(valid_preds)),checkpoint_weights):
        valid[MODEL_NAME] = valid_preds[i]
        bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, MODEL_NAME, 'target')
        valid['checkpoint_ensemble'] += weight*valid_preds[i] / sum(checkpoint_weights)
        print('epoch',i+1,' ',get_final_metric(bias_metrics_df, calculate_overall_auc(valid, MODEL_NAME)))
    print()   
    bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, 'checkpoint_ensemble', 'target')
    print('checkpoint_ensemble',get_final_metric(bias_metrics_df, calculate_overall_auc(valid, 'checkpoint_ensemble')))
    print()
    print(bias_metrics_df)
get_valid_score(valid,valid_preds,checkpoint_weights)

epoch 1   0.9009383211813122
epoch 2   0.9259373020997725
epoch 3   0.9304881199631917
epoch 4   0.937013339568904
epoch 5   0.9350978162470152

checkpoint_ensemble 0.9379267666909455

   bnsp_auc  bpsn_auc                       subgroup  subgroup_auc  \
5  0.957845  0.916561                         muslim      0.856900   
2  0.966657  0.894133      homosexual_gay_or_lesbian      0.858535   
4  0.938025  0.941693                         jewish      0.861111   
6  0.969307  0.897905                          black      0.871900   
7  0.973744  0.886161                          white      0.875182   
0  0.968045  0.941860                           male      0.931535   
3  0.953682  0.963757                      christian      0.936245   
1  0.968905  0.949091                         female      0.939486   
8  0.972766  0.939936  psychiatric_or_mental_illness      0.944925   

   subgroup_size  
5           1240  
2            638  
4            471  
6            861  
7           1514  


## History

In [36]:
MODEL_NAME = 'my_model'
TOXICITY_COLUMN = 'target'
valid[MODEL_NAME] = valid_preds[-1]
checkpoint_weights = [1,2,4,8,16]

In [36]:
MODEL_NAME = 'my_model'
TOXICITY_COLUMN = 'target'
valid[MODEL_NAME] = valid_preds[-1]
checkpoint_weights = [1,2,4,8,6]

In [37]:
bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, MODEL_NAME, 'target')
bias_metrics_df

Unnamed: 0,bnsp_auc,bpsn_auc,subgroup,subgroup_auc,subgroup_size
2,0.965841,0.882439,homosexual_gay_or_lesbian,0.84392,638
5,0.954735,0.911399,muslim,0.848208,1240
4,0.944002,0.936243,jewish,0.868391,471
7,0.970519,0.88861,white,0.873238,1514
6,0.967548,0.899446,black,0.880675,861
8,0.97066,0.931121,psychiatric_or_mental_illness,0.930763,289
0,0.966378,0.941899,male,0.930929,2662
1,0.964808,0.948661,female,0.934906,3193
3,0.956695,0.959054,christian,0.937313,2392


In [38]:
def get_valid_score(valid,valid_preds,checkpoint_weights):
    valid['checkpoint_ensemble'] = 0
    for i,weight in zip(range(len(valid_preds)),checkpoint_weights):
        valid[MODEL_NAME] = valid_preds[i]
        bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, MODEL_NAME, 'target')
        valid['checkpoint_ensemble'] += weight*valid_preds[i] / sum(checkpoint_weights)
        print('epoch',i+1,' ',get_final_metric(bias_metrics_df, calculate_overall_auc(valid, MODEL_NAME)))
    print()   
    bias_metrics_df = compute_bias_metrics_for_model(valid, identity_columns, 'checkpoint_ensemble', 'target')
    print('checkpoint_ensemble',get_final_metric(bias_metrics_df, calculate_overall_auc(valid, 'checkpoint_ensemble')))
    print()
    print(bias_metrics_df)
get_valid_score(valid,valid_preds,checkpoint_weights)

epoch 1   0.9050295295042003
epoch 2   0.9227181478177864
epoch 3   0.9267708920861492
epoch 4   0.93422739182796
epoch 5   0.9356015373894764

checkpoint_ensemble 0.9358183533894483

   bnsp_auc  bpsn_auc                       subgroup  subgroup_auc  \
2  0.970074  0.872003      homosexual_gay_or_lesbian      0.849250   
5  0.953282  0.920600                         muslim      0.851970   
4  0.941277  0.940569                         jewish      0.867944   
7  0.971398  0.885049                          white      0.869587   
6  0.969736  0.892075                          black      0.874517   
0  0.968635  0.939578                           male      0.931760   
8  0.971237  0.935441  psychiatric_or_mental_illness      0.935090   
1  0.967386  0.947359                         female      0.936860   
3  0.955539  0.960847                      christian      0.937091   

   subgroup_size  
2            638  
5           1240  
4            471  
7           1514  
6            861  
0