## Multioutput model (without context)

Now, we need a model to detect the type of hate

In [1]:
%load_ext autoreload
%autoreload 2

from hatedetection import load_datasets

train_dataset, dev_dataset, test_dataset = load_datasets()


Let's take just the comments that are HATEFUL

In [2]:
from sklearn.model_selection import train_test_split
import pandas as pd

train_dataset = train_dataset.filter(lambda x: x["HATEFUL"] > 0)
dev_dataset = dev_dataset.filter(lambda x: x["HATEFUL"] > 0)
test_dataset = test_dataset.filter(lambda x: x["HATEFUL"] > 0)


HBox(children=(FloatProgress(value=0.0, max=35.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=9.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=12.0), HTML(value='')))




## Binary Cross Entropy Loss

Si tenemos nuestras categorías en $C$, queremos hacer un "multi-tasking" usando una loss que sea 

$$
J(y, \hat{y}) = \frac{1}{|C|}\sum\limits_{c \in C} J_c(y, \hat{y})
$$

O sea, para cada instancia, la función de pérdida va a ser el promedio de las pérdidas para `MUJER`, `RACISMO`, etc...

In [3]:
import torch
from torch.nn import BCEWithLogitsLoss
from torch.nn.functional import binary_cross_entropy_with_logits


"""
Supongamos que tenemos un batch de 32 
Por cada uno
"""

logits = torch.randn(32, 8)
labels = torch.Tensor([[1, 1, 1, 1, 0, 0, 0, 0] for _ in range(32)])


loss_fct = BCEWithLogitsLoss()
loss_fct(logits, labels)


tensor(0.7815)

In [4]:
logits = torch.Tensor([[-10, -9, -10]])
target = torch.zeros(1, 3)

loss_fct(
    logits,
    target,
)

tensor(7.1403e-05)

¿Está haciendo lo esperado esto? Veamos...

Cross entropy es 

$- [y \log \hat{y} + (1-y) \log (1-\hat{y}) ]$

In [5]:
from torch.nn.functional import sigmoid

pred = sigmoid(logits)

losses = -(target * torch.log(pred) + (1 - target) * torch.log(1-pred))

losses.mean()



tensor(7.1410e-05)

Espectacular!!! 

Qué pasa con el weight?

In [6]:
from torch.nn.functional import sigmoid

pred = sigmoid(logits)

weights = torch.Tensor([0.5, 0.1, 0.4])

losses = -(target * torch.log(pred) + (1 - target) * torch.log(1-pred))

loss_fct = BCEWithLogitsLoss(pos_weight=weights)

(losses * weights).sum(), loss_fct(logits, target)

(tensor(5.3217e-05), tensor(7.1526e-05))



Hummm...no me queda claro **CHEQUEAR ESTO**

## Clasificación

Usamos nuestro modelo `hatedetection.BertForSequenceMultiClassification`. Es una leve modificación del clasificador de `transformers`

Ya lo entrenamos, así que sólo lo cargamos

In [7]:
from transformers import AutoTokenizer
from hatedetection import BertForSequenceMultiClassification, extended_hate_categories

model_name = "../models/bert-non-contextualized-hate-category-es/"

model = BertForSequenceMultiClassification.from_pretrained(
    model_name,
    return_dict=True, num_labels=len(extended_hate_categories)
)

model.eval();
tokenizer = AutoTokenizer.from_pretrained(model_name)
#
tokenizer.model_max_length = 128

Armo el trainer igual sólo para evaluar, je


In [8]:
def tokenize(batch, context=True, padding='max_length', truncation=True):
    """
    Apply tokenization
    
    Arguments:
    ---------
    
    use_context: boolean (default True)
        Whether to add the context to the 
    """
    
    if context:
        args = [batch['context'], batch['text']]
    else:
        args = [batch['text']]
        
    return tokenizer(*args, padding='max_length', truncation=True)

batch_size = 32
eval_batch_size = 16

my_tokenize = lambda x: tokenize(x, context=False)

train_dataset = train_dataset.map(my_tokenize, batched=True, batch_size=batch_size)
dev_dataset = dev_dataset.map(my_tokenize, batched=True, batch_size=eval_batch_size)
test_dataset = test_dataset.map(my_tokenize, batched=True, batch_size=eval_batch_size)



HBox(children=(FloatProgress(value=0.0, max=173.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=87.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=99.0), HTML(value='')))




In [9]:

def format_dataset(dataset):
    def get_category_labels(examples):
        return {'labels': torch.Tensor([examples[cat] for cat in extended_hate_categories])}
    dataset = dataset.map(get_category_labels)
    dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask', 'labels'])
    return dataset

train_dataset = format_dataset(train_dataset)
dev_dataset = format_dataset(dev_dataset)
test_dataset = format_dataset(test_dataset)


HBox(children=(FloatProgress(value=0.0, max=5525.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=1391.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=1573.0), HTML(value='')))




In [10]:
from hatedetection.metrics import compute_category_metrics
from transformers import Trainer, TrainingArguments



training_args = TrainingArguments(
    output_dir='./results',
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=eval_batch_size,
    evaluation_strategy="epoch",
    do_eval=False,
    weight_decay=0.01,
    logging_dir='./logs',
    load_best_model_at_end=True,
    metric_for_best_model="f1",
)

results = []

trainer = Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_category_metrics,
    train_dataset=train_dataset,
    eval_dataset=dev_dataset,
)

Hack para que se vea lindo nomás

In [57]:
import pandas as pd
pd.options.display.max_columns = 40
pd.set_option('display.float_format', lambda x: '%.5f' % x)

df_results = pd.DataFrame([trainer.evaluate(dev_dataset)])

df_results.T

Unnamed: 0,0
eval_loss,0.19228
eval_calls_f1,0.8746
eval_women_f1,0.83564
eval_lgbti_f1,0.84403
eval_racism_f1,0.91676
eval_class_f1,0.80978
eval_politics_f1,0.82779
eval_disabled_f1,0.86148
eval_appearance_f1,0.90425
eval_criminal_f1,0.85754


In [12]:

from hatedetection.predict import predict_category

def predict(sentence, context=None):
    return predict_category(model, tokenizer, sentence, context=context)

In [13]:
predict("Negro chorro de mierda")

['RACISM', 'CRIMINAL']

In [14]:
predict("Quiere bija")

['WOMEN']

In [15]:
predict("Es una tortillera")

['WOMEN']

In [16]:
predict("Le falta verga")

['WOMEN']

In [17]:
predict("Metanles bala!")

['CALLS', 'RACISM', 'CRIMINAL']

In [18]:
predict("Volvete a Israel!")

['RACISM']

In [19]:
predict("Volvete a Arabia, musulmán!")

['RACISM']

In [20]:
predict("Habría que limitar el acceso inmediatamente de todo oriental")

['CALLS', 'RACISM']

In [21]:
predict("Te llamás Raúl querido")

['LGBTI', 'APPEARANCE']

In [22]:
predict("No seas trolo viejo")

['LGBTI', 'APPEARANCE']

In [23]:
predict("A esta TAPU hay que agarrarla de a varios para que sepa lo que es bueno")

['CALLS', 'WOMEN', 'CRIMINAL']

Acá no encuentra el llamado a la acción...ahora, si lo cambiamos ligeramente...

In [24]:
predict("A esta negra de mierda hay que agarrarla de a varios para que sepa lo que es bueno")

['CALLS', 'WOMEN', 'RACISM', 'CRIMINAL']

In [25]:
predict("Hay que encerrar a todos estos homosexuales de mierda")

['CALLS', 'LGBTI']

In [26]:
predict("Qué asco de putos")

['LGBTI']

In [27]:
predict("Hay que matar a todos los gorilas")

['CALLS', 'POLITICS']

Parece que correlaciona todo llamado a la acción a los criminales

In [28]:
predict("orientales")

['RACISM']

In [29]:
predict("Estos come murciélagos te apagan la heladera a la tarde")

['RACISM']

In [30]:
predict("Negros chorros de mierda, cuarenta ladrillos y arreglamos esto")

['CALLS', 'CRIMINAL']

In [31]:
predict("negras al paredón!")

['CALLS', 'WOMEN', 'RACISM', 'CRIMINAL']

In [32]:
predict("No queremos rojitos acá")

['RACISM', 'CLASS']

In [33]:
predict("No les gusta laburar eh")

['CLASS']

In [34]:
predict("Agarrá la pala!")

['CLASS']

In [35]:
predict("Agarrá la pala, vago!")

['CLASS']

In [36]:
predict("Menos ganas de laborar eh")

['CLASS']

In [37]:
predict("Viven del Estado, esta gente sinceramente no tiene remedio")

['CLASS']

In [38]:
predict("Tenés dos pelotas entre las piernas amigazo")

['LGBTI']

In [39]:
predict("jajajajaj pero ni por puta casualidad laburar, no?")

['CLASS']

In [40]:
predict("más topu imposible")

[]

In [41]:
predict("hay que mandarlos a un campo de concentración a estos musulmanes")

['CALLS', 'RACISM']

In [42]:
predict("Fuera rusos!")

['CALLS', 'RACISM']

In [43]:
predict("Todos los indios son una caca que sólo merece morir")

['RACISM']

In [44]:
predict("los judíos merecen morir")

['RACISM']

In [45]:
predict("los chorros merecen morir")

['CALLS', 'CRIMINAL']

In [46]:
predict("los piqueteros merecen morir")

['POLITICS']

In [47]:
predict("los palestinos merecen morir")

['RACISM']

In [48]:
predict("los homosexuales merecen morir")

['LGBTI', 'RACISM']

In [49]:
predict("los homosexuales tienen que ser aislados de la sociedad")

['LGBTI']

In [50]:
predict("los presos tienen que ser aislados de la sociedad")

['CALLS', 'CRIMINAL']

In [51]:
predict("los criminales tienen que ser aislados de la sociedad")

['CALLS', 'RACISM', 'CRIMINAL']

In [52]:
predict("los españoles tienen que ser aislados de la sociedad")

['CALLS', 'RACISM']

In [53]:
predict("los chinos tienen que ser aislados de la sociedad")

['CALLS', 'RACISM']

In [54]:
predict("los italianos tienen que ser aislados de la sociedad")

['CALLS', 'RACISM']

In [55]:
predict("los italianos deben ser fusilados")

['CALLS', 'RACISM']