# 특강: 언어 모델(i.e., LLM)을 사용한 혐오표현탐지(i.e., hate speech detection)

## 언어모델이란?
- 논문: https://seongmin-mun.github.io/MyWebsite/Seongmin/Resources/6.Thesis/PhD/manuscript/PhD_Dissertation_Manuscripts.pdf

### 예시 문장

![S1](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S1.png)

### RNN(i.e., Recurrent Nenural Network)
- Learning representations by back-propagating errors (1986): https://www.nature.com/articles/323533a0

![S2](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S2.png)

### Attention - Google
- Attention Is All You Need (2017): https://arxiv.org/abs/1706.03762
- NeurIPS: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- 요약: https://gbdai.tistory.com/46

![S3](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S3.png)

![S4](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S4.png)

![S5](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S5.png)

### BERT (i.e., Bidirectional Encoder Representations from Transformers) - Google
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2019): https://arxiv.org/abs/1810.04805
- NAACL: https://aclanthology.org/N19-1423/
- 요약: https://misconstructed.tistory.com/43
- 모델(HuggingFace)저장소: https://huggingface.co/docs/transformers/model_doc/bert

### GPT (i.e., Generative Pre-training) - OpenAI
- GPT1 (Improving Language Understanding by Generative Pre-Training; 2018): https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
- GPT2 (Language Models are Unsupervised Multitask Learners; 2018) : https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
- GPT3 (Language Models are Few-Shot Learners; 2020) : https://arxiv.org/pdf/2005.14165
- 요약: https://mr-waguwagu.tistory.com/27
- 요약: https://lcyking.tistory.com/entry/%EB%85%BC%EB%AC%B8%EB%A6%AC%EB%B7%B0-GPT-1Improving-Language-Understandingby-Generative-Pre-Training%EC%9D%98-%EC%9D%B4%ED%95%B4
- 모델(HuggingFace)저장소: https://huggingface.co/docs/transformers/model_doc/gpt2

![S6](https://seongmin-mun.github.io/AjouUniversityCourse/2024/English%20Language%20and%20Literature/v4/image/S6.png)

## 혐오표현 말뭉치란?

### Hate Speech Dataset from a White Supremacy Forum
- ACL: https://aclanthology.org/W18-5102
- 논문: https://aclanthology.org/W18-5102.pdf
- Data: https://github.com/Vicomtech/hate-speech-dataset

### 혐오표현분류기준
- a) deliberate attack (의도된 공격적 표현)
- b) directed towards a specific group of people (특정 집단을 향한 표현)
- c) motivated by aspects of the group’s identity (집단의 정체성에 대한 표현)

### 예시문장
- 혐오표현: “Poor white kids being forced to treat apes and parasites as their equals.”
- 일반표현: “Where can I find NS(National Socialism) speeches and music, also historical, in mp3 format for free download on the net.“

# GPT모델을 사용한 혐오표현탐지 모델 생성하기

## 경로지정하기

In [None]:
# Mount Google Drive to this Notebook instance.
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## 파라미터 설정하기_1
- Explainability of neural networks for child language: Agent-First strategy in comprehension of Korean active transitive construction: https://onlinelibrary.wiley.com/doi/10.1111/desc.13405

In [None]:
##Parameter setting
#학습의 횟수(반봅)
setEpoch = 10
# 모델 가중치가 업데이트될 때 변경되는 정도; 학습률이 클수록 각 반복에서 파라미터가 크게 업데이트되며, 작을수록 작은 단계로 업데이트 된다.
setLearningRate = 0.0001
#한번의 학습에 사용되는 데이터의 양
#예를 들어, 전체 데이터셋이 10,000개의 샘플로 구성되어 있고 배치 크기를 100으로 설정하면, 모델은 각 배치(100개 샘플)를 순차적으로 처리하여 총 100번의 가중치 업데이트를 수행하게 된다.
setBatch = 16
#학습에 사용됭 최대 문장의 길이
setMaxLength = 256
# 학습과정에서 과적합(overfitting)이 발생하는 것을 방지하기 위해 무작위성 값(Epsilon)을 부여하여 데이터를 무작위로 탐색하게 한다.
setEpsilon = 1e-8
#딥러닝 난수 생성에 사용되는 초기값; 선행연구와 동일한 난수 초기값을 생성하면 동일한 초기 조건하에 난수가 생성되므로 선행 연구를 재현 할 수 있다.
setSeed = 42
#분류 범주의 수
labelNumber = 2

## 장비 확인하기

In [None]:
import tensorflow as tf

# Get the GPU device name.
device_name = tf.test.gpu_device_name()

# The device name should look like the following:
if device_name == '/device:GPU:0':
    print('Found GPU at: {}'.format(device_name))
else:
    raise SystemError('GPU device not found')
import torch

# If there's a GPU available...
if torch.cuda.is_available():

    # Tell PyTorch to use the GPU.
    device = torch.device("cuda")

    print('There are %d GPU(s) available.' % torch.cuda.device_count())

    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

Found GPU at: /device:GPU:0
There are 1 GPU(s) available.
We will use the GPU: Tesla T4


## 필요 패키지 설치

In [None]:
!pip install transformers
!pip install keras_preprocessing

Collecting keras_preprocessing
  Downloading Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.6/42.6 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: keras_preprocessing
Successfully installed keras_preprocessing-1.1.2


## 필요 패키지 불러오기

In [1]:
import tensorflow as tf
import torch

from transformers import BertTokenizer
from transformers import BertForSequenceClassification, AdamW, BertConfig
from transformers import get_linear_schedule_with_warmup
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
# from keras.preprocessing.sequence import pad_sequences
from keras_preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split

import pandas as pd
import numpy as np
import random
import time
import datetime

ModuleNotFoundError: No module named 'keras_preprocessing'

## 학습 및 평가 데이터 불러오기

In [None]:
#1. Build Dataset

#Test데이터 생성하기

testFile = "drive/My Drive/v4/data/test.csv"

fr_test = open(testFile, 'r')
contents_test = fr_test.readlines()
fr_test.close()

testSet = []
numTest = 0
for content in contents_test:
  content = content.replace("\n","")
  if numTest == 0:
    pass
  else:
    testSet.append(content)
  numTest += 1

test = pd.DataFrame(columns=('Label', 'Sentence'))
i = 0
for content in testSet:
  infos = content.split(",")
  label = int(infos[6])
  sentence = infos[5]
  test.loc[i] = [label, sentence]
  i = i + 1

print(test)


#Train데이터 생성하기

trainFile = "drive/My Drive/v4/data/train.csv"

fr_train = open(trainFile, 'r')
contents_train = fr_train.readlines()
fr_train.close()


trainSet = []
numTrain = 0
for content in contents_train:
  content = content.replace("\n","")
  if numTrain == 0:
    pass
  else:
    trainSet.append(content)
  numTrain += 1

train = pd.DataFrame(columns=('Label', 'Sentence'))
i = 0
for content in trainSet:
  infos = content.split(",")
  label = int(infos[6])
  sentence = infos[5]
  train.loc[i] = [label, sentence]
  i = i + 1

print(train)


     Label                                           Sentence
0        1  All of our tim hortons are staffed by those na...
1        1  Sadly I think the government would step in if ...
2        1  A few weeks ago i started looking into doing t...
3        1  Yes Muslims do breed like vermin  i believe th...
4        1  Maybe it had something to do with the Soviets ...
..     ...                                                ...
195      0  Wells is accused of killing 19 year old Erin J...
196      0  Do they discharge soldiers from the military i...
197      0  I also do n't want harassment in real life  wh...
198      0  I'm looking for any white females in the fort ...
199      0                      I live hour south of Regina .

[200 rows x 2 columns]
      Label                                           Sentence
0         1  With Black girls it takes a baseball bat - who...
1         1  They laugh at whites who obey the law  bother ...
2         1  Maybe God was punishing them f

## 데이터 딥러닝 활용 가능 형태로 재가공하기_1

In [None]:
import os
import pandas as pd
from torch.utils.data import Dataset


class Dataset(Dataset):
    def __init__(self, data, Dtype):
        super().__init__()
        self.data = data
        self.Dtype = Dtype

    def __len__(self):
        return len(self.data)

    def __getitem__(self, index):
        record = self.data.iloc[index]
        text = record['Sentence']
        if self.Dtype == "train":
            return {'Sentence': text, 'label': record['Label']}
        else:
            return {'Sentence': text, 'label': '0'}

train_dataset = Dataset(train, "train")
test_dataset = Dataset(test, "test")

print(train_dataset)
print(test_dataset)

<__main__.Dataset object at 0x7fb952512770>
<__main__.Dataset object at 0x7fb952511720>


## 사전학습모델 불러오기
- GPT2Config: https://huggingface.co/transformers/v2.11.0/model_doc/gpt2.html

In [None]:
# 2. Model and Tokenizer
# https://github.com/SKT-AI/KoGPT2
from transformers import set_seed, GPT2LMHeadModel, PreTrainedTokenizerFast, GPT2ForSequenceClassification, GPT2Config

set_seed(731)
model_config = GPT2Config.from_pretrained('openai-community/gpt2', num_labels=labelNumber) # Binary Classification
model = GPT2ForSequenceClassification.from_pretrained('openai-community/gpt2', config=model_config)

tokenizer = PreTrainedTokenizerFast.from_pretrained("openai-community/gpt2",
  bos_token='</s>', eos_token='</s>', unk_token='<unk>',
  pad_token='<pad>', mask_token='<mask>')
tokenizer.padding_side = "left" # Very Important
tokenizer.pad_token = tokenizer.eos_token


model.resize_token_embeddings(len(tokenizer))
model.config.pad_token_id = model.config.eos_token_id

print(model)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at openai-community/gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'GPT2Tokenizer'. 
The class this function is called from is 'PreTrainedTokenizerFast'.


GPT2ForSequenceClassification(
  (transformer): GPT2Model(
    (wte): Embedding(50261, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (score): Linear(in_features=768, out_features=2, bias=False)
)


## 데이터 딥러닝 활용 가능 형태로 재가공하기_2

In [None]:
#3. Data Collator
class Gpt2ClassificationCollator(object):
    def __init__(self, tokenizer, max_seq_len=None):
        self.tokenizer = tokenizer
        self.max_seq_len = max_seq_len

        return

    def __call__(self, sequences):
        texts = [sequence['Sentence'] for sequence in sequences]
        labels = [int(sequence['label']) for sequence in sequences]
        inputs = self.tokenizer(text=texts,
                                return_tensors='pt',
                                padding=True,
                                truncation=True,
                                max_length=self.max_seq_len)
        inputs.update({'labels': torch.tensor(labels)})

        return inputs

gpt2classificationcollator = Gpt2ClassificationCollator(tokenizer=tokenizer,
                                                        max_seq_len=setMaxLength)
print(gpt2classificationcollator)

<__main__.Gpt2ClassificationCollator object at 0x7fb9525119c0>


In [None]:
#4. DataLoader
from torch.utils.data import DataLoader, random_split

train_size = int(len(train_dataset) * 0.9)
val_size = len(train_dataset) - train_size
train_dataset, val_dataset = random_split(train_dataset, [train_size, val_size])

train_dataloader = DataLoader(dataset=train_dataset,
                              batch_size=setBatch,
                              shuffle=True,
                              collate_fn=gpt2classificationcollator)
val_dataloader = DataLoader(dataset=val_dataset,
                            batch_size=setBatch,
                            shuffle=False,
                            collate_fn=gpt2classificationcollator)
test_dataloader = DataLoader(dataset=test_dataset,
                            batch_size=setBatch,
                            shuffle=False,
                            collate_fn=gpt2classificationcollator)

print(test_dataloader)

<torch.utils.data.dataloader.DataLoader object at 0x7fb952629540>


## 파라미터 설정하기_2

In [None]:
#5. Optimizer & Lr Scheduler
from transformers import AdamW, get_cosine_schedule_with_warmup

total_epochs = setEpoch

param_optimizer = list(model.named_parameters())
no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
    {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
    {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
]
optimizer = AdamW(optimizer_grouped_parameters,
                  lr=setLearningRate,
                  eps=setEpsilon)

num_train_steps = len(train_dataloader) * total_epochs
num_warmup_steps = int(num_train_steps * 0.1)

lr_scheduler = get_cosine_schedule_with_warmup(optimizer,
                                              num_warmup_steps=num_warmup_steps,
                                              num_training_steps = num_train_steps)



## 모델 학습하기(함수화)

In [None]:
#6. Train & Validation
import torch

def train(dataloader, optimizer, scheduler, device_):
    global model
    model.train()

    prediction_labels = []
    true_labels = []

    total_loss = []

    for batch in dataloader:
        true_labels += batch['labels'].numpy().flatten().tolist()
        batch = {k:v.type(torch.long).to(device_) for k, v in batch.items()}


        outputs = model(**batch)
        loss, logits = outputs[:2]
        logits = logits.detach().cpu().numpy()
        total_loss.append(loss.item())

        optimizer.zero_grad()
        loss.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) # prevent exploding gradient

        optimizer.step()
        scheduler.step()

        prediction_labels += logits.argmax(axis=-1).flatten().tolist()

    return true_labels, prediction_labels, total_loss

def validation(dataloader, device_):
    global model
    model.eval()

    prediction_labels = []
    true_labels = []

    embedding_outputs = []

    total_loss = []

    outputs = []

    for batch in dataloader:
        true_labels += batch['labels'].numpy().flatten().tolist()
        batch = {k:v.type(torch.long).to(device_) for k, v in batch.items()}

        with torch.no_grad():
            outputs = model(**batch)
            loss, logits = outputs[:2]
            logits = logits.detach().cpu().numpy()
            total_loss.append(loss.item())

            prediction_labels += logits.argmax(axis=-1).flatten().tolist()

            embedding_outputs += logits.tolist()

            outputs = outputs

    return true_labels, prediction_labels, total_loss, outputs, embedding_outputs

## 학습 결과 평가 함수 생성

In [5]:
#7. Predicted label
def outreault(guess):
  guess = int(guess)
  outClass = ""
  if guess == 0:
      outClass = "noHate"
  elif guess == 1:
      outClass = "hate"

  return outClass

## 학습 진행하기

In [None]:
#8. Run
from sklearn.metrics import classification_report, accuracy_score

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)

all_loss = {'train_loss': [], 'val_loss': []}
all_acc = {'train_acc': [], 'val_acc': []}
outputs = []

for epoch in range(total_epochs):
    y, y_pred, train_loss = train(train_dataloader, optimizer, lr_scheduler, device)
    train_acc = accuracy_score(y, y_pred)

    y, y_pred, val_loss, outputs, logits_labels = validation(val_dataloader, device)
    val_acc = accuracy_score(y, y_pred)

    all_loss['train_loss'] += train_loss
    all_loss['val_loss'] += val_loss

    all_acc['train_acc'].append(train_acc)
    all_acc['val_acc'].append(val_acc)

    outputs = outputs

    print("")
    print('======== Epoch {:} / {:} ========'.format(epoch + 1, total_epochs))
    print('Training...')

    print(f'Epoch: {epoch}, train_loss: {torch.tensor(train_loss).mean():.3f}, train_acc: {train_acc:.3f}, val_loss: {torch.tensor(val_loss).mean():.3f}, val_acc: {val_acc:.3f}')

print("")
print("Training complete!")


Training...
Epoch: 0, train_loss: 0.750, train_acc: 0.540, val_loss: 0.797, val_acc: 0.583

Training...
Epoch: 1, train_loss: 0.622, train_acc: 0.662, val_loss: 0.546, val_acc: 0.700

Training...
Epoch: 2, train_loss: 0.508, train_acc: 0.768, val_loss: 0.393, val_acc: 0.811

Training...
Epoch: 3, train_loss: 0.435, train_acc: 0.820, val_loss: 0.387, val_acc: 0.850

Training...
Epoch: 4, train_loss: 0.288, train_acc: 0.883, val_loss: 0.455, val_acc: 0.828

Training...
Epoch: 5, train_loss: 0.231, train_acc: 0.921, val_loss: 0.447, val_acc: 0.822

Training...
Epoch: 6, train_loss: 0.146, train_acc: 0.946, val_loss: 0.621, val_acc: 0.833

Training...
Epoch: 7, train_loss: 0.112, train_acc: 0.960, val_loss: 0.776, val_acc: 0.839

Training...
Epoch: 8, train_loss: 0.103, train_acc: 0.972, val_loss: 0.860, val_acc: 0.839

Training...
Epoch: 9, train_loss: 0.081, train_acc: 0.978, val_loss: 0.881, val_acc: 0.844

Training complete!


# 학습된 모델 사용해서 문장 분류하기

## 평가 문장 입력하기

In [None]:
inputList = ["Hi, it's a honor to see you here!","Humans and negroes use restaurants in entirely different ways .", "Ukraine needs to rid itself of these Jewish Oligarchs ."]
print(inputList)

["Hi, it's a honor to see you here!", 'Humans and negroes use restaurants in entirely different ways .', 'Ukraine needs to rid itself of these Jewish Oligarchs .']


## 입력 문장 정제, 평가 및 결과확인

In [None]:
def inputRefined(listdata):
  data = pd.DataFrame(columns=('Label', 'Sentence'))
  i = 0
  for content in listdata:
    label = 0
    sentence = content
    data.loc[i] = [label, sentence]
    i = i + 1
  return data

In [None]:
test_input = inputRefined(inputList)
test_input_dataset = Dataset(test_input, "test")
test_input_dataloader = DataLoader(dataset=test_input_dataset, batch_size=setBatch, shuffle=False, collate_fn=gpt2classificationcollator)
y, y_pred, val_loss, outputs, logits_labels = validation(test_input_dataloader, device)

for each in range(0, len(inputList)):
    guess = str(y_pred[each])
    print("sentence: ", test_input['Sentence'][each], "   predictedLabel: ", str(guess) + " (" + outreault(guess) + ")")

sentence:  Hi, it's a honor to see you here!    predictedLabel:  0 (noHate)
sentence:  Humans and negroes use restaurants in entirely different ways .    predictedLabel:  1 (hate)
sentence:  Ukraine needs to rid itself of these Jewish Oligarchs .    predictedLabel:  1 (hate)


## 모델 출력하고 업로드하기
- huggingface: https://huggingface.co/simonmun

In [None]:
model.save_pretrained('drive/My Drive/v4/model/')
tokenizer.save_pretrained('drive/My Drive/v4/model/')

('drive/My Drive/v4/model/tokenizer_config.json',
 'drive/My Drive/v4/model/special_tokens_map.json',
 'drive/My Drive/v4/model/tokenizer.json')

## 업로드한 모델 사용하기

In [2]:
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="simonmun/HSC_gpt2")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.01k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/498M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.13k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [3]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("simonmun/HSC_gpt2")
model = AutoModelForSequenceClassification.from_pretrained("simonmun/HSC_gpt2")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [9]:
import torch

inputs = ["Hello, my dog is cute", "Humans and negroes use restaurants in entirely different ways ."]

for each in inputs:

  input = tokenizer(each, return_tensors="pt")

  with torch.no_grad():
      logits = model(**input).logits
      # print(logits)
      prediction_labels = logits.argmax(axis=-1).flatten().tolist()
      guess = prediction_labels[0]
      print("입력하신 문장,","'"+each+"'는",outreault(guess),"문장입니다.")

입력하신 문장, 'Hello, my dog is cute'는 noHate 문장입니다.
입력하신 문장, 'Humans and negroes use restaurants in entirely different ways .'는 hate 문장입니다.
