### 모델 경량화 및 효율적인 트랜스포머 구축

1. **knowledge distillation** (지식 증류)
2. **quantization** (양자화)
3. **pruning** (가지치기)
4. Graph Optimization with **ONNX**

### 의도 탐지 (intent detection task)

- 경량화된 모델로 빠른 의도탐지를 하여 분류하여 원하는 정보를 고객에게 전달 <br>
    (ex.상담원 없이 챗봇으로 고객의 의도를 파악하여 계좌 정보를 제공 등)
- 사전 정의된 의도에 대한 쿼리가 저장되어있지 않으면 대체응답을 출력해야한다.

In [1]:
from transformers import pipeline

bert_ckpt = "transformersbook/bert-base-uncased-finetuned-clinc"
pipe = pipeline("text-classification", model=bert_ckpt)

  from .autonotebook import tqdm as notebook_tqdm
Device set to use cuda:0


In [2]:
query = """hey, i'd like to rent a vehicle
from nov 1st to nov 15th in paris 
and i need a 15 passenger van"""

pipe(query)

[{'label': 'car_rental', 'score': 0.5490033626556396}]

### 효율적인 밴치마크를 위한 함수정의

In [3]:
class PerformanceBenchMark:
    def __init__(self, pipeline, dataset, optim_type = 'BERT baseline'):
        self.pipeline = pipeline
        self.dataset = dataset
        self.optim_type = optim_type
    
    def compute_accuracy(self):
        pass

    def compute_size(self):
        pass

    def time_pipeline(self):
        pass

    def run_benchmark(self):
        metrics = {}
        metrics[self.optim_type] = self.compute_size()
        metrics[self.optim_type].update(self.time_pipeline())
        metrics[self.optim_type].update(self.compute_accuracy())
        
        return metrics

In [36]:
# 데이터 셋 불러오기
from datasets import load_dataset

clinic = load_dataset("clinc_oos", "plus")

In [37]:
intents = clinic["test"].features["intent"]
sample_feature_int = clinic["test"][42]["intent"]

intents.int2str(sample_feature_int)

'transfer'

In [38]:
# metrics
from evaluate import load

def compute_accuracy(self):
    pred, labels = [], []
    for example in self.dataset:
        pred = self.pipeline(example["text"][0]['label'])
        label = example['intent']
        pred.append(intents.str2int(pred))
        labels.append(label)
    accuracy = load("accuracy").compute(predictions=pred,references=labels)
    print(f"정확도 : {accuracy['accuracy']:3f}")
    return accuracy

PerformanceBenchMark.compute_accuracy = compute_accuracy