In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
cd /content/drive/MyDrive/

/content/drive/MyDrive


In [None]:
# Transformers installation
! pip install transformers datasets
# To install from source instead of the last release, comment the command above and uncomment the following one.
# ! pip install git+https://github.com/huggingface/transformers.git

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.25.1-py3-none-any.whl (5.8 MB)
[K     |████████████████████████████████| 5.8 MB 15.0 MB/s 
[?25hCollecting datasets
  Downloading datasets-2.8.0-py3-none-any.whl (452 kB)
[K     |████████████████████████████████| 452 kB 67.9 MB/s 
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 72.8 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 66.1 MB/s 
Collecting multiprocess
  Downloading multiprocess-0.70.14-py38-none-any.whl (132 kB)
[K     |████████████████████████████████| 132 kB 26.2 MB/s 
Collecting xxhash
  Downloading xxhash-3.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.w

# Text classification

In [None]:
#@title
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/leNG9fN9FQU?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')

Text classification is a common NLP task that assigns a label or class to text. Some of the largest companies run text classification in production for a wide range of practical applications. One of the most popular forms of text classification is sentiment analysis, which assigns a label like 🙂 positive, 🙁 negative, or 😐 neutral to a sequence of text. 

This guide will show you how to:

1. Finetune [DistilBERT](https://huggingface.co/distilbert-base-uncased) on the [IMDb](https://huggingface.co/datasets/imdb) dataset to determine whether a movie review is positive or negative.
2. Use your finetuned model for inference.

<Tip>

See the text classification [task page](https://huggingface.co/tasks/text-classification) for more information about other forms of text classification and their associated models, datasets, and metrics.

</Tip>

Before you begin, make sure you have all the necessary libraries installed:

```bash
pip install transformers datasets evaluate
```

We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:

In [None]:
from huggingface_hub import notebook_login

# write
# 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
notebook_login()

Token is valid.
Your token has been saved in your configured git credential helpers (store).
Your token has been saved to /root/.huggingface/token
Login successful


## Load IMDb dataset

Start by loading the IMDb dataset from the 🤗 Datasets library:

In [None]:
from datasets import load_dataset

# 인터넷 영화 데이터베이스 (IMDb)
# imdb = load_dataset("imdb")

sst2 = load_dataset('sst2')
# sst2 = load_dataset('glue', 'sst2')




# from nlp import load_dataset

# https://huggingface.co/datasets/sst2
# sst2 = load_dataset('glue', 'sst2')



  0%|          | 0/3 [00:00<?, ?it/s]

Then take a look at an example:

In [None]:
'sentence' in sst2['train'].features

True

In [None]:
sst2['train'].features

{'idx': Value(dtype='int32', id=None),
 'sentence': Value(dtype='string', id=None),
 'label': ClassLabel(names=['negative', 'positive'], id=None)}

In [None]:
sst2["test"][2]

{'idx': 2,
 'sentence': 'by the end of no such thing the audience , like beatrice , has a watchful affection for the monster .',
 'label': -1}

In [None]:
sst2["train"][2]

{'idx': 2,
 'sentence': 'that loves its characters and communicates something rather beautiful about human nature ',
 'label': 1}

In [None]:
sst2["validation"][2]

{'idx': 2,
 'sentence': 'allows us to hope that nolan is poised to embark a major career as a commercial yet inventive filmmaker . ',
 'label': 1}

There are two fields in this dataset: 

- `text`: the movie review text.
- `label`: a value that is either `0` for a negative review or `1` for a positive review.

## Preprocess

The next step is to load a DistilBERT tokenizer to preprocess the `text` field:

* DistilBERT 에 대한 간단한 설명
  - This model is a distilled version of the [BERT base model](https://huggingface.co/bert-base-uncased). 
  - It was introduced in this [paper](https://arxiv.org/pdf/1910.01108).
  - The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation).
  - This model is uncased: it does not make a difference between english and English.

In [None]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/config.json
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.25.1",
  "vocab_size": 30522
}

loading file vocab.txt from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/vocab.txt
loading file tokenizer.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapsh

Create a preprocessing function to tokenize `text` and truncate sequences to be no longer than DistilBERT's maximum input length:

In [None]:
def preprocess_function(examples):
    # return tokenizer(examples["text"], truncation=True)
    return tokenizer(examples["sentence"], truncation=True)

To apply the preprocessing function over the entire dataset, use 🤗 Datasets [map](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.map) function. You can speed up `map` by setting `batched=True` to process multiple elements of the dataset at once:

* 전체 데이터 집합에 전처리 함수를 적용
* [map](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Dataset.map) 함수를 적용
* 'batched=True'로 설정 --> 데이터 세트의 여러 요소를 한 번에 처리 --> 속도 향상

In [None]:
tokenized_sst2 = sst2.map(preprocess_function, batched=True)



  0%|          | 0/1 [00:00<?, ?ba/s]



Now create a batch of examples using [DataCollatorWithPadding](https://huggingface.co/docs/transformers/main/en/main_classes/data_collator#transformers.DataCollatorWithPadding). It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximium length.

In [None]:
from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

## Evaluate

### method 1

Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

In [None]:
pip install evaluate

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting evaluate
  Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
[K     |████████████████████████████████| 81 kB 7.7 MB/s 
Installing collected packages: evaluate
Successfully installed evaluate-0.4.0


In [None]:
import evaluate

accuracy = evaluate.load("accuracy")

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Then create a function that passes your predictions and labels to [compute](https://huggingface.co/docs/evaluate/main/en/package_reference/main_classes#evaluate.EvaluationModule.compute) to calculate the accuracy:

In [None]:
import numpy as np

# 정확도 계산하기 위한 함수
def compute_metrics(eval_pred):
    # 예측값, 실제값
    predictions, labels = eval_pred

    # np.argmax(), np.argmin()
    # axis=None : return index ( 모든 원소를 순서대로 1차원 array로 가정 )
    # axis=1 : 가로축 원소들끼리의 비교 (row)
    # axis=0 : 세로축 원소들끼리의 비교 (column)
    predictions = np.argmax(predictions, axis=1)

    # evaluate Library 이용
    return accuracy.compute(predictions=predictions, references=labels)

Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.

### method 2

In [None]:
import numpy as np

def compute_metrics(eval_pred):
    # 예측값, 실제값
    predictions, labels = eval_pred

    return np.mean(np.array(predictions) == np.array(labels))

## Train

Before you start training your model, create a map of the expected ids to their labels with `id2label` and `label2id`:

In [None]:
id2label = {0: "NEGATIVE", 1: "POSITIVE"}
label2id = {"NEGATIVE": 0, "POSITIVE": 1}

<Tip>

If you aren't familiar with finetuning a model with the [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer), take a look at the basic tutorial [here](https://huggingface.co/docs/transformers/main/en/tasks/../training#train-with-pytorch-trainer)!

</Tip>
You're ready to start training your model now! Load DistilBERT with [AutoModelForSequenceClassification](https://huggingface.co/docs/transformers/main/en/model_doc/auto#transformers.AutoModelForSequenceClassification) along with the number of expected labels, and the label mappings:

* Load DistilBERT
* 예상되는 레이블 수 및 레이블 매핑과 함께 다음을 수행

In [None]:
from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer

model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
)

loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/config.json
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "id2label": {
    "0": "NEGATIVE",
    "1": "POSITIVE"
  },
  "initializer_range": 0.02,
  "label2id": {
    "NEGATIVE": 0,
    "POSITIVE": 1
  },
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.25.1",
  "vocab_size": 30522
}

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a

At this point, only three steps remain:

1. Define your training hyperparameters in [TrainingArguments](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.TrainingArguments). The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer) will evaluate the accuracy and save the training checkpoint.
2. Pass the training arguments to [Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer) along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
3. Call [train()](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer.train) to finetune your model.

### Method of Training 1 ( 기존 방법 )
* epoch 단위로 평가하고 epoch 단위로 checkpoint를 저장시킨다.

In [None]:
training_args = TrainingArguments(
    # The output directory where the model predictions and checkpoints will be written.
    output_dir="distilbert-base-uncased_1",

    # The initial learning rate for AdamW optimizer.
    learning_rate=2e-5,   # AdamW 의  optimizer  learning rate

    # The batch size per GPU/TPU core/CPU for training.
    per_device_train_batch_size=16,

    # The batch size per GPU/TPU core/CPU for evaluation.
    per_device_eval_batch_size=16,

    # Total number of training epochs to perform 
    # (if not an integer, will perform the decimal part percents of the last epoch before stopping training).
    num_train_epochs=2,

    # The weight decay to apply (if not zero) to all layers except all bias and LayerNorm weights in AdamW optimizer.
    weight_decay=0.01,

    # The evaluation strategy to adopt during training. Possible values are:
    #   "no": No evaluation is done during training.
    #   "steps": Evaluation is done (and logged) every eval_steps.
    #   "epoch": Evaluation is done at the end of each epoch.
    evaluation_strategy="epoch",    # train 하는 동안 epoch 단위로 평가

    # The checkpoint save strategy to adopt during training. Possible values are:
    #   "no": No save is done during training.
    #   "epoch": Save is done at the end of each epoch.
    #   "steps": Save is done every save_steps.
    save_strategy="epoch",    # epoch 단위로 체크포인트 저장 ?

    # Whether or not to load the best model found during training at the end of training.
    load_best_model_at_end=True,    # train 하는 동안 가장 성능이 좋은 모델을 load 할 것인지 판단 여부

    # Whether or not to push the model to the Hub every time the model is saved. 
    # If this is activated, output_dir will begin a git directory synced with the repo (determined by hub_model_id) 
    # and the content will be pushed each time a save is triggered (depending on your save_strategy). 
    # Calling save_model() will also trigger a push.
    push_to_hub=True,
)

from transformers import Trainer

trainer = Trainer(
    # The model to train, evaluate or use for predictions. If not provided, a model_init must be passed.
    model=model,

    # 
    args=training_args,

    # The dataset to use for training. 
    train_dataset=tokenized_sst2["train"],

    # The dataset to use for evaluation.
    # eval_dataset=tokenized_sst2["test"],
    eval_dataset=tokenized_sst2["validation"],

    # The tokenizer used to preprocess the data.
    tokenizer=tokenizer,

    # The function to use to form a batch from a list of elements of train_dataset or eval_dataset.
    data_collator=data_collator,

    # The function that will be used to compute metrics at evaluation. 
    # Must take a EvalPrediction and return a dictionary string to metric values.
    compute_metrics=compute_metrics,
)

trainer.train()

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
Cloning https://huggingface.co/mdj1412/distilbert-base-uncased_1 into local empty directory.
The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: idx, sentence. If idx, sentence are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 67349
  Num Epochs = 2
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 8420
  Number of trainable parameters = 66955010
You're using a DistilBertToke

Epoch,Training Loss,Validation Loss,Accuracy
1,0.1846,0.308542,0.908257
2,0.1141,0.336396,0.913991


The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: idx, sentence. If idx, sentence are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 872
  Batch size = 16
Saving model checkpoint to distilbert-base-uncased_1/checkpoint-4210
Configuration saved in distilbert-base-uncased_1/checkpoint-4210/config.json
Model weights saved in distilbert-base-uncased_1/checkpoint-4210/pytorch_model.bin
tokenizer config file saved in distilbert-base-uncased_1/checkpoint-4210/tokenizer_config.json
Special tokens file saved in distilbert-base-uncased_1/checkpoint-4210/special_tokens_map.json
tokenizer config file saved in distilbert-base-uncased_1/tokenizer_config.json
Special tokens file saved in distilbert-base-uncased_1/special_tokens_map.json
The following columns in the evaluation set don't have a correspo

TrainOutput(global_step=8420, training_loss=0.1762660916797339, metrics={'train_runtime': 649.2512, 'train_samples_per_second': 207.467, 'train_steps_per_second': 12.969, 'total_flos': 1227916383406536.0, 'train_loss': 0.1762660916797339, 'epoch': 2.0})

### Method of Training 2 ( step 단위 )
* step 단위로 평가하고 step 단위로 checkpoint를 저장시킨다.

In [None]:
training_args = TrainingArguments(
    # The output directory where the model predictions and checkpoints will be written.
    output_dir="distilbert-base-uncased_2",

    # The initial learning rate for AdamW optimizer.
    learning_rate=2e-5,   # AdamW 의  optimizer  learning rate

    # The batch size per GPU/TPU core/CPU for training.
    per_device_train_batch_size=16,

    # The batch size per GPU/TPU core/CPU for evaluation.
    per_device_eval_batch_size=16,

    # Total number of training epochs to perform 
    # (if not an integer, will perform the decimal part percents of the last epoch before stopping training).
    num_train_epochs=2,

    # The weight decay to apply (if not zero) to all layers except all bias and LayerNorm weights in AdamW optimizer.
    weight_decay=0.01,

    # The evaluation strategy to adopt during training. Possible values are:
    #   "no": No evaluation is done during training.
    #   "steps": Evaluation is done (and logged) every eval_steps.
    #   "epoch": Evaluation is done at the end of each epoch.
    evaluation_strategy="steps",    # train 하는 동안 epoch 단위로 평가
    eval_steps=500,

    # The checkpoint save strategy to adopt during training. Possible values are:
    #   "no": No save is done during training.
    #   "epoch": Save is done at the end of each epoch.
    #   "steps": Save is done every save_steps.
    save_strategy="steps",    # epoch 단위로 체크포인트 저장 ?
    save_steps=500,

    # Whether or not to load the best model found during training at the end of training.
    load_best_model_at_end=True,    # train 하는 동안 가장 성능이 좋은 모델을 load 할 것인지 판단 여부

    # Whether or not to push the model to the Hub every time the model is saved. 
    # If this is activated, output_dir will begin a git directory synced with the repo (determined by hub_model_id) 
    # and the content will be pushed each time a save is triggered (depending on your save_strategy). 
    # Calling save_model() will also trigger a push.
    push_to_hub=True,
)

from transformers import Trainer 

trainer = Trainer(
    # The model to train, evaluate or use for predictions. If not provided, a model_init must be passed.
    model=model,

    # 
    args=training_args,

    # The dataset to use for training. 
    train_dataset=tokenized_sst2["train"],

    # The dataset to use for evaluation.
    # eval_dataset=tokenized_sst2["test"],
    eval_dataset=tokenized_sst2["validation"],

    # The tokenizer used to preprocess the data.
    tokenizer=tokenizer,

    # The function to use to form a batch from a list of elements of train_dataset or eval_dataset.
    data_collator=data_collator,

    # The function that will be used to compute metrics at evaluation. 
    # Must take a EvalPrediction and return a dictionary string to metric values.
    compute_metrics=compute_metrics,
)

trainer.train()

### Save model to the Hub


<Tip>

[Trainer](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer) applies dynamic padding by default when you pass `tokenizer` to it. In this case, you don't need to specify a data collator explicitly.

</Tip>

Once training is completed, share your model to the Hub with the [push_to_hub()](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer.push_to_hub) method so everyone can use your model:


* 훈련이 완료되면 모두가 사용할 수 있도록 "push_to_hub() 함수"를 사용하여 모델을 허브에 공유

In [None]:
# save 하고 hugging face 에 push 하는 듯
# trainer.push_to_hub()

# 저장
model.save_pretrained('distilbert-base-uncased_1')
# model.save_pretrained('distilbert-base-uncased_2')

Configuration saved in distilbert-base-uncased_2/config.json
Model weights saved in distilbert-base-uncased_2/pytorch_model.bin


<Tip>

For a more in-depth example of how to finetune a model for text classification, take a look at the corresponding
[PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)
or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb).

</Tip>

## Inference

Great, now that you've finetuned a model, you can use it for inference!

Grab some text you'd like to run inference on:

In [None]:
text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three."

### 추천 X ( pipline module 사용 )

The simplest way to try out your finetuned model for inference is to use it in a [pipeline()](https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#transformers.pipeline). Instantiate a `pipeline` for sentiment analysis with your model, and pass your text to it:

In [None]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="stevhliu/my_awesome_model")
classifier(text)

### 추천 O ( pytorch 로 직접 tokenizer, pretrained model 가져와서 실행 )

You can also manually replicate the results of the `pipeline` if you'd like:

* ' pipeline ' 의 결과를 수동으로 작동시킬 수도 있다.

Tokenize the text and return PyTorch tensors:

#### huggingface 에서 가져오는 방법

In [None]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")

inputs = tokenizer(text, return_tensors="pt")
tokenizer, inputs

Downloading:   0%|          | 0.00/360 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/125 [00:00<?, ?B/s]

loading file vocab.txt from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/vocab.txt
loading file tokenizer.json from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/special_tokens_map.json
loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/tokenizer_config.json


(PreTrainedTokenizerFast(name_or_path='stevhliu/my_awesome_model', vocab_size=30522, model_max_len=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}),
 {'input_ids': tensor([[  101,  2023,  2001,  1037, 17743,  1012,  2025,  3294, 11633,  2000,
           1996,  2808,  1010,  2021,  4372,  2705,  7941,  2989,  2013,  2927,
           2000,  2203,  1012,  2453,  2022,  2026,  5440,  1997,  1996,  2093,
           1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
          1, 1, 1, 1, 1, 1, 1, 1]])})

Pass your inputs to the model and return the `logits`:

In [None]:
from transformers import AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")

with torch.no_grad():
    logits = model(**inputs).logits

Downloading:   0%|          | 0.00/538 [00:00<?, ?B/s]

loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/config.json
Model config DistilBertConfig {
  "_name_or_path": "stevhliu/my_awesome_model",
  "activation": "gelu",
  "architectures": [
    "DistilBertForSequenceClassification"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.25.1",
  "vocab_size": 30522
}



Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--stevhliu--my_awesome_model/snapshots/cace37cc724b081ec363d716c73be5125ef2453d/pytorch_model.bin
All model checkpoint weights were used when initializing DistilBertForSequenceClassification.

All the weights of DistilBertForSequenceClassification were initialized from the model checkpoint at stevhliu/my_awesome_model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use DistilBertForSequenceClassification for predictions without further training.


Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a text label:

* What is Logit ?
  - log-odds function
  - 로짓 함수는 0에서 1까지의 확률값과 -∞에서 ∞ 사이의 확률값을 표현해주는 함수
  - Y축에서 0과 1 사이의 값을 제한하는 시그모이드 함수에 대한 역함수
  - 로짓 함수는 0 - 1의 도메인 내에 존재하기 때문에 이 함수는 확률을 이해하는 데 가장 일반적으로 사용

In [None]:
logits

tensor([[-3.9755,  3.6133]])

In [None]:
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]

'LABEL_1'

#### 로컬 디렉토리에서 가져오는 방법

In [None]:
from transformers import BertConfig, BertForMaskedLM, BertTokenizer
import torch

# Case 1
# ???
# model = BertForMaskedLM.from_pretrained('distilbert-base-uncased_1')

# Case 2
config = BertConfig.from_pretrained('distilbert-base-uncased_1')
model = BertForMaskedLM(config)

# Case 1
# ???
# tokenizer = BertTokenizer(text)
# inputs = tokenizer(text)

# Case 2
tokenizer = BertTokenizer.from_pretrained('distilbert-base-uncased_1')


loading configuration file distilbert-base-uncased_1/config.json
You are using a model of type distilbert to instantiate a model of type bert. This is not supported for all configurations of models and can yield errors.
Model config BertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForSequenceClassification"
  ],
  "attention_dropout": 0.1,
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "dim": 768,
  "dropout": 0.1,
  "hidden_act": "gelu",
  "hidden_dim": 3072,
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "NEGATIVE",
    "1": "POSITIVE"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "NEGATIVE": 0,
    "POSITIVE": 1
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "n_heads": 12,
  "n_layers": 6,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_emb

In [None]:
model.eval()

In [None]:
import numpy as np

inputs = torch.tensor(tokenizer.encode(text)).unsqueeze(0)     # Batch size 1
# inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    logits = model(inputs).logits

print(logits)

predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]

tensor([[[ 0.0000,  0.3624, -0.7317,  ...,  0.0051,  0.2768, -0.2314],
         [ 0.0000,  0.1416, -0.7633,  ...,  0.1097,  0.3072,  0.4791],
         [ 0.0000, -0.1719, -0.8926,  ...,  0.5877,  0.6686, -0.3162],
         ...,
         [ 0.0000, -0.1460, -0.1179,  ...,  0.5388,  0.6286,  0.9924],
         [ 0.0000,  1.0961, -0.5175,  ...,  0.1077,  0.6400, -0.1589],
         [ 0.0000,  1.0531, -0.4761,  ...,  0.3810,  0.0362,  0.9155]]])


KeyError: ignored

# 연습

In [None]:
predicted_class_id

1

In [None]:
id2label[predicted_class_id]

'POSITIVE'

In [None]:
model.config.id2label[predicted_class_id]

'LABEL_1'

In [None]:
model

In [None]:
model.config