# LLM 사용해보기

### ✨ 실습 개요 <br>

1) 실습 목적 <br>
- 이번 실습에서는 현재 사용 가능한 LLM 에 대해 직접 활용해봅니다. <br>
- 이미 pre-training/fine-tuning 된 공개 모델의 ckpt 를 불러와 다양한 task를 수행해보고 (huggingface pipeline), chat-gpt web 과 API 사용 방법에 대해 알아봅니다.   <br>


 2) 수강 목표
  - 이미 학습되어 있는 모델의 ckpt 를 가져와 사용할 수 있다.
  - chatgpt web 버전을 사용할 수 있다.
  - chatgpt API를 호출하여 사용할 수 있다.

### 실습 목차

1. 공개된 모델 checkpoint 활용하기
  * 1-1. checkpoint 직접 load
  * 1-2. pipeline 활용하기
2. chat-gpt 활용하기
  * 2-1. web
  * 2-2. API

In [1]:
%pip install openai

Collecting openai
  Downloading openai-1.43.0-py3-none-any.whl.metadata (22 kB)
Collecting distro<2,>=1.7.0 (from openai)
  Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Downloading openai-1.43.0-py3-none-any.whl (365 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m365.7/365.7 kB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hDownloading distro-1.9.0-py3-none-any.whl (20 kB)
Downloading jiter-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (319 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m319.7/319.7 kB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jiter, distro, openai
Successfully installed distro-1.9.0 jiter-0.5.0 openai-1.43.0
Note: you may need to restart the kernel to use updated packages.


In [8]:
import torch
from transformers import pipeline, set_seed
from transformers import GPT2LMHeadModel, GPT2Tokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM

In [2]:
tokenizer = GPT2Tokenizer.from_pretrained("gpt2") # tokenizer load
model = GPT2LMHeadModel.from_pretrained("gpt2") #model checkpoint load

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [3]:
# 실제 입력을 통해 잘 tokenizing 되는지 확인
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") # 원하는 입력을 tokenizing
print(inputs) # input_ids, attention_masks

{'input_ids': tensor([[15496,    11,   616,  3290,   318, 13779]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}


In [4]:
# 모델에 tokenized input 을 넣은 후 generate 를 통해 생성된 결과를 확인
greedy_output = model.generate(**inputs, max_new_tokens=40)
print(greedy_output) # 실제 문장을 보려면 decode 필요

# decoding 하면 생성된 문장을 확인 가능
print(tokenizer.decode(greedy_output[0]))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


tensor([[15496,    11,   616,  3290,   318, 13779,    13,   314,  1101,   407,
          1654,   611,   673,   338,   257, 26188,   393,   407,    13,   314,
          1101,   407,  1654,   611,   673,   338,   257,  3290,   393,   407,
            13,   314,  1101,   407,  1654,   611,   673,   338,   257,  3290,
           393,   407,    13,   198,   198,    40]])
Hello, my dog is cute. I'm not sure if she's a puppy or not. I'm not sure if she's a dog or not. I'm not sure if she's a dog or not.

I


In [6]:
# 여러 개 sentences 로 구성 가능
sequences = [
    "I've been waiting for a HuggingFace course my whole life.",
    "This course is amazing!",
]

encoded = tokenizer.encode(sequences, return_tensors="pt")

with torch.no_grad():
    generated_ids = model.generate(
        encoded,
        do_sample=False,
        min_length=10,
        max_length=50,
    )

print(tokenizer.decode([el.item() for el in generated_ids[0]]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
2024-08-30 12:20:36.303848: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-30 12:20:36.380476: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-30 12:20:36.668752: W tensorflow/compiler/xla/stream_executor/platform/de

<|endoftext|><|endoftext|>The U.S. Department of Justice has filed a lawsuit against the company that owns the video game company, Electronic Arts, alleging that the company violated antitrust laws by selling the game to a third party.

The lawsuit, filed in


#### 1-2 Huggingface pipeline 활용하기

> Huggingface 패키지의 pipeline 를 활용하여 모델의 ckpt 를 더 간단히 load 해봅니다



In [9]:
# pipeline 으로 더 짧고, 쉽게 task 수행 가능
# text 생성 task
set_seed(42)
generator = pipeline('text-generation', model='gpt2')

# max_length : 생성할 문장의 최대 길이
# num_return_sequences : 몇 개의 후보 문장을 생성할 것인지 지정
generator("Hello, I'm a language model,", max_length=30, num_return_sequences=5)

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "Hello, I'm a language model, but what I'm really doing is making a human-readable document. There are other languages, but those are"},
 {'generated_text': "Hello, I'm a language model, not a syntax model. That's why I like it. I've done a lot of programming projects.\n"},
 {'generated_text': "Hello, I'm a language model, and I'll do it in no time!\n\nOne of the things we learned from talking to my friend"},
 {'generated_text': "Hello, I'm a language model, not a command line tool.\n\nIf my code is simple enough:\n\nif (use (string"},
 {'generated_text': "Hello, I'm a language model, I've been using Language in all my work. Just a small example, let's see a simplified example."}]