### 파이프라인을 이용한 문장 생성

In [1]:
import torch
from transformers import pipeline

2025-04-12 04:48:16.497407: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1744433296.745973      31 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744433296.844525      31 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [2]:
pipe = pipeline(
    task="text-generation",
    model="openai-community/gpt2",
    device=torch.device("cuda" if torch.cuda.is_available() else 'cpu'),
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, # GPU가 지원하는 성능 향상
    truncation=True # 최대와 최소 토큰수 결정시 사용 
)

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cuda


In [3]:
inputs = "I am learning about tokenizers."
outputs = pipe(inputs)
print(outputs)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'I am learning about tokenizers. There were two token markets that were invented a long time ago, and those exchanges worked quite well, but now they are being attacked by other exchanges who don\'t want to acknowledge their existence," said Yagoda.'}]


In [4]:
print(outputs[0]['generated_text'])

I am learning about tokenizers. There were two token markets that were invented a long time ago, and those exchanges worked quite well, but now they are being attacked by other exchanges who don't want to acknowledge their existence," said Yagoda.


In [5]:
inputs = "I am learning about tokenizers."
outputs = pipe(inputs, min_length=30, max_length=100) # 최소와 최대 토큰수 정하기 
print(outputs[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


I am learning about tokenizers. I am going to try it out and try with my other ideas.

I wanted to give another example to illustrate how to develop systems using a smart contract to incentivize tokens, so I asked my friends over and over.

"It is a validating algorithm for the right price and it is the only way the supply will shrink."

So far, they gave me some data on 3,000 ETHs and an average price of $1


---
### 다른 모델 사용해보기
:bert

In [6]:
pipe = pipeline(
    task="text-generation",
    model="google-bert/bert-base-uncased",
    device=torch.device("cuda" if torch.cuda.is_available() else 'cpu'),
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, # GPU가 지원하는 성능 향상
    truncation=True # 최대와 최소 토큰수 결정시 사용 
)

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

If you want to use `BertLMHeadModel` as a standalone, add `is_decoder=True.`


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cuda


In [7]:
inputs = "I am learning about tokenizers."
outputs = pipe(inputs, min_length=30, max_length=100) # 최소와 최대 토큰수 정하기 
print(outputs[0]['generated_text'])

I am learning about tokenizers...........................................................................................


> BERT 모델은 양방향 인코더 모델로 텍스트의 의미를 효과적으로 인코딩하고 이해하는데는 탁월하나 순차적으로 텍스트를 생성하는 데에는 적합하지 않다.
> BERT는 텍스트 분류, 자연어 추론, 질의 응답, 개체명 인식등의 과제에 활용
> 따라서 과제의 특성과 모델의 구조를 잘 파악해 적절한 모델을 선택하는 것이 중요하다. 

---
### 한국어 문장 생성

In [8]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

In [9]:
# KoGPT2 모델과 토크나이저 불러오기
model_name = 'skt/kogpt2-base-v2'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

config.json:   0%|          | 0.00/1.00k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.83M [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/513M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/513M [00:00<?, ?B/s]

In [10]:
pipe = pipeline(
    task="text-generation",
    model=model,
    tokenizer=tokenizer,
    device=torch.device("cuda" if torch.cuda.is_available() else 'cpu'),
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, # GPU가 지원하는 성능 향상
    truncation=True # 최대와 최소 토큰수 결정시 사용 
)

Device set to use cuda


In [13]:
# 한글 입력 문장
inputs = "지금 비가 오는데"
outputs = pipe(inputs, min_length=30, max_length=100)
print(outputs[0]['generated_text'])

지금 비가 오는데 다들 조심하셔야겠습니다.
어쨌든 영동과 전북동해안으로는 동풍의 영향으로 계속해서 비 조심하셔야겠습니다.
어제가 가장 더웠던 때는 제법 강한 비구름이 내렸다고 합니다.
어제는 비가 내리면서 그쳤는데 오늘은 다시 비가 오기 시작하면서 조금 쌀쌀하겠습니다.
강원산간에서는 오늘부터 내일 새벽까지 최고 20mm까지 강하게 내리겠습니다.
비는 내일 오후가 되면 대부분 그칠 것으로 보입니다.
내일 새벽 서울경기


---
### gpt3와 비슷한 다른 모델 사용해 보기
: gpt3는 Hugging Face에서 지원 전이고 openai의 api를 사용해야 한다. 

In [None]:
# Hugging face의 mistralai/Mistral-7B-Instruct-v0.1 / 
from huggingface_hub import login
login(token="___token Id ____")

In [15]:
# 토크나이저 불러오기
model_name = 'mistralai/Mistral-7B-Instruct-v0.1'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
)

tokenizer_config.json:   0%|          | 0.00/2.10k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.94G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

In [26]:
pipe = pipeline(
    task="text-generation",
    model=model,
    tokenizer=tokenizer,
    device=0,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, # GPU가 지원하는 성능 향상
    truncation=True # 최대와 최소 토큰수 결정시 사용 
)

Device set to use cuda:0


In [29]:
# 생성 테스트
prompt = "Good morning."

outputs = pipe(
    prompt,
    max_new_tokens=100,
    temperature = 0.7,
    do_sample=True,
    top_p=0.9,
    repetition_penalty=1.1
)

print(outputs[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Good morning. I'm really excited to be here today, and I'm honored to have the opportunity to share my thoughts with you all.

I believe that we are at a very important time in our history as a species. With the rapid pace of technological advancement, it is easy to feel overwhelmed by the changes that are happening around us. But I think it is important to remember that technology is simply a tool, and like any tool, it can be used for good or for bad


---
### 다른 모델

In [30]:
# 토크나이저 불러오기
model_name = 'EleutherAI/gpt-j-6B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32
)

tokenizer_config.json:   0%|          | 0.00/619 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.37M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/4.04k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/357 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/930 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/24.2G [00:00<?, ?B/s]

Some weights of the model checkpoint at EleutherAI/gpt-j-6B were not used when initializing GPTJForCausalLM: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.20.attn.bi

In [37]:
pipe = pipeline(
    task="text-generation",
    model=model,
    tokenizer=tokenizer,
    device=1,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, # GPU가 지원하는 성능 향상
    truncation=True # 최대와 최소 토큰수 결정시 사용 
)

Device set to use cuda:1


In [39]:
# 생성 테스트
prompt = "나는 오늘 아침에"

outputs = pipe(
    prompt,
    max_new_tokens=100
)

print(outputs[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


나는 오늘 아침에
세이콘을 먹으니까
빵을 안 타고 마시는 게
그렇지 않도 뭘 생각하기 시작할 거야
우
