Skip to content

ElPlaguister/Univ_WithU

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

87 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AI ๊ฐ์„ฑ ์นด๋“œ

2021-2022 Capstone Project

CONTENTS


ํŒ€ ์œ„๋“œ์œ  ์†Œ๊ฐœ

Member ์—ญํ•  ์ฑ…์ž„
jiminAn(์•ˆ์ง€๋ฏผ) ํŒ€์žฅ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ, ์ƒ์„ฑ ๋ชจ๋ธ(KoGPT2) ํ•™์Šต ๋ฐ ์„ฑ๋Šฅ ๋ถ„์„, GUI ์ œ์ž‘
hyeji1221(์ž„ํ˜œ์ง€) ๋ถ„๋ฅ˜ ๋ชจ๋ธ(KoBERT) ํ•™์Šต ๋ฐ ์„ฑ๋Šฅ ๋ถ„์„
ElPlaguister(์ด์Šน๋ฏผ) ํ…์ŠคํŠธ ์ƒ์„ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์„ฑ๋Šฅ ํ…Œ์ŠคํŠธ, GUI ์ œ์ž‘
SunjungAn(์•ˆ์„ ์ •) ๋ถ„๋ฅ˜ ๋ชจ๋ธ(KoELECTRA) ํ•™์Šต ๋ฐ ์„ฑ๋Šฅ ๋ถ„์„

์ฃผ์ œ ์†Œ๊ฐœ ๋ฐ ๊ธฐํš ๋ฐฐ๊ฒฝ

  • ๊ธฐ์กด์˜ ์ผ์ƒ ๊ณต์œ  ํ”Œ๋žซํผ์€ ๋‹ค๋ฅธ ์‚ฌ์šฉ์ž๋“ค์˜ ๋ฐ˜์‘์„ ์ค‘์š”์‹œํ•ด ์†”์งํ•˜์ง€ ๋ชปํ•˜๊ณ , ๊ด€์‹ฌ์„ ๋„๋Š” ์šฉ๋„์˜ ๋ณด์—ฌ์ฃผ๊ธฐ ์‹ ๊ธ€์“ฐ๊ธฐ๊ฐ€ ์‹ฌํ™”๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ž์‹ ์˜ ๊ฒŒ์‹œ๊ธ€์— ๋น„๋ฐฉ ๋Œ“๊ธ€ ๋“ฑ ์•…์˜์ ์ธ ๋‹ต๊ธ€๋กœ ์ธํ•ด ์ƒ์ฒ˜๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ์Œ.
  • ๋˜ํ•œ, ์ž์‹ ์˜ ์ผ์ƒ์ด ๋…ธ์ถœ๋˜๋Š” ๊ฒƒ์ด ์‹ซ์–ด ๊ฐœ์ธ ๋ฉ”๋ชจ๊ณต๊ฐ„์— ์ ๋Š” ๊ฒƒ์€ ์ž์‹ ์˜ ์ผ์ƒ์„ ๊ณต๊ฐ๋ฐ›๊ณ  ์œ„๋กœ ๋ฐ›์„ ์ˆ˜ ์—†์Œ.
  • ๋”ฐ๋ผ์„œ ํ”„๋ผ์ด๋น— ํ•œ ์ž๊ธฐ๋งŒ์˜ ๊ณต๊ฐ„์—์„œ ๊ธ€๊ท€๋ฅผ ์ž‘์„ฑํ•˜๋˜, ๊ณต๊ฐ๊ณผ ์œ„๋กœ๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ์‹ ๊ฐœ๋… ์ผ์ƒ ๊ณต์œ  ํ”Œ๋žซํผ์˜ ํ•„์š”์„ฑ์ด ๋Œ€๋‘๋จ. ์ด๋ฅผ ๊ฒŒ์‹œ๊ธ€์— ๋Œ€ํ•œ ์ ์ ˆํ•œ ๊ณต๊ฐ/์œ„๋กœ ๋‹ต๊ธ€์„ ๋‹ฌ์•„์ฃผ๋Š” NLP ๋ชจ๋ธ์„ ํ†ตํ•ด AI ์นœ๊ตฌ๊ฐ€ ๊ณต๊ฐํ•ด์ฃผ๋ฉฐ ์œ„๋กœํ•ด์ฃผ๋Š” AI ๊ฐ์„ฑ์นด๋“œ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๊ฐœ๋ฐœํ•˜๊ณ ์ž ํ•จ

์‚ฌ์šฉ ์–ธ์–ด ๋ชจ๋ธ

์–ธ์–ด๋ชจ๋ธ(Language Model)์— ๋Œ€ํ•ด auto regressive, text classification ํŒŒ์ธํŠœ๋‹ ๋ฐ ํ…Œ์ŠคํŠธ

  • KoGPT2: ์งˆ์˜๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ, ๋‹ค์Œ ๋‹ต๋ณ€์— ๋Œ€ํ•œ ํ…์Šค ์ƒ์„ฑ
  • KoELECTRA: ์งˆ์˜์— ๋Œ€ํ•ด์„œ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์˜ˆ์ธก
  • KoBERT: ์งˆ์˜์— ๋Œ€ํ•ด์„œ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์˜ˆ์ธก

ํ™˜๊ฒฝ

Data

GPU

Colab pro, P100

Package

kogpt2-transformers
kobert-transformers
transformers==3.0.2
torch

GUI (by PyQT5) ์†Œ๊ฐœ


GUI ๊ฒฐ๊ณผ


๊ฒŒ์‹œ๊ธ€ : ์˜ค๋Š˜ ํ•˜๋ฃจ ์ •๋ง ํž˜๋“ค์—ˆ์–ด. ํ”ผ๊ณคํ•ด

  • ์นดํ…Œ์ฝ”๋ฆฌ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ(KoBERT) : ์ฆ์ƒ/ํ”ผ๋กœ

  • ๋‹ต๋ณ€

    • greedy search

      ์ •๋ง ํ”ผ๊ณคํ•˜์‹œ๊ฒ ์–ด์š”. ์˜ค๋Š˜ ๋ฐค์€ ํ‘น ์ฃผ๋ฌด์„ธ์š”. ์ž ์ด ์•ˆ์™€์š”.
      ์ž ์„ ๋ชป์ž๋Š” ๊ฒƒ๋งŒํผ ํž˜๋“ ๊ฒŒ ์—†์ฃ . ๋„ˆ๋ฌด ํž˜๋“ค์–ด์š”.
      
    • top-k sampling

      ์ž ์„ ๋ชป์ž๋Š” ๊ฒƒ๋งŒํผ ํž˜๋“ ๊ฒŒ ์—†์ฃ .
      ๋„ˆ๋ฌด ํž˜๋“ค๋ฉด ์ž ๊น ์‚ฐ์ฑ…์ด๋ผ๋„ ํ•˜๋Š”๊ฑด ์–ด๋–จ๊นŒ์š”? ์‚ฐ์ฑ…์€ ์–ด๋– ์„ธ์š”? ์•„๋‹ˆ ๋‹น์‹ ์˜ ์ด์•ผ๊ธฐ๊ฐ€ ๋” ๋“ฃ๊ณ  ์‹ถ์–ด์š”
      ์‚ฐ์ฑ…์ด๋ผ๋„ ์ข‹์œผ๋‹ˆ๊นŒ์š”. ๊ดœ์ฐฎ์•„ ๋ณด์—ฌ์š”., ์ž ๊น ์‰ฌ์–ด๊ฐ€์š”.
      
    • top-p sampling

      ์ž ์„ ๋ชป์ž๋Š” ๊ฒƒ๋งŒํผ ํž˜๋“ ๊ฒŒ ์—†์ฃ .
      ๋„ˆ๋ฌด ํž˜๋“ค๋ฉด ์ž ๊น ์‚ฐ์ฑ…์ด๋ผ๋„ ํ•˜๋Š”๊ฑด ์–ด๋–จ๊นŒ์š”? ์‚ฐ์ฑ…์€ ๊ดœ์ฐฎ์•„์š”? 
      ์–ด๋–ป๊ฒŒ ์ง€๋‚ด๊ณ  ๊ณ„์‹ ๊ฐ€์š”? ์ œ๊ฐ€ ๊ถ๊ธˆํ•ด์š”. ์‚ฐ์ฑ…์ด๋ผ๋„ ํ•˜๋‚˜ ํ•˜๋ฉด ๊ธฐ๋ถ„์ด ์ข‹์•„์งˆ ๊ฑฐ์—์š”
      ์šฐ์šธํ•œ ๊ธฐ๋ถ„์ด ์‚ฌ๋ผ์งˆ๊ฑฐ์˜ˆ์š”. ์„ฑ๋ น๋‹˜์ด ํ•จ๊ป˜ํ•ด์š”
      
    • Hugging face generator

      ์ •๋ง ํ”ผ๊ณคํ•˜์‹œ๊ฒ ์–ด์š”.
      ์˜ค๋Š˜ ๋ฐค์€ ํ‘น ์ฃผ๋ฌด์„ธ์š”. ์ž ์ด ์•ˆ์™€์š”. ์ €๋„ ๋„ˆ๋ฌด ๊ณ ํ†ต์Šค๋Ÿฌ์›Œ์š”
      ์ž ์„ ๋ชป์ž๋Š” ๊ฒƒ ๊ฐ™์•„์š”
      

๊ฒŒ์‹œ๊ธ€ : ์˜ค๋Š˜ ๋ถ€๋ชจ๋‹˜์ด๋ž‘ ์™•์ฐฝ ์‹ธ์› ์–ด. ์ง€๊ธ‹์ง€๊ธ‹ํ•ด ์–ผ๋ฅธ ๋…๋ฆฝํ•˜๊ณ  ์‹ถ์–ด

  • ์นดํ…Œ์ฝ”๋ฆฌ ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ(KoBERT) : ๋ฐฐ๊ฒฝ/๋ถ€๋ชจ/๊ฐˆ๋“ฑ

  • ๋‹ต๋ณ€

    • greedy search

      ๊ฐ€๊นŒ์šด ์‚ฌ์ด๋ผ ๋” ํž˜๋“ค์—ˆ๊ฒ ์–ด์š”. ํ•˜์ง€๋งŒ ์ œ๊ฐ€ ํ•ญ์ƒ ๊ณ์— ์žˆ์–ด์š”
      ํž˜๋‚ด์„ธ์š”. ์ด์•ผ๊ธฐ ํ•ด์ฃผ์„ธ์š”. ํž˜๋“ค๊ฒ ์–ด์š”
      
    • top-k sampling

      ๊ทธ๋Ÿฌ์…จ๊ตฐ์š”. ๋งŽ์ด ์†์ƒํ–ˆ๊ฒ ์–ด์š”. ์ดํ•ดํ•ด์š”
      ํ•ญ์ƒ ์˜†์— ์žˆ์–ด ๋“œ๋ฆด๊ฒŒ์š”. ํž˜๋‚ด์„ธ์š”
      
    • top-p sampling

      ๊ทธ๋Ÿฌ์…จ๊ตฐ์š”. ๋งŽ์ด ์†์ƒํ–ˆ๊ฒ ์–ด์š”. ์ดํ•ดํ•ด์š”
      ์ข‹์€ ๋ถ„๊ณผ ๊ฒฐํ˜ผํ•˜์‹  ๊ฒƒ ๊ฐ™์•„ ๋งˆ์Œ์ด ์•„ํ”„๋„ค์š”.
      ์ง€๊ธˆ๋„ ํ™”ํ•ด๋Š” ํ•˜์…จ๋‚˜์š”? ํ™”ํ•ด๋Š” ํ•˜์…จ์–ด์š”? ํž˜๋“ค๊ตฐ์š”
      
    • Hugging face generator

      ๊ฐ€๊นŒ์šด ์‚ฌ์ด๋ผ ๋” ํž˜๋“ค์—ˆ๊ฒ ์–ด์š”. ๋งˆ์Œ์ด ๋งŽ์ด ๋‹ต๋‹ตํ•˜๊ฒ ์–ด์š”
      ๋นจ๋ฆฌ ์‹œ๊ฐ„์ด ์ง€๋‚˜๊ฐ”์œผ๋ฉด ์ข‹๊ฒ ์–ด์š”. ์ €๋Š” ๋‹น์‹ ์ด ๊ณง ๊ดœ์ฐฎ์•„์งˆ ์ˆ˜ ์žˆ๋‹ค๊ณ  ๋ฏฟ์–ด์š”. ๊ดœ์ฐฎ์•„์š”
      

TASK RESULT


1. KoELECTRA & KoBERT Text Classifcation

KoELECTRA ๋ฐ KoBERT๋ฅผ ์ด์šฉํ•œ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ.

1.1 ์งˆ์˜์— ๋Œ€ํ•œ ์นดํ…Œ๊ณ ๋ฆฌ ๋ถ„๋ฅ˜

๋ฐ์ดํ„ฐ

Wellness ์‹ฌ๋ฆฌ ์ƒ๋‹ด ๋ฐ์ดํ„ฐ ์‚ฌ์šฉ. Wellness ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์นดํ…Œ๊ณ ๋ฆฌ/ ์งˆ๋ฌธ/ ๋‹ต๋ณ€์œผ๋กœ ๋‚˜๋ˆ„์–ด์ ธ์žˆ๋‹ค. ์นดํ…Œ๊ณ ๋ฆฌ ๋ณ„๋กœ 3๊ฐœ ๋‚ด์™ธ์˜ ๋‹ต๋ณ€์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฏ€๋กœ Wellness ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ ์งˆ๋ฌธ๊ณผ ์นดํ…Œ๊ณ ๋ฆฌ ํด๋ž˜์Šค์˜ ์Œ์œผ๋กœ ๋งŒ๋“ค์–ด ํ•™์Šต.

KoELECTRA USAGE

๋ชจ๋ธ

class koElectraForSequenceClassification(ElectraPreTrainedModel):
  def __init__(self,
               config,
               num_labels):
    super().__init__(config)
    self.num_labels = num_labels
    self.electra = ElectraModel(config)
    self.classifier = ElectraClassificationHead(config, num_labels)

    self.init_weights()
...
  1. koELECTRA ๋ชจ๋ธ ํ•™์Šต save ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ
python ./sunjungAn/koelectra_predict/model_download.py 
  1. ๋ฐ์ดํ„ฐ ๋กœ๋“œ ๋ฐ koELECTRA ์‹คํ–‰
python ./sunjungAn/koelectra_predict/koelectra.py 
python ./sunjungAn/koelectra_predict/module.py 
python ./sunjungAn/koelectra_predict/predict.py 
  1. ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก ๊ฒฐ๊ณผ ์ถœ๋ ฅ
python ./sunjungAn/jm_predict.py
qustion	answer	predict
๊ทธ ๋’ค๋กœ ์šด์ „์„ ๋ชปํ•˜๊ณ  ์žˆ์–ด.	๋ฐฐ๊ฒฝ/์ƒํ™œ/๋ถˆ๊ฐ€๋Šฅ/์šด์ „	๋ฐฐ๊ฒฝ/์ƒํ™œ/๋ถˆ๊ฐ€๋Šฅ/์šด์ „
๋‚ด ์ฃผ๋ณ€์— ์•„๋ฌด๋„ ์—†๋Š” ๊ฒƒ ๊ฐ™์•„์š”.	๊ฐ์ •/๊ณ ๋…๊ฐ	๊ฐ์ •/๊ณ ๋…๊ฐ
๊ทธ ์ด์•ผ๊ธฐ๋ฅผ ๋“ค์—ˆ์„ ๋•Œ ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋Š”๊ฑด์ง€ ๋ชจ๋ฅด๊ฒ ์–ด.	๊ฐ์ •/๊ณคํ˜น๊ฐ	๊ฐ์ •/๊ณคํ˜น๊ฐ
๋ฌด์—‡์„ ๋ณด๋“  ์†Œ๋ฆ„์ด ๋‹์•„์š”.	๊ฐ์ •/๊ณตํฌ	๊ฐ์ •/๊ณตํฌ
๊ทธ ์ผ์ด ์ผ์–ด๋‚œ ๋’ค๋กœ ์ƒˆ๋ฅผ ๋ฌด์„œ์›Œํ•˜๊ฒŒ ๋์–ด์š”.	๊ฐ์ •/๊ณตํฌ/์ƒˆ	๊ฐ์ •/๊ณตํฌ/์ƒˆ
์•„๋ฌด๋ฆฌ ์—ด์‹ฌํžˆ ํ•ด๋„ ๋‚จ๋Š” ๊ฑด ํ•˜๋‚˜๋„ ์—†๋Š”๊ฒƒ ๊ฐ™์•„์š”.	๊ฐ์ •/๊ณตํ—ˆ๊ฐ	๊ฐ์ •/๋ถ€์ •์ ์‚ฌ๊ณ 
์‚ฌ์†Œํ•œ ์ผ์—๋„ ๋„ˆ๋ฌด ๋†€๋ผ์š”.	๊ฐ์ •/๊ณผ๋ฏผ๋ฐ˜์‘	๊ฐ์ •/๋ถˆ์พŒ๊ฐ
์ด๋ ‡๊ฒŒ ์ŠคํŠธ๋ ˆ์Šค ๋ฐ›์œผ๋ฉด์„œ ์ผํ•ด์•ผ ํ•˜๋‚˜ ์‹ถ๊ณ  ๊ดด๋กœ์›Œ.	๊ฐ์ •/๊ดด๋กœ์›€	๊ฐ์ •/๊ดด๋กœ์›€
....
Accuracy: 0.58
Recall: 0.58
Precision: 0.58
F1: 0.58
  • ์ „์ฒด ํŒŒ์ผ์€ ์ด๊ณณ ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅ

KoBERT USAGE

๋ชจ๋ธ

class KoBERTforSequenceClassfication(BertPreTrainedModel):
  def __init__(self,
                num_labels = 359, # ๋ถ„๋ฅ˜ํ•  ๋ผ๋ฒจ ๊ฐฏ์ˆ˜๋ฅผ ์„ค์ •
                hidden_size = 768, # hidden_size
                hidden_dropout_prob = 0.1,  # dropout_prop
               ):
    super().__init__(get_kobert_config())

    self.num_labels = num_labels 
    self.kobert = get_kobert_model()
    self.dropout = nn.Dropout(hidden_dropout_prob)
    self.classifier = nn.Linear(hidden_size, num_labels)

    self.init_weights()
...
  1. ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก ๊ฒฐ๊ณผ ์ถœ๋ ฅ
python ./hyejiLim/jm_predict.py
qustion	answer	predict
๊ทธ ๋’ค๋กœ ์šด์ „์„ ๋ชปํ•˜๊ณ  ์žˆ์–ด.	๋ฐฐ๊ฒฝ/์ƒํ™œ/๋ถˆ๊ฐ€๋Šฅ/์šด์ „	๊ฐ์ •/๋‘๋ ค์›€/์šด์ „
๋‚ด ์ฃผ๋ณ€์— ์•„๋ฌด๋„ ์—†๋Š” ๊ฒƒ ๊ฐ™์•„์š”.	๊ฐ์ •/๊ณ ๋…๊ฐ	๊ฐ์ •/๋ถ€์ •์ ์‚ฌ๊ณ 
๊ทธ ์ด์•ผ๊ธฐ๋ฅผ ๋“ค์—ˆ์„ ๋•Œ ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋Š”๊ฑด์ง€ ๋ชจ๋ฅด๊ฒ ์–ด.	๊ฐ์ •/๊ณคํ˜น๊ฐ	๊ฐ์ •/์ƒ๊ฐ
๋ฌด์—‡์„ ๋ณด๋“  ์†Œ๋ฆ„์ด ๋‹์•„์š”.	๊ฐ์ •/๊ณตํฌ	๊ฐ์ •/๋ฌด์„œ์›€
๊ทธ ์ผ์ด ์ผ์–ด๋‚œ ๋’ค๋กœ ์ƒˆ๋ฅผ ๋ฌด์„œ์›Œํ•˜๊ฒŒ ๋์–ด์š”.	๊ฐ์ •/๊ณตํฌ/์ƒˆ	๊ฐ์ •/๊ณตํฌ/์ƒˆ
์•„๋ฌด๋ฆฌ ์—ด์‹ฌํžˆ ํ•ด๋„ ๋‚จ๋Š” ๊ฑด ํ•˜๋‚˜๋„ ์—†๋Š”๊ฒƒ ๊ฐ™์•„์š”.	๊ฐ์ •/๊ณตํ—ˆ๊ฐ	๊ฐ์ •/๋ถ€์ •์ ์‚ฌ๊ณ 
์‚ฌ์†Œํ•œ ์ผ์—๋„ ๋„ˆ๋ฌด ๋†€๋ผ์š”.	๊ฐ์ •/๊ณผ๋ฏผ๋ฐ˜์‘	์ฆ์ƒ/์ธ์ง€๊ธฐ๋Šฅ์ €ํ•˜
์ด๋ ‡๊ฒŒ ์ŠคํŠธ๋ ˆ์Šค ๋ฐ›์œผ๋ฉด์„œ ์ผํ•ด์•ผ ํ•˜๋‚˜ ์‹ถ๊ณ  ๊ดด๋กœ์›Œ.	๊ฐ์ •/๊ดด๋กœ์›€	๊ฐ์ •/๊ดด๋กœ์›€
....
Accuracy: 0.56
Recall: 0.56
Precision: 0.56
F1: 0.56
  • ์ „์ฒด ํŒŒ์ผ์€ ์ด๊ณณ ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅ

KoGPT USAGE

๋ชจ๋ธ
class DialogKoGPT2(nn.Module):
  def __init__(self):
    super(DialogKoGPT2, self).__init__()
    self.kogpt2 = get_kogpt2_model()  
...
ํ…์ŠคํŠธ ์ƒ์„ฑ๋ถ€๋ถ„

how-to-generate-text ์ฐธ๊ณ  ํ•˜์—ฌ, greedy_search/ beam_search/top_k_sampling/top_p_sampling/Huggingface์˜ Generate ์‚ฌ์šฉ.

def greedy_search(id):
    result = model.generate(
        input_ids = id,
        no_repeat_ngram_size=3,
        max_length=50
        )
    return result

def beam_search(id):
    result = model.generate(
        input_ids = id, 
        max_length=50, 
        num_beams=5, 
        no_repeat_ngram_size=3, 
        early_stopping=True
        )
    return result

def basic_sampling(id):
    result = model.generate(
        input_ids = id, 
        do_sample=True, 
        max_length=50, 
        no_repeat_ngram_size=3,
        top_k=0,
        temperature = 0.7
        )
    return result

def top_k_sampling(id):
    result = model.generate(
        input_ids = id, 
        do_sample=True, 
        max_length=50, 
        no_repeat_ngram_size=3,
        top_k=50
        )
    return result

def top_p_sampling(id):
    result = model.generate(
        input_ids = id, 
        do_sample=True, 
        max_length=50, 
        no_repeat_ngram_size=3,
        top_p=0.92, 
        top_k=0
        )
    return result
def generate(self,
               input_ids,
               do_sample=True,
               max_length=50,
               top_k=0,
               temperature=0.7):
    return self.kogpt2.generate(input_ids,
               do_sample=do_sample,
               max_length=max_length,
               top_k=top_k,
               temperature=temperature)

๊ฒฐ๊ณผ

Question: ์š”์ฆ˜ ๋„ˆ๋ฌด ํž˜๋“ค์–ด
greedy seacrh:  answer:Answer: ๊ทธ๋Ÿฐ ์ผ์ด ์žˆ์œผ์…จ๊ตฐ์š”. ํ•˜์ง€๋งŒ ๊ทธ๋Ÿด ์ˆ˜๋ฐ–์— ์—†๋Š” ์ด์œ ๊ฐ€ ์žˆ์—ˆ์„ ๊ฑฐ์˜ˆ์š”. ์ถฉ๋ถ„ํžˆ ์ดํ•ดํ•ด์š”. ๊ทธ๋Ÿฌ๋ฉด ์กฐ๊ธˆ์€ ๋‹ค๋ฅด๊ฒŒ ์ƒ๊ฐ์„ ํ•ด๋ณด๋Š” ๊ฒƒ๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™์•„์š”. ๋ฐฐ์›€๋งŒํผ ๋ฐฐ์›€ ์—†๋Š” ์ผ์ด ์—†์ฃ 
beam search:  answer:Answer: ์ •๋ง ๋‹นํ™ฉ์Šค๋Ÿฌ์šฐ์…จ๊ฒ ์–ด์š”. ํ•˜์ง€๋งŒ ๋„ˆ๋ฌด ๋ฌด๋ฆฌํ•ด์„œ ์ƒ๊ฐํ•ด๋‚ผ ํ•„์š”๋Š” ์—†๋‹ต๋‹ˆ๋‹ค.์ œ๊ฐ€ ์˜†์—์„œ ํž˜์ด ๋˜์–ด ๋“œ๋ฆด๊ฒŒ์š”. ๋‹น์‹ ์˜ ์ด์•ผ๊ธฐ๋Š” ๋“คํŒ์— ๋ฐ•ํ˜€ ์žˆ์œผ๋‹ˆ ๊ดœ์ฐฎ์œผ์‹ค ๊ฑฐ์˜ˆ์š”!๋‹น์‹ ์ด ๋„ˆ๋ฌด ์ƒ์ฒ˜๋ฐ›์ง€
top k sampling:  answer:Answer: ๊ทธ๋Ÿฐ ์ผ์ด ์žˆ์œผ์…จ๊ตฐ์š”. ํ•˜์ง€๋งŒ ๊ทธ๋Ÿด ์ˆ˜๋ฐ–์— ์—†๋Š” ์ด์œ ๊ฐ€ ์žˆ์—ˆ์„ ๊ฑฐ์˜ˆ์š”. ์ถฉ๋ถ„ํžˆ ์ดํ•ดํ•ด์š”. ๊ทธ๋Ÿฌ๋ฉด ์กฐ๊ธˆ์€ ๋‹ค๋ฅด๊ฒŒ ์ƒ๊ฐ์„ ํ•ด๋ณด๋Š” ๊ฒƒ๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™์•„์š”. ๋ฐฐ์›€๋งŒํผ ๋ฐฐ์›€ ์—†๋Š” ์ผ์ด ์—†์ฃ 
top p sampling:  answer:Answer: ์•„ํ”„๊ณ  ๋†€๋ž๊ฒ ์–ด์š”. ๊ดœ์ฐฎ์•„์š”? ์–ด์ œ ๊ทธ ๋ณ‘์› ๊ณผ์—์„œ ์น˜๋ฃŒ ๋ฐ›์•˜์–ด์š”. ์ง€๊ธˆ๋„ ์ฆ์ƒ์ด ์‹ฌํ•˜์‹œ๋‹ค๋ฉด ๋ณ‘์› ์ง„๋‹จ์„ ๋ฐ›์•„๋ณด๋Š” ๊ฑด ์–ด๋–จ๊นŒ์š”?์„œ์ƒ๋ฌต ๋ฌธ์ž ํ™•์ธํ•˜๋ฉด
hugging face generator:  answer:Answer: ๊ฐ€์Šด์ด ๋‹ต๋‹ตํ•˜๊ฒ ์–ด์š”. ์–ผ๋ฅธ ์‹œ๊ฐ„์ด ์ง€๋‚˜๋ฒ„๋ ธ์œผ๋ฉด ์ข‹๊ฒ ์–ด์š”.๊ฐ€์Šด์ด ๋‹ต๋‹ตํ•œ ๊ฒƒ๋งŒํผ ํž˜๋“  ๊ฒŒ ์—†์ฃ . ์ง„๋ฃŒ๋ฅผ ๋ฐ›์•„๋ณด์‹œ๋Š” ๊ฑด ์–ด๋– ์„ธ์š”? ์˜ค๋Š˜์€ ์–ด๋– ์…จ๋‚˜์š”?
Question: ์˜ค๋Š˜ ์ •๋ง ๊ธฐ์œ์ผ์ด ์žˆ์—ˆ์–ด
greedy seacrh:  answer:Answer:  ์žˆ์œผ์…จ๊ตฐ์š”. ํ•˜์ง€๋งŒ ๊ทธ๋Ÿด ์ˆ˜๋ฐ–์— ์—†๋Š” ์ด์œ ๊ฐ€ ์žˆ์—ˆ์„ ๊ฑฐ์˜ˆ์š”. ์ถฉ๋ถ„ํžˆ ์ดํ•ดํ•ด์š”. ๊ทธ๋Ÿฌ๋ฉด ์กฐ๊ธˆ์€ ๋‹ค๋ฅด๊ฒŒ ์ƒ๊ฐ์„ ํ•ด๋ณด๋Š” ๊ฒƒ๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™์•„์š”. ๋ฐฐ์›€๋งŒํผ ๋ฐฐ์›€ ์—†๋Š” ์ผ์ด ์—†์ฃ 
beam search:  answer:Answer: ์Šค๋Ÿฌ์šฐ์…จ๊ฒ ์–ด์š”. ํ•˜์ง€๋งŒ ๋„ˆ๋ฌด ๋ฌด๋ฆฌํ•ด์„œ ์ƒ๊ฐํ•ด๋‚ผ ํ•„์š”๋Š” ์—†๋‹ต๋‹ˆ๋‹ค.์ œ๊ฐ€ ์˜†์—์„œ ํž˜์ด ๋˜์–ด ๋“œ๋ฆด๊ฒŒ์š”. ๋‹น์‹ ์˜ ์ด์•ผ๊ธฐ๋Š” ๋“คํŒ์— ๋ฐ•ํ˜€ ์žˆ์œผ๋‹ˆ ๊ดœ์ฐฎ์œผ์‹ค ๊ฑฐ์˜ˆ์š”!๋‹น์‹ ์ด ๋„ˆ๋ฌด ์ƒ์ฒ˜๋ฐ›์ง€
top k sampling:  answer:Answer:  ์žˆ์œผ์…จ๊ตฐ์š”. ํ•˜์ง€๋งŒ ๊ทธ๋Ÿด ์ˆ˜๋ฐ–์— ์—†๋Š” ์ด์œ ๊ฐ€ ์žˆ์—ˆ์„ ๊ฑฐ์˜ˆ์š”. ์ถฉ๋ถ„ํžˆ ์ดํ•ดํ•ด์š”. ๊ทธ๋Ÿฌ๋ฉด ์กฐ๊ธˆ์€ ๋‹ค๋ฅด๊ฒŒ ์ƒ๊ฐ์„ ํ•ด๋ณด๋Š” ๊ฒƒ๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™์•„์š”. ๋ฐฐ์›€๋งŒํผ ๋ฐฐ์›€ ์—†๋Š” ์ผ์ด ์—†์ฃ 
top p sampling:  answer:Answer:  ๋Š˜ ๋‹น์‹ ์ด ์šฐ์„ ์ธ๋ฐ... ๋งˆ์Œ์ด ์•„ํŒŒ์š”. ์“ฐ๋Ÿฌ์งˆ ๊ฒƒ ๊ฐ™์•„์š”. ๋‚  ์ด๋ ‡๊ฒŒ ์•ˆ์•„์ฃผ์…จ๊ตฐ์š”. ๋„ˆ๋ฌด ์•„ํ”„๋‹ˆ๋‹ค.์ด. ์‚ด ๋ง›๋‚˜์š”.์— ๋‹น์‹ ์ด ์˜†์—
hugging face generator:  answer:Answer: ๋‹น์‹ ์ด ํ–‰๋ณตํ•˜๋‹ค๋ฉด ์ €๋„ ๊ธฐ๋ป์š”. ํ•˜์ง€๋งŒ ์•„๋‹ ์ˆ˜๋„ ์žˆ์ง€ ์•Š์„๊นŒ์š”? ์ถฉ๋ถ„ํžˆ ์ดํ•ดํ•ด์š”. ์ข‹์€ ์‚ฌ๋žŒ์„ ์ฐพ๊ธฐ๋Š” ์ •๋ง ์–ด๋ ค์šด ๊ฒƒ ๊ฐ™์•„์š”. ํž˜๋‚ด์„ธ์š”.
  • ์ž์„ธํ•œ ๊ฒฐ๊ณผ๋Š” ์ด๊ณณ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅ

About

Capstone Project in Computer Science, SMU

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 93.1%
  • Python 6.9%