Skip to content

Pretrained dialogue generation models (GPT-2 and Meena) of Pingpong, ScatterLab.

License

Notifications You must be signed in to change notification settings

Entekorea/dialogue-generation-models-1

ย 
ย 

Repository files navigation

Dialogue Generation Models

Lint and Format Python

Introduction

  • This is a repository of pretrained dialogue generation models (GPT-2 and Meena) of Pingpong, ScatterLab.
  • You can refer to our blog post for detailed pre-training processes and experiment results.
  • Check our Korean demo and Japanese demo for the chatting experience.

Downloads

  • You can download the pretrained GPT-2 and Meena models from Release page.
    • base_gpt_trained_on_dialogue_data_kr.pth
      • ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ๋งŒ ํ•™์Šตํ•œ base size GPT-2
    • large_gpt_trained_on_dialogue_data_kr.pth
      • ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ๋งŒ ํ•™์Šตํ•œ large size GPT-2
    • base_gpt_trained_on_wiki_and_dialogue_data_kr.pth
      • ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ, ์œ„ํ‚คํ”ผ๋””์•„, ๋‚˜๋ฌด์œ„ํ‚ค๋กœ ํ•™์Šตํ•œ base size GPT-2
    • large_gpt_trained_on_wiki_and_dialogue_data_kr.pth (Recommend)
      • ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ, ์œ„ํ‚คํ”ผ๋””์•„, ๋‚˜๋ฌด์œ„ํ‚ค๋กœ ํ•™์Šตํ•œ large size GPT-2
    • base_meena_trained_on_filtered_data_kr.pth
      • ํ•„ํ„ฐ๋ง๋œ ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ•œ base size Meena
    • large_meena_trained_on_filtered_data_kr.pth (Recommend)
      • ํ•„ํ„ฐ๋ง๋œ ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ•œ large size Meena
    • base_meena_trained_on_non_filtered_data_kr.pth
      • ํ•„ํ„ฐ๋ง์„ ๊ฑฐ์น˜์ง€ ์•Š์€ ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ•œ base size Meena
    • large_meena_trained_on_non_filtered_data_kr.pth
      • ํ•„ํ„ฐ๋ง์„ ๊ฑฐ์น˜์ง€ ์•Š์€ ํ•œ๊ตญ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ•œ large size Meena
    • base_meena_trained_on_filtered_data_jp.pth
      • ็ด„5ๅ„„ไปถใฎๆ—ฅๆœฌ่ชžๆ—ฅๅธธไผš่ฉฑใƒ‡ใƒผใ‚ฟใงๅญฆ็ฟ’ใ—ใŸbase sizeใฎMeena

Usage

  • GPT
PYTHONPATH=. python examples/run_gpt.py \
        --pretrained-model-path $PRETRAINED_MODEL_PATH \
        --model-config-path $MODEL_CONFIG_PATH \
        --tokenizer-model-path $TOKENIZER_MODEL_PATH \
        --decoding-method $DECODING_METHOD
  • Meena
PYTHONPATH=. python examples/run_meena.py \
        --pretrained-model-path $PRETRAINED_MODEL_PATH \
        --model-config-path $MODEL_CONFIG_PATH \
        --tokenizer-model-path $TOKENIZER_MODEL_PATH \
        --decoding-method $DECODING_METHOD
  • We implement two decoding methods called Top-p Sampling and Beam Search as examples.
    • There is a trade-off of Accuracy (Sensibleness) and Diversity (Specificity) between two decoding methods.
    • Beam Search is a good choice if you prefer the accuracy of the answer, and Top-p Sampling is a good choice if you prefer the diversity of the answer.

Notes

Korean

  • ๋ชจ๋ธ์˜ ์ƒ์„ฑ ๊ฒฐ๊ณผ๋Š” ํ•™์Šต์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•œ ์˜ˆ์ธก ๊ฒฐ๊ณผ์ด๋ฉฐ ์Šค์บํ„ฐ๋žฉ/ํ•‘ํํŒ€์˜ ์˜๊ฒฌ๊ณผ ๋ฌด๊ด€ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ์˜ ์ƒ์„ฑ ๊ฒฐ๊ณผ๋Š” ๊ฐ€์ƒ์˜ ๋Œ€ํ™” ์ƒ์„ฑ ๊ฒฐ๊ณผ์ด๋ฉฐ ์‚ฌ์‹ค ์—ฌ๋ถ€๋ฅผ ๋‹ด๋ณดํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ์Šค์บํ„ฐ๋žฉ/ํ•‘ํํŒ€์€ ๊ณต๊ฐœํ•œ ๋ชจ๋ธ์˜ ์ƒ์„ฑ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ฑ…์ž„์„ ์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ๋ณธ ๋ ˆํฌ์ง€ํ† ๋ฆฌ๋Š” ๋ชจ๋ธ์˜ ์‚ฌ์ „ ํ•™์Šต ์ฝ”๋“œ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ๊ณต๊ฐœํ•œ ๋ชจ๋ธ์€ ์› ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ GPT-2 ๋ฐ Meena ๋ชจ๋ธ๊ณผ ์‚ฌ์ด์ฆˆ ๋ฐ ๊ตฌ์กฐ์ ์œผ๋กœ ์ผ๋ถ€ ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ณต๊ฐœํ•œ ๋ชจ๋ธ์€ ๋Œ€๋Ÿ‰์˜ ์นดํ†ก ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•œ ์‚ฌ์ „ํ•™์Šต๋งŒ ์™„๋ฃŒํ•œ ์ƒํƒœ์ด๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์‚ฌ์šฉ์„ ํ•  ๋•Œ๋Š” ๋ชจ๋ธ์„ ์›ํ•˜๋Š” ๋ชฉ์ ์— ๋งž๊ฒŒ ํŒŒ์ธํŠœ๋‹ํ•œ ๋’ค ์‚ฌ์šฉํ•˜์‹œ๋Š” ๊ฒƒ์„ ๊ถŒ์žฅ๋“œ๋ฆฝ๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ์˜ ์ƒ์—…์  ํ™œ์šฉ์— ๋Œ€ํ•ด์„œ๋Š” support@pingpong.us๋กœ ๋ฌธ์˜ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

Japanese

  • ใƒขใƒ‡ใƒซใฎ็”Ÿๆˆ็ตๆžœใฏ็ตฑ่จˆ็š„ๆฉŸๆขฐๅญฆ็ฟ’ใ‚’็”จใ„ใŸไบˆๆธฌ็ตๆžœใงใ‚ใ‚Šใ€ไบ‹ๅฎŸใจใฏ็„ก้–ขไฟ‚ใช็™บ่ฉฑๆ–‡ใŒ็”Ÿๆˆใ•ใ‚Œใ‚‹ๅฏ่ƒฝๆ€งใŒใ‚ใ‚Šใพใ™ใ€‚ใ“ใฎ็ตๆžœใฏๅฝ“็คพใฎๆ„ๆ€ๆฑบๅฎšใ‚„ๅˆคๆ–ญใ‚’็คบใ™ใ‚‚ใฎใงใฏใ‚ใ‚Šใพใ›ใ‚“ใ€‚
  • ๅฝ“็คพใฏใ€ๅ…ฌ้–‹ใ—ใŸใƒขใƒ‡ใƒซใฎไฝฟ็”จใซใ‚ˆใฃใฆ็”Ÿใ˜ใ‚‹ๆๅคฑใ€ๆๅฎณ็ญ‰ใซใคใ„ใฆใ€ใ„ใ‹ใชใ‚‹ๅ ดๅˆใซใŠใ„ใฆใ‚‚ไธ€ๅˆ‡่ฒฌไปปใ‚’่ฒ ใ„ใพใ›ใ‚“ใ€‚
  • ๆœฌใƒฌใƒใ‚ธใƒˆใƒชใซใฏใƒขใƒ‡ใƒซใฎไบ‹ๅ‰ๅญฆ็ฟ’ใซ้–ขใ™ใ‚‹ใ‚ฝใƒผใ‚นใ‚ณใƒผใƒ‰ใŒๅซใพใ‚ŒใฆใŠใ‚Šใพใ›ใ‚“ใ€‚
  • ๅ…ฌ้–‹ใ—ใŸใƒขใƒ‡ใƒซใซใฏใ€ใ‚ชใƒชใ‚ธใƒŠใƒซ่ซ–ๆ–‡ใงๆๆกˆใ•ใ‚ŒใŸGPT-2ใ€Meenaใจใฏใ‚ตใ‚คใ‚บใ‚„ใƒขใƒ‡ใƒซใฎๆง‹้€ ใซใŠใ„ใฆไธ€้ƒจ็•ฐใชใ‚‹้ƒจๅˆ†ใŒๅซใพใ‚ŒใฆใŠใ‚Šใพใ™ใ€‚
  • ๅ…ฌ้–‹ใ—ใŸใƒขใƒ‡ใƒซใฏๆ—ฅๅธธไผš่ฉฑใƒ‡ใƒผใ‚ฟใ‚’็”จใ„ใŸไบ‹ๅ‰ๅญฆ็ฟ’ใฎใฟใ‚’ๅฎŒไบ†ใ—ใŸใ‚‚ใฎใงใ‚ใ‚Šใ€ๅฎŸ้š›ใซๅˆฉ็”จใ™ใ‚‹ๅ ดๅˆใซใฏ็›ฎ็š„ใซใ‚ˆใฃใฆ่ฟฝๅŠ ๅญฆ็ฟ’ใ‚’่กŒใฃใฆใ‹ใ‚‰ๅˆฉ็”จใ™ใ‚‹ใ“ใจใ‚’ใŠๅ‹งใ‚ใ—ใพใ™ใ€‚
  • ใƒขใƒ‡ใƒซใฎๅ•†ๆฅญ็š„ๅˆฉ็”จใซ้–ขใ—ใฆใฏใ€support@pingpong.usใ‹ใ‚‰ๅ•ใ„ๅˆใ‚ใ›ใŠ้ก˜ใ„ใ—ใพใ™ใ€‚

License

The pretrained models and the codes in this repository are distributed under the terms of the Apache-2.0 License.

Citation

If you use our software for research, please cite:

@misc{pingpong2020dial_gen_models,
  author = {Chaehun Park, Sangwoo Seo, Dawoon Jung},
  title = {dialogue-generation-models},
  year = {2019},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/pingpong-ai/dialogue-generation-models}}
}

References

@techreport{radford2019gpt2,
    title={Language Models are Unsupervised Multitask Learners},
    author={Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever},
    institution={OpenAI},
    year={2019}
}
@misc{adiwardana2020meena,
    title={Towards a Human-like Open-Domain Chatbot},
    author={Daniel Adiwardana, Minh-Thang Luong, David R So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu},
    year={2020},
    eprint={2001.09977},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Acknowledgments

For training models, we used Cloud TPUs provided by TensorFlow Research Cloud program.

About

Pretrained dialogue generation models (GPT-2 and Meena) of Pingpong, ScatterLab.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.7%
  • Shell 0.3%