language combinations

Install parlai package

python setup.py develop

Download Korean Wizard of Wikipedia (KoWoW)

Download

tar xvzf wizard_of_wikipedia_ko.tar.gz
mv wizard_of_wikipedia_ko data/

language combinations

Name	Knowledge	Utterance
ko	Korean	Korean
ke	Korean	English
ek	English	Korean
en	English	English

Test splits

Name	topic
random_split	Seen
topic_split	Unseen

training on KoWoW (ek dataset)

CUDA_VIS_DEV=0
LANG_TYPE=ek

NUM_EPOCHS=10
MODEL_TYPE=T5EndToEndTwoAgent
MODEL_NAME=mymodel

CUDA_VISIBLE_DEVICES="${CUDA_VIS_DEV}" parlai train_model -t wizard_of_wikipedia_ko:generator:topic_split --ln ${LANG_TYPE} -m projects.wizard_of_wikipedia_ko.generator.t5:${MODEL_TYPE}  -mf model_data/model/${MODEL_NAME}  --t5-model-arch "KETI-AIR/ke-t5-base" --t5-encoder-model-arch "KETI-AIR/ke-t5-small" --log-every-n-secs 10 --validation-patience 12 --validation-metric ppl --validation-metric-mode min --validation-every-n-epochs 5 -bs 4 --max_knowledge 32 --num-epochs ${NUM_EPOCHS} --tensorboard-log true --tensorboard-logdir ./model_data/tf_logs/${MODEL_NAME}

test topic split(Unseen) on KoWoW ek dataset

CUDA_VIS_DEV=0
LANG_TYPE=ek
MODEL_NAME=mymodel

CUDA_VISIBLE_DEVICES="${CUDA_VIS_DEV}" parlai eval_model -t wizard_of_wikipedia_ko:generator:topic_split -dt test --ln ${LANG_TYPE} -mf model_data/model/${MODEL_NAME} -bs 4 --inference beam --beam-size 4

Display sample - topic split(Unseen)

You can display predicted knowledge using enc_output field.

CUDA_VIS_DEV=0
LANG_TYPE=ek
MODEL_NAME=mymodel
NUM_EXAMPLES=300

CUDA_VISIBLE_DEVICES="${CUDA_VIS_DEV}" parlai display_model -t wizard_of_wikipedia_ko:generator:topic_split -dt test --ln ${LANG_TYPE} -mf ./model_data/model/${MODEL_NAME} -bs 1 --inference beam --beam-size 4 --display-add-fields checked_sentence,enc_output -n NUM_EXAMPLES

Acknowledgement

본 연구는 정부(과학기술정보통신부)의 재원으로 지원을 받아 수행된 연구입니다. (정보통신기획평가원, 2022-0-00320, 상황인지 및 사용자 이해를 통한 인공지능 기반 1:1 복합대화 기술 개발)

Name		Name	Last commit message	Last commit date
Latest commit History 3,697 Commits
.circleci		.circleci
.github		.github
docs		docs
example_parlai_internal		example_parlai_internal
parlai		parlai
projects		projects
tests		tests
website		website
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NEWS.md		NEWS.md
README.md		README.md
autoformat.sh		autoformat.sh
codecov.yml		codecov.yml
conftest.py		conftest.py
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install parlai package

Download Korean Wizard of Wikipedia (KoWoW)

language combinations

Test splits

training on KoWoW (ek dataset)

test topic split(Unseen) on KoWoW ek dataset

Display sample - topic split(Unseen)

Acknowledgement

About

Releases

Packages

Contributors 136

Languages

License

AIRC-KETI/kowow

Folders and files

Latest commit

History

Repository files navigation

Install parlai package

Download Korean Wizard of Wikipedia (KoWoW)

language combinations

Test splits

training on KoWoW (ek dataset)

test topic split(Unseen) on KoWoW ek dataset

Display sample - topic split(Unseen)

Acknowledgement

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 136

Languages

Packages