Skip to content
This repository has been archived by the owner on Oct 21, 2020. It is now read-only.

A project for the Zalo AI Challenge 2019, Vietnamese Wikipedia Question Answering task.

License

Notifications You must be signed in to change notification settings

namnv1113/Nanibot_ZaloAIChallenge2019_VietnameseWikiQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zalo AI Challenge 2019 - Vietnamese Wikipedia Question Answering

General

This repository represents the works of the Nanibot Team on the Vietnamese Wikipedia Question Answering task on the Zalo AI Challenge 2019.

The works on this repository is based on a previous work on a similar task (Question Answering for regulations of UIT)

Structures

  • QASystem and Ultilities contain source codes, base model as well as fine-tuning models and dataset used in this project. Guide on how to setup and re-produce the result is also provided.
  • Dataset contains the dataset that is used in this project.

Team Members

How to run

Details on how to train/predict using the model is described here

What we have tried

  • Apply BERT as baseline for the QA problem defined by Zalo
  • Data augmented using the SQuAD dataset by translating & de-noising, resulted in 1% F1 boost compared to the baseline model
  • Improve BERT by trying different approaches (BERT + TextCNN, BERT with additional fully-connected layer (1), (2)), but yield no improvements
  • Try different loss function for the classification problem ((Squared) Hinge loss, KLD loss & Focal loss) along with label smoothing, but yield no improvements
  • Data augmented using backtranslation
  • Apply multilligual RoBERTa for the problem

Our solution yeild an F1 score of 79.15%, ranked 11 in the public leaderboard of the Zalo AI Challenge 2019 for the Vietnamese Wiki Question Answering problem for the public test set.

About

A project for the Zalo AI Challenge 2019, Vietnamese Wikipedia Question Answering task.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published