Skip to content

Neural-Space/indic-transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MergedQUAD Dataset

MergedQUAD consists of splits for SQUAD-based Question-Answering in Hindi language. It is a combination of examples taken from other multilingual SQUAD-based Question Answering datasets like XQUAD and TyDiQA. This dataset was introduced in our paper titled "Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages" which has been accepted as a workshop paper at ML-RSA (NeurIPS 2020). This paper presents an exhaustive study of transformer-based architectures on Indian languages like Hindi, Bengali and Telugu. You can find our models on HuggingFace model hub over here.

Citation

If you use this work, please cite

@misc{jain2020indictransformers,
      title={Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages}, 
      author={Kushal Jain and Adwait Deshpande and Kumar Shridhar and Felix Laumann and Ayushman Dash},
      year={2020},
      eprint={2011.02323},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

MergedQUAD consists of splits for SQUAD-based Question-Answering in Hindi language.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published