Skip to content

lanwuwei/GigaBERT

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

GigaBERT

This repo contains pre-trained models and code-switched data generation script for GigaBERT:

@inproceedings{lan2020gigabert,
  author     = {Lan, Wuwei and Chen, Yang and Xu, Wei and Ritter, Alan},
  title      = {An Empirical Study of Pre-trained Transformers for Arabic Information Extraction},
  booktitle  = {Proceedings of The 2020 Conference on Empirical Methods on Natural Language Processing (EMNLP)},
  year       = {2020}
} 

Fine-tuning Experiments

Please check Yang Chen's GitHub for code and data.


Checkpoints

The pre-trained models can be found here: GigaBERT-v3 and GigaBERT-v4

Please contact Wuwei Lan for code-switched GigaBERT with different configurations.

License

Apache License 2.0

Acknowledgement

This material is based in part on research sponsored by IARPA via the BETTER program (2019-19051600004).