Skip to content
Branch: master
Find file History
achyudh and Ashutosh-Adhikari Integrate BERT into Hedwig (#29) (#11)
* Fix package imports

* Update README.md

* Fix bug due to TAR/AR attribute check

* Add BERT models

* Add BERT tokenizer

* Return logits from the model.py

* Remove unused classes in models/bert

* Return logits from the model.py (#12)

* Remove unused classes in models/bert (#13)

* Add initial main file

* Add args for BERT

* Add partial support for BERT

* Initialize training and optimization

* Draft the structure of Trainers for BERT

* Remove duplicate tokenizer

* Add utils

* Move optimization to utils

* Add more structure for trainer

* Refactor the trainer (#15)

* Refactor the trainer

* Add more edits

* Add support for our datasets

* Add evaluator

* Split data4bert module into multiple processors

* Refactor BERT tokenizer

* Integrate BERT into Castor framework (#17)

* Remove unused classes in models/bert

* Split data4bert module into multiple processors

* Refactor BERT tokenizer

* Add multilabel support in BertTrainer

* Add multilabel support in BertEvaluator

* Add get_test_samples method in dataset processors

* Fix args.py for BERT

* Add support for Reuters, IMDB datasets for BERT

* Revert "Integrate BERT into Castor framework (#17)"

This reverts commit e4244ec.

* Fix paths to datasets in dataset classes and args

* Add SST dataset

* Add hedwig-data instructions to README.md

* Fix KimCNN README

* Fix RegLSTM README

* Fix typos in README

* Remove trec_eval from README

* Add tensorboardX to requirements.txt

* Rename processors module to bert_processors

* Add method to print metrics after training

* Add model check-pointing and early stopping for BERT

* Add logos

* Update README.md

* Fix code comments in classification trainer

* Add support for AAPD, Sogou, AGNews and Yelp2014

* Fix bug that deleted saved models

* Update README for HAN

* Update README for XML-CNN

* Remove redundant TODOs from the READMEs

* Fix logo in README.md

* Update README for Char-CNN

* Fix all the READMEs

* Resolve conflict

* Fix Typos

* Re-Add SST2 Processor

* Add support for evaluating trained model

* Update args.py

* Resolve issues due to DataParallel wrapper on saved model

* Remove redundant Yelp processor

* Fix bug for safely creating the saving directory

* Change checkpoint paths to timestamps

* Remove unwanted string.strip() from tokenizer

* Create save path if it doesn't exist

* Decouple model checkpoints from code

* Remove model choice restrictions for BERT

* Remove model/distill driver

* Simplify checkpoint directory creation
Latest commit 7d24958 Apr 14, 2019
You can’t perform that action at this time.