Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

bentrevett/bag-of-tricks-for-efficient-text-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bag of Tricks for Efficient Text Classification

Implementation of Bag of Tricks for Efficient Text Classification in PyTorch using TorchText

Things you can try:

  • Use n-grams by setting N_GRAMS > 1. Note: this slows down pre-processing.
  • Reduce the vocabulary size by setting VOCAB_MAX_SIZE or increasing VOCAB_MIN_FREQ
  • Train on truncated sequences by setting MAX_LENGTH
  • Change the tokenizer to a built in one, like the spaCy tokenizer, by setting TOKENIZER = 'spacy'. Note: this slows down the pre-processing considerably.

About

Implementation of 'Bag of Tricks for Efficient Text Classification' in PyTorch using TorchText

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages