Skip to content

Commit

Permalink
add vocab_size to sentencepiece
Browse files Browse the repository at this point in the history
  • Loading branch information
aayux committed Nov 19, 2018
1 parent 69ed7b4 commit 5ffbe8b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ulmfit/pretrain_lm.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def pretrain_lm(dir_path, lang='en', cuda_id=0, qrnn=True, subword=False, max_vo
read_file(trn_path, 'train')
read_file(val_path, 'valid')

sp = get_sentencepiece(dir_path, trn_path, name)
sp = get_sentencepiece(dir_path, trn_path, name, vocab_size=max_vocab)

data_lm = TextLMDataBunch.from_csv(dir_path, **sp)
itos = data_lm.train_ds.vocab.itos
Expand Down

0 comments on commit 5ffbe8b

Please sign in to comment.