Added PubMed embeddings computed by @jessepeng #519

alanakbik · 2019-02-19T12:34:55Z

@jessepeng computed a character LM over PubMed abstracts and shared the models with us. This PR adds them as FlairEmbeddings.

Init with:

embeddings_f = FlairEmbeddings('pubmed-forward')
embeddings_b = FlairEmbeddings('pubmed-backward')

khituras · 2019-02-19T13:54:12Z

Is the size of the hidden layer(s) and the number of layers known for these models? This would be an interesting information for comparative experiments.

alanakbik · 2019-02-19T14:02:44Z

Hi @khituras - I believe the model was trained with a hidden size of 1150 and 3 layers and BPTT truncated at a sequence length of 240. It was only trained over a 5% sample of PubMed abstracts until 2015, which is 1.219.734 abstracts.

@jessepeng is this correct?

jessepeng · 2019-02-19T22:28:56Z

Yes, this is correct. Below are the hyperparameters used for training:
• 3 Layers LSTM
• Hidden size 1150
• Embedding size 200
• Dropout 0.5
• Sequence Length 240
• LR 20
• Batch size 100
• Annealing 0.25
• Gradient clipping 0.25

khituras · 2019-02-20T08:46:49Z

@jessepeng Thank you so much for this specification. Was there some specific evaluation strategy that lead you to choose these parameters?
@alanakbik Will those be available in the documentation for the embeddings? I think that would be very important - for any embedding actually - so the users know what they are working with and if it would make sense to train embeddings themselves with different parameters.

alanakbik · 2019-02-20T10:28:40Z

Yes, good point - we'll add this to the documentation with the release!

pinal-patel · 2019-03-12T07:46:48Z

Can we know the statistics of test and validation dataset and what is the perplexity on test and validation dataset?

jessepeng · 2019-03-12T17:07:15Z

@khituras No, I chose most of those parameters because they were the standard parameters of Flair. I did however choose the number of layers and number of hidden dimensions to be in accordance to a word-level LM I also trained on the same corpus. The architecture and hyperparameters I chose for this LM follow Merity et. al. 2017.

@pinal-patel The dataset consisting of the aforementioned 1.219.734 abstract was split 60/10/30 into train/validation/test datasets. The perplexities on train/val/test were 2,15/2,08/2,07 for the forward model and 2,19/2,1/2,09 for the backward model.

shreyashub · 2019-06-20T06:10:05Z

@jessepeng Did you start the training from scratch on Pubmed abstracts or did you further fine tune on a model trained on Wiki or some similar dataset?
Also, how long did it take and on what H/W?

shreyashub · 2019-06-26T06:07:54Z

@jessepeng ?
@alanakbik, if I need to further train these embeddings on more data. What changes need to be made to Tutorial 9 ?

jessepeng · 2019-06-26T10:37:54Z

@shreyashub I started training from scratch. I trained each direction for about 10 days on a GeForce GTX Titan X.

alanakbik · 2019-06-26T12:27:47Z

Hello @shreyashub to fine tune an existing LanguageModel, you only need to load an existing one instead of instantiating a new one. The rest of the training code remains the same as in Tutorial 9:

from flair.data import Dictionary
from flair.models import LanguageModel
from flair.trainers.language_model_trainer import LanguageModelTrainer, TextCorpus

# get your corpus, process forward and at the character level
corpus = TextCorpus('/path/to/your/corpus',
                    dictionary,
                    is_forward_lm,
                    character_level=True)

# instantiate an existing LM, such as one from the FlairEmbeddings
language_model = FlairEmbeddings('news-forward-fast').lm

# use the model trainer to fine-tune this model on your corpus
trainer = LanguageModelTrainer(language_model, corpus)

trainer.train('resources/taggers/language_model',
              sequence_length=10,
              mini_batch_size=10,
              max_epochs=10)

Note that when you fine-tune, you automatically use the same character dictionary as before and automatically copy the direction (forward/backward).

shreyashub · 2019-06-29T16:30:17Z

as PooledFlairEmbeddings('pubmed-forward').lm does not exist, do we train FlairEmbeddings and use them instead in PooledFlairEmbeddings. But I don't think that makes sense. What can I do? @alanakbik

alanakbik · 2019-07-01T12:53:11Z

Yes that works - the pooled variant just builds on top of FlairEmbeddings, so you can train with FlairEmbeddings('pubmed-forward').lm and then use the resulting embeddings either as FlairEmbeddings or as PooledFlairEmbeddings.

Added PubMed embeddings computed by @jessepeng

cb0a2d1

alanakbik merged commit aaa8a2f into release-0.4.1 Feb 19, 2019

alanakbik deleted the GH-518-pubmed-flair branch February 19, 2019 13:28

alanakbik mentioned this pull request Feb 19, 2019

Pre-trained PubMed model for FlairEmbeddings #518

Closed

bharatr21 mentioned this pull request Feb 22, 2019

Word embeddings from Pubmed and PMC #552

Closed

alanakbik pushed a commit that referenced this pull request Jun 26, 2019

GH-519: add info on how to fine-tune to tutorial 9

d33aa6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added PubMed embeddings computed by @jessepeng #519

Added PubMed embeddings computed by @jessepeng #519

alanakbik commented Feb 19, 2019

khituras commented Feb 19, 2019 •

edited

Loading

alanakbik commented Feb 19, 2019

jessepeng commented Feb 19, 2019

khituras commented Feb 20, 2019

alanakbik commented Feb 20, 2019

pinal-patel commented Mar 12, 2019

jessepeng commented Mar 12, 2019

shreyashub commented Jun 20, 2019

shreyashub commented Jun 26, 2019

jessepeng commented Jun 26, 2019

alanakbik commented Jun 26, 2019

shreyashub commented Jun 29, 2019 •

edited

Loading

alanakbik commented Jul 1, 2019

Added PubMed embeddings computed by @jessepeng #519

Added PubMed embeddings computed by @jessepeng #519

Conversation

alanakbik commented Feb 19, 2019

khituras commented Feb 19, 2019 • edited Loading

alanakbik commented Feb 19, 2019

jessepeng commented Feb 19, 2019

khituras commented Feb 20, 2019

alanakbik commented Feb 20, 2019

pinal-patel commented Mar 12, 2019

jessepeng commented Mar 12, 2019

shreyashub commented Jun 20, 2019

shreyashub commented Jun 26, 2019

jessepeng commented Jun 26, 2019

alanakbik commented Jun 26, 2019

shreyashub commented Jun 29, 2019 • edited Loading

alanakbik commented Jul 1, 2019

khituras commented Feb 19, 2019 •

edited

Loading

shreyashub commented Jun 29, 2019 •

edited

Loading