Spanish clinical flair embeddings #2292

matirojasg · 2021-06-03T18:15:30Z

Hi, in the research group where I work (http://pln.cmm.uchile.cl/grav/en), we have trained flair embeddings in Spanish in the clinical context.

We have fine-tuned an existing LM (es-forward/es-backward provided by @iamyihwa), and we have trained (>1 week) on a Chilean clinical dataset (created by the same group) with around 50 million words. We have good perplexity values, and when generating random text, it generates text close to natural language.

We would like to know the steps to follow to upload these models to the site. Do we have to test it on the NER task for some medical dataset in Spanish and show you the results?

To our knowledge, there is no flair embedding model in Spanish in the clinical context :)

If you want to know more about what this corpus is about, you can see the paper published last year: https://www.aclweb.org/anthology/2020.clinicalnlp-1.32/

alanakbik · 2021-06-03T19:06:14Z

Hello @matirojasg thanks for offering to add the models, people would surely find this useful!

The standard way would be to put the model on a server and do a pull request to add the auto-downloading functionality to the FlairEmbeddings class. If you like, I can put your models on our faculty server and also do the pull request (this is how we've been mostly doing it). Alternatively you can do the PR and/or put the models on your own server. Both great for me!

matirojasg · 2021-06-03T20:05:22Z

Thank you for the quick response.

"If you want, I can put your models on our faculty server and also do the pull request (that's how we've been doing it most of the time)." I prefer this option.

Do I share the files with you by drive? Which files do you need exactly?

alanakbik · 2021-06-06T11:55:55Z

Hello @matirojasg yes if you send me a mail with a link to a drive folder where models are, I can put them on our server! Thanks again!

matirojasg · 2021-06-07T01:15:10Z

Here is the link to the clinical models in Spanish, let me know if any file is missing or you can't see the drive.

https://drive.google.com/drive/folders/1M1b5FzZqEebTF7B2l58GQvciF4SXP5dT?usp=sharing

Thanks!

alanakbik · 2021-06-14T12:25:37Z

Hi @matirojasg I put then on our server: https://flair.informatik.hu-berlin.de/resources/embeddings/flair/

Would you like to do the PR for integration into Flair, or should I?

matirojasg · 2021-06-14T13:57:04Z

Could you do it, please? Thank you :)

GH-2292: add support for Spanish clinical Flair embeddings

codemaster-22 · 2021-07-11T12:50:36Z

Hi @matirojasg can you please suggest me size of Corpus to fine tune language model 'news-forward' on english tweets , I am currently thinking to follow 50 million words as mentioned by you. But will it be fine? please suggest me

stale · 2021-11-09T09:51:29Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

matirojasg added the question Further information is requested label Jun 3, 2021

alanakbik added a commit that referenced this issue Jun 29, 2021

GH-2292: add support for Spanish clinical Flair embeddings

2b42a03

alanakbik added a commit that referenced this issue Jul 1, 2021

Merge pull request #2323 from flairNLP/GH-2292-spanish-clinical

8b9cf18

GH-2292: add support for Spanish clinical Flair embeddings

stale bot added the wontfix This will not be worked on label Nov 9, 2021

stale bot closed this as completed Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spanish clinical flair embeddings #2292

Spanish clinical flair embeddings #2292

matirojasg commented Jun 3, 2021

alanakbik commented Jun 3, 2021

matirojasg commented Jun 3, 2021

alanakbik commented Jun 6, 2021

matirojasg commented Jun 7, 2021

alanakbik commented Jun 14, 2021

matirojasg commented Jun 14, 2021

codemaster-22 commented Jul 11, 2021 •

edited

stale bot commented Nov 9, 2021

Spanish clinical flair embeddings #2292

Spanish clinical flair embeddings #2292

Comments

matirojasg commented Jun 3, 2021

alanakbik commented Jun 3, 2021

matirojasg commented Jun 3, 2021

alanakbik commented Jun 6, 2021

matirojasg commented Jun 7, 2021

alanakbik commented Jun 14, 2021

matirojasg commented Jun 14, 2021

codemaster-22 commented Jul 11, 2021 • edited

stale bot commented Nov 9, 2021

codemaster-22 commented Jul 11, 2021 •

edited