Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MedMentions Dictionary file created #3

Closed
saranyakrishm opened this issue May 5, 2022 · 1 comment
Closed

MedMentions Dictionary file created #3

saranyakrishm opened this issue May 5, 2022 · 1 comment

Comments

@saranyakrishm
Copy link

Hello Team
Thank you for your great contribution. Can you please brief me on how was the medmentions dictionary file was created to run evaluations.

Thanks
Saranya

@hardyqr
Copy link
Collaborator

hardyqr commented May 10, 2022

Hi Saranya, thanks for your interest in our work!
We downloaded UMLS2017AA's MRCONSO file and extracted the CUI-name pairs in the following form:

C0079564||htlv-ii rex protein
C0162999||oleoylamine
C0347197||benign mouth neoplasm
...

In the end, there should be around 7.4M individual lines (after deleting duplicated lines) and around 3.4M individual concepts (CUIs) in the dictionary file. Hope this helps!

Best,
Fangyu

@hardyqr hardyqr closed this as completed May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants