Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to convert text to Input.pubtator (NER) required by BIORED #8

Open
Khyati-Microcrispr opened this issue Jun 4, 2024 · 4 comments

Comments

@Khyati-Microcrispr
Copy link

Hi,

Biored ran efficiently, thank you for your help. I have one more favor to ask. How can I perform Named Entity Recognition (NER) and linking in the format required by BioRED for relation prediction? I have input data containing text, titles, and PubMed IDs. I tried using AIONER, but it's not working. I also tried raising an issue on AIONER's GitHub, but no one is replying. Could you please provide me with the correct AIONER code and environment setup, along with the CUDA and cuDNN versions? I am using Ubuntu 22.04, GPU: RTX 4090. Alternatively, if there is any other way to accomplish this task, please let me know.

@ptlai
Copy link
Collaborator

ptlai commented Jun 5, 2024

Hi @Khyati-Microcrispr,

AIONER does not link entities to their corresponding concept identifiers (e.g., NCBI gene IDs). However, BioREx relies on these concept identifiers. Within PubTator3, we have integrated several normalization tools, including GNorm2, TaggerOne, the NLM-Chem model, and tmVar3, to support the normalization process (https://www.ncbi.nlm.nih.gov/research/pubtator3/api). If you just want to process PubMed abstracts, we have processed them, and the results can be accessed at https://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator3. For questions regarding the AIONER tool, you may contact Dr. Luo (lingluo@dlut.edu.cn).

@Khyati-Microcrispr
Copy link
Author

Khyati-Microcrispr commented Jul 5, 2024 via email

@ptlai
Copy link
Collaborator

ptlai commented Jul 9, 2024

Hi @Khyati-Microcrispr ,

We processed all PubMed abstracts, totaling around 37 million, but only a quarter of the abstracts contained relations.

@Khyati-Microcrispr
Copy link
Author

Khyati-Microcrispr commented Aug 5, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants