-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to convert text to Input.pubtator (NER) required by BIORED #8
Comments
AIONER does not link entities to their corresponding concept identifiers (e.g., NCBI gene IDs). However, BioREx relies on these concept identifiers. Within PubTator3, we have integrated several normalization tools, including GNorm2, TaggerOne, the NLM-Chem model, and tmVar3, to support the normalization process (https://www.ncbi.nlm.nih.gov/research/pubtator3/api). If you just want to process PubMed abstracts, we have processed them, and the results can be accessed at https://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator3. For questions regarding the AIONER tool, you may contact Dr. Luo (lingluo@dlut.edu.cn). |
Hi, can I know how many papers have you processed? Using FTP I was only
able to get relations for 9 million papers.
…On Wed, 5 Jun 2024 at 22:50, Po-Ting Lai ***@***.***> wrote:
Hi @Khyati-Microcrispr <https://github.com/Khyati-Microcrispr>,
AIONER does not link entities to their corresponding concept identifiers
(e.g., NCBI gene IDs). However, BioREx relies on these concept identifiers.
Within PubTator3, we have integrated several normalization tools, including
GNorm2, TaggerOne, the NLM-Chem model, and tmVar3, to support the
normalization process (https://www.ncbi.nlm.nih.gov/research/pubtator3/api).
If you just want to process PubMed abstracts, we have processed them, and
the results can be accessed at
https://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator3. For questions regarding
the AIONER tool, you may contact Dr. Luo ***@***.***).
—
Reply to this email directly, view it on GitHub
<#8 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BG5NJYQVA6I2VIHUUL3B373ZF5CHXAVCNFSM6AAAAABIYCGZM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJQGU3TMMZWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @Khyati-Microcrispr , We processed all PubMed abstracts, totaling around 37 million, but only a quarter of the abstracts contained relations. |
Hi, I hope this message finds you well.
I am writing to address a concern regarding the identification of
chemicals, antibodies, and peptides in PubTator. Specifically, I have
encountered issues where certain entities, such as "Tirzepatide" and
"Pascolizumab," do not have unique IDs or are clustered with numerous other
entities.
For example:
- Searching for "Tirzepatide" returns multiple entries without unique
IDs.
- Similarly, "Pascolizumab" results in clusters of entities without
clear unique identification.
Here are some search results from the dataset:
- *Tirzepatide:*
33325008 Chemical -
cTirzepatide|twincretin|DPP4i|aTirzepatide|TZP|anti-hyperglycaemic
agents|bTirzepatide|SGLT2i|oral anti-hyperglycaemic
medication|OAM|SU|diacid PubTator3
- *Pascolizumab:*
27637004 Chemical - pitakinra|lebrikizunab|pascolizumab
PubTator324032029 Chemical -
PIP|steroidsensitive|molecular|inositol triphosphate|SCH
900117|agents|CNTO|RG4934|pascolizumab|pathogen|molecular pattern
molecules PubTator332380052 Chemical -
SPMs|resolvins|ICS|SB010|beta2-agonists|SABA|inhaled
corticosteroids|microbial associated molecular
patterns|anti|LABA|muscarinic acetylcholine receptor
antagonist|Aerovant|leukotriene (LT)B4|methylene and toluidine
blue|pascolizumab|MAMPs PubTator3
Could you please provide insights on whether there are plans to resolve
these issues, particularly regarding the assignment of unique IDs to these
entities and their relationships? Any improvements or updates in this
regard would be greatly appreciated.
Thank you for your attention to this matter.
Best regards,
Khyati
…On Wed, 10 Jul 2024 at 00:52, Po-Ting Lai ***@***.***> wrote:
Hi @Khyati-Microcrispr <https://github.com/Khyati-Microcrispr> ,
We processed all PubMed abstracts, totaling around 37 million, but only a
quarter of the abstracts contained relations.
—
Reply to this email directly, view it on GitHub
<#8 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BG5NJYSVLTTZEC253ED2TW3ZLQZ5XAVCNFSM6AAAAABIYCGZM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJYGQ3DSMJRGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi,
Biored ran efficiently, thank you for your help. I have one more favor to ask. How can I perform Named Entity Recognition (NER) and linking in the format required by BioRED for relation prediction? I have input data containing text, titles, and PubMed IDs. I tried using AIONER, but it's not working. I also tried raising an issue on AIONER's GitHub, but no one is replying. Could you please provide me with the correct AIONER code and environment setup, along with the CUDA and cuDNN versions? I am using Ubuntu 22.04, GPU: RTX 4090. Alternatively, if there is any other way to accomplish this task, please let me know.
The text was updated successfully, but these errors were encountered: