-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to download all the chemical compound and their related data of an organism from LOTUS ? #27
Comments
Hi! Thank you for your issue. Actually, we support a lot of custom searches (see https://lotus.naturalproducts.net/documentation) but not the specific one you requested. We might provide a SPARQL endpoint in the future to handle such requests but in the meantime, querying Wikidata directly seems a good option. I prepared a query you can easily adapt for you: https://w.wiki/5GSw. You can directly download the results as a tabular file there. Another option could be to use https://pubchem.ncbi.nlm.nih.gov/classification/#hid=115 and search there directly, they offer CSV download also. More generally, the compounds' names are automatically generated so we would advise being very cautious with them. Best |
Thank you for your quick response and valuable suggestion. Can you check and guide me where did i go wrong. |
You were almost there! I think the query you want is: https://w.wiki/5Ggd Your was querying again against whole Wikidata for molecules |
Thanks for the correction and insights. |
Search for "Gentiana" returned 483 natural products in LOTUS Database search in LOTUS webpage. Can you please let me know the reason behind the difference? |
Hi, Not exactly, the query I wrote you gives structure-organism pairs. So the same structure can appear multiple times. If you want to reduce it to distinct structures, here: https://w.wiki/5J73. Hope this clarifies |
Thank you |
I'm trying to do something similar and following your examples, when I run:
I get 20968 results, however when I try to include CASID and INCHIKEY information with the following:
I only get 7967 results. I imagine this might be because the latter query doesn't return instances without a CAS ID or INCHIKEY. Is it possible to return all metabolites found in taxa and leave missing values for the properties as NaN? |
So, i have an organism and i want to download all the chemical compounds related to that organism with their smile ID and the species that produce those chemical compounds.
So what i did was just search in the web page and found all the entries of chemical compounds related to that organism. And downloaded the SDF file which was the only downloading option available. And later converted it to excel format.
But what i realized was that file was missing compound names.
So what i wanted was Compound name, Smile ID, Species it is present.
Is is possible to get it as such from the LOTUS database by any means ?
The text was updated successfully, but these errors were encountered: