Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DTXSIDs into RMassBank identifiers #215

Closed
schymane opened this issue May 7, 2019 · 8 comments · Fixed by #229
Closed

Add DTXSIDs into RMassBank identifiers #215

schymane opened this issue May 7, 2019 · 8 comments · Fixed by #229

Comments

@schymane
Copy link
Member

schymane commented May 7, 2019

The EPA have preliminary webservices now available to retrieve DTXSIDs by InChIKey, we should add this into RMassBank to retrieve these for new records (and note that this may change again in the future)

https://github.com/MassBank/RMassBank/blob/master/R/webAccess.R

https://actorws.epa.gov/actorws/chemIdentifier/v01/resolve?identifier=IKHGUXGNUITLKF-UHFFFAOYSA-N
https://actorws.epa.gov/actorws/chemIdentifier/v01/resolve.json?identifier=IKHGUXGNUITLKF-UHFFFAOYSA-N
https://actorws.epa.gov/actorws/chemIdentifier/v01/resolve.xml?identifier=IKHGUXGNUITLKF-UHFFFAOYSA-N

@adelenelai would you be interested in looking into this?

We are looking at doing this initially database side to get DTXSIDs into records already created:
MassBank/MassBank-data#66
and
MassBank/MassBank-web#80

@adelenelai
Copy link

working on it

@meier-rene
Copy link

We have added this in MassBank. If you like you can skip this because its easy to post process records in MassBank.

@tsufz
Copy link
Member

tsufz commented May 14, 2019

@meier-rene I don't agree to process everything with MassBank only. MassBank records should be applicable on a private repository with parsing tools without MassBank in game. DTXSID is an important identifier and thus it should be processed with RMassBank.

@adelenelai
Copy link

Hi, EPA webservices seem to be down now, the links in @schymane 's original post don't work @ChemConnector
Are there new URLs which work?

@ChemConnector
Copy link

ChemConnector commented May 16, 2019 via email

@schymane
Copy link
Member Author

The services should be up again @adelenelai can you check if they work for you now?
See also MassBank/MassBank-data#68 for updates

@adelenelai
Copy link

Note (mostly to self, also for documentation):

It is possible for a DTXSID to not exist for a particular compound, because not all stereochemistries of a compound exist in the CompTox Dashboard.

Assuming webservices and Dashboard are in sync, (see MassBank/MassBank-data#68) any attempts to manually curate infolists in Step 2 of mbWorkflow for DTXSID will not be successful either - the DTXSID just does not exist!

Therefore, if final MB record does not contain the field CH$LINK: COMPTOX, it should not be perceived as bug but rather reflects inherent non-existence of that particular DTXSID. (This behaviour was modelled after that of other pre-existing identifiers e.g. CHEBI.)

Whether an explicit declaration should be incorporated for these cases e.g. CH$LINK: COMPTOX **none found** is another issue...

In future: if the DTXSID does come into existence over time, post-MB-record-generation-and-upload, Rene's post-processing would be handy.

@schymane
Copy link
Member Author

schymane commented Jun 3, 2019

Yes, it is common that identifiers are missing, so modelling this on the way that ChEBI identifiers are handled is the right way to go ... I don't think we need an explicit declaration because the fact that it is missing is implicitly clear in the absence of a corresponding entry in the infolist. Some of the identifiers like KEGG, ChEBI and LipidMaps have very few matches and explicit statements would get annoying over time... and yes, Rene's workflow will catch those that do appear later.
The validation should account for potential identifier deprecation over time ... but this is another (trickier) topic ... @meier-rene

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants