Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraping GC retention indices from NIST #152

Closed
Aariq opened this issue May 25, 2018 · 4 comments
Closed

Scraping GC retention indices from NIST #152

Aariq opened this issue May 25, 2018 · 4 comments

Comments

@Aariq
Copy link
Collaborator

Aariq commented May 25, 2018

It would be amazing if there was a function to get literature retention indices from webbook.nist.gov using CAS numbers or other identifiers. I'd be happy to work on this function, but would love some feedback on how it would work first. Right now I'm thinking there would be an argument for CAS numbers, and maybe the following options:

  1. An option to choose which type of RI. webbook.nist.gov has three different retention indices: Kovats, Normal Alkane, and Van Den Dool and Kratz.
  2. An option for polar or non-polar column.
  3. An option to specify column types (e.g. DB-5, Supelcowax-10, etc.)
  4. This would sometimes match many different literature RIs. An option to return an average, or all of them (as a data frame with reference and comments).
  5. Possibly some way of prioritizing the above arguments? For example, if a Van Den Dool and Kratz RI isn't listed, get the Normal Alkane one.

I'd love to hear thoughts and feedback.

@eduardszoecs
Copy link
Member

I've found some info on the net:

They seem to not provide an API :( - So you need to go the web-scraping way...

Please be nice to the server (restrict the calls to 3/second or choose from a gamma distribution, to avoid getting blocked).
Some xpath (via xml2 package) is definitively beneficial.
See chemid.R and pan.R here for examples (code is quite old, newer methods might be available).
hmtltidy mitght be also useful for development.

Hope this helps,

Edi

@eduardszoecs
Copy link
Member

Related PR: #154

@eduardszoecs
Copy link
Member

@Aariq This can be closed, as now implemented by you?

@Aariq
Copy link
Collaborator Author

Aariq commented Jun 25, 2018

yes, it can be closed.

@Aariq Aariq closed this as completed Jun 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants