New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed stripping of . from extracted numbers #136
Conversation
Right now numbers are not extracted corrected. For examples in pp_query. `extr_num("Melting Pt : -44.6 deg C")` strips the "." and gives -446. My fix keeps "."'s. It will fail if there is a dot in another place though. `extr_num("Melting Pt : -44.6 deg C.")` gives NA. Alternatively you could use `readr::parse_number` that also handles that.
Hmm. This strips the - instead. |
OK fixed that. |
Codecov Report
@@ Coverage Diff @@
## master #136 +/- ##
=======================================
Coverage ? 0%
=======================================
Files ? 17
Lines ? 1622
Branches ? 0
=======================================
Hits ? 0
Misses ? 1622
Partials ? 0
Continue to review full report at Codecov.
|
Evidently CS is sometimes not escaping properly special characters. Here is a fix for the problem I found. I could not find a general way to fix this. I guess if there was one parsers would do it.
Some chemspider records are missing epiSuite data. For those the parsing will fail.
Cherry-picked all suggestions. |
Right now numbers are not extracted corrected.
For examples in pp_query.
extr_num("Melting Pt : -44.6 deg C")
strips the "." and gives -446.My fix keeps "."'s. It will fail if there is a dot in another place though.
extr_num("Melting Pt : -44.6 deg C.")
gives NA.Alternatively you could use
readr::parse_number
that also handles that.