-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One vs. two word esters #4
Comments
Original comment by Daniel Lowe (Bitbucket: dan2097, GitHub: dan2097). In IUPAC nomenclature formally only the space separated version is an ester. |
Original comment by Steve Chapman (Bitbucket: isomerdesign, ). I agree with each of your points. The worry in this case, for instance, is that substance is listed (correctly) in the Misuse of Drugs Act but incorrectly in the ACMD report that recommended its addition: http://www.homeoffice.gov.uk/publications/alcohol-drugs/drugs/acmd1/acmd-report-agonists?view=Binary, causing confusion. I suppose what I'd like is a Google-type intervention of the "did you mean finite **state **machine" when one mistypes finite **stale **machine, or //some //indication the name is suspect. Another concern is the missing locant defaults to 2, e.g.. phenyldecanoate = 2-phenyldecanoate. Omitting a locant seems increasingly frowned upon by IUPAC unless there is pretty much no possible ambiguity. Not so here. Consider the difference between 3-hexyl decanoate = hex-3-yl decanoate and 3-hexyldecanoate = 3-(hexyl)decanoate. Even if a missing locant does not defeat the parser////, couldn't it whine a little about it? |
Original comment by Daniel Lowe (Bitbucket: dan2097, GitHub: dan2097). Adding detection for ambiguity would be nice although to do so rigorously is not completely straightforward e.g. hexyl is not ambiguous even thought there are non-equivalent carbons from which a carbon could be removed. I would be keen if ambiguity detection were to be introduced to keep to an absolute minimum the amount of false positives. A charge imbalance could be a good reason to produce a warning (although in some databases such structures do exist), but to actually suggest a cause/solution would require adding a rule to detect this particular problem. While I would be happy to accept contributions to this area of the project I don't think I am going to be able to find the time to look into it personally (my PhD is currently focusing on the automatic extraction of chemical reactions). |
Original comment by Steve Chapman (Bitbucket: isomerdesign, ). Thank you, Daniel. I agree it's not a pressing issue--I just felt it should be noted, really. The fused ring numbering problem is more important. |
Original comment by Daniel Lowe (Bitbucket: dan2097, GitHub: dan2097). I'm not sure whether or not its more important but from a completionist point of view the deficiency in fused ring numbering is very annoying. |
Original comment by Daniel Lowe (Bitbucket: dan2097, GitHub: dan2097). I have added heuristics for treating cases where the space is missing as esters. This version is now up on the web service for testing.
The lattermost rule is required as there is only one possible position for substitution on these structures. The detection of ambiguity is pretty good although not completely fool-proof (due to things like double bonds not having been formally assigned yet rather than problems with the atom environment perception algorithm). I'm a bit dubious about this heuristic as it can result in different interpretations of otherwise very similar names e.g. diethylmalonate -->not ester, diethylsuccinate -->ester, but ethylsuccinate --> not ester (as the position for the ethyl is unambiguous) |
Original report by Steve Chapman (Bitbucket: isomerdesign, ).
Omitting the space makes a difference:
[9-Hydroxy-6-methyl-3-(5-phenylpentan-2-yl)oxy-5,6,6a,7,8,9,10,10a-octahydrophenanthridin-1-yl]acetate
[9-Hydroxy-6-methyl-3-(5-phenylpentan-2-yl)oxy-5,6,6a,7,8,9,10,10a-octahydrophenanthridin-1-yl] acetate
Simper cases exhibit the same behaviour: hexylacetate vs hexyl acetate.
The text was updated successfully, but these errors were encountered: