-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse edits #179
base: master
Are you sure you want to change the base?
Parse edits #179
Conversation
Well, I think I have the parsing.py and associated test issues sorted. I got stuck on the balancing and reaction modules, and think it would be best if someone more familiar with those could help out. For now, I have the code doing what I need for my purposes but am happy to help finish this up where I can. |
Thank you for working on this. This looks good, but unfortunately with the exception for radicals. I had forgot I used "." to indicate radicals. Here a colon is confusing, and would probably be interpreted as a diradical. The only way I see around this issue is to parse "." in a context sensitive manner, i.e. when between two numbers: it's a decimal point, and when leading (and in front of a letter?) it's a radical. Perhaps someone interested in seeing this getting into master will take over the work here. |
I intend to work on this more, handling it generally like you suggest. I
have the grammar worked out, at least on paper. I’m glad you mentioned the
bit about radicals as I had not considered it yet and will include it.
I had stopped at this point to think about the data structure of the parsed
output. Right now, chempy.util.parsing._get_formula_parser().parseString()
returns a list of element and count pairs. But in _aqueous.py,
ions_from_formula() indicates that parsing ions out of the formula is a
goal and as currently implemented, that would require another parser to
produce ions instead of elements. My thinking was to tie all this together
in parsing.py with the parser and helper functions, and use a dict or class
to hold the original string, the composition, the state, the charge, etc.
and to decide if the compound is ionic or not and store the lists of ions,
complexes, etc. This would make some things easy, like naming ionic
compounds or removing spectator ions from a reaction and hopefully won't
make anything difficult. Thoughts?
…On Sat, Aug 8, 2020 at 12:22 Bjorn ***@***.***> wrote:
Thank you for working on this. This looks good, but unfortunately with the
exception for radicals. I had forgot I used "." to indicate radicals. Here
a colon is confusing, and would probably be interpreted as a diradical. The
only way I see around this issue is to parse "." in a context sensitive
manner, i.e. when between two numbers: it's a decimal point, and when
leading (and in front of a letter?) it's a radical. Perhaps someone
interested in seeing this getting into master will take over the work here.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#179 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOQCHS2PZ2J2GC6JDBNHLU3R7WCWHANCNFSM4PC5T2JQ>
.
|
Implemented changes to parsing.py as discussed in #176. Changed notation for hydrates to ":" instead of "."
changed html, latex, and unicode parsing to match changes