-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add support for molecular fields (AMRex) #4908
ENH: add support for molecular fields (AMRex) #4908
Conversation
Hi! Welcome, and thanks for opening this pull request. We have some guidelines for new pull requests, and soon you'll hear back about the results of our tests and continuous integration checks. Thank you for your contribution! |
yt/frontends/boxlib/fields.py
Outdated
lab = r"X\left(%s%s\right)" | ||
tex_label = lab % spec_match.groups()[::-1] | ||
|
||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bare except statements should be avoided, please only catch specific exceptions. The code in the try
branch should also be limited to the part that's expected to sometimes raise exceptions.
Thanks for doing this, @simonguichandut ! It's tough for us to anticipate all the different representations, so making sure we're up to date on how codes represent molecules and aligning that with your expectations is really important. Thank you! |
I tested this with a Castro and MAESTROeX plotfile and it works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this ! Here's a couple questions and suggestions.
yt/frontends/boxlib/fields.py
Outdated
|
||
except AttributeError as e: | ||
# Catch exception cases for e.g. molecules (H2O, C6H6) | ||
print("Could not parse species ", field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not use print
to log errors. You could use ytLogger.error
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really an "error" and ytLogger prints a whole traceback, not sure that is the ideal solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really an "error"
Can you clarify ? To me, seeing return
inside an except
blocks means we've encountered a recoverable exception (error).
ytLogger prints a whole traceback
I'm not sure what you mean here. ytLogger
is meant to print messages to the console and we generally don't include (even partial) tracebacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm using the wrong terminology. I'm just trying to say that the "error" in this case is not a "big deal". The species is not recognized, so we won't try to produce a nice tex label, that's it. ytLogger prints out very long error message and the whole call stack, which feels like overkill.
Here's what I get
python $MAESTROEX_DIR/Util/yt/plotsinglevar.py plt_0000100 "X(H2O)"
--- Logging error ---
Traceback (most recent call last):
File "/Users/simon/anaconda3/lib/python3.11/logging/__init__.py", line 1110, in emit
msg = self.format(record)
^^^^^^^^^^^^^^^^^^^
File "/Users/simon/anaconda3/lib/python3.11/logging/__init__.py", line 953, in format
return fmt.format(record)
^^^^^^^^^^^^^^^^^^
File "/Users/simon/anaconda3/lib/python3.11/logging/__init__.py", line 687, in format
record.message = record.getMessage()
^^^^^^^^^^^^^^^^^^^
File "/Users/simon/anaconda3/lib/python3.11/logging/__init__.py", line 377, in getMessage
msg = msg % self.args
~~~~^~~~~~~~~~~
TypeError: not all arguments converted during string formatting
Call stack:
File "/Users/simon/Desktop/codes/AMREX-Astro/MAESTROeX/Util/yt/plotsinglevar.py", line 94, in <module>
plot_single_var(args.plotfile, args.outfile, args.variables, use_log, args.norm, args.minimum, args.maximum)
File "/Users/simon/Desktop/codes/AMREX-Astro/MAESTROeX/Util/yt/plotsinglevar.py", line 52, in plot_single_var
plots = yt.SlicePlot(ds, norm_axis, var_names)
File "/Users/simon/Desktop/codes/yt-git/yt/visualization/plot_window.py", line 1821, in __init__
(bounds, center, display_center) = get_window_parameters(
File "/Users/simon/Desktop/codes/yt-git/yt/visualization/plot_window.py", line 70, in get_window_parameters
width = ds.coordinates.sanitize_width(axis, width, None)
File "/Users/simon/Desktop/codes/yt-git/yt/geometry/coordinates/coordinate_handler.py", line 300, in sanitize_width
self.ds.index
File "/Users/simon/Desktop/codes/yt-git/yt/data_objects/static_output.py", line 613, in index
self.create_field_info()
File "/Users/simon/Desktop/codes/yt-git/yt/data_objects/static_output.py", line 665, in create_field_info
self.field_info.setup_fluid_fields()
File "/Users/simon/Desktop/codes/yt-git/yt/frontends/boxlib/fields.py", line 475, in setup_fluid_fields
nice_name, tex_label = _nice_species_name(field)
File "/Users/simon/Desktop/codes/yt-git/yt/frontends/boxlib/fields.py", line 534, in _nice_species_name
ytLogger.error("Could not parse species ", field)
Message: 'Could not parse species '
Arguments: ('X(H2O)',)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The traceback isn't a feature of the logger, but shows a mistake in your call to ytLogger.error
:
Instead of
ytLogger.error("Could not parse species ", field)
write
ytLogger.error("Could not parse species %s", field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thank you!
yt/frontends/boxlib/fields.py
Outdated
# if the species field is a descriptive name, then the match | ||
# on the integer will be blank | ||
# modify the tex string in this case to remove spurious tex spacing | ||
lab = r"X\left(^{%s}%s\right)" | ||
if spec_match.groups()[-1] == "": | ||
lab = r"X\left(%s%s\right)" | ||
tex_label = lab % spec_match.groups()[::-1] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the existence of #4845, where this file is moved, we're going to get merge conflict anyway so let's make their resolution as easy as possible by keeping the diff minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand, you are only suggesting to remove the blank line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #4845 is ready to merge, right?
should we wait on anything else in that PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand, you are only suggesting to remove the blank line here?
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we wait on anything else in that PR?
I just want to honor the "one week or so" delay I promised in my last call for objections. Reviews came in faster than I anticipated, but there's no harm in waiting a couple days more, I think.
yt/frontends/boxlib/fields.py
Outdated
# sometimes we make up descriptive names (e.g. ash). | ||
# In niche cases, we might have a molecule with a number | ||
# in the middle (e.g. H2O). We ignore those. | ||
if any(char.isdigit() for char in field) and field[-2].isdigit(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what this branch is supposed to capture. Both examples in the comment (C12
, H20
) check field[-2].isdigit()
, so they both go through, but what I understand from the comment is that C12
should go, and H2O
shouldn't. What am I missing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the strings in the plotfiles are X(C12), X(H2O). Previously the last character before the ")" parenthesis would always be a number, but not in the H2O case, which will raise an error in weight = int(weight)
a few lines down. Note that this block of code is for the moment superfluous, element and weight are not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should reflect that then. Ideally the code would be expressive enough that a comment wouldn't even be needed.
Are you saying the whole block is superfluous ? If so, I think it should be removed, or at least not touched in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was put in there so eventually we could use the element and weight variables (@zingale ?)
yt/yt/frontends/boxlib/fields.py
Lines 509 to 510 in d7209f9
# Here we can, later, add number density using 'element' and | |
# 'weight' inferred above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only made this change so that the code wouldn't break when there is a species like H2O.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't the ideal fix be to extend the regexp so that it doesn't break for these cases ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mentioned this when I submitted the PR. I tried for a while but it was too difficult for me. Ideally you would also make the numbers subscripts instead of superscript for the tex label for molecules. Again, having molecules in the first place is a very niche use of castro/maestro...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for clarifying. I suggest we slightly rephrase the logging error message to make it clear that it's a missing feature, not an actual bug, so that readers might choose to give it a try too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a brief something, but please feel free to edit
pre-commit.ci autofix |
yt/frontends/boxlib/fields.py
Outdated
# In niche cases, we might have a molecule with a number | ||
# in the middle (e.g. X(H2O)). Check if the last character before | ||
# the ) is a digit. | ||
if field[-2].isdigit(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find this too fragile, but digging for a way to make it more robust, I think I ended up solving the problem of parsing molecules. Do you mind if I push to your branch ? I could also open a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(disclaimer: I'm attending a conference this week so I might not be able to follow up on this immediately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, whatever you think is best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'll set myself a reminder to execute this by next week, thank you for your patience !
I'm going to follow through my comment from last week now. First, I'll rebase the branch to resolve merge conflicts. |
e2ab034
to
7db142a
Compare
Here goes nothing |
yt/frontends/amrex/fields.py
Outdated
# Here we can, later, add number density using 'element' and | ||
# 'weight' inferred above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I was refactoring this branch, I realized it had really been left in a noop "TODO" state for 10 years and I feel like it's not worth keeping it around given that it creates discussion around code that's actually not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also remove the same code in CastroFieldInfo.setup_fluid_fields()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Actually a separable problem so I'll open another PR. Thank you !
6f3b8da
to
474b747
Compare
I ended up radically re-orienting the PR's implementation. Hopefully the behavior I implemented is suitable. |
474b747
to
818d6d5
Compare
Ok, CI is green now, so let's open back for review. @simonguichandut what do you think ? |
The TeX formatting is a bit wonky now: molecules and descriptive names are missing the "X" that the isotopes have, and omega dot for isotopes is rendered like |
Good points ! Separation of concerns FTW ! better now ? |
Yes, much better. Can we also add the tex formatting to CastroFieldInfo? |
done ! |
ba536da
to
f196605
Compare
thanks @yut23 for your input ! Now I think we should give @simonguichandut a couple weeks to comments before we can merge. |
yt/frontends/amrex/fields.py
Outdated
return rf"^{{{count}}}{element}" | ||
|
||
def _to_tex_molecule(self) -> str: | ||
return "".join(rf"{element}_{{{count}}}" for element, count in self._spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch. I adjusted tests and fixed this bug as you suggested !
edb0f30
to
7abf640
Compare
Looks great to me! I've tested on MAESTROeX plotfiles, with and without molecules, and it works as expected. |
7abf640
to
bac589d
Compare
Awesome, I squashed all my commits together, let me know if I should do the same with yours ! |
I'm not sure what this means/does, but either way I will delete my branch and sync with the main repo! :) |
I'm proposing to rewrite the branch history so it contains just one commit. With some tools (mainly |
Co-authored-by: Clément Robert <cr52@protonmail.com>
bac589d
to
5ce793a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
How does this handle something like "H2", with is either an isotope (deuterium) or a molecule (molecular H)? |
H2 does get recognized as the isotope |
Yes, that's intentional, because from what I understand molecules are rarely seen in this context, but it's a shame that there's no way to tell for sure from just looking at the string |
no, I think it's fine. Just wanted to double check. |
Alright let's just merge then. Thanks all ! |
Hooray! Congratulations on your first merged pull request! We hope we keep seeing you around! 🎆 |
PR Summary
The boxlib dataset parsing (for CASTRO/MAESTROeX) are currently set up for usual atomic species like "He4" or "C12", or descriptive names like "ash" with no numbers. The parsing breaks for e.g. molecules like "H2O" or "C6H6". This fix escapes those cases.
Ideally, we could have some fancy regex to correctly parse the molecules (and create proper a TeX string), but this is a very niche use of those codes.
PR Checklist