Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include [-+] in regular expression to allow negative box indices #2936

Merged
merged 13 commits into from
Oct 15, 2020

Conversation

RevathiJambunathan
Copy link
Contributor

PR Summary

The boxlib data output from a WarpX simulation had negative box indices which was leading to the following error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-da7165ce561d> in <module>
      2 ds = yt.load(plotfile)  # Load the from yt.frontends.boxlib.data_structures import AMReXDatasetplotfile
      3 #ds.domain_dimensions
----> 4 ds.field_list # Print all available quantities

~/.local/lib/python3.8/site-packages/yt/data_objects/static_output.py in field_list(self)
    562     @property
    563     def field_list(self):
--> 564         return self.index.field_list
    565 
    566     def create_field_info(self):

~/.local/lib/python3.8/site-packages/yt/data_objects/static_output.py in index(self)
    507             if self._index_class is None:
    508                 raise RuntimeError("You should not instantiate Dataset.")
--> 509             self._instantiated_index = self._index_class(
    510                 self, dataset_type=self.dataset_type
    511             )

~/.local/lib/python3.8/site-packages/yt/frontends/boxlib/data_structures.py in __init__(self, ds, dataset_type)
   1503 class WarpXHierarchy(BoxlibHierarchy):
   1504     def __init__(self, ds, dataset_type="boxlib_native"):
-> 1505         super(WarpXHierarchy, self).__init__(ds, dataset_type)
   1506 
   1507         is_checkpoint = True

~/.local/lib/python3.8/site-packages/yt/frontends/boxlib/data_structures.py in __init__(self, ds, dataset_type)
    344         self.particle_headers = {}
    345 
--> 346         GridIndex.__init__(self, ds, dataset_type)
    347         self._cache_endianness(self.grids[-1])
    348 

~/.local/lib/python3.8/site-packages/yt/geometry/geometry_handler.py in __init__(self, ds, dataset_type)
     34 
     35         mylog.debug("Setting up domain geometry.")
---> 36         self._setup_geometry()
     37 
     38         mylog.debug("Initializing data grid data IO")

~/.local/lib/python3.8/site-packages/yt/geometry/grid_geometry_handler.py in _setup_geometry(self)
     38 
     39         mylog.debug("Parsing index.")
---> 40         self._parse_index()
     41 
     42         mylog.debug("Constructing grid objects.")

~/.local/lib/python3.8/site-packages/yt/frontends/boxlib/data_structures.py in _parse_index(self)
    433             for gi in range(ngrids):
    434                 # components within it
--> 435                 start, stop = _our_dim_finder.match(next(level_header_file)).groups()
    436                 # fix for non-3d data
    437                 # note we append '0' to both ends b/c of the '+1' in dims below

AttributeError: 'NoneType' object has no attribute 'groups'

This PR fixes this bug by adding [-+]?\d to the python regular expression to account for negative indices as well.
Thank you to @atmyers for the suggestion!
With the addition in this PR, we are now able to visualize plotfile data that has negative box indices.
Screenshot from 2020-10-07 21-19-01

PR Checklist

  • pass black --check yt/
  • pass isort . --check --diff
  • pass flake8 yt/
  • pass flynt yt/ --fail-on-change --dry-run -e yt/extern
  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@welcome
Copy link

welcome bot commented Oct 8, 2020

Hi! Welcome, and thanks for opening this pull request. We have some guidelines for new pull requests, and soon you'll hear back about the results of our tests and continuous integration checks. Thank you for your contribution!

@neutrinoceros neutrinoceros added code frontends Things related to specific frontends bug labels Oct 8, 2020
@neutrinoceros
Copy link
Member

Hi @RevathiJambunathan, thank you for this bugfix !
I want to point out first that currently yt.load() never recognizes any dataset as AMReX, see #2806 (the PR is currently on hold but I'll get back at it soon).

I'm curious to see what you get with

ds = yt.load(yourfile)
print(type(ds))

while #2806 isn't dealt with, you can force the AMReXDataset type with

from yt.frontends.boxlib.api import AMReXDataset
ds = AMReXDataset(yourfile)

Now for the present PR itself, I acknowledge this is a bugfix, however I'm wondering if the showcased sliceplot is indeed correct. Namely, I'm surprised that it turns out more that half "empty". Is this what you expect, and if so, why is the data so flat in most of the plot ?

Thanks !

@neutrinoceros
Copy link
Member

neutrinoceros commented Oct 8, 2020

Another point is that flake8 and black are complaining about the lines exceeding 88 chars.
Indeed it seems to me that those regex are somewhat repetitive and could be compacted (this was the case even before your PR, I'm not blaming you :) ).
If you feel like improving this, please do so ! otherwise we can simply silence the linters by appending # noqa E501 to the lines in question.

edit: I meant appending, not prepending

@neutrinoceros
Copy link
Member

Ok one last thing for now, likely the most important one:
this breaks several tests (see report here https://tests.yt-project.org/job/yt_py38_git/1574/), you'll likely need to make it more robust to avoid these boundary effects.

@RevathiJambunathan
Copy link
Contributor Author

Hi @neutrinoceros ,

Thank you for taking a look at the PR!

Here is the output of

ds = yt.load(yourfile)
print(type(ds))

Screenshot from 2020-10-08 06-50-35
It recognizes that the plotfile has boxlib datastructure and in a WarpX dataset.

Yes, I tried AMReXDataset(plotfile) as well and it recognizes the output as an AMReX dataset.
Screenshot from 2020-10-08 06-55-32

Now for the present PR itself, I acknowledge this is a bugfix, however I'm wondering if the showcased sliceplot is indeed correct. Namely, I'm surprised that it turns out more that half "empty". Is this what you expect, and if so, why is the data so flat in most of the plot ?

I understand it is surprising :) However, this is expected for the simulation that I was running.
This is a Lorentz-transformed lab-frame output from a boosted-simulation in WarpX and for the small-period of the simulation for which we printed the output, only a fraction of the lab-frame data is obtained.
Previously, we used to customize the output format for these simulations and some information about the current diagnostics can be found here.
https://warpx.readthedocs.io/en/latest/visualization/backtransformed_diags.html?highlight=back%20tra#visualizing-back-transformed-diagnostics
However, we are in the process of enhancing our diagnostics and ensuring that the data can be written out in plotfile and OpenPMD format.
ECP-WarpX/WarpX#1167
When adding this capability to WarpX, I had visualization issues and with the proposed fix in this PR, I was able to visualize the available lab-frame data in plotfile format.
To summarize -- for the coarse and short simulation that I used for testing and visualizing the plotfile, with negative box indices, the above output is expected -- since only a fraction of the lab-frame data is obtained from the short boosted-frame simulation.

Another point is that flake8 and black are complaining about the lines exceeding 88 chars.
Indeed it seems to me that those regex are somewhat repetitive and could be compacted (this was the case even before your PR, I'm not blaming you :) ).
If you feel like improving this, please do so ! otherwise we can simply silence the linters by prepending # noqa E501 to the lines in question.

Sure! 👍

Ok one last thing for now, likely the most important one:
this breaks several tests (see report here https://tests.yt-project.org/job/yt_py38_git/1574/), you'll likely need to make it more robust to avoid these boundary effects.

Yes! will fix it.

Thank you for reviewing the PR!

@neutrinoceros
Copy link
Member

To summarize -- for the coarse and short simulation that I used for testing and visualizing the plotfile, with negative box indices, the above output is expected -- since only a fraction of the lab-frame data is obtained from the short boosted-frame simulation.

Awesome, thanks for your detailed comment ! I was not expecting this explanation at all and I'm glad the output isn't buggy. :)

@neutrinoceros
Copy link
Member

Good job on fixing the side-effects of this change !

@cphyc
Copy link
Member

cphyc commented Oct 13, 2020

Congrats on the fix! I am however confused as to why there is no need to escape de + in your regexp (i.e. use [-\+]? instead of [-+]?). Do you know why this work?

@RevathiJambunathan
Copy link
Contributor Author

RevathiJambunathan commented Oct 14, 2020

@cphyc Thank you for catching this. It worked because '+' was not explicitly used for the values.
Now it is just [-]?. Thank you to @atmyers for the offline discussions on regexp.

@RevathiJambunathan
Copy link
Contributor Author

@neutrinoceros I tried to make the regex compact and in the process fixed a style error with spaces before and after + operator.
I am not sure why the style check named "black" is failing and not sure how to fix it. Any suggestions? Thank you

@neutrinoceros
Copy link
Member

neutrinoceros commented Oct 14, 2020

black is complaining because you're not using it to auto format your code and it can tell :)

You can make it happy with

pip install black
black yt/

But we can also solve this here directly
/format

edit: well this didn't work

@neutrinoceros
Copy link
Member

/black

@neutrinoceros
Copy link
Member

Oh my, this bot is all over the place, it changed so much more files than it was supposed to (44 !).
I'll revert this.

@neutrinoceros
Copy link
Member

Ok I manually applied black myself to the branch. Should be good now.
One final remark:

Now it is just [-]?

I'm not an expert in regexs, but sn't [-]? strictly equivalent to -? ?

Copy link
Member

@cphyc cphyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments to reduce the degree of wizardry behind the regular expressions! Other than that, thanks you so much for getting your hands in these regexps!

yt/frontends/boxlib/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/boxlib/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/boxlib/data_structures.py Outdated Show resolved Hide resolved
RevathiJambunathan and others added 3 commits October 14, 2020 08:42
Co-authored-by: Corentin Cadiou <contact@cphyc.me>
Co-authored-by: Corentin Cadiou <contact@cphyc.me>
Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of replacing the last commit ("make black happy") because it was touching unrelated docstrings and I prefer we avoid making the history of those confusing.
Otherwise congratz, the PR is now not only a bugfix but it improves the quality and readability of this code, well played !
I'll let @cphyc merge this if he's happy too :)

@cphyc cphyc merged commit 9071f0d into yt-project:master Oct 15, 2020
@welcome
Copy link

welcome bot commented Oct 15, 2020

Hooray! Congratulations on your first merged pull request! We hope we keep seeing you around! 🎆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code frontends Things related to specific frontends
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants