Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parthenon frontend #4323

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

pgrete
Copy link
Contributor

@pgrete pgrete commented Feb 3, 2023

PR Summary

Add yt frontend for codes based on the Parthenon AMR framework (https://github.com/parthenon-hpc-lab/parthenon) including AthenaPK (https://github.com/parthenon-hpc-lab/athenapk), Phoebus (https://github.com/lanl/phoebus) or KHARMA (https://github.com/AFD-Illinois/kharma)
Note that the mesh structure in Parthenon is heavily derived from Athena++, so that's why we used the Athena++ frontend as baseline.

We've been using this frontend for quite some time now (without any recent major changes) so we consider it stable enough to open it for review and merge to yt proper.

At this point I'm looking for

  • general feedback
  • feedback on how to generally handle/differentiate between Parthenon (as framework) and downstream code related functions/fields/... (the current version of this frontend is geared towards AthenaPK and I'd like to abstract this)
  • (assuming a test dataset is desired), how to upload/handle/provide the dataset

PR Checklist

  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@neutrinoceros neutrinoceros self-requested a review February 3, 2023 18:19
@matthewturk
Copy link
Member

This is awesome! Thank you for contributing it. I'll provide a review ASAP.

@neutrinoceros
Copy link
Member

Congratz @pgrete ! I'll make a first pass for QA and will try to focus on stuff that isn't already reported by linters (see pre-commit.ci's report).

pyproject.toml Show resolved Hide resolved
yt/frontends/parthenon/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/parthenon/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/parthenon/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/parthenon/data_structures.py Outdated Show resolved Hide resolved
yt/frontends/parthenon/tests/test_outputs.py Show resolved Hide resolved
@neutrinoceros
Copy link
Member

  • feedback on how to generally handle/differentiate between Parthenon (as framework) and downstream code related functions/fields/... (the current version of this frontend is geared towards AthenaPK and I'd like to abstract this)

I think we have one example of a framework-as-a-dataformat already: the boxlib frontend. In my experience, t's indeed the one for which automatic determination of exact format is hardest to implement and maintain, and the reason why we added a hint keyword argument to yt.load. I hope we'll be able to have a cleaner distinction between formats with this frontend but if the binary data format is to be identical I think one practical approach is to reserve some header in binary files to encode some unique key (like the name of the code) that maps to a known format. Otherwise it may be impossible to guess.

At this point, and I'm not saying that to discourage this pull request, you may want to consider distributing the frontend as an extension package, which can help if frequent contributions are expected and you want to be free from our relatively slow release cycle.

  • (assuming a test dataset is desired), how to upload/handle/provide the dataset

yes that would be a necessary step for us to merge this PR.
The process requires a companion PR to our website repo. The most recent example of such a PR would be yt-project/website#115

@neutrinoceros neutrinoceros added code frontends Things related to specific frontends new feature Something fun and new! labels Feb 4, 2023
Comment on lines 284 to 300
gamma_attrs = {
key: val
for key, val in self._handle["Params"].attrs.items()
if key.endswith("AdiabaticIndex")
}
if "gamma" in self.specified_parameters:
self.gamma = float(self.specified_parameters["gamma"])
elif len(gamma_attrs) >= 1:
self.gamma = next(iter(gamma_attrs.values()))
else:
self.gamma = 5.0 / 3.0
Copy link
Contributor Author

@pgrete pgrete Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forrestglines what's the intention behind this logic, i.e., why creating the gamma_attrs rathern than checking if AdiabaticIndex is already in Params?
(same below for He_mass_fraction)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's due to the actual parameter being Hydro/AdiabaticIndex in AthenaPK which will be different for the other Parthenon codes.

If this logic is too opaque, we should instead look for Hydro/AdiabaticIndex for AthenaPK and implement the adiabatic index as needed for the other codes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's supposed to work right now, an adiabatic index passed in via gamma= will take priority over an adiabatic index defined in the parameters of the Parthenon simulation, then 5./3. if nothing else is found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The He_mass_fraction piece isn't working since at some point we stopped outputing He_mass_fraction and started including mbar_over_kb instead. We could back out He_mass_fraction from mbar_over_kb instead.

Copy link
Contributor

@forrestglines forrestglines Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to fix reading in He_mass_fraction from mbar_over_kb but I'm getting nonsensical results. I'll have to look at it later with fresh eyes. I'll push what I'm working with in case someone can see what I'm doing wrong.

@pgrete
Copy link
Contributor Author

pgrete commented Mar 13, 2023

  • (assuming a test dataset is desired), how to upload/handle/provide the dataset

yes that would be a necessary step for us to merge this PR. The process requires a companion PR to our website repo. The most recent example of such a PR would be yt-project/website#115

Regarding the dataset, am I assuming correctly that the hdf5 file should not use (default) compression, i.e., deflate?

@matthewturk
Copy link
Member

@pgrete I think as long as it doesn't require hdf5plugin it should be OK to use deflate.

@neutrinoceros
Copy link
Member

I think that the reason CI is now failing is that #4344 was merged in between runs. Add test_outputs.py to the list of files that pytest should ignore to resolve this.

@pgrete
Copy link
Contributor Author

pgrete commented Mar 13, 2023

I think that the reason CI is now failing is that #4344 was merged in between runs. Add test_outputs.py to the list of files that pytest should ignore to resolve this.

I'm currently figuring out some useful tests to add (with the appropriate data file).

@pgrete
Copy link
Contributor Author

pgrete commented Mar 13, 2023

Looks like I'm stuck.
How can I actually run tests?

I first thought that the test are answer tests so that I should call them using a class (like TestTipsy described here https://yt-project.org/doc/developing/testing.html#how-to-run-the-answer-tests) but then I realized that the current test_outputs.py does not define a class (or I don't see where it does) -- similar to other test_outputs.py of other frontends.

Then I tried pytest (after reinstalling the project to fix dependencies) and only got

$ pytest yt/frontends/parthenon/tests
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.10.9, pytest-7.2.2, pluggy-1.0.0 -- /usr/bin/python
cachedir: .pytest_cache
rootdir: /home/pgrete/src/yt, configfile: pyproject.toml
plugins: typeguard-2.13.3, anyio-3.6.2
collected 0 items                                                                                                                                                                            

=================================================================================== no tests ran in 0.01s ====================================================================================

so what am I missing?

@pgrete
Copy link
Contributor Author

pgrete commented Mar 17, 2023

I'm a little stuck here.
I realized that the failing test was run with py38 so I thought that my python environment may be too new (also wrt the nose PR for python >3.10).

So I installed a clean env with python 3.8 and the "full" version of yt pip install -e .[full]

Trying the test now results in:

$ pytest yt/frontends/parthenon/tests/test_outputs.py 
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.8.16, pytest-7.2.2, pluggy-1.0.0 -- /home/pgrete/miniconda3/envs/py3.8/bin/python
cachedir: .pytest_cache
rootdir: /home/pgrete/src/yt, configfile: pyproject.toml
collected 0 items / 1 error                                                                                                                                                                  

=========================================================================================== ERRORS ===========================================================================================
_______________________________________________________________ ERROR collecting yt/frontends/parthenon/tests/test_outputs.py ________________________________________________________________
yt/frontends/parthenon/tests/test_outputs.py:11: in <module>
    from yt.utilities.answer_testing.framework import (
yt/utilities/answer_testing/framework.py:24: in <module>
    from nose.plugins import Plugin
../../miniconda3/envs/py3.8/lib/python3.8/site-packages/nose/__init__.py:1: in <module>
    from nose.core import collector, main, run, run_exit, runmodule
../../miniconda3/envs/py3.8/lib/python3.8/site-packages/nose/core.py:12: in <module>
    from nose.loader import defaultTestLoader
../../miniconda3/envs/py3.8/lib/python3.8/site-packages/nose/loader.py:21: in <module>
    from nose.importer import Importer, add_path, remove_path
../../miniconda3/envs/py3.8/lib/python3.8/site-packages/nose/importer.py:12: in <module>
    from imp import find_module, load_module, acquire_lock, release_lock
../../miniconda3/envs/py3.8/lib/python3.8/imp.py:31: in <module>
    warnings.warn("the imp module is deprecated in favour of importlib; "
E   DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
================================================================================== short test summary info ===================================================================================
ERROR yt/frontends/parthenon/tests/test_outputs.py - DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================================================== 1 error in 0.29s ======================================================================================

What am I missing?

@neutrinoceros
Copy link
Member

Sorry for the late reply. GenericArrayTest is a subclass of AnswerTestingTest and as such, it cannot be run with pytest yet (we are still in the middle of the migration process).
As a result, your tests can only be run with Python 3.8 or 3.9 (since nose isn't compatible with Python 3.10 and later).

@pgrete
Copy link
Contributor Author

pgrete commented Mar 17, 2023

So I'd call python tests/nose_runner.py (as that's the command being run by the py38 GitHub check)?
And, if yes, can I just run the Parthenon frontend or do I need to run all tests?

@neutrinoceros
Copy link
Member

test/nose_runner.py is a connivence script. I'm sure it's possible to run only a fraction of tests.
I'm actually very confused seeing the docs recommend using pytest for that
https://yt-project.org/docs/dev/developing/testing.html#how-to-run-the-answer-tests

so clearly I'm not the expert here. Does the documentation help ? if not, maybe @Xarthisius is would be of better guidance.

@pgrete
Copy link
Contributor Author

pgrete commented Mar 20, 2023

I figured it out (for reference nosetests -v --with-answer-testing --local --local-dir $HOME/src/yt-test-data --answer-store --answer-name=local-parthenon yt.frontends.parthenon). Going back to the old doc helped.

I just added a plain Parthenon 2D test (which of course will fail given that the data is not available online).

What do the next steps look like in general (in terms of when to upload/open a PR for the reference files) and the combined review?

With respect to this PR, I plan to make the following two additions:

  • a 3D example (of the 2D type above) but with dx != dy != dz
  • a 3D AthenaPK example (incl changes to work with a code hint) with support for units

Do you recommend any other additional tests?
Is sth else required to complete this PR?

@neutrinoceros
Copy link
Member

What do the next steps look like in general (in terms of when to upload/open a PR for the reference files) and the combined review?

AFAICT it seems mature enough that submitting datasets for upload now wouldn't be inappropriate.

Do you recommend any other additional tests?

I think that frontend tests should be as focused as possible on things that might go wrong specifically with the data format at hand. Tests exposing problems encountered during development are also very welcome, but don't feel like you need to be 100% exhaustive.

Is sth else required to complete this PR?

Eventually we'll want to have 2 approvals from maintainers, but you're close to getting mine already, since my initial comments were all addressed ! I'll need to review again, and I'm ready to do so as soon as needed. Feel free to ping me if you'd rather get my review before dataset are uploaded, otherwise I'll just wait for tests to go green !

@pgrete
Copy link
Contributor Author

pgrete commented Apr 13, 2023

In the process of adding data files of two other codes I'm currently wondering
a) if (and how) yt might handle data from general relativistic simulations, i.e., data that might carry a metric around
b) if (and how) multi material simulation data is supported (e.g., different equations of state for different fields)

@matthewturk
Copy link
Member

@pgrete For a), this should be feasible (@chrishavlin has done some on-the-same-line-of-thought work with integrating pressure to get spatial coordinates) but might need some collaborative thinking, which I'm up for. For b) this should be feasible by even in a worst case using multiple "field types". Either way, both are high priorities, so let's try!

Comment on lines +85 to +87
for mom_field_name in ["MomentumDensity", "cons_momentum_density_"]:
mom_field = ("parthenon", f"{mom_field_name}{i+1}")
if mom_field in self.field_list:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated quite a bit of the field logic (moving things to known_other_fields above) and am now wondering if the registered/resolved fields are potentially already available here.
This would simplify logic here (and at several places below) wrt to just checking if ("gas", "momentum_density_x") is known rather than lookups for our own names in self.field_list.

@forrestglines
Copy link
Contributor

Just pushed some quick changes so that coordinate ordering for cylindrical and spherical coordinates follows Athena++'s conventions, which AthenaPK will inherit.

(This is the last functional addition we had planned for this PR)

@@ -17,7 +17,9 @@
from .fields import ParthenonFieldInfo

geom_map = {
"UniformCartesian": Geometry.CARTESIAN,
"UniformCartesian": (Geometry.CARTESIAN,("x","y","z")),
"UniformCylindrical": (Geometry.CYLINDRICAL,("r","theta","z")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forrestglines I suspect this line is the only reason for your latest commit, so I should point out that you should just use Geometry.POLAR here, so you don't need to specify axis_order at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neutrinoceros that would certainly allow a bit of code cleanup. Unfortunately changing to Geometry.POLAR is breaking my scripts which use fixed resolution buffers. For example, this code to make a slice plot works with both Geometry.POLAR and the current solution with Geometry.CYLINDRICAL:

ds = yt.load(filename)

slc = yt.SlicePlot(ds, "z", ("gas", "density"))

but this code below only works with my current solution with Geometry.CYLINDRICAL:

ds = yt.load(filename)

slc = yt.SlicePlot(ds, "z", ("gas", "density"))
frb = slc.data_source.to_frb(width=(4,"cm"),resolution=512,center=(0,0))
frb["r"] 

with an error being thrown by pixelize_cylinder

File ~/installs-chicoma/miniconda3/envs/py39/lib/python3.9/site-packages/yt/visualization/fixed_resolution.py:600, in CylindricalFixedResolutionBuffer._generate_image_and_mask(self, item)
    597 @override
    598 def _generate_image_and_mask(self, item) -> None:
    599     buff = np.zeros(self.buff_size, dtype="f8")
--> 600     mask = pixelize_cylinder(
    601         buff,
    602         self.data_source["r"],
    603         self.data_source["dr"],
    604         self.data_source["theta"],
    605         self.data_source["dtheta"],
    606         self.data_source[item].astype("float64"),
    607         self.radius,
    608         return_mask=True,
    609     )
    610     self.data[item] = ImageArray(
    611         buff, units=self.data_source[item].units, info=self._get_info(item)
    612     )
    613     self.mask[item] = mask

File ~/installs-chicoma/miniconda3/envs/py39/lib/python3.9/site-packages/yt/utilities/lib/pixelization_routines.pyx:552, in yt.utilities.lib.pixelization_routines.pixelize_cylinder()

ValueError: Buffer has wrong number of dimensions (expected 2, got 1)

Suggestions? Is this something to be fixed in the frontend or in yt? I can provide the data file for this cylindrical dataset. The dataset will be included with our tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to tell for sure without the data file, but I can have a look if you can share it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've pushed the small changes to switch from Geometry.CYLINDRICAL to Geometry.POLAR.

Here's the file I was testing on (with an .txt extension due to github's limitations)

disk_cyl.prim.00000.phdf.txt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you ! I can reproduce locally, which is a great start. I'll try to debug this today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's a yt bug, I can also reproduce it with AMRVAC data

import yt

ds = yt.load_sample("bw_polar_2d")
slc = yt.SlicePlot(ds, "z", ("gas", "density"))
frb = slc.data_source.to_frb(width=(4, "cm"), resolution=512, center=(0, 0))
frb["r"]

I'll open a dedicated issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw issue #4790, the bug is not related to 2D datasets. I also explored this with a 3D dataset and got the same error.

disk_cyl.3d.prim.00000.phdf.txt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic that we have a bug fix started with this so soon. While this did eliminate the error with FRB, the polar coordinates are still not behaving like the cylindrical coordinates. In some comparison plots between Athena++ and AthenaPK using FRB's of a slice generated with the same code, I get plots looking like this:

Screen Shot 2024-01-27 at 3 13 50 PM

My Parthenon frontend (using Geometry.POLAR with your new bugfix) is only getting one quadrant while the Athena++ frontend (using Geometry.CYLINDRICAL) is getting the full disk as expected.

Let me work on a better reproducer and perhaps one using an existing frontend in YT. I'm not convinced either way that this is now the Parthenon frontend's bug or Polar's bug but it's still a problem that changing from cylindrical to polar changes my plots.

@chrishavlin
Copy link
Contributor

chrishavlin commented Jan 25, 2024

just FYI, i hit the button for the tests to run, but you'll likely get unrelated failures after your merge with main (e.g., #4785 ).

@forrestglines
Copy link
Contributor

I've pushed @neutrinoceros 's bug fix to this PR although I'm still working on Polar. You can advise on how you'd like me to fix the git history later on if the merges get messy.

@neutrinoceros
Copy link
Member

Where do the tarballs actually go?

They'll be installed on the server by someone with admin rights (not me).

What's the hash I need to add to the registry? Some flavor of checksum?

See pooch docs for how to produce hashes.

I assume this won't work until steps 1 and 2 are done?

Correct !

@neutrinoceros
Copy link
Member

I'll try to review thoroughly again soon (probably not before the weekend).

@forrestglines
Copy link
Contributor

I've pushed the sha256 hashes of the tarballs. yt.load_sample is working as I would expect it to

ds = yt.load_sample("athenapk_cluster")
slc = yt.SlicePlot(ds,"z",fields=["density",])
slc.show()

ds = yt.load_sample("athenapk_disk")
slc = yt.SlicePlot(ds,"z",fields=["density",])
slc.show()

ds = yt.load_sample("parthenon_advection")
slc = yt.SlicePlot(ds,"z",fields=["advected_0_0",])
slc.show()

@forrestglines
Copy link
Contributor

Needed a small fix to the name of a dataset that I changed as I created the tarball. My nose test passes locally now

nosetests -v --with-answer-testing --local --local-dir $HOME/trash/yt-test-data --answer-store --answer-name=local-parthenon yt.frontends.parthenon
# Runs through 63 tests quickly

@Xarthisius
Copy link
Member

Xarthisius commented Feb 27, 2024

Please add the following change to trigger answer testing on yt infra:

diff --git a/tests/tests.yaml b/tests/tests.yaml
index 25e310d87..a87af7b7b 100644
--- a/tests/tests.yaml
+++ b/tests/tests.yaml
@@ -164,6 +164,11 @@ answer_tests:
   local_nc4_cm1_002: # PR  2176, 2998
     - yt/frontends/nc4_cm1/tests/test_outputs.py:test_cm1_mesh_fields
 
+  local_parthenon_001: # PR 4323
+    - yt/frontends/parthenon/tests/test_outputs.py:test_loading_data
+    - yt/frontends/parthenon/tests/test_outputs.py:test_cluster
+    - yt/frontends/parthenon/tests/test_outputs.py:test_derived_fields
+
 other_tests:
   unittests:
      # keep in sync with nose_ignores.txt

Please note that only those 3 tests are answer tests, i.e. they generate data that can be stored and compared. All the other tests in test_outputs.py are unit tests and should be using @requires_file instead of @requires_ds decorator.

@forrestglines
Copy link
Contributor

Please note that only those 3 tests are answer tests, i.e. they generate data that can be stored and compared. All the other tests in test_outputs.py are unit tests and should be using @requires_file instead of @requires_ds decorator.

Done. Does this mean only these tests are run during the CI? Are the unit tests covered during normal CI for yt?

@Xarthisius
Copy link
Member

Done. Does this mean only these tests are run during the CI? Are the unit tests covered during normal CI for yt?

Unit tests are detected automatically. Answer tests has to be specified in tests.yaml.

@BenWibking
Copy link

"Full testsuite (nose)" fails because it says "there is no old answer available." (https://tests.yt-project.org/job/yt_py310_git/7946/).

Is there a problem here, or does it just need to be re-run?

@BenWibking
Copy link

Is there a way to auto-format to satisfy the checks here: https://results.pre-commit.ci/run/github/88776493/1709230377.pe6ltUfYTG-No8lHXu3CpA?

@Xarthisius
Copy link
Member

@yt-fido test this please.

@neutrinoceros
Copy link
Member

@BenWibking yes, the magic comment is found at the bottom of the page you linked.
Though I don't expect you have clearance to trigger the bot as you're not the PR's author. I'll let @forrestglines take care of it.

There's exactly one lint that cannot be auto-fixed though:

yt/frontends/parthenon/io.py:13:15: C400 Unnecessary generator (rewrite as a `list` comprehension)

please fix this one manually

@forrestglines
Copy link
Contributor

pre-commit.ci autofix

@BenWibking
Copy link

Did the bot... react with a thumbs-down emoji?

@neutrinoceros
Copy link
Member

That's new. Let me see if I can convince it with maintainer authority.

@neutrinoceros
Copy link
Member

pre-commit.ci autofix

@BenWibking
Copy link

It looks like the docs failed to build: https://tests.yt-project.org/job/yt_docs_new/2762/console

@forrestglines
Copy link
Contributor

This broken doc build doesn't look like our doing, should we just merge main? (I'm more asking reviewers if a simple merge from main would be better than another rebase)

/var/jenkins_home/workspace/yt_docs_new/doc/source/reference/field_list.rst:26: ERROR: Unknown directive type "notebook-cell".

.. notebook-cell::

  import yt
  ds = yt.load("Enzo_64/DD0043/data0043")
  for i in sorted(ds.field_list):
    print(i)

Sphinx parallel build error:
nbsphinx.NotebookError: RuntimeError in cookbook/fits_xray_images.ipynb:
Kernel died before replying to kernel_info
make: *** [Makefile:51: html] Error 2
Process leaked file descriptors. See https://www.jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording fingerprints
Recording fingerprints
Setting status of 4bc3027b7b862d76a69d45a3099962f97c1d134c to FAILURE with url https://tests.yt-project.org/job/yt_docs_new/2762/ and message: 'Build finished. '
Using context: docs
Finished: FAILURE

@Xarthisius
Copy link
Member

@yt-fido test this please

@Xarthisius
Copy link
Member

This broken doc build doesn't look like our doing, should we just merge main? (I'm more asking reviewers if a simple merge from main would be better than another rebase)

Don't worry about it. It's not blocking this PR. I re-triggered docs build and it will likely pass this time.

@neutrinoceros
Copy link
Member

Not sure that's a deterministic error. Let me try again:
@yt-fido test this please

@neutrinoceros
Copy link
Member

Docs built alright but this time regular tests failed (also flaky)
@yt-fido test this please

@neutrinoceros
Copy link
Member

All green now ! I hope I'll find time to review this week, but with the imminent release of a major numpy release candidate I may not. Thank you for your patience !

Copy link
Contributor

@chrishavlin chrishavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me! Left a number of questions and comments, but nothing too critical.

Note that only AthenaPK data is currently automatically converted to the
standard fields known by yt.
For other codes, the raw data of the fields stored in the output file is accessible
and a conversion between those fields and yt standard fields need to be done manually.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and a conversion between those fields and yt standard fields need to be done manually.
and a conversion between those fields and yt standard fields needs to be done manually.


* Reading particle data from Parthenon output is currently not supported.
* Only Cartesian coordinate systems are currently supported.
* Other than periodic boundary conditions are currently not supported by the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sentence could be clearer, e.g.: "Only periodic boundary conditions are currently supported"

zrat = self._handle["Info"].attrs["RootGridDomain"][8]
if xrat != 1.0 or yrat != 1.0 or zrat != 1.0:
self.logarithmic = True
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a NotImplementedError error is more appropriate here:

Suggested change
raise ValueError(
raise NotImplementedError(

self.logarithmic = False
self._magnetic_factor = get_magnetic_normalization(magnetic_normalization)

self.geometry = geom_map[self._handle["Info"].attrs["Coordinates"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new docs say only cartesian coordinates are supported -- is that still correct? if that's the case, should probably raise a NotImplementedError here as well or update the docs if that's no longer the case.

Comment on lines +224 to +226
xmin, xmax = self._handle["Info"].attrs["RootGridDomain"][0:2]
ymin, ymax = self._handle["Info"].attrs["RootGridDomain"][3:5]
zmin, zmax = self._handle["Info"].attrs["RootGridDomain"][6:8]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check -- does parthenon always store dummy variables for unused dimensions in 1D or 2D simulations? i.e., in a 2D simulation, zmin and zmax will always be defined? This question also relates to the dimensionality calculation happening down below.

Comment on lines +113 to +115
self.grids = np.empty(self.num_grids, dtype="object")
for i in range(num_grids):
self.grids[i] = self.grid(i, self, levels[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __init__ methods of the self.gridobjects won't reference the grid_right_edge, etc. arrays that are built above until _populate_grid_objects gets called, so I think you can simply move the self.grids[i] assignment to the prior loop and avoid this second iteration over num_grids.

Comment on lines +89 to +91
self.grid_left_edge = np.zeros((num_grids, 3), dtype="float64")
self.grid_right_edge = np.zeros((num_grids, 3), dtype="float64")
self.grid_dimensions = np.zeros((num_grids, 3), dtype="int32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm pretty sure you can actually delete these lines as these attributes will already be allocated:

_parse_index gets called by _setup_geometry during initialization of the index. Since _setup_geometry is not being over-ridden, here's the inherited _setup_geometry:

def _setup_geometry(self):
mylog.debug("Counting grids.")
self._count_grids()
mylog.debug("Initializing grid arrays.")
self._initialize_grid_arrays()
mylog.debug("Parsing index.")
self._parse_index()
mylog.debug("Constructing grid objects.")
self._populate_grid_objects()
mylog.debug("Re-examining index")
self._initialize_level_stats()

That _initialize_grid_arrays method will also be inherited and looks like:

def _initialize_grid_arrays(self):
mylog.debug("Allocating arrays for %s grids", self.num_grids)
self.grid_dimensions = np.ones((self.num_grids, 3), "int32")
self.grid_left_edge = self.ds.arr(
np.zeros((self.num_grids, 3), self.float_type), "code_length"
)
self.grid_right_edge = self.ds.arr(
np.ones((self.num_grids, 3), self.float_type), "code_length"
)
self.grid_levels = np.zeros((self.num_grids, 1), "int32")
self.grid_particle_count = np.zeros((self.num_grids, 1), "int32")

So these lines 3 will be over-writing existing arrays with newly allocated arrays unnecessarily. And since _initialize_grid_arrays is instantiating unyt arrays with units of code_length, you can also delete the casting to unyt arrays happening down below (I'll point out in a comment).

Comment on lines +110 to +111
self.grid_left_edge = self.ds.arr(self.grid_left_edge, "code_length")
self.grid_right_edge = self.ds.arr(self.grid_right_edge, "code_length")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you delete the allocations noted in my previous comment you can also delete these lines.

ds = f[f"/{dname}"]
ind = 0
for chunk in chunks:
if self.ds.logarithmic:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ds intialization raises a NotImplementedError when ds.logarithmic is True. Should that still be the case? If this code is not functional yet can you add an in-line comment?

ng,
)
last_dname = None
for field in fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So most (all other?) frontends loop over chunks then fields, e.g.,

for chunk in chunks:
     # do stuff
     for field in fields:
            # get this fields data for this chunk
            # apply selector

The main reason being that yt's parallel chunking works by sending chunks to different processors where field data is selected. So if you eventually want the Parthenon frontend to work with yt's parallel operations, you'll want to invert the loop order here. I don't think this needs to be done for this PR, but a refactor in the future might be warranted...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to use yt's parallel functionality before, and have not been able to get it to work, so this might explain why. It would be ideal if this could work. Is inverting the loop order the only change that's needed to get it working?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I should have said "to work more efficiently with yt's parallel operations" -- generally it's better to process all the fields within a chunk rather than process every chunk for each field. But after thinking more, it might actually not be too important here since it's reading from a single hdf file -- when chunks are spread across files if you put the chunk loop on the inside you'd be incurring extra cost from opening/closing the same file multiple times.

In terms of getting parallel operations to work -- I don't think there should be more changes. When you run a parallel-enabled function with MPI, I believe the chunks argument should come in to this function with only the chunks associated with the MPI process running the function. I can test that out though, I've been meaning to get back into some parallel indexing related work....

Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long to get my review in. I note that there's also another, mostly unaddressed (from what I can see) review from @chrishavlin. Both need to be addressed before we can merge this.
Thank you for your patience !



@derived_field(name="density", units="g*cm**-3", sampling_type="cell")
def __density(field, data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identifiers with double leading underscores are mangled in Python, which isn't generally needed or desirable
see https://docs.python.org/3/tutorial/classes.html#private-variables

Suggested change
def __density(field, data):
def _density(field, data):

Comment on lines +25 to +28
_cis = np.fromiter(
chain.from_iterable(product([0, 1], [0, 1], [0, 1])), dtype=np.int64, count=8 * 3
)
_cis.shape = (8, 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resulting array is small and simple enough that I'd rather see it hard-coded for readability.

Suggested change
_cis = np.fromiter(
chain.from_iterable(product([0, 1], [0, 1], [0, 1])), dtype=np.int64, count=8 * 3
)
_cis.shape = (8, 3)
# fmt: off
_cis = np.array(
[
[0, 0, 0],
[0, 0, 1],
[0, 1, 0],
[0, 1, 1],
[1, 0, 0],
[1, 0, 1],
[1, 1, 0],
[1, 1, 1],
],
dtype="int64",
)
# fmt: on

"Logarithmic grids not yet supported in Parthenon frontend."
)
else:
self._index_class = ParthenonHierarchy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_index_class is a class attribute, it should be set in the class' body directly.

Comment on lines +85 to +86
if self.ds.logarithmic:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does nothing. Was the intention to return, raise, warn, or do something else ?

)
else:
self._index_class = ParthenonHierarchy
self.logarithmic = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this attribute seems to be meant for internal usage only so I suggest making it private

)
elif ("parthenon", "TotalEnergyDensity") in self.field_list or (
"parthenon",
"cons_total_energy_density",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above (formatting)

Suggested change
"cons_total_energy_density",
"cons_total_energy_density"

Comment on lines +30 to +33
def _read_particles(
self, fields_to_read, type, args, grid_list, count_list, conv_factors
):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is not required, so I'd rather not commit noop code

Suggested change
def _read_particles(
self, fields_to_read, type, args, grid_list, count_list, conv_factors
):
pass

Comment on lines +37 to +38
if any((ftype != "parthenon" for ftype, fname in fields)):
raise NotImplementedError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is reachable, and I find it a bit confusing

Suggested change
if any((ftype != "parthenon" for ftype, fname in fields)):
raise NotImplementedError

for gs in grid_sequences(chunk.objs):
start = gs[0].id - gs[0]._id_offset
end = gs[-1].id - gs[-1]._id_offset + 1
# Deprecate this fallback on next major Parthenon release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what "next" means in this comment ? What's the reference version ?
Or maybe we don't need to deprecate the fallback at all actually, it seems like a reasonable amount of legacy code to carry around, but it'd still be nice if the comment indicated which versions of Parthenon it is for.

for gs in grid_sequences(chunk.objs):
start = gs[0].id - gs[0]._id_offset
end = gs[-1].id - gs[-1]._id_offset + 1
# Deprecate this fallback on next major Parthenon release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code frontends Things related to specific frontends new feature Something fun and new!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants