Adding more features for Enzo, yt, and rockstar #205

anchwr · 2022-07-18T13:18:25Z

I've added a handler for Enzo simulations run with the rockstar halo finder, a few extra properties for yt, and the option to translate some of the quantities calculated by rockstar into physical units.

apontzen

Thank you! This looks really neat as we discussed before.

I spotted a few things that I've commented on individually, all very minor.

The failing integration test should start passing if you merge the latest master branch -- it is related to changes there rather than anything that looks wrong here.

I was wondering whether you have any smallish sample files that you would be willing to add to the testing files distribution? This way we could add them to the test database build to guard against us inadvertently breaking things later, as most of us are not using yt so it might fly under the radar.

apontzen · 2022-07-18T14:26:40Z

tangos/input_handlers/halo_stat_files/__init__.py

@@ -206,8 +217,37 @@ def filename(cls, timestep_filename):
        if basename.startswith("snapshot_"):
            timestep_id = int(basename[9:])
            return os.path.join(dirname, "out_%d.list"%timestep_id)
+        elif basename.startswith(("RD","DD")):


Why does the basename starting with RD/DD indicate that rockstar was run with yt?

Good point! I've separated all of the DD/RD checks out from the checks for datasets.txt. The new structure is: check for 'snapshot_' format, check for the existence of 'datasets.txt', then check for 'RD/DD' format. I guess we could also move the datasets.txt check above the 'snapshot_' check in case outputs with this format were run with yt's rockstar, but I don't know enough about GADGET outputs to know if that's useful.

apontzen · 2022-07-19T14:13:39Z

tangos/examples/misc.py

+import re
+
+
+def timestep_index(self,tstep,**kwargs):


I think this is potentially confusing because some people may ingest simulations with only some time steps from the original sequence. If I understand what you're doing here correctly, it's pulling out the probable output number from the filename of the simulation itself. If so, I'm not sure this is worth including, as may cause more confusion than help?

Agreed! I removed this.

apontzen · 2022-07-19T14:14:40Z

tangos/input_handlers/halo_stat_files/__init__.py

@@ -206,8 +217,37 @@ def filename(cls, timestep_filename):
        if basename.startswith("snapshot_"):
            timestep_id = int(basename[9:])
            return os.path.join(dirname, "out_%d.list"%timestep_id)
+        elif basename.startswith(("RD","DD")):


What is the significance of starting with RD or DD -- why does this indicate that rockstar was run with yt? Would it be ok just to look for the existence of datasets.txt as your next check does?

See above - you're right, no reason only Enzo outputs can use yt's rockstar!

apontzen · 2022-07-19T14:15:50Z

tangos/input_handlers/yt.py

+
+    def load_timestep_without_caching(self, ts_extension, mode=None):
+        from yt.data_objects.particle_filters import add_particle_filter
+        if mode!=None:


This should probably be mode is not None for style

apontzen · 2022-07-19T14:24:56Z

tangos/properties/yt/basic.py

+        potids = dbid[masses<halo_entry['Mvir']]
+        return np.array([db.get_halo(x) for x in potids[host_mask]])
+
+class GetTimestepName(LivePropertyCalculation):


I think this already exists as step_path so I would prefer not to duplicate it here under another name

You can find the existing implementation in live_calculation.builtin_functions.IntrinsicProperties

Excellent! Removed this.

apontzen · 2022-07-19T14:28:18Z

tangos/properties/yt/basic.py

+
+class FindCenter(PropertyCalculation):
+    """Returns center array in halo finder units"""
+    names = "Center"


Ideally names of properties implemented in tangos don't have a capital letter in them unless there is a particular reason. e.g. M200 deserves a capital because it is referring to a variable name with the capital, but Center to my mind doesn't because it's just a normal word (so I'd call it center). I realise there are a few exceptions, but generally this is true. Is it possible to make the property names lower case here, or will that break your workflow?

That's fine! I've changed the variable names so that most are lower case. The only ones I left as is were those that are translated directly out of rockstar (e.g., X_Mpc). Is this okay? Or would you prefer that these be lower case, as well?

apontzen · 2022-07-19T14:30:53Z

tangos/tools/consistent_trees_importer.py

+            if match:
+                # Check whether datasets.txt exists (i.e., if rockstar was run with yt)
+                if os.path.exists(os.path.join(basedir, "datasets.txt")):
+                    with open(os.path.join(basedir, "datasets.txt")) as f:


I notice you've used this datasets.txt files in a few different places. It may be that the logic about its use deserves to be in a utility function somewhere, and called from all the different places rather than reimplemented each time. Not 100% necessary, just would be neater if it's possible.

I took a shot at writing a quick utility function for this. For now, it's in tangos/util/read_datasets_file.py. Is this the right place for it? (and does the function itself look okay? It seems to work as expected, but I know there might be a more efficient way to accomplish the same thing)

anchwr · 2022-08-10T04:27:10Z

Thank you so much for your feedback - I'm sorry it's taken me so long to get back around to this! I finally had a chance to do some work on the PR over the last couple of days. I've messaged Molly and Jason about sending you a few sample files. We ran some very small dwarf volumes about a year ago that I think might work and which I doubt they'll mind sharing. I'll let you know when I hear back from them!

anchwr · 2022-08-20T15:14:10Z

It turns out even our dwarf volumes are fairly hefty. However, there are a couple of low resolution Enzo datasets that are both very small and publicly available, so I ran rockstar and my fork of tangos on a few timesteps from one of them. Everything seems to have gone smoothly! I've uploaded all of the relevant data here. Do you think this would work as an addition to the testing files distribution?

apontzen · 2022-08-20T19:52:12Z

Thanks so much for doing this! It is actually much smaller than I meant by 'smallish' (I thought GB would still be fine) but that's not a problem - it can still be added as part of the test database build. Leave that with me and I will try to merge this soon. The code looks great now.

anchwr · 2022-08-21T18:40:37Z

Awesome!

apontzen · 2022-09-13T09:49:48Z

Hi @anchwr -- I've just got back to merging this, and I am wondering if you could guide me on what can be tested using your small snapshots. I can tangos add them successfully, but then it would be good to import/calculate some properties and merger trees to test all is working well. What should I expect to work?

anchwr · 2022-09-13T15:30:09Z

Oh! Good question. Here are the commands I used to make the db:

tangos add enzo.tinycosmo --handler=yt.YtInputHandler --min-particles 100
tangos import-consistent-trees --for enzo.tinycosmo --with-ids
tangos import-properties Mvir Rvir X Y Z VX VY VZ --for enzo.tinycosmo
tangos import-properties Mvir_Msun Rvir_kpc X_Mpc Y_Mpc Z_Mpc --for enzo.tinycosmo
tangos write center center_Mpc --for enzo.tinycosmo
mpiexec -np 8 tangos write Mgas Mcoldgas Mstar contamfrac --for enzo.tinycosmo --backend mpi4py

We probably don't need all of that, as some of it is really testing the same set of functions as other bits, but it should hopefully all work!

apontzen · 2022-09-14T11:35:24Z

This is perfect, thanks.

When you use tangos write, do you see warnings along the lines of check_deleted failed? I am getting these, and I think it means yt is not allowing the individual snapshots to be garbage-collected, which could result in spiralling memory usage. Was wondering if you saw these also.

(You can see an example here: https://github.com/pynbody/tangos/actions/runs/3051994309/jobs/4920876522#step:10:1244)

Final work to merge #205

apontzen · 2022-09-14T13:36:55Z

Thanks very much for all this work. It's now merged. We should deal with check_deleted failed separately, will be interested to hear your thoughts on it

anchwr · 2022-09-14T16:12:45Z

When you use tangos write, do you see warnings along the lines of check_deleted failed? I am getting these, and I think it means yt is not allowing the individual snapshots to be garbage-collected, which could result in spiralling memory usage. Was wondering if you saw these also.

(You can see an example here: https://github.com/pynbody/tangos/actions/runs/3051994309/jobs/4920876522#step:10:1244)

Oh that's bizarre (and a bit worrying)! I don't get those warnings. Maybe I can experiment a bit with yt and gc and see if there are certain circumstances under which they just don't cooperate.

apontzen · 2022-09-14T16:16:52Z

Yes, if you have any insights let me know!

apontzen · 2022-09-14T16:18:21Z

PS it may be a yt version thing? I am using v4.0.5 and so is the github action.

anchwr · 2022-09-14T17:28:14Z

Oh, maybe! I'm on the 4.1 dev version. I will try installing 4.0.5 and see if the warnings appear.

anchwr · 2022-09-14T18:08:08Z

No warnings so far! I'll keep thinking about it, though.

anchwr added 14 commits July 12, 2022 17:11

Added more functionality for yt and rockstar

722a4a7

Updated to use correct get_halo method

3c0d687

Updated to use correct get_halo import path

d2e74e8

Updated to use correct get_halo import path

55182df

Updated fallback for rockstar not run with yt

561655d

Updated fallback for rockstar not run with yt

284d25b

Updated fallback for rockstar not run with yt

8467fe6

Fixed import, typo

8ffcc36

Fixed overwrite

e38f514

Fixed file ordering for rockstar run without yt

7ba7277

Fixed file ordering for rockstar run without yt

cd89fb0

Added physical unit translations for rockstar

1f77e47

Added physical unit translations for rockstar

9a36a85

Fixed rockstar output search path for yt

fb6e5ef

apontzen reviewed Jul 19, 2022

View reviewed changes

anchwr added 3 commits August 8, 2022 21:45

Merge branch 'pynbody:master' into master

ce57bc6

Made datasets check broader, wrote util function for it

c427bc4

Removed print statements

24db730

apontzen added a commit that referenced this pull request Sep 14, 2022

Merge pull request #207 from pynbody/yt-merge

d15f666

Final work to merge #205

apontzen merged commit 24db730 into pynbody:master Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding more features for Enzo, yt, and rockstar #205

Adding more features for Enzo, yt, and rockstar #205

anchwr commented Jul 18, 2022

apontzen left a comment

apontzen Jul 18, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

apontzen Jul 19, 2022

anchwr Aug 10, 2022

anchwr commented Aug 10, 2022

anchwr commented Aug 20, 2022

apontzen commented Aug 20, 2022

anchwr commented Aug 21, 2022

apontzen commented Sep 13, 2022

anchwr commented Sep 13, 2022

apontzen commented Sep 14, 2022

apontzen commented Sep 14, 2022 •

edited

anchwr commented Sep 14, 2022

apontzen commented Sep 14, 2022

apontzen commented Sep 14, 2022

anchwr commented Sep 14, 2022

anchwr commented Sep 14, 2022

Adding more features for Enzo, yt, and rockstar #205

Adding more features for Enzo, yt, and rockstar #205

Conversation

anchwr commented Jul 18, 2022

apontzen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anchwr commented Aug 10, 2022

anchwr commented Aug 20, 2022

apontzen commented Aug 20, 2022

anchwr commented Aug 21, 2022

apontzen commented Sep 13, 2022

anchwr commented Sep 13, 2022

apontzen commented Sep 14, 2022

apontzen commented Sep 14, 2022 • edited

anchwr commented Sep 14, 2022

apontzen commented Sep 14, 2022

apontzen commented Sep 14, 2022

anchwr commented Sep 14, 2022

anchwr commented Sep 14, 2022

apontzen commented Sep 14, 2022 •

edited