Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange square holes in slices and projections of Arepo data #3672

Closed
jzuhone opened this issue Nov 16, 2021 · 18 comments · Fixed by #3788
Closed

Strange square holes in slices and projections of Arepo data #3672

jzuhone opened this issue Nov 16, 2021 · 18 comments · Fixed by #3788
Assignees
Labels

Comments

@jzuhone
Copy link
Contributor

jzuhone commented Nov 16, 2021

Bug report

Bug summary

Often, but not always, making slices and/or projections of Arepo data creates square/rectangle-shaped gaps in the image. See the code below for examples.

A couple of weeks ago @matthewturk and I worked on this but were not able to find the source of the gaps. We've convincingly ruled out the pixelization routines themselves for slices and projections, leading us to suspect that something has gone wrong in data selection. As you can see below, the behavior is quite general, applying to SlicePlot, ProjectionPlot, and ParticleProjectionPlot, and it can also be seen if you examine the underlying objects themselves (e.g, make a scatter plot of the particle positions in the YTSlice data).

Oddly, sometimes the gaps go away if you change the center or the width of the image (sometimes only very slightly).

Code for reproduction

The dataset used in this example can be downloaded via curl -JO http://use.yt/upload/165c65a1, but the problem appears in many of these datasets.

The code can be found in this notebook:

https://gist.github.com/jzuhone/c4b41eb06947f28e220a364713947995

Version Information

  • Operating System: macOS, Linux
  • Python Version: 3.9
  • yt version: 4.0.1, installed from source
  • Other Libraries (if applicable): N/A
@neutrinoceros
Copy link
Member

I'm going to triage this as a "viz" bug because it's clear that visualisations are affected, but don't hesitate to apply any more appropriate labels instead.

@jzuhone
Copy link
Contributor Author

jzuhone commented Nov 16, 2021

Even simpler script showing that this is entirely about data selection:

https://gist.github.com/jzuhone/c2cc2621347b2f188b3d84296970af0b

@neutrinoceros
Copy link
Member

Indeed it's pretty clearly not viz that's the problem. Any idea if this bug is frontend specific yet ?

@matthewturk
Copy link
Member

matthewturk commented Nov 16, 2021 via email

@jzuhone
Copy link
Contributor Author

jzuhone commented Nov 16, 2021

@matthewturk but why would it manifest itself in squares like that?

@matthewturk
Copy link
Member

matthewturk commented Nov 16, 2021 via email

@jzuhone
Copy link
Contributor Author

jzuhone commented Nov 16, 2021

So Arepo computes the "smoothing length" by taking the equivalent sphere volume of each Voronoi cell, computing a radius from that sphere, and multiplying by a small factor. IOW, it's not a standard SPH smoothing length.

Is there anything in the selection or chunking code that assumes SPH things like kernels or number of nearest neighbors, instead of just using the snoothing length that it's given by the frontend to select on?

@matthewturk
Copy link
Member

That's a good question. I looked into it and I was unable to find anything like that. I'll try to recreate my checks again today to really dive into this.

@jzuhone
Copy link
Contributor Author

jzuhone commented Nov 17, 2021

I plan on playing around with a couple of things to see if I can help narrow this down.

@matthewturk
Copy link
Member

I used this script, which in a very uncool fashion accepts arguments via int(sys.argv[-1]) to update the index order and compare; additionally, I disabled guessing the mi1 and mi2 values by inserting a return right after self.regions.find_collisions_coarse() in particle_geometry_handler.py around line 227.

import sys
import os
import glob
import yt

for fn in sorted(glob.glob("*.ewah*")):
    print(fn)
    os.unlink(fn)

index_order = (int(sys.argv[-2]), int(sys.argv[-1]))

ds = yt.load("snapshot_250.hdf5", index_order=index_order)

_, c = ds.find_min(("PartType1","Potential"))

slc = yt.SlicePlot(ds, "z", ("gas","density"), center=c, width=(1500.0,"kpc"))
print(slc.save(f"slice_{index_order[0]}_{index_order[1]}"))

I generated these plots, which I've alt-texted with their values for index_order.

slice_1_1_Slice_z_density
slice_1_2_Slice_z_density
slice_2_1_Slice_z_density
slice_2_2_Slice_z_density
slice_3_1_Slice_z_density
slice_3_2_Slice_z_density
slice_4_1_Slice_z_density
slice_4_2_Slice_z_density
slice_5_1_Slice_z_density
slice_5_2_Slice_z_density

The upshot here is that I am not really sure what's happening. I will continue to investigate.

@matthewturk
Copy link
Member

I have realized that one thing we have not looked at is the potential asymmetry of the particles we are missing. Specifically, do the "missing" particles tend to exist all on the positive side, all on the negative side, etc, and are they out of the slice based on just the slicing coordinate?

@matthewturk
Copy link
Member

I've spent a good amount of time on this, and here is what I found to work, although I have to admit that I think it may be masking a different problem.

Here is a script, with some inline comments, that I ran:

https://gist.github.com/555124c31c389acfc95966cf1c073741

Note that I had to disable the auto-resetting of the index order to make this operate as expected, by making the conditional on line 228 of particle_geometry_handler.py always fail.

What I found was that the following patch produced correct behavior for all tested scenarios:

diff --git a/yt/geometry/particle_oct_container.pyx b/yt/geometry/particle_oct_container.pyx
index e4c05453c..3489c4902 100644
--- a/yt/geometry/particle_oct_container.pyx
+++ b/yt/geometry/particle_oct_container.pyx
@@ -852,16 +852,18 @@ cdef class ParticleBitmap:
         cdef ewah_word_type w
         this_collection = BoolArrayCollection()
         cdef ewah_bool_array *refined_arr = NULL
+        out_collection = BoolArrayCollection()
         for it1 in coarse_refined_map:
             mi1 = it1.first
             refined_arr = &this_collection.ewah_coll[0][mi1]
-            this_collection.ewah_keys[0].set(mi1)
-            this_collection.ewah_refn[0].set(mi1)
+            out_collection.ewah_keys[0].set(mi1)
+            out_collection.ewah_refn[0].set(mi1)
             buf = &it1.second
             for vec_i in range(buf.sizeInBytes() / sizeof(ewah_word_type)):
                 w = buf.getWord(vec_i)
                 refined_arr.addWord(w)
-        out_collection = BoolArrayCollection()
         in_collection._logicalor(this_collection, out_collection)
         return out_collection

But I have reason to believe that the main impact of this patch is not to update out_collection but rather to disable the setting on this_collection. I have inspected the _logicalor method and I did not find any errors (including in the order of setting values, which I was briefly concerned about, as compressed EWAH arrays have to be set in increasing order.) What I think may be happening is that by not having this_collection set in the visited areas, it doesn't set any refined values in the coarse region, thus marking the entire coarse cell as being marked. (This essentially negates the entire ghost zone refined zone checks, I believe.)

So that's where I've gotten.

@jzuhone
Copy link
Contributor Author

jzuhone commented Jan 25, 2022

I still find it very strange that we only seem to see it on Arepo data. Should we keep thinking about if there is some kind of special thing about it that might interact with this?

@langmm
Copy link
Contributor

langmm commented Jan 25, 2022

At @matthewturk's request, I am taking a look at this today/tomorrow to see if the bitmap indexing is involved. One idea I am chasing down is the treatment of the refined ghost zones for the Arepo "smoothing lengths". One consequence of estimating the voronoi cells as spheres in the hsml estimation may be that some particles' neighbors are missed in the refined ghost zones if their cells are overly oblong. If this is the cause, the only way to account for this may be to use a larger radius (multiple smoothing lengths) for expanding ghost zones for Arepo datasets.

@jzuhone
Copy link
Contributor Author

jzuhone commented Jan 25, 2022

Thanks @langmm!

@jzuhone
Copy link
Contributor Author

jzuhone commented Feb 2, 2022

@langmm were you able to look into this yet?

@langmm
Copy link
Contributor

langmm commented Feb 2, 2022

@jzuhone I was able to take a look, but I am still unsure of the cause. Scaling up the hsml used for selecting files/particles without adjusting the hsml used in the pixelation eliminated any gaps, but resulted in a pixelized image for a scale factor of 10. What was odd is that a scale factor of 2 resulted in more files being selected than a factor of 10 which would indicate to me that there is an error in the bitmap file selection. I will be checking those routines today.

@langmm
Copy link
Contributor

langmm commented Feb 7, 2022

It turns out this was a bug in the refined bitmap index creation where some coarse cells never got refined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants