Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Projection of periodic boxes with specified width smaller than boxsize. #3916

Closed
benopp99 opened this issue May 9, 2022 · 46 comments
Closed

Comments

@benopp99
Copy link

benopp99 commented May 9, 2022

I've seemingly found an issue with periodic box wrapping in ProjectionPlot. Below is a 25 Mpc/h IllustrisTNG periodic volume, and I kept getting maps zoomed into halos cut off at the periodic boundary. This was seemingly confounding because when I run a whole box, the wrapper works perfectly. After a lot of tinkering, I think it has to do with the width argument being not equal to the domain. I made images of 23000, 24000, 25000, and 26000 kpc/h centered at (22000, 22000, 22000), which I think serves as a good diagnostic of the problem. A loss gap opens up at the periodic boundary (i.e. 25000) and grows as the width becomes smaller (23000 is upper left, 24000 upper right, and so on). Therefore for a ProjectionPlot zoomed on a specific halo near the periodic boundary, you often incorrectly miss the halo. Here's the code snippet (I used FITSProjection which is a wrapper for ProjectionPlot, which has same issue):

    my_slice = ds.box(left_edge, right_edge) 

    Pos = [22000,22000,22000]
    Width = 23000

    proj_fits_slice = yt.FITSProjection(ds, "z", ("gas","O_p6_number_density"), data_source=my_slice, image_res=npix,center=Pos,width=Width)

Note this is in all types of periodic simulations- EAGLE, Simba too. Maybe I'm doing something wrong, and if there is a quick fix, let me know.

I've added a ProjectionPlot of oxygen (instead of O^6+) to more clearly show the 23000 case.

Version Information
python 3.9.5
yt 4.1.dev0

Note that this issue has been seen in earlier version of yt going back a couple years. i.e. before yt 4.0

gaps_1
gaps_2

@neutrinoceros
Copy link
Member

Triaging as "viz: 2D" because this bug manifests itself in a ProjectionPlot, though it seems likely that this is really a particle selection bug.

@neutrinoceros
Copy link
Member

@benopp99 Could you please share more of the script you're using to produce this buggy image ? Also, can you confirm that you're dealing with a particle field ?

@benopp99
Copy link
Author

benopp99 commented May 9, 2022

Yes, this is a particle field (AREPO). (Same with EAGLE, Simba).

Here's a script that you can adapt another simulation to. If you need this simulation snapshot, it can be made available.

One more thing-- I use trident, but this is not an issue with trident, so you can do an oxygen map with the ProjectionPlot routine and avoid loading/installing trident.

import trident
import sys
import astropy
import numpy as np
import h5py
import os.path

sim = sys.argv[1]
run = sys.argv[2]
snap = sys.argv[3]
axis = "z"

npix = 4000
nslices = 6

snapshot_filename = "/mnt/ceph/users/camels/Sims/" + sim + "/" + run + "/snap_" + snap + ".hdf5"
output_basename = sim + "." + run + ".snap_" + snap

ds = yt.load(snapshot_filename)

print("ds= ",ds.domain_left_edge,ds.domain_right_edge,ds.domain_right_edge[1],ds.domain_right_edge[2])

for i in range(nslices):
    slice_frac_low = i*1./nslices
    slice_frac_high = (i+1)*1./nslices
    print("slice_frac_low, slice_frac_high= ",slice_frac_low, slice_frac_high)

    left_edge = [ds.domain_left_edge[0],ds.domain_left_edge[1],slice_frac_low*ds.domain_right_edge[2]]
    right_edge = [ds.domain_right_edge[0],ds.domain_right_edge[1],slice_frac_high*ds.domain_right_edge[2]]

    my_slice = ds.box(left_edge, right_edge) 

    trident.add_ion_fields(ds, ions=['O VII'])

    Pos = [22000,22000,22000]
    Width = 23000
#    Width = (32686, 'kpc')

    proj = yt.ProjectionPlot(ds, "z", ("gas","O_number_density"), data_source=my_slice,center=Pos,width=Width,buff_size=[npix,npix])
    proj.save(name=output_basename + '.trident_O.slice%dof%d.png'%(i+1,nslices))

    proj_fits_slice = yt.FITSProjection(ds, "z", ("gas","O_p6_number_density"), data_source=my_slice, image_res=npix,center=Pos,width=Width)
    proj_fits_slice.writeto(output_basename + '.trident_OVII.slice%dof%d.fits'%(i+1,nslices),overwrite=True)

@neutrinoceros
Copy link
Member

yes I think having the exact dataset could help as a first step

@benopp99
Copy link
Author

benopp99 commented May 9, 2022

Let's see if this works. It's 3 GB. via the wormhole app.
https://wormhole.app/Nmvv4#7VKmczrb6lwxME-ZAje0fQ

@neutrinoceros
Copy link
Member

Thanks for sharing the dataset! Your script is still a little far from a minimal reproducer however, and it is proving quite difficult for me to reproduce your problem so far. In particular, I don't know how you're able to use yt 4.1dev with trident. To my knowledge, the latest version is not compatible with yt 4.1 (trident-project/trident#180).
Actually it seems like there are two possibilities here

  1. trident isn't required to reproduce your problem, in which case please try and simplify your script
  2. trident is required to reproduce, in which case it's very likely to be a trident bug rather than a yt one

Your script also necessitates some adjustments to be usable on another machine, it'd be good to simplify this too so we can focus on actually inquiring the problem.

@benopp99
Copy link
Author

trident is not necessary as this is a problem with projection plots. Let me simplify the script.

@benopp99
Copy link
Author

benopp99 commented May 11, 2022

import yt
import sys

sim = sys.argv[1]
run = sys.argv[2]
snap = sys.argv[3]
axis = "z"

npix = 4000
nslices = 6

#snapshot_filename = sim + "/" + run + "/snap_" + snap + ".hdf5"
snapshot_filename = "snap_" + snap + ".hdf5"
output_basename = sim + "." + run + ".snap_" + snap

ds = yt.load(snapshot_filename)

print("ds= ",ds.domain_left_edge,ds.domain_right_edge,ds.domain_right_edge[1],ds.domain_right_edge[2])

for i in range(nslices):
    slice_frac_low = i*1./nslices
    slice_frac_high = (i+1)*1./nslices
    print("slice_frac_low, slice_frac_high= ",slice_frac_low, slice_frac_high)

    left_edge = [ds.domain_left_edge[0],ds.domain_left_edge[1],slice_frac_low*ds.domain_right_edge[2]]
    right_edge = [ds.domain_right_edge[0],ds.domain_right_edge[1],slice_frac_high*ds.domain_right_edge[2]]

    my_slice = ds.box(left_edge, right_edge) 

    Pos = [22000,22000,22000]
    Width = 23000

    proj = yt.ProjectionPlot(ds, "z", ("gas","O_number_density"), data_source=my_slice,center=Pos,width=Width,buff_size=[npix,npix])
    proj.save(name=output_basename + '.O.slice%dof%d.png'%(i+1,nslices))

@neutrinoceros
Copy link
Member

neutrinoceros commented May 11, 2022

With this version of the script all I'm getting are uniform pictures such as this one

(click to unroll)

oui lol snap_032 O slice4of6

edit: I see now it's now actually uniform, but I couldn't distinguish the few colored dots from dust on my screen... that's a first 😄

@benopp99
Copy link
Author

benopp99 commented May 11, 2022

Hmmm... that color scale does seem strange as it goes from 10^18 to 0, so it wouldn't distinguish that gap. Can you set the scale bar manually in your case? I'm trying to see if the structures are aligned, and I don't think they are.
If you send me back the updated script, I'll rerun it too.

edit: I reran it myself without trident, and attached below, so you can grab that if you want.

@benopp99
Copy link
Author

I reran the simpler script (below), without trident. I still get the gap.

import yt
import sys

snap = sys.argv[1]
axis = "z"

npix = 4000
nslices = 6

snapshot_filename = "snap_" + snap + ".hdf5"
output_basename = "snap_" + snap

ds = yt.load(snapshot_filename)

print("ds= ",ds.domain_left_edge,ds.domain_right_edge,ds.domain_right_edge[1],ds.domain_right_edge[2])

for i in range(nslices):
    slice_frac_low = i*1./nslices
    slice_frac_high = (i+1)*1./nslices
    print("slice_frac_low, slice_frac_high= ",slice_frac_low, slice_frac_high)

    left_edge = [ds.domain_left_edge[0],ds.domain_left_edge[1],slice_frac_low*ds.domain_right_edge[2]]
    right_edge = [ds.domain_right_edge[0],ds.domain_right_edge[1],slice_frac_high*ds.domain_right_edge[2]]

    my_slice = ds.box(left_edge, right_edge) 

    Pos = [22000,22000,22000]
    Width = 23000

    proj = yt.ProjectionPlot(ds, "z", ("gas","O_number_density"), data_source=my_slice,center=Pos,width=Width,buff_size=[npix,npix])
    proj.save(name=output_basename + '.O.slice%dof%d.png'%(i+1,nslices))

image

@neutrinoceros
Copy link
Member

I've been making a lot of changes related to color scaling lately. For reference, I'm running your script with yt 4.0.3, which should behave as the current 4.1dev regarding color scaling.

Anyway here's a further simplified version of your script

import yt

ds = yt.load("snap_032.hdf5")

left_edge = [ds.domain_left_edge[0],ds.domain_left_edge[1],0*ds.domain_left_edge[0]]
right_edge = [ds.domain_right_edge[0],ds.domain_right_edge[1],0.5*ds.domain_right_edge[2]]

my_slice = ds.box(left_edge, right_edge) 

Pos = [22000,22000,22000]
Width = 23000

proj = yt.ProjectionPlot(ds, "z", ("gas","O_number_density"), data_source=my_slice,center=Pos,width=Width,buff_size=(50,50))
proj.set_log(("gas","O_number_density"), True, linthresh=1e-8)
proj.save("test.png")

So now I think I got it
test

@benopp99
Copy link
Author

benopp99 commented May 11, 2022

Nice, that's the issue with the colorbar not being scaled. I like that you did it with fewer pixels, so that it runs faster.

I would guess that in the loss gap, the particles are being mapped twice, once positive and once negative, and cancelling each other out. But that's just an uneducated guess and I haven't looked at the code.

@neutrinoceros
Copy link
Member

I think an important hint is that we're setting Width = 23000 (kpc) and the plot actually covers more than 30Mpc in width.

@benopp99
Copy link
Author

So no that's in code units, and that is 23000/h (h=0.6711). I spent a lot of time checking if this is a units issue, and as far as I can tell, it's not, because I changed everything (ds.domain_edges, etc.) to physical kpc, and other similar tests. It can be explored, but I feel like I exhausted those avenues.

@neutrinoceros
Copy link
Member

Oh I see, I incorrectly assumed that the length unit was kpc. Nevermind.

@neutrinoceros
Copy link
Member

neutrinoceros commented May 11, 2022

I think I found a reproducer that doesn't require your 3Gb dataset:

import numpy as np
import yt


def fake_ds(hsml_factor):
    npart = 3**3
    x = np.empty(npart)
    y = np.empty(npart)
    z = np.empty(npart)

    tot = 0
    for i in range(0, 3):
        for j in range(0, 3):
            for k in range(0, 3):
                x[tot] = i + 0.5
                y[tot] = j + 0.5
                z[tot] = k + 0.5
                tot += 1

    data = {
        "particle_position_x": (x, "cm"),
        "particle_position_y": (y, "cm"),
        "particle_position_z": (z, "cm"),
        "particle_mass": (np.ones(npart), "g"),
        "particle_velocity_x": (np.zeros(npart), "cm/s"),
        "particle_velocity_y": (np.zeros(npart), "cm/s"),
        "particle_velocity_z": (np.zeros(npart), "cm/s"),
        "smoothing_length": (0.05 * np.ones(npart) * hsml_factor, "cm"),
        "density": (np.ones(npart), "g/cm**3"),
        "temperature": (np.ones(npart), "K"),
    }

    bbox = np.array([[0, 3], [0, 3], [0, 3]])

    return yt.load_particles(data=data, length_unit=1.0, bbox=bbox)


ds = fake_ds(10)

for w in (3, 2, 1.5):
    proj = yt.ProjectionPlot(
        ds, "z", ("gas", "density"), center=(2.8, 2.8, 2.8), width=w,
    )
    proj.save(f"/tmp/test_{w=}.png")

outputs

click to expand

test_w=3
test_w=2
test_w=1 5

So we can see that the further we zoom in, the more data we miss.
Do you think this correctly mimics your situation ?

one last note for today: this issue seems specific to projections indeed. I don't see any problem with the SlicePlot equivalent.

@benopp99
Copy link
Author

Interesting to see this. And yes, I see that as well- the more you zoom in, the more that is missed (which is the other image I showed above from the progression from 23000-26000). I can make more of those from the 3GB dataset to show you the progression more clearly, because it is revealing.
Yours is a different pattern than I expected (I would expect whole columns and rows removed, but the 2nd one shows only the corner particle missing).

Let me get back to you on this in a little bit.

@benopp99
Copy link
Author

This is the behavior I see. The gap opens up at the periodic boundary and grows at double the rate of the inverse of the width (when it is smaller than the boxsize) away from the boundary. e.g. width=25 Mpc/h- no gap, 24 Mpc/h- gap from 25-27 Mpc/h, 23 Mpc/h- gap 25-29 Mpc/h, etc. Or so it seems.

@neutrinoceros
Copy link
Member

So I've tried tracking this down but for now my results are limited. I know that the following region of the code is hit at pixelisation time (at least in my minimal example)

if isinstance(data_source, YTParticleProj):

but so far I haven't been able to find where the actual particle selection is performed. At this point I don't think it's my best shot to just try and find this myself so let's ask for help. @matthewturk, any pointers ?

@neutrinoceros
Copy link
Member

just a quick note that this problem seems to be loosely related (at least thematically) to #2639 and #3386

@benopp99
Copy link
Author

So taking a smaller width than the whole box behaves like the bounding box issue in #2639? And the box periodicity is turned off in such a case now?

@matthewturk's comment from that thread suggests that regions are prevented from spanning periodic boundaries: "We don't currently allow bbox selections to run off the edge of a domain (i.e., like we do with ds.region), so this shouldn't block anything that works. Is that right?"

I believe this issue happens when I don't explicitly set data_source to be a ds.box region, but I can recheck this if necessary. I am using ds.box to have limited depth in the box (or what I call a slice).

@benopp99
Copy link
Author

benopp99 commented May 18, 2022

Just to check, I do get the same issue without the region (ds.box), so does this not have to do with bounding box?

proj = yt.ProjectionPlot(ds, "z", ("gas","O_number_density"),center=Pos,width=Width,buff_size=[npix,npix])

Width is 23000 (vs. 25000 total boxsize).
image

@neutrinoceros
Copy link
Member

Sorry about the confusion, indeed these problems are not directly related. What I meant to point out is that they both are caused by how particle selection works with periodicity.
This may give us insight as for where to look in the code base.

@neutrinoceros
Copy link
Member

neutrinoceros commented May 21, 2022

Making progress ! I investigated a bit further and I'm almost certain that the problematic selection happens here:

proj_reg = data_source.ds.region(
left_edge=le,
right_edge=re,
center=data_source.center,
data_source=data_source.data_source,
)
proj_reg.set_field_parameter("axis", data_source.axis)
buff = np.zeros(size, dtype="float64")
if weight is None:
for chunk in proj_reg.chunks([], "io"):

For context:

  • this is right before the pixelisation method is called, proj_reg.chunks([], "io") yields some subsections of the dataset (and seems to omit some)
  • proj_region is a YTRegion object
  • the proj_region.chunk method is defined in YTSelectionContainer
  • in this scope, data_source is a YTParticleProj object

AFAICT, all these objects are correctly initialised, regarding bounds and periodicity. The only problem seems to be with chunking.

@matthewturk
Copy link
Member

@neutrinoceros This is sounding somewhat familiar. I have not been able to be checked-in to this as much as I would like, so I will ask a few leading questions -- what happens if you change proj_reg to be all_data instead? Do you still see the errors? I'm also trying to think about whether or not this is related to periodicity distance calculations in the kernel versus in the IO chunking/selection. What happens if you use a slightly different selector, like a sphere centered at the same place but with a radius big enough to encompass it all?

@neutrinoceros
Copy link
Member

what happens if you change proj_reg to be all_data instead? Do you still see the errors?

taking my example from #3916 (comment), I obtain

This ! (click to expand)

test_w=2

and taking my example from #3916 (comment), I obtain

That ! (click to expand)

test

So no errors in sight here !
Using a sphere selector gets me the same result (first test), and I'm not sure what the second test tells us, but here's what I get:

(click to expand)

test

So if I'm reading this right, it confirms my hunch that the IO selection is where the bug is, would you agree ?

@benopp99
Copy link
Author

Happy to test this out on my system.

Is this the line that needs to be changed?

                    for chunk in proj_reg.chunks([], "io"):

to something like:

                    for chunk in all_data.chunks([], "io")

?

@neutrinoceros
Copy link
Member

The simplest change is override proj_reg = data_source.ds.all_data(), but beware this isn't an actual solution

@benopp99
Copy link
Author

benopp99 commented May 26, 2022

Okay, I'm trying this override. It does solve the problem of the gaps, but I think it is selecting everything, e.g. if I use a region like a box to limit the depth of the projection, I believe it is not using that region to limit the particle selection. But I'll check to make sure. Though you probably can answer that this is the case when using all_data().

Update: Yes, it looks to take the entire depth.

@benopp99
Copy link
Author

benopp99 commented Jun 1, 2022

Hi @neutrinoceros and @matthewturk So the simple override is not a longterm solution. Is there a fix that can be made so one can take regions of limited depth?

@jzuhone
Copy link
Contributor

jzuhone commented Aug 3, 2022

@benopp99 @matthewturk @neutrinoceros I am also seeing this creep up in another situation so I am going to take a whack at it. The main issue here is that when you specify a smaller rectangular region across a periodic boundary, it does not select all of the particles. It actually doesn't have anything to do with projections per se

Frankly, this is quite a major bug so I am surprised it's not getting reported more often or isn't getting caught in a test.

@matthewturk
Copy link
Member

matthewturk commented Aug 3, 2022 via email

@jzuhone
Copy link
Contributor

jzuhone commented Aug 3, 2022

@matthewturk I am worried that something is going wrong here:

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cdef int select_sphere(self, np.float64_t pos[3], np.float64_t radius) nogil:
# adapted from http://stackoverflow.com/a/4579192/1382869
cdef int i
cdef np.float64_t p
cdef np.float64_t r2 = radius**2
cdef np.float64_t dmin = 0
for i in range(3):
if self.periodicity[i] and self.check_period[i]:
p = pos[i] + self.right_edge_shift[i]
else:
p = pos[i]
if p < self.left_edge[i]:
dmin += (p - self.left_edge[i])**2
elif pos[i] > self.right_edge[i]:
dmin += (p - self.right_edge[i])**2
return int(dmin <= r2)

@matthewturk
Copy link
Member

matthewturk commented Aug 3, 2022 via email

@neutrinoceros
Copy link
Member

Sorry I haven't been able to follow this. Thank you John for picking it up, I agree this is a major bug.

@jzuhone
Copy link
Contributor

jzuhone commented Aug 4, 2022

Hi @benopp99 can you re-post the file that was originally here:

https://wormhole.app/Nmvv4#7VKmczrb6lwxME-ZAje0fQ

It's not available anymore.

@jzuhone
Copy link
Contributor

jzuhone commented Aug 4, 2022

This bit of logic at 0e206f0
partially fixes @neutrinoceros's example, but still doesn't get us all of the way. The particles on the left boundary are still clipped out in the middle image:

test_w=3

test_w=2

test_w=1 5

@benopp99
Copy link
Author

benopp99 commented Aug 4, 2022 via email

@jzuhone
Copy link
Contributor

jzuhone commented Aug 4, 2022

I think if we can figure out the simple example we should be in good shape

@benopp99
Copy link
Author

benopp99 commented Aug 4, 2022 via email

@jzuhone
Copy link
Contributor

jzuhone commented Aug 4, 2022

@benopp99 it would be good to have it just to make sure it's fixed for your case. PR incoming

@neutrinoceros
Copy link
Member

Fixed with #4050

@benopp99
Copy link
Author

Just an epilogue. Now that I have 4.0.5 uploaded, the boundary wrapping works whereas it did not in 4.0.3. See this IllustrisTNG100 example. Thanks for the fix!

TNG100 yt_4 0 3_4 0 5 comparison

@neutrinoceros
Copy link
Member

Yup, the fix was the main motivation for 4.0.5
I'm happy this actually works for you in production !

@chummels
Copy link
Member

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants