Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VELOCIraptor zeros all fields with metallicity of star-forming gas #75

Closed
EvgeniiChaikin opened this issue Mar 31, 2021 · 17 comments
Closed
Labels
bug Something isn't working workaround available

Comments

@EvgeniiChaikin
Copy link

Describe the bug
In short , when I introduce a new gas internal property field to the VELOCIraptor config file, VELOCIraptor begins to zero all fields with metallicity of star-forming gas.

More preciesly, I have two config files of VR, where the diff between the two is

177,181c177,181
< Gas_internal_property_names=DensitiesAtLastSupernovaEvent,GraphiteMasses,SilicatesMasses,AtomicHydrogenMasses,IonisedHydrogenMasses,MolecularHydrogenMasses,HydrogenMasses,HeliumMasses,
< Gas_internal_property_index_in_file=0,0,0,0,0,0,0,0,
< Gas_internal_property_input_output_unit_conversion_factors=1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,
< Gas_internal_property_calculation_type=max,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,
< Gas_internal_property_output_units=solar_mass/MPc3,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,
---
> Gas_internal_property_names=DensitiesAtLastSupernovaEvent,GraphiteMasses,SilicatesMasses,AtomicHydrogenMasses,IonisedHydrogenMasses,MolecularHydrogenMasses,HydrogenMasses,HeliumMasses,IronOverHydrogenMasses,
> Gas_internal_property_index_in_file=0,0,0,0,0,0,0,0,0,
> Gas_internal_property_input_output_unit_conversion_factors=1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,1.0e10,
> Gas_internal_property_calculation_type=max,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,aperture_total,
> Gas_internal_property_output_units=solar_mass/MPc3,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,solar_mass,

i.e. compared to config 1, config 2 has an additional field IronOverHydrogenMasses.

After I run VR

../VR_ICRAR_27_March/VELOCIraptor-STF/stf -C vrconfig_3dfof_subhalos_SO_hydro_1.cfg  -i colibre_0023 -o halo_v1_0023 -I 2
../VR_ICRAR_27_March/VELOCIraptor-STF/stf -C vrconfig_3dfof_subhalos_SO_hydro_2.cfg  -i colibre_0023 -o halo_v2_0023 -I 2

and look into the output fields, I find


IPython 6.3.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from velociraptor import load

In [2]: import numpy as np

In [3]: output_from_config1 = load("halo_v1_0023.properties.0")

In [4]: output_from_config2 = load("halo_v2_0023.properties.0")

In [5]: np.max(output_from_config1.apertures.zmet_gas_sf_100_kpc)
Out[5]: unyt_quantity(0.01275099, '83.33*dimensionless')

In [6]: np.max(output_from_config2.apertures.zmet_gas_sf_100_kpc)
Out[6]: unyt_quantity(0., '83.33*dimensionless')

where one can see that in the latter case (config 2), the field apertures.zmet_gas_sf_100_kpc has no positive values. And this is true not only for the 100-kpc apertures but for all types of apertures.

To Reproduce

  • Path to VR I use /cosma7/data/dp004/dc-chai1/VR_ICRAR_27_March/VELOCIraptor-STF (version 64de17bff6925f47f3ebe8f8195108801d661d95)
  • Path to SWIFT snapshot /cosma7/data/dp004/dc-chai1/test_VR/colibre_0023.hdf5
  • Path to the two VR configs: /cosma7/data/dp004/dc-chai1/test_VR/vrconfig_3dfof_subhalos_SO_hydro_?.cfg, where ? is 1 for config 1 and 2 for config 2
  • The VR output files I used to draw this conclusion can be found in the same folder.

@MatthieuSchaller

@EvgeniiChaikin
Copy link
Author

Update: the masses of star-forming gas are also zero if VR is run with config 2.

@rtobar
Copy link

rtobar commented Apr 6, 2021

This sounds like a similar problem to that from #72, but @EvgeniiChaikin clearly tested using the latest fixes introduced to fix #72. That plus the fact that @MatthieuSchaller was mentioned in the original description point out that this is a different problem, although probably closely related.

In #72 (comment) there is a reference to some values not being zero though, so again, I suppose this is slightly different.

@rtobar rtobar added the bug Something isn't working label Apr 6, 2021
@rtobar
Copy link

rtobar commented Apr 6, 2021

I tried to reproduce this in comsa6 but it actually worked for me. My build has both MPI and OpenMP and HYDRO all ON. What were your compilation flags?

$> d=/cosma7/data/dp004/dc-chai1/test_VR

$> # Running against the two configurations
$> builds/71/stf -C $d/vrconfig_3dfof_subhalos_SO_hydro_1.cfg -i $d/colibre_0023 -I 2 -o halo_v1_0023 > /dev/null
$> builds/71/stf -C $d/vrconfig_3dfof_subhalos_SO_hydro_2.cfg -i $d/colibre_0023 -I 2 -o halo_v2_0023 > /dev/null

$> # First numbers from original output for Aperture_Zmet_gas_100_kpc
$> h5dump -d Aperture_Zmet_gas_100_kpc halo_v1_0023.properties.0 | head --lines 10
HDF5 "halo_v1_0023.properties.0" {
DATASET "Aperture_Zmet_gas_100_kpc" {
   DATATYPE  H5T_IEEE_F64LE
   DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }
   DATA {
   (0): 0.00119531, 0.00411753, 0.000572314, 0.000104163, 0.000112529,
   (5): 5.33842e-05, 0.000107015, 0.000100115, 1.05497e-05, 9.58872e-06,
   (10): 8.96096e-06, 2.11294e-05, 1.57962e-05, 9.3072e-06, 4.0905e-06,
   (15): 2.95208e-05, 1.55529e-05, 8.73288e-06, 3.35401e-06, 2.06113e-05,
   (20): 6.30811e-06, 5.24397e-06, 0, 1.0879e-05, 5.93846e-06, 3.55144e-06,

$> # Full diff between first and second run for Aperture_Zmet_gas_100_kpc (i.e.: all numbers are equal)
$> diff -Naur <(h5dump -d Aperture_Zmet_gas_100_kpc halo_v1_0023.properties.0) <(h5dump -d Aperture_Zmet_gas_100_kpc halo_v2_0023.properties.0) 
--- /dev/fd/63  2021-04-06 09:46:29.416308901 +0100
+++ /dev/fd/62  2021-04-06 09:46:29.417308911 +0100
@@ -1,4 +1,4 @@
-HDF5 "halo_v1_0023.properties.0" {
+HDF5 "halo_v2_0023.properties.0" {
 DATASET "Aperture_Zmet_gas_100_kpc" {
    DATATYPE  H5T_IEEE_F64LE
    DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }

$> # The actual numbers you requested
$> h5dump -d Aperture_IronOverHydrogenMasses_aperture_total_gas_10_kpc halo_v2_0023.properties.0 | head --lines 10
HDF5 "halo_v2_0023.properties.0" {
DATASET "Aperture_IronOverHydrogenMasses_aperture_total_gas_10_kpc" {
   DATATYPE  H5T_IEEE_F64LE
   DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }
   DATA {
   (0): 1.24204e-05, 0.000109075, 1.9672e-05, 6.68841e-06, 2.95815e-06,
   (5): 1.57079e-06, 3.03353e-06, 1.79867e-06, 5.43869e-07, 5.85223e-07,
   (10): 8.79976e-07, 7.09117e-07, 9.37009e-07, 0, 3.45762e-07, 1.26667e-06,
   (16): 3.49815e-08, 4.42882e-07, 1.32802e-08, 1.03888e-06, 2.37694e-07,
   (21): 2.18679e-07, 0, 5.73632e-07, 1.18798e-07, 1.8089e-07, 2.28771e-08,

@rtobar
Copy link

rtobar commented Apr 7, 2021

I just tried again with -DVR_MPI=OFF and obtained the exact same results as in my previous comment, so I'm still unable to reproduce this problem.

And actually I just compared the two original output files pointed at by @EvgeniiChaikin and their Aperture_Zmet_gas_100_kpc also look identical:

$> d=/cosma7/data/dp004/dc-chai1/test_VR
$> diff -Naur <(h5dump -d Aperture_Zmet_gas_100_kpc $d/halo_v1_0023.properties.0) <(h5dump -d Aperture_Zmet_gas_100_kpc $d/halo_v2_0023.properties.0) 
--- /dev/fd/63  2021-04-07 09:55:22.992640634 +0100
+++ /dev/fd/62  2021-04-07 09:55:22.993640643 +0100
@@ -1,4 +1,4 @@
-HDF5 "/cosma7/data/dp004/dc-chai1/test_VR/halo_v1_0023.properties.0" {
+HDF5 "/cosma7/data/dp004/dc-chai1/test_VR/halo_v2_0023.properties.0" {
 DATASET "Aperture_Zmet_gas_100_kpc" {
    DATATYPE  H5T_IEEE_F64LE
    DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }

@EvgeniiChaikin maybe there's a problem with the python tools you are using to read these files? Or maybe I'm missing something obvious from the original description of the problem?

@EvgeniiChaikin
Copy link
Author

EvgeniiChaikin commented Apr 7, 2021

Hi @rtobar,

Thank you for helping me out!

I complied VR with
cmake -DVR_USE_GAS=ON -DVR_USE_STAR=ON -DVR_USE_BH=ON

I saved the output from the compilation process into
/cosma7/data/dp004/dc-chai1/VR_ICRAR_27_March/VELOCIraptor-STF/compile_output.txt

This bug is probably not related to the python tools I am using because I was also able to reproduce it using h5dump

[dc-chai1@login7b [cosma7] test_VR]$ h5dump -d Aperture_Zmet_gas_sf_100_kpc halo_v1_0023.properties.0 | head --lines 10
HDF5 "halo_v1_0023.properties.0" {
DATASET "Aperture_Zmet_gas_sf_100_kpc" {
   DATATYPE  H5T_IEEE_F64LE
   DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }
   DATA {
   (0): 0.00428788, 0.012751, 0.00263531, 0.00478277, 0, 0.00130498,
   (6): 0.000917185, 0, 0, 0, 0, 0, 0.00342874, 0, 0.000424479, 0.00703778,
   (16): 0, 0, 0, 0.000731791, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.00384946,
   (32): 0, 0, 0, 0, 0, 0, 0, 0.000778238, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   (51): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
[dc-chai1@login7b [cosma7] test_VR]$ h5dump -d Aperture_Zmet_gas_sf_100_kpc halo_v2_0023.properties.0 | head --lines 10
HDF5 "halo_v2_0023.properties.0" {
DATASET "Aperture_Zmet_gas_sf_100_kpc" {
   DATATYPE  H5T_IEEE_F64LE
   DATASPACE  SIMPLE { ( 823 ) / ( 823 ) }
   DATA {
   (0): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   (23): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   (46): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   (69): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   (92): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

I presume the reason why you were not able to reproduce the bug so far is that you have been using apertures.zmet_gas_100_kpc whereas I take apertures.zmet_gas_sf_100_kpc. That is, the problem seems to only occur in the fields related to the star-forming gas.

@rtobar
Copy link

rtobar commented Apr 8, 2021

@EvgeniiChaikin ahhhhh what a silly mistake on my side! I lost the sf bit in the name of the dataset, I'll re-run things to double-check again and confirm.

@rtobar
Copy link

rtobar commented Apr 8, 2021

I was now able to reproduce this, sorry it took so long!

This is a problem specific to MPI builds it seems. I tried to reproduce the problem with an MPI enabled build, then with an MPI disabled one, and only the former produced the problem:

$> # build with and without MPI:
$> for mpi in ON OFF; do builddir=builds/75-mpi-$mpi; cmake -B $builddir -DVR_USE_GAS=ON -DVR_USE_STAR=ON -DVR_USE_BH=ON -DVR_MPI=$mpi; cmake --build $builddir -j 16; done

$> # Run both executables against the two config files
$> for mpi in ON OFF; do builddir=builds/75-mpi-$mpi; for config in 1 2; do $builddir/stf -C $d/vrconfig_3dfof_subhalos_SO_hydro_$config.cfg -i $d/colibre_0023 -I 2 -o mpi-$mpi-config-$config > /dev/null; done; done

$> Compare things now
$> compare_dataset() { diff -Naur <(h5dump -d $1 $2) <(h5dump -d $1 $3) | grep 'DATA {' -A 100; }

# MPI=ON has differences
$> compare_dataset Aperture_Zmet_gas_sf_100_kpc mpi-ON-config-{1,2}.properties.0 | head 
    DATA {
-   (0): 0.00428788, 0.012751, 0.00263531, 0.00478277, 0, 0.00130498,
-   (6): 0.000917185, 0, 0, 0, 0, 0, 0.00342874, 0, 0.000424479, 0.00703778,
-   (16): 0, 0, 0, 0.000731791, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.00384946,
-   (32): 0, 0, 0, 0, 0, 0, 0, 0.000778238, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-   (51): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-   (74): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-   (97): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-   (120): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-   (142): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

# MPI=OFF has no differences; and MPI ON/OFF are the same for config 1
$> compare_dataset Aperture_Zmet_gas_sf_100_kpc mpi-OFF-config-{1,2}.properties
$> compare_dataset Aperture_Zmet_gas_sf_100_kpc mpi-OFF-config-1.properties mpi-ON-config-1.properties.0
$>

This will hopefully help constraining the search for the underlying problem, plus it indicates there is a workaround for small runs that can happen on a single node.

@EvgeniiChaikin
Copy link
Author

Thanks a lot @rtobar !

I mostly feed VR with small cosmological boxes so your fix is perfect for me!

@MatthieuSchaller
Copy link

Could the recent fix to the negative densities over MPI have magically fixed this?

@MatthieuSchaller
Copy link

@JBorrow is that not something we see with the EAGLE runs?

@JBorrow
Copy link

JBorrow commented May 11, 2021

Master from 2021-04-13 works for us; this plot shows Z_{gas, sf} against M_*, using the MPI-only version of the code.

stellar_mass_gas_sf_metallicity_30

@MatthieuSchaller
Copy link

Is that version then different from the one @EvgeniiChaikin uses?

@MatthieuSchaller
Copy link

@EvgeniiChaikin (when you have time) the fix (#79) that is now in master may have helped here.

@rtobar
Copy link

rtobar commented May 28, 2021

I wanted to try this out with the original config/inputs reported in this ticket but they were gone. Hopefully the fix in #79 has some positive effect on this problem -- the fact that MPI builds work in some cases and not in others could be indicative that this was an uninitialised variable problem indeed.

@MatthieuSchaller
Copy link

Agreed, that seems plausible.

@EvgeniiChaikin
Copy link
Author

I re-created the same test as in the description.

I ran it for two versions of the VR

  • 8f380fcd4b53c4b1063eaa294e270e30610daf4b (May 26)
  • 64de17bff6925f47f3ebe8f8195108801d661d95 (Mar 24)

In both cases, VR was compiled with MPI.

As before, the older version would produce the metallicities of star-formation gas equal to zero. However, that is not the case for the newer version (May 26). To confirm this result, I used 5 different snapshots for this test and never found the star-forming gas to have zero metallicities in the catalogues produced by the newer version of VR.

I thus conclude that the bug has been fixed and close this thread.

Thanks everyone for the help!

@rtobar
Copy link

rtobar commented Jun 6, 2021

Great news @EvgeniiChaikin, thanks for retrying this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working workaround available
Projects
None yet
Development

No branches or pull requests

4 participants