Remove duplicate particles in mergers #32

VictorForouhar · 2024-04-26T09:04:50Z

Do merging checks after unbinding an object, rather than the whole set of objects. This ensures we can update the boundness of particles before unbound ones are passed to the parent of the subhalo. In some cases, this leads to particle duplication.

VictorForouhar · 2024-04-26T11:32:55Z

Running first test with COLIBRE example box.

Duh

VictorForouhar · 2024-04-27T12:35:21Z

Early testing suggests new merging method does indeed decrease number of duplicate particles. We achieve the same number as tests where merging between subhaloes are disabled.

As a comparison, the master version produces:

Testing presence of duplicate particles in HBTplus at snapshot index 10.
Opening HBT catalogue: /cosma7/data/dp004/dc-foro1/colibre/hbt_testing/merged_master/010/SubSnap_010.{file_nr}.hdf5 --- DONE
1170 particles out of 8695713 are duplicate.
60 unique subhalos out of 57064 share particles.

Master version without merging:

Testing presence of duplicate particles in HBTplus at snapshot index 10. 
Opening HBT catalogue: /cosma7/data/dp004/dc-foro1/colibre/hbt_testing/removing_duplicates_no_mergers/010/SubSnap_010.{file_nr}.hdf5 --- DONE
70 particles out of 8673697 are duplicate.
13 unique subhalos out of 57064 share particles.

New merging version:

Testing presence of duplicate particles in HBTplus at snapshot index 10. 
Opening HBT catalogue: /cosma7/data/dp004/dc-foro1/colibre/hbt_testing/new_merging_criteria_fix/010/SubSnap_010.{file_nr}.hdf5 --- DONE
70 particles out of 8676564 are duplicate.
13 unique subhalos out of 57064 share particles.

VictorForouhar · 2024-04-27T12:37:05Z

There are some small differences in the timing and number of mergers. I am assuming this is due to the fact that the master version only checks merging with the phase-space information of the first round of unbinding.

The new version uses up-to-date phase-space information, e.g. if a subhalo can become overlapping with its parent after accreting particles from its satellite. The master version does not take into account the newly accreted particles. Will check a few examples next week.

VictorForouhar · 2024-04-27T12:40:02Z

I also have an idea concerning the remaining set of duplicate particles, I will test it next week.

src/subhalo_unbind.cpp

VictorForouhar · 2024-04-29T08:21:14Z

This new version results in slightly more mergers relative to the original version. Below is the number of mergers at a given snapshot. Overall, 25 more objects are identified as merged by snapshot 12 in the new version, an increase of 0.6%.

Would have affected absolute position values, and potentially the position dispersion (if the core of a subhalo is split across a PCB)

To check values relative to old calculations. Run disabling the new merging scheme, to obtain the same subhalo-subhalo comparisons.

VictorForouhar · 2024-04-29T09:22:29Z

Running a test with the asserts added in 3c7967e results in no crashes, indicating the core calculations and phase-space distances are the same the old and new versions. Prior to ad15256, they asserts would crash. I will now remove those temporary asserts.

VictorForouhar · 2024-04-29T09:23:45Z

I will re-run a test to see how the merger rates vary between old and new versions. In particular, see how #32 (comment) changes now that the spatial distance calculations are done correctly.

Should only be used for runs using the same FOF and particle information!

VictorForouhar · 2024-04-29T11:28:07Z

After the PCB correction, they are now in better agreement (note difference in y-axis scale of bottom plot). The number of merged objects is the same in both runs.

VictorForouhar · 2024-04-29T11:49:05Z

Looking into differences in merging of matched objects. Some appear to be disrupted in the old but merged in the new (first plot). Others have experienced mergers in the old but not in the new (second plot). I now believe my previous statement concerning the updating the phase-space properties after unbinding a single object vs all can also suppress mergers. It all depends on where the newly accreted particles shift the core to.

Information available within Unbind, whereas previous method relied on a helper array that did not have the same ordering.

VictorForouhar · 2024-04-29T13:05:53Z

First example shown above is caused due to a small difference in Nbound of the parent (+1 particle is bound), and the PhaseSpaceDistance goes from > 2 to < 2.

jchelly · 2024-04-29T15:57:23Z

Running this on the colibre test with gcc address sanitizer it stops with a heap buffer overflow error at subhalo_tracking.cpp:679. There's a full stack trace (although with lots of repetition due to MPI) in /cosma7/data/dp004/jch/HBT-colibre/merging/remove_duplicate_particles_in_mergers/logs/job.HBT.6937740.out. I'll try it in DDT.

Change made to help debugging.

This would occur because we tried to access element -1 in the halo array.

VictorForouhar · 2024-04-30T09:35:13Z

Running with address sanitizer now works after 200d30f. Will submit a FLAMINGO test now, which also crashed last night at this point.

Doing so before storing ids in merger tree can lead to some tracers being missing

Conflicts: HBT.cpp

VictorForouhar · 2024-05-01T12:14:47Z

The memory issues we currently have appear to be within the merger tree routines. Whilst that is getting solved, I will work on getting the last mechanism for particles duplication under control.

* Masking function that gives preference to shallower subhaloes in the hierarchy * Added masking from bottom to top * Update value of TracerIndex to reflect change in Particle vector ordering * Added function call to mask particles from the source subhalo Called after saving bound properties. * Skip FOF groups that only have a central or an unbound component * Comments * Moved location of CleanTracks Makes more sense to be placed within save routine. * Fixed array indexing for hierarchical relationships At the stage where we want to do the mask cleaning, the NestedSubhalos values have been globalised and thus do not reflect the location in the local Subhalo array. * tracer_counter only asserted for subhalos with Nbound > 0 * Update Nbound value after mask clean step. Outdated value would lead to memory issues in Subhalo_t::KickNullParticles, which relies of Nbound instead of Particles.size(). We should really make Nbound consistent throughout the code!! * Nasty bug with iterator start Shout out to Rob for spotting it! * Value of Nbound now means the number of remaining bound particles in the masked subset. * VERY crude way to allocate sufficient memory in ExclusionList * Split IO into two distinct steps: bound and source particle saving This allows me to clean the particles after saving bound properties, but before saving the source information used for restarts * Removed stray timer * More comments * Fix in duplicate particle testing script. Would only account for one of the TrackIds hosting a duplicate particle. * Applied formatter * Number of tracers used for merger tree default to minimum number of tracers This value defaults to 10. * Made the masking work. - New debug checks added - Masking from top to bottom and viceversa are called separately, which solves cases where the initial central guess was incorrect. - Fixed a bug when checking whether the number of bound particles is changing when masking * Formatting * Avoid using MemberTable in CleanTracks() * Move CleanTracks() call to just before AssignHosts This avoids problems with restarting because we don't write an out of date Nbound to the SubSnap files. It also means that we use all of the bound particles from computing halo positions and velocities for deciding centrals. * Fix concerning orphan masses in central identification Previously, newly created orphans would retain their bound masses, meaning that they could remain as centrals even if another resolved subhalo was present. This fix makes all orphans have zero mass, in line with their zero bound particles. * Updated comments --------- Co-authored-by: John Helly <j.c.helly@durham.ac.uk>

src/subhalo_merge.cpp

src/subhalo_unbind.cpp

VictorForouhar · 2024-05-17T13:32:44Z

Made the suggested changes. @robjmcgibbon

robjmcgibbon

\o/

jchelly · 2024-05-17T14:43:32Z

If I run the colibre test with asserts enabled I get a crash with this branch:

HBT: /cosma7/data/dp004/jch/HBT/crash/HBTplus/src/subhalo_tracking.cpp:1173: void SubhaloMasker_t::MaskTopBottom(HBTInt, std::vector<Subhalo_t>&, const MappedIndexTable_t<long int, long int>&): Assertion `insert_status.second == true' failed.

VictorForouhar · 2024-05-17T14:47:43Z

Okay, thanks for flagging. Weird that it was not triggered before... could it be some of the latest commits?

jchelly · 2024-05-17T14:51:00Z

It was processing snapshot 4:

saving 4086 subhalos to ./test_output//003
Conversion factor from SWIFT length units to 0.681 Mpc/h = 1
Conversion factor from SWIFT mass units to 6.81e+09 Msun/h = 1
Conversion factor from SWIFT velocity units to 1 km/s = 1
Null group ID is 2147483647
Number of ranks per node reading simultaneously is 28
33223360 particles loaded at Snapshot 4(36)
10103 groups loaded at snapshot 4(36)
HBT: /cosma7/data/dp004/jch/HBT/crash/HBTplus/src/subhalo_tracking.cpp:1173: void SubhaloMasker_t::MaskTopBottom(HBTInt, std::vector<Subhalo_t>&, const MappedIndexTable_t<long int, long int>&): Assertion `insert_status.second == true' failed.

VictorForouhar · 2024-05-17T16:41:21Z

I ran it with the last commit of the branch doing the masking, de59340d623044424df2ee9e2c258926aaf22c4c, and it did not crash. Merging these two branches must have caused the issue.

VictorForouhar · 2024-05-17T16:42:17Z

Which is weird, since that branch had already incorporated changes in the merging algorithm.

VictorForouhar · 2024-05-17T19:23:16Z

Branch immediately after merging worked. The issue is likely caused by the subsequent commits.

It would produce duplicate particles since we'd be adding the bound component twice. Then the assert in the masking would fail due to the presence of duplicates and the fail to insert some particles.

VictorForouhar · 2024-05-17T19:43:18Z

Latest commit fixes the assert failure. Essentially, I was calling the MergeTo routine twice if it ran in DEBUG mode.

VictorForouhar added 4 commits April 25, 2024 15:47

Comments to explain recursive unbinding function

d055410

Added pseudocode

85bdb5b

Added function arguments and return values

04f44a0

Added recursive loop

265c88d

VictorForouhar mentioned this pull request Apr 26, 2024

Duplicate particles #29

Closed

VictorForouhar added 7 commits April 26, 2024 10:28

Added logic to handle merger testing and property calculation.

a478c25

Refactored new merging code.

49ecfa2

Added missing flag

8f91862

Modified variable name to more make self-explanatory

e8343f1

Declared Subhalo_t variables used in new merging method

815121a

Added function definition to measure core phase space properties

a6cb466

Added function calls to compute phase space properties of the core

b90a88f

VictorForouhar added 2 commits April 26, 2024 15:24

More refactoring

4b5f5c9

Do not check phase-space overlap between an subhalo and itself

c3280ec

Duh

VictorForouhar commented Apr 28, 2024

View reviewed changes

src/subhalo_unbind.cpp Outdated Show resolved Hide resolved

VictorForouhar added 2 commits April 29, 2024 09:58

Added missing PCB correction

ad15256

Would have affected absolute position values, and potentially the position dispersion (if the core of a subhalo is split across a PCB)

Added (temporary) tests of phase space distance calculations

3c7967e

To check values relative to old calculations. Run disabling the new merging scheme, to obtain the same subhalo-subhalo comparisons.

VictorForouhar added 2 commits April 29, 2024 10:38

Remove temporary asserts

a7283ce

Helper function to identify TrackIds across different HBT catalogues

c47d8c3

Should only be used for runs using the same FOF and particle information!

New merging method does not need to globalise SinkTrackId

fe3f5cf

Information available within Unbind, whereas previous method relied on a helper array that did not have the same ordering.

VictorForouhar added 2 commits April 30, 2024 09:19

Lambda criterion for removing foreing FOF particles defined separately

1f335f1

Change made to help debugging.

Fixed bug involving hostless subgroups when constraining them to a FOF

200d30f

This would occur because we tried to access element -1 in the halo array.

VictorForouhar added 3 commits April 30, 2024 16:02

Added assert to check if we have found all required IDs for merger_tree

34575e2

Re-order where single FOF constrains are applied.

7c6901f

Doing so before storing ids in merger tree can lead to some tracers being missing

Merge branch 'master' into remove_duplicate_particles_in_mergers

d17ef3c

Conflicts: HBT.cpp

VictorForouhar and others added 3 commits May 3, 2024 09:11

Pass bound particles instead of bound + unbound in mergers

478d27d

Merge branch 'master' into remove_duplicate_particles_in_mergers

f7799f0

robjmcgibbon reviewed May 17, 2024

View reviewed changes

src/subhalo_merge.cpp Outdated Show resolved Hide resolved

src/subhalo_merge.cpp Outdated Show resolved Hide resolved

src/subhalo_unbind.cpp Outdated Show resolved Hide resolved

VictorForouhar added 5 commits May 17, 2024 10:23

Remove special treatment of orphans in RecursiveUnbind

7f9b6fc

Removed commented out code

7d34a4b

Changed default method to insert particles in MergeTo

4ed0651

Formatter

bf66c3a

Missing call to starting Particle iterator

68aa55f

VictorForouhar requested a review from robjmcgibbon May 17, 2024 13:32

robjmcgibbon approved these changes May 17, 2024

View reviewed changes

Fixed double call of MergeTo when running in DEBUG

9a74133

It would produce duplicate particles since we'd be adding the bound component twice. Then the assert in the masking would fail due to the presence of duplicates and the fail to insert some particles.

VictorForouhar merged commit b8d5f0a into master May 20, 2024

VictorForouhar deleted the remove_duplicate_particles_in_mergers branch May 20, 2024 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove duplicate particles in mergers #32

Remove duplicate particles in mergers #32

VictorForouhar commented Apr 26, 2024

VictorForouhar commented Apr 26, 2024

VictorForouhar commented Apr 27, 2024

VictorForouhar commented Apr 27, 2024 •

edited

Loading

VictorForouhar commented Apr 27, 2024

VictorForouhar commented Apr 29, 2024 •

edited

Loading

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

jchelly commented Apr 29, 2024

VictorForouhar commented Apr 30, 2024

VictorForouhar commented May 1, 2024

VictorForouhar commented May 17, 2024

robjmcgibbon left a comment •

edited

Loading

jchelly commented May 17, 2024

VictorForouhar commented May 17, 2024

jchelly commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

Remove duplicate particles in mergers #32

Remove duplicate particles in mergers #32

Conversation

VictorForouhar commented Apr 26, 2024

VictorForouhar commented Apr 26, 2024

VictorForouhar commented Apr 27, 2024

VictorForouhar commented Apr 27, 2024 • edited Loading

VictorForouhar commented Apr 27, 2024

VictorForouhar commented Apr 29, 2024 • edited Loading

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

VictorForouhar commented Apr 29, 2024

jchelly commented Apr 29, 2024

VictorForouhar commented Apr 30, 2024

VictorForouhar commented May 1, 2024

VictorForouhar commented May 17, 2024

robjmcgibbon left a comment • edited Loading

Choose a reason for hiding this comment

jchelly commented May 17, 2024

VictorForouhar commented May 17, 2024

jchelly commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented May 17, 2024

VictorForouhar commented Apr 27, 2024 •

edited

Loading

VictorForouhar commented Apr 29, 2024 •

edited

Loading

robjmcgibbon left a comment •

edited

Loading