SO list offsets are wrong/counterintuitive #108

bwvdnbro · 2021-07-08T15:45:51Z

Describe the bug
The offsets stored in the SO list output seem not consistent with the SO sizes in the same file. My (possibly wrong) expectation is that the particle IDs belonging to SO i are stored in pIDs[offset[i] : offset[i]+size[i]], but that is not the case.

To Reproduce
The problem can be best illustrated through the following Python snippet that can be applied to any HDF5 SO list output:

import h5py
SOfile = h5py.File("<RANDOM SO_LIST FILE.hdf5>", "r")
ofs = SOfile["Offset"][:]
siz = SOfile["SO_size"][:]
pIDs = SOfile["Particle_IDs"][:]
if not ofs[-1]+siz[-1] == pIDs.shape[0]:
  print("Wrong final size ({0}=/={1})!".format(ofs[-1]+siz[-1], pIDs.shape[0]))
for i in range(1,ofs.shape[0]):
  if not ofs[i-1]+siz[i-1] == ofs[i]:
    print("Wrong offset ({0} {1}, {2} {3})!".format(ofs[i-1], siz[i-1], ofs[i], siz[i]))
    exit(1)

This will produce two error messages. The first one because the final offset is not separated from the end of the particle ID list by the size of the final SO, and the second because the next offset does not match the previous offset plus the previous size.

Expected behavior
ofs[i] == ofs[i-1]+siz[i-1]
Put differently, I would expect ofs to be equivalent to

ofs = numpy.cumsum(siz)
ofs[1:] = ofs[:-1]
ofs[0] = 0

Log files
Not applicable.

Environment (please complete the following information):
Irrelevant for this problem.

Additional context
I think the problem is situated in io.cxx:1450, where the value of SOpids[0] seems to be (incorrectly) omitted from the loop.

The text was updated successfully, but these errors were encountered:

rtobar · 2021-07-09T13:54:28Z

Thanks @bwvdnbro for recording the issue here.

tl;dr: you are almost right: yes, there is an element being skipped when building offset from SOpids, but the missing index is 1, not 0. We can quickly patch this now, or you can wait a bit more for a more complete solution.

The longer story now: like I mentioned in Slack, I had actually been looking at this code these days in the context of trying to reduce the maximum memory usage (see #71). The code has some extra complexity because the SOpids and SOtypes arrays-of-vectors are originally allocated in GetSOMasses with N + 1 elements:

VELOCIraptor-STF/src/substructureproperties.cxx

Lines 3306 to 3311 in b3c3371

    
               vector<Int_t> *SOpartlist = new vector<Int_t>[ngroup+1]; 
        
               vector<int> *SOparttypelist = NULL; 
        
           #if defined(GASON) || defined(STARON) || defined(BHON) || defined(HIGHRES) 
        
               SOparttypelist=new vector<int>[ngroup+1]; 
        
           #endif

Then they are 1-indexed:

VELOCIraptor-STF/src/substructureproperties.cxx

Line 3372 in b3c3371

for (i=1;i<=ngroup;i++)

VELOCIraptor-STF/src/substructureproperties.cxx

Lines 3751 to 3758 in b3c3371

    
                       SOpartlist[i].resize(llindex); 
        
           #if defined(GASON) || defined(STARON) || defined(BHON) || defined(HIGHRES) 
        
                       SOparttypelist[i].resize(llindex); 
        
           #endif 
        
                       for (j=0;j<llindex;j++) SOpartlist[i][j]=SOpids[indices[j]]; 
        
           #if defined(GASON) || defined(STARON) || defined(BHON) || defined(HIGHRES) 
        
                       for (j=0;j<llindex;j++) SOparttypelist[i][j]=typeparts[indices[j]]; 
        
           #endif

The same repeats in GetInclusiveMasses.

Skipping element 0 is then fine, because it does't contain anything. What is being skipped though is the size of element 1 in the loop populating offset:

VELOCIraptor-STF/src/io.cxx

Line 1461 in b3c3371

for (Int_t i=2;i<=ngroups;i++) offset[i]=offset[i-1]+SOpids[i].size();

In that line the indexing should happen with SOpids[i - 1].

As part of the changes that I've been working on for #71 I've also changed the indexing strategy for SOpids and SOtypes to be 0-based, which in turn makes their manipulation much clearer in other places. My changes have also fixed this problem actually, although I hadn't noticed them before you raised this issue. If you are in a hurry we can quickly try a patch now, but if you're happy to wait a bit more for the more complete solution then a fix for this problem will be included when I merge my changes.

bwvdnbro · 2021-07-09T13:56:55Z

Sorry, I meant element 1 and not 0 (my mind is just not used to thinking about 1 as the first element).

There is absolutely no hurry, I am currently using the numpy.cumsum workaround in my script.

Spherical overdensity information (Particle IDs and types) are collected by two different routines: GetInclusiveMasses and GetSOMasses. Both routines use the following technique: first, for "ngroup" groups, a C array-of-vectors (AOV) of "ngroup + 1" elements was allocated via "new" (which was eventually delete[]'ed). Vectors for each group are filled independently as groups are processed; the loop over groups is 1-indexed, and indexing into these AOVs happens using the same iteration variable, meaning that their 0 element is skipped. Finally, these AOVs are passed to WriteSOCatalog for writing. WritingSOCatalog is aware of the 1-based indexing, and additionally it flattens the AOVs into a single vector (thus duplicating their memory requirement) to finally write the data into the output HDF5 file. This commit originally aimed to reduce the memory overhead of the final writing of data into the HDF5 file. The main change required to achieve this is to perform the flattening of data at separate times, such that particles IDs and types are not flattened at the same time, but one after the other, with memory from the first flattening operation being released before the second flattening happens, thus reducing the maximum memory requirement of the application. This goal was achieved. However, while performing these changes two things became clear: firstly, that using a vector-of-vectors (VOV) was a better interface than an AOV (due to automatic memory management), and secondly that the 1-based indexing of the original AOVs introduced much complexity in the code. The scope of these changes was then broadened to cover these two extra changes, and therefore this commit considerably grew in size. In particular the 0-indexing of the VOVs allowed us to more easily use more std algorithms that clarify the intent in certain places of the code. There are other minor changes that have also been included in this commit, mostly to reduce variable scopes, reduce code duplication, and such. Assertions have also been sprinkled here and there to add further assurance that the code is working as expected. As an unintended side-effect, this commit also fixed the wrongly-calculated Offset dataset, which was off by one index in the original values. This problem was reported in #108, and seems to have always been there.

Spherical overdensity information (Particle IDs and types) are collected by two different routines: GetInclusiveMasses and GetSOMasses. Both routines use the following technique: first, for "ngroup" groups, a C array-of-vectors (AOV) of "ngroup + 1" elements was allocated via "new" (which was eventually delete[]'ed). Vectors for each group are filled independently as groups are processed; the loop over groups is 1-indexed, and indexing into these AOVs happens using the same iteration variable, meaning that their 0 element is skipped. Finally, these AOVs are passed to WriteSOCatalog for writing. WritingSOCatalog is aware of the 1-based indexing, and additionally it flattens the AOVs into a single vector (thus duplicating their memory requirement) to finally write the data into the output HDF5 file. This commit originally aimed to reduce the memory overhead of the final writing of data into the HDF5 file (see #71). The main change required to achieve this is to perform the flattening of data at separate times, such that particles IDs and types are not flattened at the same time, but one after the other, with memory from the first flattening operation being released before the second flattening happens, thus reducing the maximum memory requirement of the application. This goal was achieved. However, while performing these changes two things became clear: firstly, that using a vector-of-vectors (VOV) was a better interface than an AOV (due to automatic memory management), and secondly that the 1-based indexing of the original AOVs introduced much complexity in the code. The scope of these changes was then broadened to cover these two extra changes, and therefore this commit considerably grew in size. In particular the 0-indexing of the VOVs allowed us to more easily use more std algorithms that clarify the intent in certain places of the code. There are other minor changes that have also been included in this commit, mostly to reduce variable scopes, reduce code duplication, and such. Assertions have also been sprinkled here and there to add further assurance that the code is working as expected. As an unintended side-effect, this commit also fixed the wrongly-calculated Offset dataset, which was off by one index in the original values. This problem was reported in #108, and seems to have always been there.

rtobar · 2021-07-13T02:19:26Z

@bwvdnbro please have a look at the changes in the issue-71 branch, that should now yield a Offset dataset. Please let me know if you see now the correct values (and that nothing else breaks), and after confirmation I'll merge this to the master.

bwvdnbro · 2021-07-14T16:09:14Z

Hi @rtobar, I will take a look at this tomorrow. I have never actually run VELOCIraptor before, so it can take a while. So far I was just looking at the output provided by someone else.

bwvdnbro · 2021-07-27T16:18:14Z

@rtobar, I can now confirm that the changes in the issue-71 fix the offsets and do not affect any of the other quantities we look at.

rtobar · 2021-07-28T02:07:42Z

Branch merged now, thanks for testing.

rtobar added the bug Something isn't working label Jul 9, 2021

bwvdnbro closed this as completed Jul 27, 2021

rtobar mentioned this issue Jul 28, 2021

std::bad_alloc in WriteSOCatalog #71

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SO list offsets are wrong/counterintuitive #108

SO list offsets are wrong/counterintuitive #108

bwvdnbro commented Jul 8, 2021

rtobar commented Jul 9, 2021

bwvdnbro commented Jul 9, 2021

rtobar commented Jul 13, 2021

bwvdnbro commented Jul 14, 2021

bwvdnbro commented Jul 27, 2021

rtobar commented Jul 28, 2021

SO list offsets are wrong/counterintuitive #108

SO list offsets are wrong/counterintuitive #108

Comments

bwvdnbro commented Jul 8, 2021

rtobar commented Jul 9, 2021

bwvdnbro commented Jul 9, 2021

rtobar commented Jul 13, 2021

bwvdnbro commented Jul 14, 2021

bwvdnbro commented Jul 27, 2021

rtobar commented Jul 28, 2021