Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mu fcs hit clear #301

Merged
merged 33 commits into from
Feb 28, 2022
Merged

Mu fcs hit clear #301

merged 33 commits into from
Feb 28, 2022

Conversation

jdbrice
Copy link
Contributor

@jdbrice jdbrice commented Feb 16, 2022

Attempt to modify TClonesArray cleanup to prevent memory leak on StMuFcsHit

jdbrice and others added 30 commits June 11, 2021 10:16
update from upstream on Friday, June 11, 2021
Updated StarGeo.xml to be able to acces this in dev2021
… add a Clear on StMuFcsHit to clean up memory - may effect all TClonesArrays
@@ -28,6 +28,11 @@ class StMuFcsHit : public TObject {
unsigned short ns, unsigned short ehp, unsigned short dep, unsigned short ch,
float e);
~StMuFcsHit();

virtual void Clear (Option_t * opt=""){
TObject::Clear(opt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TObject::Clear() seems to be empty... At least
https://root.cern.ch/doc/master/classTObject.html#a0ccdc7dd6ee02ee79c46bfab0e1be291
shows only links to reimplementations. I don't know what this line do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, it is not needed. I will remove

@@ -242,7 +242,7 @@ void StMuDstMaker::clearArrays()
__NMTDARRAYS__+__NFGTARRAYS__;

for ( int i=0; i<ezIndex; i++) {
mAArrays[i]->Clear();
mAArrays[i]->Clear("C");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change affect all classes in MuDst Arrays. I guess we can try..

@genevb
Copy link
Contributor

genevb commented Feb 16, 2022

Test job to see the memory leak (was showing 273,940 bytes lost in 5 events from StMuFcsHit - biggest and last item in the valgrind report):

valgrind --leak-check=full root4star -b -q -l bfc.C\(5,\"pp2022,StiCA,BEmcChkStat,epdhit,\-hitfilt,\-fcsCluster,\-fcsPoint\",\"/star/data03/daq/2022/027/23027048/st_physics_23027048_raw_1000002.daq\"\)

From what I see, this patch does resolve the memory leak issue. I do not know whether the "C" argument in Clear() affects other collections.

-Gene

@genevb
Copy link
Contributor

genevb commented Feb 16, 2022

valgrind also reported another similar, but smaller memory leak. It appears that StMuEvent is also stored in a collection (of size 1?), and is never deleted on each event. However, it has a TArrayI data member whose contents are set in this line:

mL2Result.Set(event->triggerData()->l2ResultLength(),(const Int_t*) event->triggerData()->l2Result());

This form of TArrayI::Set() instantiates an array on the heap that is typically deleted in the destructor:
Set() : https://root.cern/root/html534/src/TArrayI.cxx.html#oaNEsD
Destructor : https://root.cern/root/html534/TArrayI.html#TArrayI:_TArrayI

I guess you could create a Clear() function for the StMuEventclass as well to delete the array:

mL2Result.Set(0);

I've been using a run for my tests that doesn't have many detectors in it, so I may be missing other collections with the same issue. I'll try processing a typical production run and see what it shows.

-Gene

@genevb
Copy link
Contributor

genevb commented Feb 16, 2022

Just looking for other TArray data members of MuDst classes, I see there's one called mHits in StMuEmcCluster, but someone already wrote a Clear() function that calls Set(0) :-)

void StMuEmcCluster::Clear(Option_t*)
{
  mEta=0;mPhi=0;mSigmaEta=0;mSigmaPhi=0;mEnergy=0;mNHits=0;
  mHits.Set(0);
}

-Gene

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 16, 2022

Just looking for other TArray data members of MuDst classes, I see there's one called mHits in StMuEmcCluster, but someone already wrote a Clear() function that calls Set(0) :-)

void StMuEmcCluster::Clear(Option_t*)
{
  mEta=0;mPhi=0;mSigmaEta=0;mSigmaPhi=0;mEnergy=0;mNHits=0;
  mHits.Set(0);
}

-Gene

This is exactly why I think adding the "C" option should be added - I always thought clearing the contained objects was the default. So the code you show here was maybe never even called since we did not use "C" option previously.
I am still looking into the way ROOT handles all of this, maybe there is a more elegant solution where the dtors are called, if so I'll investigate that.

@genevb
Copy link
Contributor

genevb commented Feb 16, 2022

Other leaks when not calling destructors found from running valgrind on data with other detectors:

valgrind --leak-check=full root4star -b -q -l bfc.C\(4,\"pp2022,StiCA,BEmcChkStat,epdhit,\-hitfilt,\-fcsCluster,\-fcsPoint\",\"/star/data03/daq/2022/046/23046005/st_physics_adc_23046005_raw_7500007.daq\"\)

StMutEvent has an StEventSummary data member, which itself has several TArray* members. By not calling desturctors, the arrays that are stored on the heap are never deleted. Because the classes are data members, not pointers to the classes, you can't delete the StEventSummary instance inside a StMuEvent::Clear() function. You would also need to create a StEventSummary::Clear() in which all of its TArray* data members' used Set(0).

Next up is StMuETofHeader, which uses STL vector instances that can also put things on the heap, and so must be deleted. Again, they are data members, so someone may have to create a StMuETofHeader::Clear() function that that calls vector::clear() for the vector data members.

-Gene

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 16, 2022

Other leaks when not calling destructors found from running valgrind on data with other detectors:

valgrind --leak-check=full root4star -b -q -l bfc.C\(4,\"pp2022,StiCA,BEmcChkStat,epdhit,\-hitfilt,\-fcsCluster,\-fcsPoint\",\"/star/data03/daq/2022/046/23046005/st_physics_adc_23046005_raw_7500007.daq\"\)

StMutEvent has an StEventSummary data member, which itself has several TArray* members. By not calling desturctors, the arrays that are stored on the heap are never deleted. Because the classes are data members, not pointers to the classes, you can't delete the StEventSummary instance inside a StMuEvent::Clear() function. You would also need to create a StEventSummary::Clear() in which all of its TArray* data members' used Set(0).

Next up is StMuETofHeader, which uses STL vector instances that can also put things on the heap, and so must be deleted. Again, they are data members, so someone may have to create a StMuETofHeader::Clear() function that that calls vector::clear() for the vector data members.

-Gene

Can you send me your valgrind report file?
I will work on adding the needed Clear functions.

Copy link
Contributor

@klendathu2k klendathu2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StMuFcsHit.h line 47. mData is deferenced w/out checking to see that it is not a nullptr.

It would be cleaner for StMuFcsHit (and StEvent/StFcsHit) to hold an instance of TArrayS, rather than constantly having to call destruct on TArrayS during "Clear".

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 16, 2022

I agree. However, Akio and I were unsure about this (maybe it is causing the other issue with MuDst) - i.e. can TClonesArray hold an object of unknown size?

@klendathu2k
Copy link
Contributor

klendathu2k commented Feb 16, 2022 via email

@genevb
Copy link
Contributor

genevb commented Feb 16, 2022 via email

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 17, 2022

OK commit 2a7e328 has updates for StMuEvent (and StEventSummary) and StMuETofHeader.

I didnt run a full valgrind but I did test to make sure the ::Clear(...) was being called.

I also changed StMuFcsHit to use a TArrayS instead of a TArrayS*

@akioogawa
Copy link
Contributor

Okey. I see that TArray does not violate TConesArray... I guess its same for TRefArray in StMuFcsCuster and this is nothing to do with StMuFcsCluster reading crash...

@genevb
Copy link
Contributor

genevb commented Feb 17, 2022

I didnt run a full valgrind ...

I ran valgrind and now it only shows MuDst leaks with the STL vectors in StMuETofHeader, which is surprising given that you do clear the vectors, and I confirmed that the Clear() is called, though for some reason it only appeared to be called 4 times in 5 events, so perhaps just one event gets lost?

Anyhow, this is really small and won't add up to more than a few MB over 50k events even. I can try to track it down later this evening, but I'm fine with moving forward as it is now. Thanks for the work, @jdbrice .

-Gene

@genevb
Copy link
Contributor

genevb commented Feb 18, 2022

I got it....

STL vectors have both a capacity (actual memory allocation) and a size (in-use portion of the allocation). While clear() sets the size to zero, capacity remains at the discretion of the STL implementation. The best bet for reducing the capacity to zero is to call clear(), and then call shrink_to_fit(), which brings brings the capacity down to fit the size (though my impression is that this isn't guaranteed, it has the highest probability of achieving the effect; there is an erase() function, but it too may or may not actually reduce the capacity). Of course deleting the vector guarantees that the allocation is freed, but that's not an option here.

Anyhow, I printed the capacities of the two vectors before and after the clear() and indeed they are unchanged. I then tried shrink_to_fit() and the capacity goes to zero! Yay! I am running in valgrind now and I am quite optimistic this memory leak will be gone.

As for STL maps, there doesn't seem to be this size vs. capacity business, and clear() does the appropriate job. And valgrind shows no leak from them.

So in the end, it appears that what we want is:

virtual void Clear( Option_t *opt = "" ){
    // ensure allocations are destroyed
    mRocGdpbTs.clear();
    mRocStarTs.clear();
    mMissMatchFlagVec.clear();
    mGoodEventFlagVec.clear();
    mMissMatchFlagVec.shrink_to_fit();
    mGoodEventFlagVec.shrink_to_fit();
}

-Gene

@genevb
Copy link
Contributor

genevb commented Feb 18, 2022

I am running in valgrind now and I am quite optimistic this memory leak will be gone.

Well, valgrind still reported on the two vectors in StMuETofHeader as memory leaks, but says of them:

==26042== 0 bytes in 4 blocks are definitely lost in loss record 5 of 270,908
...
==26042== 0 bytes in 4 blocks are definitely lost in loss record 6 of 270,908

I'll take memory leaks of size 0 bytes for the win any day :-)

-Gene

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 18, 2022

I am running in valgrind now and I am quite optimistic this memory leak will be gone.

Well, valgrind still reported on the two vectors in StMuETofHeader as memory leaks, but says of them:

==26042== 0 bytes in 4 blocks are definitely lost in loss record 5 of 270,908
...
==26042== 0 bytes in 4 blocks are definitely lost in loss record 6 of 270,908

I'll take memory leaks of size 0 bytes for the win any day :-)

-Gene

Agreed :)

@plexoos
Copy link
Member

plexoos commented Feb 18, 2022

Freeing the memory and allocating again for every event is potentially a more expensive operation than reusing the same chunk of memory. This is probably a reason why we have zero calls to std::vector::shrink_to_fit() in our code.

@genevb
Copy link
Contributor

genevb commented Feb 18, 2022 via email

@jdbrice
Copy link
Contributor Author

jdbrice commented Feb 21, 2022

@ullrich-bnl Can you comment or approve these changes

@akioogawa
Copy link
Contributor

@ullrich-bnl? please?

@starsdong starsdong merged commit f7b824a into star-bnl:main Feb 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants