Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use openMP to parallelize TOF calibration #6568

Merged
merged 3 commits into from Aug 11, 2021

Conversation

chiarazampolli
Copy link
Collaborator

Done with @noferini .


Fix include of omp

Remove trailing whitespaces

again tabs and trailing whitespaces...

...again?...

...tabs...

Whitespaces

debugging

fix

Removing wrong call

moving functions to source file to use OpenMP

Does not work, but let's commit

move fitter to channel offset from cosmics to TLinearFitter for omp (#39)

calling FixParameter only once

move to local fitter for cosmics (#40)

removing obsolete code - but needs optimization

fix in memory allocation TOF calib (#41)

Some fixes and parallelization of the finalizeSlotWithTracks

Formatting changes

leftover in formatting

clang-format

davidrohr
davidrohr previously approved these changes Jul 14, 2021
Copy link
Collaborator

@davidrohr davidrohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without checking in detail, looks ok, just one question below


#ifdef WITH_OPENMP
if (mNThreads < 1) {
mNThreads = std::min(omp_get_max_threads(), NMAXTHREADS);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many sectors does TOF have? Is it really 20?
Since you parallelize over the sectors, I think it would make sense to set NMAXTHREADS to the number of tof sectors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are 18. The 20 is the number that I got on my desktop. I can change it to 18, sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to impose a limit, I would put nSectors, 20 doesn't make sense since with parallelization over the sectors you cannot run 20 threads.
But in principle you also do not need to impose a limit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, only 18 threads will be created at max in any case. 20 was just the limit I had on my desktop, so what omp_get_max_threads() was returning. Then I thought that if you have something else running, this number will be less, so I used the minimum between the two. But if this is exactly what is done automatically, then I would remove this setting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, now I checked the code. Here the logic is: if you don't set the number of threads from outside, then you want that it parallelize as much as it can. If I just put:
mNThreads = omp_get_max_threads();
I guess it will know that the max is 18 from the fact that we parallelize the loop over the 18 sectors, right?

noferini
noferini previously approved these changes Jul 14, 2021
Copy link
Collaborator

@shahor02 shahor02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I've added a lot of comments some 10 days age, but forgot to submit them, sorry.
Apart from the technical comments below, a general one: you are adding to general use math library a fitGaus function which in fact has nothing to do with Gaus: you are fitting the log of input data to user provided function which will be equivalent to a guassian fit provided the user supplied TLinearFitter& fitter happens to be pol2.
What is the point of making a function with such specific requirements a general purpose function? I think it either should be defined a private method of your class, or better, implemented properly w/o user - supplied TLinearFitter: in the end, you are fitting a set of point to pol2, it takes a few lines to do this (one loop and 1 3x3 matrix inversion).

Comment on lines 274 to 277
TVectorD par(3);
TVectorD sigma(3);
TMatrixD A(3, 3);
TMatrixD b(3, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use dynamic objects where fixed size arrays would work. The matrices A, b are always filled but used only for npoints==3 for a trivial multiplication. I think these structures should be filled only if npoints==3 and if matrices to be used, it is more appropriate to use root (templated fixed size) SMatrix.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I work on this, or should the author address the comment?

fitter.Eval();
fitter.GetParameters(par);
fitter.GetCovarianceMatrix(mat);
chi2 = fitter.GetChisquare() / Double_t(npoints);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using proper reduced chi2 definition, i.e. chi2 = fitter.GetChisquare() / (npoints - 3); ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I am not really the author of the function.

Int_t npoints = 0;
for (Int_t ibin = 0; ibin < nBins; ibin++) {
Float_t entriesI = arr[ibin];
if (entriesI > 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why >1 and not >0?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

// use fitter for more than three points
fitter.Eval();
fitter.GetParameters(par);
fitter.GetCovarianceMatrix(mat);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this cov.matrix makes no sense for the final params of the fit: it is calculated for the params of pol2, which you then transform to gaussian params.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same again.

mStripOffsetFunction.append("-");
mStripOffsetFunction.append(Form("(x > %d && x < %d)", irow + 96, irow + 96 + 1));
}
mStripOffsetFunction.append(Form(") "));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, what does this monster function do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function allows to fit the differences between the peaks for each channel. Using such function allows to get better results since we have 96 channels per strip to be calibrated, but we can use 144 points (all pairs between adjacent channels, adjacent in X or Z).

Comment on lines 405 to 407
memset(&fracUnderPeak[0], 0, sizeof(fracUnderPeak));
memset(isChON, 0, 96);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not defining directly fracUnderPeak[Geo::NPADS]={0} and isChON[96]={0} ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to reset it for every strip, so I need memset, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could simply define them as local (in the strip loop) variable initialized by {0}, should be faster than calling memset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, I will commit all changes in one go, for simplicity.


LOG(DEBUG) << "N real params = " << nparams << ", fitter has " << localFitter.GetNumberFreeParameters() << " free parameters, " << localFitter.GetNumberTotalParameters() << " total parameters, " << localFitter.GetNpoints() << " points";
//LOG(INFO) << "Sector = " << sector << ", strip = " << istrip << " fitted by thread = " << ithread << ": ready to fit";
printf("Sector = %d, strip = %d, fitted by thread = %d: ready to fit\n", sector, istrip, ithread);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid using printf, use LOG

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. The problem was that with LOG and the thread we did not see the logs for some reason, so then we changed them to printf, but then forgot to change them again.
Do you know if it is normal that the LOG did not show?

}

if (fitValues[2] < 0) {
fitValues[2] = -fitValues[2];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how this can be?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually do not remember :-) @noferini , do you recall if we had seen fitValues[2] (which is the sigma) negative?

@chiarazampolli
Copy link
Collaborator Author

Hi @shahor02 ,

Before going to the specific comments:

Apart from the technical comments below, a general one: you are adding to general use math library a fitGaus function which in fact has nothing to do with Gaus: you are fitting the log of input data to user provided function which will be equivalent to a guassian fit provided the user supplied TLinearFitter& fitter happens to be pol2.
What is the point of making a function with such specific requirements a general purpose function? I think it either should be defined a private method of your class, or better, implemented properly w/o user - supplied TLinearFitter: in the end, you are fitting a set of point to pol2, it takes a few lines to do this (one loop and 1 3x3 matrix inversion).

Actually the fitGaus existed before, we only added another signature of the same. And it is used in several other classes.

Chiara

@shahor02
Copy link
Collaborator

Hi @chiarazampolli

The original general purpose MathUtils/fit.h has a fitGaus method which indeed fits data to gauss and does not provide the error matrix.
You are adding another function fitGaus which actually does no fit to gaus, (but fits the log of provided data to user-supplied TLinearFitter), extracts wrong covariance matrix and is certainly not optimal from the performance point of view. What's the point of doing this? If you need something quick and dirty, better move this function to your local code and suppress the wrong definition of the output cov.matrix. But, better to add a proper function for fast Gauss fit, usable with openMP.
If you want, I can provide the code.

@chiarazampolli
Copy link
Collaborator Author

Hi @shahor02 ,

It is for sure my ignorance, but I am not sure I understand your comment: apart from passing the matrix, which we might even be able to avoid, why is the fitGaus that we added different from the other one? We only pass the TLinearFitter, since this is probably a static object that with threads gets messed up.

I would be very interested to see how to create the fitting function, if this is the way to go. Since next week we will be back to CERN, maybe we can find a slot offline to have a look together? Or if you have no time, you can point me to some documentation/examples.

Thanks,

Chiara

@shahor02
Copy link
Collaborator

Hi @chiarazampolli

The original version is still (in an inefficient way) fitting the logs of values to pol2 (defined in the function), which is equivalent to fitting the points to gaussian (provided the values are positive) and does not pretend to extract the errors.
With your modification this function 1) does not fit the gaussian if the provided TLinearFitter does not happen to be pol2,
2) the matrix you pretend to extract is wrong by construction. This is not how the common-use utility should behave.
On top of that, it the speed is so important that you use openMP, better to use optimize the function at 1st place.
I will write simpler code for log-normal fit in one of the next days.

@chiarazampolli
Copy link
Collaborator Author

Hi @shahor02 .
Thanks, now I understand better. I committed for now the other fixes. I would propose to keep this PR open till we have the more appropriate fit procedure.
Cheers,
Chiara

@shahor02
Copy link
Collaborator

Hi @chiarazampolli
could you try to use the new function from #6659? With it you don't need to prebook bunch of linear fitters and you are free to extract or not a proper cov. matrix.
We can discuss Tue. afternoon, probably after 3:30 pm.

Fix include of omp

Remove trailing whitespaces

again tabs and trailing whitespaces...

...again?...

...tabs...

Whitespaces

debugging

fix

Removing wrong call

moving functions to source file to use OpenMP

Does not work, but let's commit

move fitter to channel offset from cosmics to TLinearFitter for omp (#39)

calling FixParameter only once

move to local fitter for cosmics (#40)

removing obsolete code - but needs optimization

fix in memory allocation TOF calib (#41)

Some fixes and parallelization of the finalizeSlotWithTracks

Formatting changes

leftover in formatting

clang-format

Fixing braces
@chiarazampolli
Copy link
Collaborator Author

Hi @shahor02 ,

I changed the fitting as we discussed, using your new fitGaus with the medMAD to reject outliers.

Thanks for this development, it is very nice!

Chiara

LOG(DEBUG) << "Strip = " << istrip << " fitted by thread = " << ithread << ", goodpoints = " << goodpoints << ", number of free parameters = "
<< localFitter.GetNumberFreeParameters() << ", NDF = " << goodpoints - localFitter.GetNumberFreeParameters();

if (goodpoints <= localFitter.GetNumberFreeParameters()) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ciao @noferini ,

Note that here I replaced: if (goodpoints < localFitter.GetNumberFreeParameters()) with if (goodpoints <= localFitter.GetNumberFreeParameters()) (note the "<=").

@chiarazampolli
Copy link
Collaborator Author

Hello @TimoWilken ,

All tests are red, but no error is found... Any hint? Other PRs have issues, see #6772

Thanks,

Chiara

@noferini
Copy link
Collaborator

+1

@noferini
Copy link
Collaborator

Dear all,
we are close to deploy the calibration topology flp-epn-cal for TOF.
We need this PR as soon as possible to inject the changes and test the Workflow at P2 if there are no major comments.

Copy link
Collaborator

@shahor02 shahor02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to merge this, sorry.

@shahor02 shahor02 merged commit f579fbd into AliceO2Group:dev Aug 11, 2021
@chiarazampolli chiarazampolli deleted the TOFcalibParallel branch May 16, 2022 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants