New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization in TheoreticalSpectrumGenerator and MSSpectrum + Fixing multithreaded SimpleSearchEngine #4709
Conversation
TheoreticalSpectrumGenerator.cpp
…ty to emplace_back
@enetz what do you think about these changes? I think you also made some attempts to optimize TSG |
I've taken another look into the threading problem. There was an omp atomic missing, but fixing this didn't solve the whole problem. |
I've found out, that the sorting of the best n hits for each scan are sorted using the AnnotatedHit_::hasBetterScore(). Problem here is, that the sorting only considers the score of the compared hits. If two sequences have the same score for one scan (which happens fairly often), there is no rule on how to order them. That made the SimpleSearchEngine have different results when using multiple threads. Also, I had to build in a sorting of the peptide_ids-vector in postProcessHits_(), to make sure, the idXML-File always writes the hits in the same order (as using threads "shuffles" peptide_ids). The results with these new changes differ slightly compared to what the SimpleSearchEngine has put out (using one thread) before the changes were made. But at least the results are always the same now indepent on how many threads are being used. |
@@ -143,5 +147,8 @@ namespace OpenMS | |||
double pre_int_; | |||
double pre_int_H2O_; | |||
double pre_int_NH3_; | |||
|
|||
// formula.toString() is extremely expensive, so we use a member map to remember what String belongs to which formula | |||
//mutable std::map<EmpiricalFormula, String> formula_str_cache_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about this comment line?
While profiling the SimpleSearchEngine, we have found several points to optimize:
This is how we optimized the above points:
Also, a test for the new sort function was added to the MSSpectrum_test.
Results:
Depending on the parameters for the SimpleSearchEngine, the speed-up lies between 7% and 20%.
More information on the progress of this pull request can be seen here