Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBSCAN 3D cluster algorithm #139

Merged

Conversation

osahin
Copy link

@osahin osahin commented Aug 31, 2017

Implementation of the DBScan algorithm (Ester et al) for linking the 2D clusters.
Required parameters (to be optimized) cluster distance, min number of 2D clusters.
To do:

  • Optimize the parameters
  • Improve the performance using a ranked list.

void clusterizeDBSCAN( const edm::PtrVector<l1t::HGCalCluster> & clustersPtr,
l1t::HGCalMulticlusterBxCollection & multiclusters);

void findNeighbor( const edm::PtrVector<l1t::HGCalCluster> & clustersPtrs,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function is not meant to be used outside the class, put it private

const l1t::HGCalCluster & cluster,
std::vector<int> & neighborList);

bool isNeighbor( const l1t::HGCalCluster & clu1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function is not meant to be used outside the class, put it private

{
int iclu = 0;

for(edm::PtrVector<l1t::HGCalCluster>::iterator clu = clustersPtrs.begin(); clu != clustersPtrs.end(); ++clu, ++iclu){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iterator is const, so better to write it explicitly using const_iterator

std::vector<std::vector<int>> neighborList;
int iclu = 0, imclu = 0;

for(edm::PtrVector<l1t::HGCalCluster>::iterator clu = clustersPtrs.begin(); clu != clustersPtrs.end(); ++clu, ++iclu){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iterator is const, so better to write it explicitly using const_iterator

multiclustersTmp.emplace_back( *clu );
merged.at(iclu) = true;

for(unsigned int neighNo = 0; neighNo < neighborList.at(iclu).size(); neighNo++){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A range-based loop would make the code more readable, to avoid having all the neighborList.at(iclu).at(neighNo)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent half a day on this, I don't think range-based loops work here (note that range is not constant). I would be more than happy to try if you have a suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, right. In that case maybe you can define an integer at the beginning of the loop:

int neighbor = neighborList.at(iclu).at(neighNo);

to be used later in the loop.

@@ -65,7 +65,11 @@

C3d_parValues = cms.PSet( dR_multicluster = cms.double(0.01), # dR in normalized plane used to clusterize C2d
minPt_multicluster = cms.double(0.5), # minimum pt of the multicluster (GeV)
calibSF_multicluster = cms.double(1.084)
calibSF_multicluster = cms.double(1.084),
type_multicluster = cms.string('DBSCANC3d'), # 'dRC3d' for the cone algorithm
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algo should not be used as default for the moment since no performance studies have been shown yet.

calibSF_multicluster = cms.double(1.084)
calibSF_multicluster = cms.double(1.084),
type_multicluster = cms.string('DBSCANC3d'), # 'dRC3d' for the cone algorithm
dist_dbscan_multicluster = cms.double(0.03),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this parameter plays the same role as dR_multicluster. If this is the case one single parameter could be used.

std::vector<int> neighbors;

if(!visited.at(iclu)){
visited.at(iclu)=true;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are mixed tab and space indents from here down to the end of the function. Please use only spaces.



void HGCalMulticlusteringImpl::findNeighbor( const edm::PtrVector<l1t::HGCalCluster> & clustersPtrs,
const l1t::HGCalCluster & cluster,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed tab and space indents. Please use only spaces.

}else {
edm::LogWarning("ParameterError") << "Unknown Multiclustering type '" << typeMulticluster
<< "'. Using DBSCAN instead.\n";
multiclusteringAlgoType_ = DBSCANC3d;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algo should not be used as default for the moment since no performance studies have been shown yet.

@jbsauvan
Copy link

jbsauvan commented Sep 1, 2017

There are still mixed spaces and tabs

@jbsauvan jbsauvan merged commit 7947a82 into PFCal-dev:hgc-tpg-devel-CMSSW_9_3_0_pre4 Sep 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants