Cluster2TP assoc on GPU #105

VinInn · 2018-07-26T09:52:57Z

This PR contains mostly a cluster to trackingParticle association on GPU
It includes the possibility to dump all hits on cpu with the corresponding TP
adding to config process.tpClusterProducerHeterogeneous.dumpCSV = True
of course
process.tpClusterProducerHeterogeneousPixelTrackingOnly.dumpCSV = True
for our workflows

it also includes a "proto" doublet code ready to produce Cells to be consumed by the CA.

I prefer this is merged now.
We will proceed to create Cells and use them in CA later

VinInn · 2018-07-26T10:27:38Z

ok to get a list of hits tat can be compared one can use
grep "HIT" dump2.log| cut -d' ' -f2,4-15 | sort -g -k1 -k10 -k3 > zmumu2.csv

VinInn · 2018-07-26T10:28:56Z

indeed zmumu does not reproduce
ev det charge xg yg zg rg iphi tkId pt n1 tkId2 pt2
diff zmumu1.csv zmumu2.csv | less

3341d3340
< 8 1830 99663 2.479492 13.942786 -50.228298 14.161539 14547 337 118 5 2 312
3502a3502
> 8 1830 99663 2.479492 13.942786 -50.228298 14.161539 14547 2 312 3 337 118
12420a12421
> 25 263 3560705 -1.036580 -7.205391 20.692194 7.279571 -17876 370 750 70 367 3577
12696d12696
< 25 263 3560705 -1.036580 -7.205391 20.692194 7.279571 -17876 367 3577 70 370 750
34684c34684
< 59 34 53883 -2.440561 1.777732 -9.959659 3.019382 26200 0 0 0 0 0
---
> 59 34 79036 -1.781883 2.457852 -12.177644 3.035810 22928 0 0 0 0 0
38699a38700

etc

oops, no is the clus2TP that does not fully reproduce in case of multiple TPs...
ev 59, det 34 instead seems a real issue
so select only those with no second TP

diff zmumu1.csv zmumu2.csv | grep "0 0 0"    
< 59 34 53883 -2.440561 1.777732 -9.959659 3.019382 26200 0 0 0 0 0
> 59 34 79036 -1.781883 2.457852 -12.177644 3.035810 22928 0 0 0 0 0
< 408 31 275932 -1.483108 2.348596 19.877537 2.777681 22260 0 0 0 0 0
> 408 31 380389 -1.003347 2.485823 21.884169 2.680676 20385 0 0 0 0 0
diff zmumu1.csv zmumu3.csv | grep "0 0 0"  
< 243 86 15754 2.166721 -2.250007 13.777572 3.123653 -8388 0 0 0 0 0
> 243 86 44356 2.109987 -2.308259 17.822094 3.127316 -8659 0 0 0 0 0
< 408 31 275932 -1.483108 2.348596 19.877537 2.777681 22260 0 0 0 0 0
> 408 31 380389 -1.003347 2.485823 21.884169 2.680676 20385 0 0 0 0 0
diff zmumu2.csv zmumu3.csv | grep "0 0 0"
< 59 34 79036 -1.781883 2.457852 -12.177644 3.035810 22928 0 0 0 0 0
> 59 34 53883 -2.440561 1.777732 -9.959659 3.019382 26200 0 0 0 0 0
< 243 86 15754 2.166721 -2.250007 13.777572 3.123653 -8388 0 0 0 0 0
> 243 86 44356 2.109987 -2.308259 17.822094 3.127316 -8659 0 0 0 0 0

makortel

Spotted a few things that could be cleaned up, otherwise looks good to me.

makortel · 2018-07-26T10:49:33Z

SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu

+            count = step;
+    }
+    return first;
+}


This is the same as in

cmssw/HeterogeneousCore/CUDAUtilities/interface/cudastdAlgorithm.h

Lines 24 to 26 in 64e6201

template<typename RandomIt, typename T, typename Compare = less<T>>

constexpr

RandomIt lower_bound(RandomIt first, RandomIt last, const T& value, Compare comp={})

right?

yes, but the one cudastd does not compile seems to require __device__ __host__
at least in this context

Ok. Should we then consider decorating the cudastd ones with __device__ __host__? (possibly in a later PR)

definetively!
I prefer first we find a location for a macro or something that guarantee __device__ __host__ not be defined if a non cuda compiler is used...

actually this is the error message

/home/vin/GPUDoublets/CMSSW_10_2_0_pre6_Patatrack/src/HeterogeneousCore/CUDAUtilities/interface/cudastdAlgorithm.h(46): error: calling a __device__ function("operator()") from a __host__ __device__ function("lower_bound") is not allowed detected during instantiation of "RandomIt cuda_std::lower_bound(RandomIt, RandomIt, const T &, Compare) [with RandomIt=const std::array<uint32_t, 4UL> *, T=std::array<uint32_t, 4UL>, Compare=lambda [](const std::array<uint32_t, 4UL> &, const std::array<uint32_t, 4UL> &)->bool]" /home/vin/GPUDoublets/CMSSW_10_2_0_pre6_Patatrack/src/SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu(75): here

pretty bizzare

Maybe the lambda gets declared only as __device__?

Found this https://devblogs.nvidia.com/new-compiler-features-cuda-8/
What would happen with

auto less = [] __host__ __device__ (...)->bool{

?

yes, in in a global function
I can mark it device host

- auto less = [](std::array<uint32_t,4> const & a, std::array<uint32_t,4> const & b)->bool { + auto less = [] __device__ __host__ (std::array<uint32_t,4> const & a, std::array<uint32_t,4> const & b)->bool {

ok fine it compiles
I will make new PR, you to judge how ugly is it...

makortel · 2018-07-26T11:02:39Z

SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu

+  const std::array<uint32_t,4> me{{id,ch,0,0}};
+
+  auto less = [](std::array<uint32_t,4> const & a, std::array<uint32_t,4> const & b)->bool {
+     return a[0]<b[0] || ( !(b[0]<a[0]) && a[1]<b[1]); // in this context we do not care of [2] 


I'm not sure I understand the logic. !(b[0]<a[0]) is equivalent to a[0]<=b[0], which given the left side if || has the same effect as a[0]==b[0]. I find the latter easier to understand.

yes, this is the standard way to code lexicographic ordering in std, when the only requirement is the existance of operator< (not operator==)

Good point, thanks. On the other hand in this case the compared types are uint32_t, but ok.

makortel · 2018-07-26T11:04:07Z

SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu

+
+   cudaCheck(cudaMalloc((void**) & slgpu.me_d, sizeof(ClusterSLGPU)));
+   cudaCheck(cudaMemcpyAsync(slgpu.me_d, &slgpu, sizeof(ClusterSLGPU), cudaMemcpyDefault, stream.id()));
+   cudaCheck(cudaDeviceSynchronize());


IIUC this synchronization is not needed.

makortel · 2018-07-26T11:05:03Z

SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu

+   cudaCheck(cudaMalloc((void**) & slgpu.n2_d,(ClusterSLGPU::MaxNumModules*256)*sizeof(uint32_t)));
+
+
+   cudaCheck(cudaMalloc((void**) & slgpu.me_d, sizeof(ClusterSLGPU)));


Are these freed anywhere?

oopsss, no.

makortel · 2018-07-26T11:06:16Z

SimTracker/TrackerHitAssociation/plugins/ClusterSLOnGPU.cu

+
+    assert(sl.me_d);
+    simLink<<<blocks, threadsPerBlock, 0, stream.id()>>>(dd.me_d,ndigis, hh.gpu_d, sl.me_d,n);
+    cudaStreamSynchronize(stream.id());


IIUC this synchronization is not needed (even for the dump below).

it is needed for the dump below in case of other printf (can go inside the if)

Why? dumpLink below is launched asynchronously on the same CUDA stream, so I'd expect it to work without this synchronization.

is printf that requires syncronization to dump the buffer to host.
otherwise it will overwrite the circular one on device.
at least this is what I understood (and observed)

Ok, so you want to protect against any potential earlier printf? Then yes, please move to inside the if (with a comment explaining the need).

makortel · 2018-07-26T11:24:56Z

SimTracker/TrackerHitAssociation/plugins/ClusterTPAssociationHeterogeneous.cc

+
+  iEvent.put<Output>(std::move(output), [legacy](const GPUProduct& hits, CPUProduct& cpu) {
+                       cpu = *legacy; delete legacy;
+                     });


Nice example in favor of #100.

makortel · 2018-07-26T14:31:39Z

+1 from me

VinInn · 2018-07-26T14:36:28Z

some level of irreproducibility exists:
the number of clusters/hits is always the same

[innocent@vinzen0]/home/vin/mc2018/crash% wc  zmumu6.csv                          
  276370  3869180 20537503 zmumu6.csv
[innocent@vinzen0]/home/vin/mc2018/crash% wc  zmumu5.csv
  276370  3869180 20537501 zmumu5.csv
[innocent@vinzen0]/home/vin/mc2018/crash% wc  zmumu4.csv
  276370  3869180 20537502 zmumu4.csv

details changes in few cases: some can be attributed to cluster2TP (to be investigated)

[innocent@vinzen0]/home/vin/mc2018/crash% grep "5 0 126468" zmumu5.csv
5 0 126468 2.908309 0.966696 -25.288895 3.064762 3347 4 443 13 0 0 0
[innocent@vinzen0]/home/vin/mc2018/crash% grep "5 0 126468" zmumu6.csv
5 0 126468 2.908309 0.966696 -25.288895 3.064762 3347 4 443 12 0 0 1

other seems really coming from the clusterizer (always for clusters not associated to any TP???)

[innocent@vinzen0]/home/vin/mc2018/crash% diff zmumu6.csv zmumu4.csv | grep " 0$" 
> 5 0 126468 2.908309 0.966696 -25.288895 3.064762 3347 4 443 13 0 0 0
< 59 34 79036 -1.781883 2.457852 -12.177644 3.035810 22928 0 0 0 0 0 0
> 59 34 53883 -2.440561 1.777732 -9.959659 3.019382 26200 0 0 0 0 0 0
< 480 49 22916 -3.064693 -0.377123 -18.870319 3.087810 -31490 0 0 0 0 0 0
> 480 49 188751 -2.813137 -1.259513 -18.027351 3.082225 -28377 0 0 0 0 0 0
[innocent@vinzen0]/home/vin/mc2018/crash% diff zmumu5.csv zmumu4.csv | grep " 0$" 
> 119 840 53604 -8.183054 13.835355 -22.581303 16.074184 21956 0 0 0 0 0 0
< 119 840 82958 -8.744591 13.499733 -22.080933 16.084486 22381 0 0 0 0 0 0
< 176 55 96198 -3.022953 -0.524170 19.582039 3.068061 -30976 0 0 0 0 0 0
> 176 55 109700 -2.934326 -0.835302 20.598766 3.050902 -29875 0 0 0 0 0 0
< 408 31 380389 -1.003347 2.485823 21.884169 2.680676 20385 0 0 0 0 0 0
> 408 31 275932 -1.483108 2.348596 19.877537 2.777681 22260 0 0 0 0 0 0
< 450 270 39491 0.347789 -6.659621 13.322131 6.668696 -15840 0 0 0 0 0 0
> 450 270 395927 1.614992 -6.495625 17.991888 6.693380 -13842 0 0 0 0 0 0
< 480 49 22916 -3.064693 -0.377123 -18.870319 3.087810 -31490 0 0 0 0 0 0
> 480 49 188751 -2.813137 -1.259513 -18.027351 3.082225 -28377 0 0 0 0 0 0

fwyzard · 2018-07-30T10:43:59Z

Validation summary

Reference release CMSSW_10_2_0_pre6 at a674e1f
Development branch CMSSW_10_2_X_Patatrack at 64e6201
Testing PRs:

Cluster2TP assoc on GPU #105 at 007795f

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, visual profile and summary for workflow 10824.5
development log, visual profile and summary for workflow 10824.5
development log, visual profile and summary for workflow 10824.8
development log, visual profile and summary for workflow 10824.7
development log, visual profile and summary for workflow 10824.9
testing log, visual profile and summary for workflow 10824.5
testing log, visual profile and summary for workflow 10824.8
testing log, visual profile and summary for workflow 10824.7
testing log, visual profile and summary for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, visual profile and summary for workflow 10824.5
development log, visual profile and summary for workflow 10824.5
development log, visual profile and summary for workflow 10824.8
development log, visual profile and summary for workflow 10824.7
development log, visual profile and summary for workflow 10824.9
testing log, visual profile and summary for workflow 10824.5
testing log, visual profile and summary for workflow 10824.8
testing log, visual profile and summary for workflow 10824.7
testing log, visual profile and summary for workflow 10824.9

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/e36b10437a73fea9d58141f501e488d387cd5645/log .

fwyzard · 2018-07-30T11:25:59Z

HeterogeneousCore/CUDAUtilities/interface/cudastdAlgorithm.h

@@ -1,6 +1,15 @@
 #ifndef HeterogeneousCore_CUDAUtilities_cudastdAlgorithm_h
 #define HeterogeneousCore_CUDAUtilities_cudastdAlgorithm_h

+#ifdef __CUDACC__


@VinInn does it work as well if you replace the whole block with just

#include <cuda_runtime.h>

?

Yes, #include <cuda_runtime.h> is enough, as it #defines away the CUDA-specific attributes when not building for CUDA (i.e. if __CUDACC__ is not defined).

The downside is that one must <use name="cuda"/> in the BuildFile, to let the compiler find cuda_runtime.h in the first place.

so what is the decision?

strictly speaking clients of HeterogeneousCore/CUDAUtilities should use it
(in test as well)

I've tried to change the #defines, but I run into link errors: my guess is that some code sees the __host__ __device__ as __attribute__((host)) __attribute__((device)), while other code sees an empty #define, and the two symbols to not match...

I think the best soultion would be either to #include <cuda_runtime.h>, or to patch the CUDA API wrappers to include that one instead of some internal CUDA includes.

I moved to #include <cuda_runtime.h>
and fixed the other ```NVCC``

fwyzard · 2018-07-30T11:28:53Z

SimTracker/TrackerHitAssociation/plugins/trackerHitAssociationHeterogeneousProduct.h

+#ifndef SimTrackerTrackerHitAssociationClusterHeterogeneousProduct_H
+#define SimTrackerTrackerHitAssociationClusterHeterogeneousProduct_H
+
+#ifndef	__NVCC__


please check for __CUDACC__ rather than __NVCC__

fwyzard · 2018-07-30T11:29:04Z

SimTracker/TrackerHitAssociation/plugins/trackerHitAssociationHeterogeneousProduct.h

+
+namespace trackerHitAssociationHeterogeneousProduct {
+
+#ifndef __NVCC__


fwyzard · 2018-07-30T11:29:29Z

SimTracker/TrackerHitAssociation/plugins/trackerHitAssociationHeterogeneousProduct.h

+     ClusterSLGPU *  gpu_d=nullptr;
+  };
+
+#ifndef	__NVCC__


makortel · 2018-08-01T07:26:53Z

Fixed in #111.

#105 and #106 from 94X to 104X

@perrotta

* First implementation of deep tau id. * Building dpf isolation module * Adding in v1 * Adding in runTauIDMVA for other users * making things fully reproducible * Reorganisation of configuration files: cff split to cfi and cff * Some code cleaning * adapt to cfi/cff reorganization * Review of DPF and DeepTauId code. - Defined base class for deep tau discriminators. - Removed weight files from home cms repository. Now using weights from cms-data. - Defined WP for both discriminators. Now all discriminators return the corresponding WP results. - Removed cfi files. Using fillDescriptions instead. - General code review and cleaning. * Added example of a python configuration file to produce pat::Tau collection with the new Tau-Ids * requested changes on runDeepTauIDsOnMiniAOD.py * Clean runTauIdMVA.py tool and test config to run tauIDs * Made DeepTauId and DPFIsolation thread-safe * Finish implement thread-safe requirements on DPFIsolation * Disable DPFTau_2016_v1 and issue some warnings * Remove assigning value of variable to itself * - Implemented on runTauIdMVA the option to work with new training files quantized - Added a new parameter 'version' on runTauIdMVA, used on DPFIsolation - Changes on DeepTauId to reduce memory consumption * - Implementation of global cache to avoid reloading graph for each thread and reduce the memory consuption - Creation of class DeepTauCache in DeepTauBase, in which now is created graph and session - Implementation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache * Applied changes on DeepTauBase to allow load new training files using memory mapping * Implemented TauWPThreshold class. TauWPThreshold class parses WP cut string (or value) provided in the python configuration. It is needed because the use of the standard StringObjectFunction class to parse complex expression results in an extensive memory usage (> 100 MB per expression). * Remove the qm.pb input files and leaving just the quantized and the original files * -Overall, changes to improve memory usage, among these are: - Implementation of global cache to avoid reloading graph for each thread - Creation of two new static methods inside the class DeepTauBase: initializeGlobalCache and globalEndJob. The graph and DeepTauCache object are created now inside initializeGlobalCache. The memory consumption of initializeGlobalCache for the original, quantized and files that are load using memory mapping method are in the memory_usage.pdf file - Implemented configuration to use new training files quantized, and set them as default - Implementation of configuration for load files using memory mapping. In our case there wasn't any improvement, respect at the memory consumption of this method, respect the quantized files, so this is not used, but set for future training files - General code review and cleaning. * Applied style comments * Applied style comments * Applied comments * Change to be by default the original training file for deepTau, instead of the quantized * Changes regarding forward-porting DNN-related developments from the PRs #105 and #106 from 94X to 104X * Applied commets of previus PR * cleaning code * Modification in the config to work with new label in files * Applied comment about the expected format of name of training file * Fix in last commit * Applied last comments * Changes regarding forward-porting DNN-related developments from the PRs #105 and #106 from 94X to 104X * Applied @perrotta comments on 104X * Fix error * Applied comments * Applied comments * Fix merge problem * Applied a few commets * Applied more changes * Applied a few small followups * Fixed error on DPFIsolation * Update DPFIsolation.cc * - RecoTauTag/RecoTau/plugins/DeepTauId.cc: Remove ' clusterVariables 'as a class member - RecoTauTag/RecoTau/test/runDeepTauIDsOnMiniAOD.py: Update globaltag and sample * Added changes in RecoTauTag/RecoTau/python/tools/runTauIdMVA.py made in the commit 194a1d5 from the PR cms-sw#25016 * Fix error on runDeepTauIDsOnMiniAOD * Change the GT in RecoTauTag/RecoTau/test/runDeepTauIDsOnMiniAOD.py

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

VinInn added 18 commits July 26, 2018 09:58

enable sorting hits in layers by histogramming

25c7697

clusterSimLink on gpu

6160220

Heterogeneous clTP

c26ca64

Heterogeneous clTP

3e2f557

moving on

c1890f8

progressing??

3658975

compiles

5aa57d5

compiles converter as well

4d3eea0

missing only sequence

731b062

runs up to some point

b65075b

works, dumps the hits

ebe74fc

fixed

17f1e4c

keep in limit of buffer

5684627

remove debug, protect dump with flag

4bfe9b7

consume hit on gpu

e13e054

initialize

cd1088b

Merged VinInn_GPUDoublets from repository makortel with cms-merge-topic

80f6b35

remove debug

46f5e22

VinInn mentioned this pull request Jul 26, 2018

Update the CMSSW_10_2_X_Patatrack branch to CMSSW_10_2_0 #97

Closed

makortel reviewed Jul 26, 2018

View reviewed changes

free memory, remove sync, use cudastd

007795f

fwyzard reviewed Jul 30, 2018

View reviewed changes

makortel mentioned this pull request Sep 25, 2018

Switch HLT cluster->TP associator back to the CPU one #176

Merged

fwyzard pushed a commit that referenced this pull request Dec 14, 2018

Changes regarding forward-porting DNN-related developments from the PRs

355c3ee

#105 and #106 from 94X to 104X

fwyzard pushed a commit that referenced this pull request Oct 8, 2020

Heterogeneous ClusterTPAssociation (#105)

20c45d0

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 8, 2020

Heterogeneous ClusterTPAssociation (#105)

ef64ba6

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

This was referenced Oct 8, 2020

Patatrack integration - Pixel local reconstruction (9/N) cms-sw/cmssw#31721

Merged

Patatrack integration - Pixel track reconstruction (10/N) cms-sw/cmssw#31722

Merged

fwyzard pushed a commit that referenced this pull request Oct 19, 2020

Heterogeneous ClusterTPAssociation (#105)

c5856d9

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Heterogeneous ClusterTPAssociation (#105)

b10131f

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Heterogeneous ClusterTPAssociation (#105)

ca0be5f

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Heterogeneous ClusterTPAssociation (#105)

973b9ee

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 23, 2020

Heterogeneous ClusterTPAssociation (#105)

ebee852

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Oct 23, 2020

Heterogeneous ClusterTPAssociation (#105)

1603b2c

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Heterogeneous ClusterTPAssociation (#105)

93f6ab9

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Heterogeneous ClusterTPAssociation (#105)

f5baa1f

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Heterogeneous ClusterTPAssociation (#105)

9d0a868

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Heterogeneous ClusterTPAssociation (#105)

b7493fd

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Heterogeneous ClusterTPAssociation (#105)

3b1ddef

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard added a commit that referenced this pull request Nov 27, 2020

Heterogeneous ClusterTPAssociation (#105)

eacdc6d

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard added a commit that referenced this pull request Nov 28, 2020

Heterogeneous ClusterTPAssociation (#105)

c69000e

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Dec 25, 2020

Heterogeneous ClusterTPAssociation (#105)

ff942a7

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Dec 26, 2020

Heterogeneous ClusterTPAssociation (#105)

48265f5

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard mentioned this pull request Dec 26, 2020

Patatrack integration - Pixel vertex reconstruction (11/N) cms-sw/cmssw#31723

Merged

fwyzard added a commit that referenced this pull request Dec 26, 2020

Heterogeneous ClusterTPAssociation (#105)

ae3e844

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Heterogeneous ClusterTPAssociation (#105)

e79ce38

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Heterogeneous ClusterTPAssociation (#105)

2dbefb4

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Heterogeneous ClusterTPAssociation (#105)

ed9d9b4

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Jan 15, 2021

Heterogeneous ClusterTPAssociation (#105)

8120c9c

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

fwyzard pushed a commit that referenced this pull request Apr 1, 2021

Heterogeneous ClusterTPAssociation (#105)

a7c22e4

Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.

	template<typename RandomIt, typename T, typename Compare = less<T>>
	constexpr
	RandomIt lower_bound(RandomIt first, RandomIt last, const T& value, Compare comp={})

		cudaCheck(cudaMalloc((void*) & slgpu.n2_d,(ClusterSLGPU::MaxNumModules256)*sizeof(uint32_t)));


		cudaCheck(cudaMalloc((void**) & slgpu.me_d, sizeof(ClusterSLGPU)));


		namespace trackerHitAssociationHeterogeneousProduct {

		#ifndef __NVCC__

Cluster2TP assoc on GPU #105

Cluster2TP assoc on GPU #105

Conversation

VinInn commented Jul 26, 2018 • edited

VinInn commented Jul 26, 2018

VinInn commented Jul 26, 2018 • edited

makortel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makortel Jul 26, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makortel commented Jul 26, 2018

VinInn commented Jul 26, 2018

fwyzard commented Jul 30, 2018

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre6-PU25ns_102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre6-102X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

Logs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VinInn Jul 31, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makortel commented Aug 1, 2018

VinInn commented Jul 26, 2018 •

edited

VinInn commented Jul 26, 2018 •

edited

makortel Jul 26, 2018 •

edited

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles

VinInn Jul 31, 2018 •

edited