GPU Cellular Automaton #48

felicepantaleo · 2018-05-23T09:15:17Z

Preliminary version of the GPU CA.
I'll make it heterogeneousEDProducer, place copies in the correct place, remove commented code.
@rovere @fwyzard @VinInn @makortel

cmsbot · 2018-05-23T09:15:37Z

A new Pull Request was created by @felicepantaleo (Felice Pantaleo) for CMSSW_10_2_X_Patatrack.

It involves the following packages:

RecoPixelVertexing/PixelTriplets
RecoTracker/TkHitPairs

@cmsbot, @fwyzard can you please review it and eventually sign? Thanks.

cms-bot commands are listed here

makortel

I took a cursory look and have also some general comments

In general the reformatting is nice, but
- there are some cases where IMHO it would be clearer to avoid line breaks
- in principle it would be nicer to have the reformatting in its own PR
There is a lot of copy-paste (in addition to what there was already), but that is probably best to deal with later when the dust settles (i.e. after migrating to HeterogeneousEDProducer etc)

makortel · 2018-05-23T09:21:40Z

RecoPixelVertexing/PixelTriplets/plugins/CAGraph.h


-	std::vector<int> theOuterLayerPairs;
-	std::vector<int> theInnerLayerPairs;
+  std::string name() const { return theName; }


This should return const std::string&.

Eventually it would be nice to avoid strings as the layer identifiers.

makortel · 2018-05-23T09:22:29Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGenerator.cc

+  edm::ParameterSet comparitorPSet =
+      cfg.getParameter<edm::ParameterSet>("SeedComparitorPSet");
+  std::string comparitorName =
+      comparitorPSet.getParameter<std::string>("ComponentName");


Could the line breaks be avoided?

makortel · 2018-05-23T09:23:04Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGenerator.cc

  }
 }

-void CAHitQuadrupletGenerator::fillDescriptions(edm::ParameterSetDescription& desc) {
+void CAHitQuadrupletGenerator::fillDescriptions(
+    edm::ParameterSetDescription &desc) {


Could the line break be avoided?

makortel · 2018-05-23T09:32:06Z

RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h

@@ -39,15 +39,15 @@ class RecHitsSortedInPhi {
  typedef std::pair<HitIter,HitIter>            Range;

  using DoubleRange = std::array<int,4>;
-  
+


All changes in this file are reformatting, so in principle could be avoided in this PR.

makortel · 2018-05-23T09:34:55Z

RecoPixelVertexing/PixelTriplets/plugins/GPUSimpleVector.h

+#include <cuda.h>
+#include <cuda_runtime.h>
+
+template <int maxSize, class T> struct GPUSimpleVector {


How is this different from
https://github.com/cms-patatrack/cmssw/blob/CMSSW_10_2_X_Patatrack/HeterogeneousCore/CUDAUtilities/interface/GPUSimpleVector.h
? (ok, I see int maxSize template parameter)

Anyway it would be better placed (eventually) in HeterogeneousCore/CUDAUtilities.

Indeed, can we avoid having two GPUSimpleVector classes ?

We discussed with Felice and I agree the use case is valid (one GPU allocation vs. a GPU allocation per hit). The intention of the class is to provide a vector-like interface on top of an (dynamically-allocated) array, close to what

cmssw/FWCore/Utilities/interface/VecArray.h

Lines 27 to 28 in 655e4ed

template <typename T, unsigned int N>

class VecArray {

does in CPU.

I'd suggest to treat this class similarly, i.e. rename to e.g. GPUVecArray (and reorder the template parameters as <class, int>) and move to HeterogeneousCore/CUDAUtilities/interface.

fwyzard · 2018-05-23T16:12:13Z

RecoPixelVertexing/PixelTriplets/plugins/BuildFile.xml

  <flags   EDM_PLUGIN="1"/>
+  <flags   CUDA_FLAGS="--expt-relaxed-constexpr"/>


you don't need to add this, it is included by default in the CUDA_FLAGS

fwyzard · 2018-05-23T16:13:14Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.cc

+
+  std::vector<const HitDoublets *> hitDoublets;
+
+  const int numberOfHitsInNtuplet = 4;


this is unused

cmsbot · 2018-05-25T14:11:13Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

cmsbot · 2018-05-25T16:57:04Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

makortel

I repeat my earlier general comments on the formatting:

separating the formatting changes from the rest would make review of the rest much easier
there are many places where the line breaks make the code more difficult to read (IMHO)

makortel · 2018-05-25T15:07:21Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletEDProducerT.cc

+//
+// #include "CAHitQuadrupletGeneratorGPU.h"
+// using CAHitQuadrupletGPUEDProducer = CAHitNtupletEDProducerT<CAHitQuadrupletGeneratorGPU>;
+// DEFINE_FWK_MODULE(CAHitQuadrupletGPUEDProducer);


Please remove the commented code.

makortel · 2018-05-28T07:44:19Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+CAHitNtupletHeterogeneousEDProducer::CAHitNtupletHeterogeneousEDProducer(
+    const edm::ParameterSet &iConfig)
+    : HeterogeneousEDProducer<heterogeneous::HeterogeneousDevices<
+          heterogeneous::GPUCuda, heterogeneous::CPU>>(iConfig), doubletToken_(consumes<IntermediateHitDoublets>(iConfig.getParameter<edm::InputTag>("doublets"))),


The base class constructor call can be shortened to

: HeterogeneousEDProducer(iConfig),

makortel · 2018-05-28T07:45:13Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+
+CAHitNtupletHeterogeneousEDProducer::~CAHitNtupletHeterogeneousEDProducer() {
+
+}


I suggest to replace the empty destructor with adding = default to the declaration on line 34.

makortel · 2018-05-28T07:48:29Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+void CAHitNtupletHeterogeneousEDProducer::beginStreamGPUCuda(
+    edm::StreamID streamId, cuda::stream_t<> &cudaStream) {
+
+}


It is not mandatory to implement beginStreamGPUCuda(), so you could remove the method altogether. But where do you allocate the GPU memory? (answering to myself: in the constructor, but see other comment why the allocations should (eventually) be moved here)

makortel · 2018-05-28T07:49:46Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+
+void CAHitNtupletHeterogeneousEDProducer::produceGPUCuda(
+    edm::HeterogeneousEvent &iEvent, const edm::EventSetup &iSetup,
+    cuda::stream_t<> &cudaStream) {


I think ordering acquireGPUCuda() and produceGPUCuda() in that order would make them easier to read.

makortel · 2018-05-28T08:33:02Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.cu

+  dim3 numberOfBlocks_find(8, numberOfRootLayerPairs);
+  ((GPUSimpleVector<maxNumberOfQuadruplets, Quadruplet>
+        *)(h_foundNtuplets[regionIndex]))
+      ->reset();


If h_foundNtuplets really has to contain void *, could we at least use reinterpret_cast here (and elsewhere)?

makortel · 2018-05-28T08:36:00Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.cu

+  cudaMemsetAsync(device_isOuterHitOfCell, 0,
+                  maxNumberOfLayers * maxNumberOfHits *
+                      sizeof(GPUSimpleVector<maxCellsPerHit, unsigned int>),
+                  cudaStream_);


Could you add a comment that this resets temporary memory for the next event, and is not needed for reading the output? Just to explain why it is ok to have async call here without cudaStreamSynchronize().

makortel · 2018-05-28T08:42:45Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.cc

+      const TrackingRegion &region = regionLayerPairs.region();
+      auto foundQuads = fetchKernelResult(index);
+      std::cout << foundQuads.size() << " found quads" << std::endl;
+    unsigned int numberOfFoundQuadruplets = foundQuads.size();


Indentation becomes inconsistent at this line. But I'd rather re-indent the lines above than those below.

makortel · 2018-05-28T08:46:59Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+      const TrackingRegion &region = regionLayerPairs.region();
+      auto seedingHitSetsFiller = seedingHitSets->beginRegion(&region);
+      generator_.fillResults(regionDoublets, ntuplets, iSetup, seedingLayerHits,
+                             cudaStream.id());


To me it looks like CAHitQuadrupletGeneratorGPU::fillResults() already loops over the regions and fills ntuplets for all regions. Why does it have to be repeated here for each region? Should the loop perhaps be something along

generator_.fillResults(regionDoublets, ntuplets, iSetup, seedingLayerHits, cudaStream.id()); int index = 0; for (const auto &regionLayerPairs : regionDoublets) { const TrackingRegion &region = regionLayerPairs.region(); auto seedingHitSetsFiller = seedingHitSets->beginRegion(&region); fillNtuplets(seedingHitSetsFiller, ntuplets[index]); index++; )

instead?

makortel · 2018-05-28T08:54:57Z

RecoPixelVertexing/PixelTriplets/plugins/BuildFile.xml

  <flags   EDM_PLUGIN="1"/>
+  <flags   CUDA_FLAGS="--expt-relaxed-constexpr"/>


@fwyzard commented earlier that this is not needed as the flag is included by default in CUDA_FLAGS.

makortel · 2018-05-29T07:26:27Z

It is also noteworthy that (because of the formatting changes, IIUC) this PR conflicts with cms-sw#23363.

fwyzard · 2018-05-29T10:24:16Z

question: with these changes, do we use the new producer for the GPU workflow ?

makortel · 2018-05-29T10:25:46Z

question: with these changes, do we use the new producer for the GPU workflow ?

No. Additional changes in the configuration are needed to use this new producer.

cmsbot · 2018-05-30T14:24:38Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

fwyzard · 2018-06-05T14:21:05Z

Validation summary

Reference release CMSSW_10_2_0_pre4 at 926a81b
Development branch CMSSW_10_2_X_Patatrack at b1e6d1c
Testing PRs:

GPU Cellular Automaton #48 at f56ca1e

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8 are missing
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

log, profile and summary for workflow
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

testing log, profile and summary for workflow 10824.9
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/a56c9f39c1dfffed018bbd9a0f026238f9390c21/log .

fwyzard · 2018-06-05T15:30:00Z

The development summary for workflow 10824.8 is very succinct:

======== Error: Application received signal 139

Which is strange because the same workflow is successful in the validation of other PRs.

cmsbot · 2018-06-05T15:36:49Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

cmsbot · 2018-06-05T15:39:12Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

fwyzard · 2018-06-05T15:42:17Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

+
+  bool emptyRegionDoublets = false;
+  std::unique_ptr<RegionsSeedingHitSets> seedingHitSets;
+  std::vector<OrderedHitSeeds> ntuplets;


please update the member variable names to match the coding rules, i.e. add a trailing _

fwyzard · 2018-06-05T15:44:53Z

RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.h

+    static constexpr int maxNumberOfHits = 1000;
+    static constexpr int maxNumberOfRegions = 30;
+
+    unsigned int numberOfRootLayerPairs = 0;


please update the member variable names to match the coding rules, i.e. add a trailing _

fwyzard · 2018-06-05T15:45:36Z

RecoPixelVertexing/PixelTriplets/plugins/GPUCACell.h

+           theInnerR, theOuterR);
+  }
+
+  //        __host__    __device__


can you delete the commented out part ?

cmsbot · 2018-06-08T16:25:14Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

cmsbot · 2018-06-08T16:28:54Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

felicepantaleo · 2018-06-08T16:29:57Z

I still need to make use of the existing non-templated GPU vector and probably rebase the PR...

fwyzard · 2018-06-14T14:41:56Z

To reformulate the comment: the code should not crash.
If we use fixed-size buffers, their use should be protected, with the algorithms doing one of

adapting, e.g. processing all elements, a chunk at a time
processing only the elements that fit the buffer, and signal a LogError
process no elements, and signal a LogError

cmsbot · 2018-06-14T15:52:28Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

felicepantaleo · 2018-06-14T15:53:37Z

@fwyzard, I have implemented your suggestion and replaced the assert with a LogError and return

fwyzard · 2018-06-14T16:14:00Z

Validation summary

Reference release CMSSW_10_2_0_pre5 at 30c7b03
Development branch CMSSW_10_2_X_Patatrack at 655e4ed
Testing PRs:

GPU Cellular Automaton #48 at 2ebbcf3

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8 are missing
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8 are missing
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/7be28d1421adf3f678cb1e9c25683564047e3e87/log .

fwyzard · 2018-06-14T16:17:09Z

As of 2ebbcf3 , both 10824.8 workflows are failing (TTbar with SIGABRT, abort, Zmumu with SIGSEGV, segmentation violation).

cmsbot · 2018-06-14T17:08:59Z

Pull request #48 was updated. @cmsbot, @fwyzard can you please check and sign again.

fwyzard · 2018-06-16T10:45:58Z

Validation summary

Reference release CMSSW_10_2_0_pre5 at 30c7b03
Development branch CMSSW_10_2_X_Patatrack at 655e4ed
Testing PRs:

GPU Cellular Automaton #48 at a9f46c6

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8 are missing
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/c31f6d167e9a9330f83b4144f95152ae5eeffac3/log .

fwyzard · 2018-06-16T12:28:39Z

I have split the squash commit in two, to keep separate the work on the GPU::SimpleVector and GPU::VecArray, and the work on the CA.

fwyzard · 2018-06-16T14:15:28Z

@makortel sorry for not addressing your comments earlier; can you summarise which ones are still relevant, and I'll try to make the corresponding changes ?

makortel · 2018-06-18T08:00:11Z

@fwyzard No problem, I tried to gather them below (most are mostly aesthetic and possibly subjective)

The intermediate event(). could be dropped here

cmssw/RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

Line 92 in 96559f3

iEvent.event().getByToken(doubletToken_, hdoublets);
This one could be dropped, right?

cmssw/RecoPixelVertexing/PixelTriplets/plugins/BuildFile.xml

Line 8 in 96559f3

<flags CUDA_FLAGS="--expt-relaxed-constexpr"/>

This result parameter seems to be unneeded

cmssw/RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.cc

Lines 183 to 185 in 96559f3

    
           void CAHitQuadrupletGeneratorGPU::hitNtuplets( 
        
               const IntermediateHitDoublets &regionDoublets, 
        
               std::vector<OrderedHitSeeds> &result, const edm::EventSetup &es,

I'd pass the cudaStream_t around as a function parameter instead of a member variable

cmssw/RecoPixelVertexing/PixelTriplets/plugins/CAHitQuadrupletGeneratorGPU.h

Line 151 in 96559f3

cudaStream_t cudaStream_;

Even though the reformatting was in general an improvement, there are places where it made the code harder to read to me, like in here

cmssw/RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletHeterogeneousEDProducer.cc

Lines 95 to 98 in 96559f3

    
           const SeedingLayerSetsHits &seedingLayerHits = 
        
               regionDoublets.seedingLayerHits(); 
        
           if (seedingLayerHits.numberOfLayersInSet() < 
        
               CAHitQuadrupletGeneratorGPU::minLayers) {

I could go through them myself and propose changes (which will likely be overridden by the next clang-format...)

fwyzard · 2018-06-18T09:02:20Z

I'll implement 1. and 2., look into 3. and 4., and leave 5. as is for the moment... I'd rather make clang-format do the work us.

makortel · 2018-06-18T09:41:15Z

@fwyzard Sounds good, thanks!

fwyzard · 2018-06-18T10:48:10Z

OK, I've done something along the lines of 3. and 4. (and maybe 5.) as well.

See #83 for the clean up.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

@makortel

Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48: - clean up the `BuildFile.xml` - remove unused data members and arguments from function calls; - percolate the CUDA stream instead of storing it as a data member. Also: - add `cudaCheck` calls around memory allocations and copies; - reduce the number of memory allocations used to set up the GPU state.

…on. (#48)

cmsbot added comparison-pending labels May 23, 2018

makortel requested changes May 23, 2018

View reviewed changes

fwyzard requested changes May 23, 2018

View reviewed changes

makortel requested changes May 28, 2018

View reviewed changes

fwyzard force-pushed the CMSSW_10_2_X_Patatrack branch 2 times, most recently from 1965f2e to 30594f6 Compare June 4, 2018 16:10

fwyzard reviewed Jun 5, 2018

View reviewed changes

cmsbot added core-pending labels Jun 8, 2018

felicepantaleo added 2 commits June 11, 2018 12:30

clang format CellularAutomaton.cc

9ca4a91

clang format CAHitQuadrupletGenerator.cc

42d2e6d

replace assert with LogError

4da9073

increase the max number of hits

a9f46c6

fwyzard merged commit 2b0f382 into cms-patatrack:CMSSW_10_2_X_Patatrack Jun 16, 2018

fwyzard mentioned this pull request Jun 18, 2018

Clean up CAHitNtupletHeterogeneousEDProducer #83

Merged

makortel mentioned this pull request Jun 20, 2018

Fix workflow 10824.8 irreproducibility and crashes #84

Closed

fwyzard pushed a commit that referenced this pull request Feb 12, 2021

Moved layer ID and valid bit to end to be consistent with DTC emulati…

39a9c39

…on. (#48)

		@@ -39,15 +39,15 @@ class RecHitsSortedInPhi {
		typedef std::pair<HitIter,HitIter> Range;

		using DoubleRange = std::array<int,4>;

		<flags EDM_PLUGIN="1"/>
		<flags CUDA_FLAGS="--expt-relaxed-constexpr"/>


		std::vector<const HitDoublets *> hitDoublets;

		const int numberOfHitsInNtuplet = 4;


		CAHitNtupletHeterogeneousEDProducer::~CAHitNtupletHeterogeneousEDProducer() {

		}

GPU Cellular Automaton #48

GPU Cellular Automaton #48

Conversation

felicepantaleo commented May 23, 2018

cmsbot commented May 23, 2018

makortel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbot commented May 25, 2018

cmsbot commented May 25, 2018

makortel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makortel commented May 29, 2018

fwyzard commented May 29, 2018

makortel commented May 29, 2018

cmsbot commented May 30, 2018

fwyzard commented Jun 5, 2018 • edited

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Jun 5, 2018

cmsbot commented Jun 5, 2018

cmsbot commented Jun 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbot commented Jun 8, 2018

cmsbot commented Jun 8, 2018

felicepantaleo commented Jun 8, 2018

fwyzard commented Jun 14, 2018

cmsbot commented Jun 14, 2018

felicepantaleo commented Jun 14, 2018

fwyzard commented Jun 14, 2018

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Jun 14, 2018

cmsbot commented Jun 14, 2018

fwyzard commented Jun 16, 2018

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

fwyzard commented Jun 5, 2018 •

edited

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles