Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patatrack integration - Hcal conditions (13/N) #32039

Merged
merged 2 commits into from Dec 2, 2020

Conversation

mariadalfonso
Copy link
Contributor

@mariadalfonso mariadalfonso commented Nov 5, 2020

Hcal conditions on GPU are move in the appropriate CondFormats/HcalObjects package
to be used by HCAL-local reco in #31720

@fwyzard @perrotta @makortel @slava77 @jpata

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 5, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 5, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32039/19590

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 5, 2020

A new Pull Request was created by @mariadalfonso for master.

It involves the following packages:

CondFormats/HcalObjects

@yuanchao, @christopheralanwest, @tocheng, @cmsbuild, @tlampen, @ggovi, @pohsun can you please review it and eventually sign? Thanks.
@mmusich, @abdoulline, @tocheng, @seemasharmafnal this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here


using HcalConvertedPedestalsRcd = HcalCombinedRecord<HcalPedestalsRcd, HcalQIEDataRcd, HcalQIETypesRcd>;

using HcalConvertedEffectivePedestalsRcd = HcalCombinedRecord<HcalPedestalsRcd, HcalQIEDataRcd, HcalQIETypesRcd>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By quick look HcalConvertedPedestalsRcd and HcalConvertedPedestalsRcd are identical (in fact they very same C++ type). What motivates the duplication?

HcalCombinedRecord<HcalPedestalsRcd, HcalPedestalWidthsRcd, HcalQIEDataRcd, HcalQIETypesRcd>;

using HcalConvertedEffectivePedestalWidthsRcd =
HcalCombinedRecord<HcalPedestalsRcd, HcalPedestalWidthsRcd, HcalQIEDataRcd, HcalQIETypesRcd>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with HcalConvertedPedestalWidthsRcd and HcalConvertedEffectivePedestalWidthsRcd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like nobody really knows, so I'll just remove the duplicate ones and check that everything still works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedestal and EffectivePedestal (mean and widths) are actually two distinct payload.
We derive this one with bias voltage on and one with bias voltage off.

We need both.
we can remove the code duplicate, but we actually need to read two different numbers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two (or more) payload (EventSetup product) types are fine, I was asking about the Record types (that determine the IOVs). Framework supports arbitrary number of products and product types in a Record.


#ifndef __CUDACC__
#include "HeterogeneousCore/CUDAUtilities/interface/HostAllocator.h"
#include "HeterogeneousCore/CUDACore/interface/ESProduct.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a dependence on CUDA acceptable for CondFormats? Or would we need e.g. CUDACondFormats for these? (these are not supposed to be stored in the CondDB in any case)

Just to note that in the next revision of the CUDA framework support I hope to remove the explicit use of cms::cuda::ESProduct as a member of the "payload", and instead wrap the payload similarly to cms::cuda::Product<T> (for EDProducts) for th ES product.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that in the next revision of the CUDA framework support I hope to remove the explicit use of cms::cuda::ESProduct as a member of the "payload"

That would remove the dependency on CUDA from this package ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that in the next revision of the CUDA framework support I hope to remove the explicit use of cms::cuda::ESProduct as a member of the "payload"

That would remove the dependency on CUDA from this package ?

Very likely yes. Essentially e.g. HcalConvertedPedestalWidthsGPU would change such that (e.g.)

  • HcalConvertedPedestalWidthsGPU::Product becomes HcalConvertedPedestalWidthsGPU
  • The logic from HcalConvertedPedestalWidthsGPU::getProduct() moves into the ESProducer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then I would leave this part as is for the time being.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that in the next revision of the CUDA framework support I hope to remove the explicit use of cms::cuda::ESProduct as a member of the "payload"

That would remove the dependency on CUDA from this package ?

Very likely yes.

On a further thought I have to take it back. If the payload uses e.g. cms::cuda::device::unique_ptr (which I hope the CUDA ESProducts would move to at that point), the dependence on HeterogeneousCore/CUDAUtilities would be unavoidable.

~HcalConvertedPedestalWidthsGPU() = default;

// get device pointers
Product const& getProduct(cudaStream_t) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned by this function (and many below) effectively returning "const pointer to non-const" instead of "(const) pointers to const". Then it would be very easy (as in compiler would not catch) to have code that accidentally modifies the pointed-to data (which is not allowed for ES products).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think that changing the definition of Product to

  struct Product {
    ~Product();
    edm::propagate_const<float*> values;
  };

would fix this potential issue ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe edm::propagate_const would solve issue. I suppose it would require declaring ~all methods of propagate_const as constexpr for it to work in the device code.

Copy link
Contributor

@fwyzard fwyzard Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these changes compile fine:

diff --git a/CondFormats/HcalObjects/interface/HcalConvertedPedestalWidthsGPU.h b/CondFormats/HcalObjects/interface/HcalConvertedPedestalWidthsGPU.h
index e3adb59ca54e..1afad7ca3a66 100644
--- a/CondFormats/HcalObjects/interface/HcalConvertedPedestalWidthsGPU.h
+++ b/CondFormats/HcalObjects/interface/HcalConvertedPedestalWidthsGPU.h
@@ -5,6 +5,7 @@
 #include "CondFormats/HcalObjects/interface/HcalPedestalWidths.h"
 #include "CondFormats/HcalObjects/interface/HcalQIEData.h"
 #include "CondFormats/HcalObjects/interface/HcalQIETypes.h"
+#include "FWCore/Utilities/interface/propagate_const.h"
 
 #ifndef __CUDACC__
 #include "HeterogeneousCore/CUDAUtilities/interface/HostAllocator.h"
@@ -15,7 +16,7 @@ class HcalConvertedPedestalWidthsGPU {
 public:
   struct Product {
     ~Product();
-    float* values;
+    edm::propagate_const<float*> values;
   };
 
 #ifndef __CUDACC__
diff --git a/CondFormats/HcalObjects/src/HcalConvertedPedestalWidthsGPU.cc b/CondFormats/HcalObjects/src/HcalConvertedPedestalWidthsGPU.cc
index 1d7ca5b3394c..0af2b5eaade2 100644
--- a/CondFormats/HcalObjects/src/HcalConvertedPedestalWidthsGPU.cc
+++ b/CondFormats/HcalObjects/src/HcalConvertedPedestalWidthsGPU.cc
@@ -145,6 +145,8 @@ HcalConvertedPedestalWidthsGPU::Product const& HcalConvertedPedestalWidthsGPU::g
   auto const& product = product_.dataForCurrentDeviceAsync(
       cudaStream, [this](HcalConvertedPedestalWidthsGPU::Product& product, cudaStream_t cudaStream) {
         // malloc
-        cudaCheck(cudaMalloc((void**)&product.values, this->values_.size() * sizeof(float)));
+        float* values;
+        cudaCheck(cudaMalloc(&values, values_.size() * sizeof(float)));
+        product.values = values;
 
         // transfer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these changes compile fine:

Ah, right, probably HcalConvertedPedestalWidthsGPU::Product is not passed to device code as such, but the individual arrays themselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makortel can you confirm that the implementation that uses edm::propagate_const_array is now fine ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you confirm that the implementation that uses edm::propagate_const_array is now fine ?

Visually the use of edm::propagate_const_array looks good now, thanks.

// FIXME: add proper getters to conditions
HcalPedestalsGPU::HcalPedestalsGPU(HcalPedestals const& pedestals)
: unitIsADC_{pedestals.isADC()},
totalChannels_{pedestals.getAllContainers()[0].second.size() + pedestals.getAllContainers()[1].second.size()},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make a note on a semi-random place. The HcalPedestals::getAllContainers() seems to have lot's of overhead (constructing vector<pair<string, vector<HcalPedestal>>> on each call). I wonder if this could be (eventually) improved, event if the price is paid only at IOV changes.

#include "CondFormats/DataRecord/interface/HcalPedestalWidthsRcd.h"
#include "CondFormats/DataRecord/interface/HcalPedestalsRcd.h"
#include "CondFormats/DataRecord/interface/HcalQIEDataRcd.h"
#include "CondFormats/DataRecord/interface/HcalQIETypesRcd.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is the only one pulling in the dependence on CondFormats/DataRecord. I wonder if it would be better to place these Record definitions in other package to avoid that (e.g. in CondFormats/DataRecord).

@fwyzard
Copy link
Contributor

fwyzard commented Nov 13, 2020

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 13, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: b7e1199
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5865d8/10735/summary.html
CMSSW: CMSSW_11_2_X_2020-11-12-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@fwyzard
Copy link
Contributor

fwyzard commented Nov 13, 2020

@yuanchao, @christopheralanwest, @tocheng, @tlampen, @ggovi, @pohsun any comments from your side ?

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5865d8/10735/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2529296
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2529273
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 148 log files, 22 edm output root files, 35 DQM output files

@fwyzard
Copy link
Contributor

fwyzard commented Nov 17, 2020

please test with #31720

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 17, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) #31720

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 26, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) #31720

@cmsbuild
Copy link
Contributor

+1
Tested at: 041e3b9
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6a9587/11069/summary.html
CMSSW: CMSSW_11_2_X_2020-11-25-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-6a9587/11069/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2747014
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2746985
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
  • Checked 153 log files, 34 edm output root files, 36 DQM output files

@perrotta
Copy link
Contributor

@cms-sw/alca-l2 do you have any comment here? There are other PRs queued while waiting for this one...

@christopheralanwest
Copy link
Contributor

How exactly will these conditions be used? Will they ever require new conditions to be added to the GTs or will they simply reformat the existing conditions in a way that is convenient for GPUs?

@mariadalfonso
Copy link
Contributor Author

How exactly will these conditions be used? Will they ever require new conditions to be added to the GTs or will they simply reformat the existing conditions in a way that is convenient for GPUs?

no new conditions will be introduced.
It's just a reformat the existing conditions convenient for GPUs as you say.

@christopheralanwest
Copy link
Contributor

+alca

@ggovi
Copy link
Contributor

ggovi commented Dec 2, 2020

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 2, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants