Replace CUDA API wrapper memory operations with native CUDA calls #395

waredjeb · 2019-10-25T17:31:29Z

PR description

This PR is part of #386:

replace cuda::memory::copy() with cudaMemcpy(), cuda::memory::async::copy() with cudaMemcpyAsync()
replace cuda::memory::zero() and cuda::memory::set() with cudaMemset()
replace cuda::memory::async::zero() and cuda::memory::async::set() with cudaMemsetAsync()

PR validation

unit tests run

makortel · 2019-10-25T18:22:29Z

CUDADataFormats/BeamSpot/src/BeamSpotCUDA.cc

@@ -4,5 +4,5 @@

 BeamSpotCUDA::BeamSpotCUDA(Data const* data_h, cuda::stream_t<>& stream) {
  data_d_ = cudautils::make_device_unique<Data>(stream);
-  cuda::memory::async::copy(data_d_.get(), data_h, sizeof(Data), stream.id());
+  cudaMemcpyAsync(data_d_.get(), data_h, sizeof(Data), cudaMemcpyHostToDevice, stream.id());


We've typically used cudaMemcpyDefault elsewhere (but I'm not against of denoting the direction explicitly).

So could I leave the cudaMemcpy with the direction defined?

So could I leave the cudaMemcpy with the direction defined?

I'm fine with that.

makortel · 2019-10-25T18:26:15Z

There are conflicts so this PR needs to be rebased (OTOH this PR conflicts also with #389, so it could be less work to wait until that one gets merged, I let @fwyzard to comment whether #389 could get in soon).

fwyzard

Could you

fix the spurious lines
fix the types used in the copies
not add/remove whitespaces and empty lines (unless it is done on purpose)
?

Then, one thing I forgot to ask you earlier: could you wrap every call to cudaMemcpy(...), cudaMemcpyAsync(...), cudaMemset(...), cudaMemsetAsync(...) in a call to cudaCheck() ?
For example

  cudaMemcpyAsync(data_d_.get(), data_h, sizeof(Data), cudaMemcpyHostToDevice, stream);

should become

  cudaCheck(cudaMemcpyAsync(data_d_.get(), data_h, sizeof(Data), cudaMemcpyHostToDevice, stream));

To make it available, you may need to add

#include "HeterogeneousCore/CUDAUtilities/interface/cudaCheck.h"

if it was not already there.

DataFormats/Math/test/CholeskyInvert_t.cu

HeterogeneousCore/CUDAUtilities/test/copyAsync_t.cpp

RecoLocalTracker/SiPixelClusterizer/test/gpuClustering_t.h

fwyzard · 2019-10-27T06:32:30Z

RecoPixelVertexing/PixelVertexFinding/test/VertexFinder_t.h

-      cuda::memory::copy(nn, LOC_ONGPU(ndof), nv * sizeof(int32_t));
-      cuda::memory::copy(chi2, LOC_ONGPU(chi2), nv * sizeof(float));
+      cudaMemcpy(&nv, LOC_ONGPU(nvFinal), sizeof(uint32_t), cudaMemcpyDeviceToHost);
+      cudaMemcpy(nn, LOC_ONGPU(ndof), nv * sizeof(uint32_t), cudaMemcpyDeviceToHost);


here uint32_t was originally int32_t

I verified, only cudaMemcpy(nn, LOC_ONGPU(ndof), nv * sizeof(uint32_t), cudaMemcpyDeviceToHost); was originally int32_t

…rack/cmssw into replace_cuda_memory

… into replace_cuda_memory Updating changes

fwyzard · 2019-10-28T14:26:49Z

Validation summary

Reference release CMSSW_11_0_0_pre7 at 411b633
Development branch CMSSW_11_0_X_Patatrack at 617f9a0
Testing PRs:

Replace CUDA API wrapper memory operations with native CUDA calls #395 at 463b495

Validation plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and `nvprof`/`nvvp` profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

Logs

The full log is available at https://patatrack.web.cern.ch/patatrack/validation/pulls/9a76577bb63975315ef69ad1b88a362a5932bc83/log .

fwyzard · 2019-10-28T16:38:35Z

@VinInn could you have a look at this PR ?

The changes should be only technical (moving from the cuda::memory::.. wrappers to the standard cudaMemcpy() etc. functions), but we observe a non negligible change in the TTbra realistic performance for the tracks associate to the PV:

	development-10824.5	development-10824.52	testing-10824.52
Number of TrackingParticles (after cuts)	4605	4950	5017
Number of matched TrackingParticles	2346	2757	2790
Number of tracks	3410	4371	4416
Number of true tracks	3025	3860	3905
Number of fake tracks	385	511	511
Number of pileup tracks	0	0	0
Number of duplicate tracks	44	0	0

while there doesnt seem to be any change in the overall tracks:

	reference-10824.5	development-10824.5	development-10824.52	testing-10824.52
Efficiency	0.5128	0.5252	0.5818	0.5818
Number of TrackingParticles (after cuts)	5530	5320	5320	5320
Number of matched TrackingParticles	2836	2794	3095	3095
Fake rate	0.0472	0.0479	0.0212	0.0212
Duplicate rate	0.0150	0.0152	0.0003	0.0003
Number of tracks	32648	32656	39763	39763
Number of true tracks	31108	31093	38921	38920
Number of fake tracks	1540	1563	842	843
Number of pileup tracks	27279	27270	34468	34467
Number of duplicate tracks	491	495	12	12

VinInn · 2019-10-28T16:48:15Z

HeterogeneousCore/CUDAUtilities/interface/copyAsync.h

  }

  template <typename T>
  inline void copyAsync(cudautils::host::unique_ptr<T>& dst,
                        const cudautils::device::unique_ptr<T>& src,
                        cudaStream_t stream) {
    static_assert(std::is_array<T>::value == false, "For array types, use the other overload with the size parameter");
-    cuda::memory::async::copy(dst.get(), src.get(), sizeof(T), stream);
+    cudaCheck(cudaMemcpyAsync(dst.get(), src.get(), sizeof(T), cudaMemcpyHostToDevice, stream));


this is device2host

And "Calling cudaMemcpyAsync() with dst and src pointers that do not match the direction of the copy results in an undefined behavior." (*), so specifying the direction explicitly is actually harmful?

(*) https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g85073372f776b4c4d5f89f7124b7bf79

Indeed. I think we agreed to remove all explicit directions.

Calling cudaMemcpyAsync() with dst and src pointers that do not match the direction of the copy results in an undefined behavior.

I thought it was supposed to crash...

VinInn · 2019-10-28T16:48:38Z

HeterogeneousCore/CUDAUtilities/interface/copyAsync.h

  }

  template <typename T>
  inline void copyAsync(cudautils::host::unique_ptr<T[]>& dst,
                        const cudautils::device::unique_ptr<T[]>& src,
                        size_t nelements,
                        cudaStream_t stream) {
-    cuda::memory::async::copy(dst.get(), src.get(), nelements * sizeof(T), stream);
+    cudaCheck(cudaMemcpyAsync(dst.get(), src.get(), nelements * sizeof(T), cudaMemcpyHostToDevice, stream));


fwyzard · 2019-10-28T17:29:26Z

Validation summary

Reference release CMSSW_11_0_0_pre7 at 411b633
Development branch CMSSW_11_0_X_Patatrack at 617f9a0
Testing PRs:

Replace CUDA API wrapper memory operations with native CUDA calls #395 at 49d83d6

Validation plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and `nvprof`/`nvvp` profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.5
- ✔️ step3.py: log
development release, workflow 10824.51
- ✔️ step3.py: log
development release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.86452
testing release, workflow 10824.5
- ✔️ step3.py: log
testing release, workflow 10824.51
- ✔️ step3.py: log
testing release, workflow 10824.52
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.86452

Logs

The full log is available at https://patatrack.web.cern.ch/patatrack/validation/pulls/4b8bedbe5a9102199e1d28c223042e43fcda503d/log .

fwyzard · 2019-10-28T23:50:11Z

OK, now it looks better.
The same summary comparison now give identical results:

	development-10824.5	development-10824.52	testing-10824.52
Number of TrackingParticles (after cuts)	4605	5017	5017
Number of matched TrackingParticles	2346	2790	2790
Number of tracks	3410	4416	4416
Number of true tracks	3025	3905	3905
Number of fake tracks	385	511	511
Number of pileup tracks	0	0	0
Number of duplicate tracks	44	0	0

and all the others show identical or almost identical results.

makortel reviewed Oct 25, 2019

View reviewed changes

This comment has been minimized.

Sign in to view

waredjeb added 2 commits October 26, 2019 22:48

Solve conflicts with #389

2ee863c

Solve conflicts with #389

ede0cdd

fwyzard changed the title ~~Replace cuda::memory[::async]::copy() with cudaMemcpy[Async](), cuda:…~~ Replace CUDA API wrapper memory operations with native CUDA calls Oct 26, 2019

fwyzard added 5 commits October 26, 2019 23:41

Delete spurious file

a731751

Delete spurious file

c00e7bd

Delete spurious file

3b7e845

Delete spurious file

61d12f5

Whitespaces

b17e7f9

fwyzard requested changes Oct 27, 2019

View reviewed changes

waredjeb added 7 commits October 28, 2019 09:25

Merge branch 'CMSSW_11_0_X_Patatrack' of https://github.com/cms-patat…

6e103c8

…rack/cmssw into replace_cuda_memory

Merge branch 'replace_cuda_memory' of https://github.com/waredjeb/cmssw…

51d5cc3

… into replace_cuda_memory Updating changes

Wrap cudaMem calls in call to cudaCheck

01bb995

Fix errors, missing include of launch.h

c7b7f03

Apply code-format

e007a5e

Reorders memory copy operations

9a1ca24

Reoders memory copy in Device_to_Host section

463b495

fwyzard approved these changes Oct 28, 2019

View reviewed changes

makortel mentioned this pull request Oct 28, 2019

Remove the use of CUDA API wrappers #386

Closed

20 tasks

fwyzard requested a review from VinInn October 28, 2019 16:42

VinInn reviewed Oct 28, 2019

View reviewed changes

Fix direction of the copies from device to host

49d83d6

fwyzard merged commit 6bfe94f into cms-patatrack:CMSSW_11_0_X_Patatrack Oct 29, 2019

fwyzard mentioned this pull request Oct 30, 2019

Understand non-perfect reproducibility in tracks to PV association #397

Open

fwyzard pushed a commit that referenced this pull request Oct 19, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

a035c70

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

5c45e8b

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

c7b23f0

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

9ecf4f0

fwyzard pushed a commit that referenced this pull request Oct 20, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

67eb7ad

fwyzard pushed a commit that referenced this pull request Oct 23, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

e924575

fwyzard pushed a commit that referenced this pull request Oct 23, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

6e99237

fwyzard pushed a commit that referenced this pull request Oct 23, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

b40ca24

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

52b504f

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

a2118a9

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

1d4a74a

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

a1152f2

fwyzard pushed a commit that referenced this pull request Nov 6, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

68b5498

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

8a39469

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

36e7a88

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

7ba73f0

fwyzard added a commit that referenced this pull request Nov 27, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

ac5a609

fwyzard added a commit that referenced this pull request Nov 27, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

fae9816

fwyzard pushed a commit that referenced this pull request Dec 25, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

b9ff383

fwyzard pushed a commit that referenced this pull request Dec 26, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

3679799

fwyzard added a commit that referenced this pull request Dec 26, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

bed5e7f

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

c49c9dc

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

5aa5b55

fwyzard pushed a commit that referenced this pull request Dec 29, 2020

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

3854e3a

fwyzard pushed a commit that referenced this pull request Jan 13, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

e3eb8a1

fwyzard pushed a commit that referenced this pull request Jan 13, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

9fc35c5

fwyzard pushed a commit that referenced this pull request Jan 15, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

7ef38e5

fwyzard pushed a commit that referenced this pull request Jan 15, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

b685bca

fwyzard pushed a commit that referenced this pull request Mar 23, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

441bb39

fwyzard pushed a commit that referenced this pull request Apr 1, 2021

Replace CUDA API wrapper memory operations with native CUDA calls (#395)

4181007

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace CUDA API wrapper memory operations with native CUDA calls #395

Replace CUDA API wrapper memory operations with native CUDA calls #395

waredjeb commented Oct 25, 2019 •

edited by fwyzard

Loading

makortel Oct 25, 2019

waredjeb Oct 25, 2019

makortel Oct 25, 2019

makortel commented Oct 25, 2019

This comment has been minimized.

fwyzard left a comment

fwyzard Oct 27, 2019

waredjeb Oct 28, 2019

fwyzard commented Oct 28, 2019 •

edited

Loading

fwyzard commented Oct 28, 2019

VinInn Oct 28, 2019

makortel Oct 28, 2019 •

edited

Loading

VinInn Oct 28, 2019

fwyzard Oct 28, 2019

VinInn Oct 28, 2019

fwyzard commented Oct 28, 2019 •

edited

Loading

fwyzard commented Oct 28, 2019

Replace CUDA API wrapper memory operations with native CUDA calls #395

Replace CUDA API wrapper memory operations with native CUDA calls #395

Conversation

waredjeb commented Oct 25, 2019 • edited by fwyzard Loading

PR description

PR validation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

makortel commented Oct 25, 2019

This comment has been minimized.

fwyzard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwyzard commented Oct 28, 2019 • edited Loading

Validation summary

Validation plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Oct 28, 2019

Choose a reason for hiding this comment

makortel Oct 28, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwyzard commented Oct 28, 2019 • edited Loading

Validation summary

Validation plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Oct 28, 2019

waredjeb commented Oct 25, 2019 •

edited by fwyzard

Loading

fwyzard commented Oct 28, 2019 •

edited

Loading

logs and `nvprof`/`nvvp` profiles

makortel Oct 28, 2019 •

edited

Loading

fwyzard commented Oct 28, 2019 •

edited

Loading

logs and `nvprof`/`nvvp` profiles