shared memory support for SonicTriton #33801

kpedro88 · 2021-05-20T22:56:20Z

PR description:

Added allocate() function to streamline creation of input vectors
Expanded unit testing: now two tests, one for CPU and one for GPU (each tests both gRPC and shared memory)
Shared memory used automatically (by default) for local fallback server (either CPU or GPU)
useSharedMemory client config parameter to disable this for a specific algorithm/producer
Shared memory regions are reused from one event to the next, and only reallocated if the existing region is too small; this is necessary to achieve performance improvements (otherwise the high cost of reallocating e.g. every event dwarfs any improvement)
Documentation updated accordingly

PR validation:

Confirmed that the same outputs are achieved whether using gRPC or shared memory, and unit tests pass as expected.

Tested performance of the two example models (ResNet50 and Graph Attention Network):

CPU shared memory: 3-12% faster (client-side), 3-4% faster (server-side)
GPU shared memory: 15-20% faster (client-side), 10-45% faster (server-side)

The performance improvements depend on the amount of data being transferred, as well as the size of the model (which controls how long the inference takes).

The impact of these latency decreases on throughput will be tested in the near future, once realistic workflows are prepared.

Technical details: this branch is squashed from several previous development branches.

Requires: cms-sw/cmsdist#6929

kpedro88 · 2021-05-20T22:57:07Z

test parameters
pull_request = cms-sw/cmsdist#6929

cmsbuild · 2021-05-20T23:02:33Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/22787

This PR adds an extra 44KB to repository

cmsbuild · 2021-05-20T23:02:56Z

A new Pull Request was created by @kpedro88 (Kevin Pedro) for master.

It involves the following packages:

HeterogeneousCore/SonicCore
HeterogeneousCore/SonicTriton

@makortel, @cmsbuild, @fwyzard can you please review it and eventually sign? Thanks.
@makortel, @riga, @rovere this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

kpedro88 · 2021-05-20T23:08:10Z

please test

makortel · 2021-05-21T00:59:02Z

The term "shared memory" tends to be somewhat overloaded, could you clarify what exactly it means in this context?

makortel · 2021-05-21T01:21:39Z

The term "shared memory" tends to be somewhat overloaded, could you clarify what exactly it means in this context?

Ok, I suppose this is the memory that can be shared between processes.

cmsbuild · 2021-05-21T05:10:58Z

-1

Failed Tests: HeaderConsistency
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8660a4/15222/summary.html
COMMIT: 25f67dd
CMSSW: CMSSW_12_0_X_2021-05-20-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/33801/15222/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 8 differences found in the comparisons
DQMHistoTests: Total files compared: 37
DQMHistoTests: Total histograms compared: 2650486
DQMHistoTests: Total failures: 12
DQMHistoTests: Total nulls: 1
DQMHistoTests: Total successes: 2650451
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -0.004 KiB( 36 files compared)
DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
Checked 155 log files, 37 edm output root files, 37 DQM output files
TriggerResults: no differences found

kpedro88 · 2021-05-21T14:01:56Z

@makortel yes, it's shared memory for inter-process communication. For CPU, it uses /dev/shm, while for GPU, it uses cudaIpcMemHandle_t.

cmsbuild · 2021-05-21T14:04:07Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/22801

This PR adds an extra 44KB to repository

cmsbuild · 2021-06-10T19:48:32Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/23249

This PR adds an extra 84KB to repository
Found files with invalid states:
- HeterogeneousCore/SonicTriton/interface/grpc_client_gpu.h:
  - Added: 25f67dd
  - Deleted: b5e9749

cmsbuild · 2021-06-10T19:48:52Z

Pull request #33801 was updated. @makortel, @cmsbuild, @fwyzard can you please check and sign again.

kpedro88 · 2021-06-10T19:49:44Z

please test

cmsbuild · 2021-06-11T10:10:33Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8660a4/15887/summary.html
COMMIT: d92c6b4
CMSSW: CMSSW_12_0_X_2021-06-10-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/33801/15887/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

minor fix to crab client cmsdist#7009 @belforte: minor fix to crab client
Bring the -mcpu=power8 flag cmsdist#7002 @cms-sw: Bring the -mcpu=power8 flag

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8660a4/15887/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8660a4/15887/git-merge-result

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 38
DQMHistoTests: Total histograms compared: 2862520
DQMHistoTests: Total failures: 7
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2862491
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 37 files compared)
Checked 160 log files, 37 edm output root files, 38 DQM output files
TriggerResults: no differences found

kpedro88 · 2021-06-14T21:28:56Z

@makortel any further review?

makortel · 2021-06-15T13:29:17Z

HeterogeneousCore/CUDAUtilities/interface/cudaCheck.h

@@ -21,19 +23,22 @@ namespace cms {
                                              const char* cmd,
                                              const char* error,
                                              const char* message,
-                                              const char* description = nullptr) {
+                                              std::string_view description = std::string_view()) {


@fwyzard Do you see any potential problems in using std::string_view here? (all relevant compilers for CUDA should support C++17 by some time already, right?)

Sorry for the delay, the TDR has kept me fully busy the last few days (...).

My main concern was what happens in the vast majority of the cases, when no description is passed. Both @makortel and I have made some checks on godbolt, and it looks like the compiler should optimise the std::string_view away in that case.

As for C++17 vs earlier versions of the standard: yes, CUDA 11 fully supports C++ 17, so no problem there either.

kpedro88 · 2021-06-17T13:31:23Z

@makortel @fwyzard I'm happy to address any further review comments, but if the review is finished, I would like to get this merged so that other ongoing developments can rebase on top of it.

fwyzard · 2021-06-17T17:55:07Z

+heterogeneous

cmsbuild · 2021-06-17T17:55:34Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

qliphy · 2021-06-18T00:41:47Z

+1

shared memory support for SonicTriton

25f67dd

cmsbuild added this to the CMSSW_12_0_X milestone May 20, 2021

cmsbuild added code-checks-pending heterogeneous-pending orp-pending pending-signatures tests-pending labels May 20, 2021

cmsbuild added the requires-external label May 20, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels May 20, 2021

cmsbuild added tests-started and removed tests-pending labels May 20, 2021

cmsbuild added tests-rejected and removed tests-started labels May 21, 2021

fix headers

20ab568

cmsbuild added code-checks-pending tests-pending and removed code-checks-approved tests-rejected labels May 21, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels May 21, 2021

cmsbuild added code-checks-rejected and removed code-checks-pending labels Jun 10, 2021

use function template instead of macro

d92c6b4

cmsbuild added code-checks-pending and removed code-checks-rejected labels Jun 10, 2021

cmsbuild added code-checks-approved and removed code-checks-pending labels Jun 10, 2021

cmsbuild added tests-started and removed tests-pending labels Jun 10, 2021

cmsbuild added tests-approved and removed tests-started labels Jun 11, 2021

makortel reviewed Jun 15, 2021

View reviewed changes

cmsbuild added fully-signed heterogeneous-approved and removed heterogeneous-pending pending-signatures labels Jun 17, 2021

cmsbuild added orp-approved and removed orp-pending labels Jun 18, 2021

cmsbuild merged commit 55d652b into cms-sw:master Jun 18, 2021

kpedro88 mentioned this pull request Jun 22, 2021

Initial Synch of SONIC Particle Net fastmachinelearning/cmssw#3

Merged

kpedro88 mentioned this pull request Jul 16, 2021

SonicTriton feature updates, improvements, bug fixes #34508

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shared memory support for SonicTriton #33801

shared memory support for SonicTriton #33801

kpedro88 commented May 20, 2021

kpedro88 commented May 20, 2021

cmsbuild commented May 20, 2021

cmsbuild commented May 20, 2021

kpedro88 commented May 20, 2021

makortel commented May 21, 2021 •

edited

makortel commented May 21, 2021

cmsbuild commented May 21, 2021

kpedro88 commented May 21, 2021

cmsbuild commented May 21, 2021

cmsbuild commented Jun 10, 2021

cmsbuild commented Jun 10, 2021

kpedro88 commented Jun 10, 2021

cmsbuild commented Jun 11, 2021

kpedro88 commented Jun 14, 2021

makortel Jun 15, 2021

fwyzard Jun 17, 2021

kpedro88 commented Jun 17, 2021

fwyzard commented Jun 17, 2021

cmsbuild commented Jun 17, 2021

qliphy commented Jun 18, 2021

shared memory support for SonicTriton #33801

shared memory support for SonicTriton #33801

Conversation

kpedro88 commented May 20, 2021

PR description:

PR validation:

kpedro88 commented May 20, 2021

cmsbuild commented May 20, 2021

cmsbuild commented May 20, 2021

kpedro88 commented May 20, 2021

makortel commented May 21, 2021 • edited

makortel commented May 21, 2021

cmsbuild commented May 21, 2021

Comparison Summary

kpedro88 commented May 21, 2021

cmsbuild commented May 21, 2021

cmsbuild commented Jun 10, 2021

cmsbuild commented Jun 10, 2021

kpedro88 commented Jun 10, 2021

cmsbuild commented Jun 11, 2021

Comparison Summary

kpedro88 commented Jun 14, 2021

makortel Jun 15, 2021

Choose a reason for hiding this comment

fwyzard Jun 17, 2021

Choose a reason for hiding this comment

kpedro88 commented Jun 17, 2021

fwyzard commented Jun 17, 2021

cmsbuild commented Jun 17, 2021

qliphy commented Jun 18, 2021

makortel commented May 21, 2021 •

edited