New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shared memory support for SonicTriton #33801
Conversation
test parameters |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/22787
|
A new Pull Request was created by @kpedro88 (Kevin Pedro) for master. It involves the following packages: HeterogeneousCore/SonicCore @makortel, @cmsbuild, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The term "shared memory" tends to be somewhat overloaded, could you clarify what exactly it means in this context? |
Ok, I suppose this is the memory that can be shared between processes. |
-1 Failed Tests: HeaderConsistency Comparison SummarySummary:
|
@makortel yes, it's shared memory for inter-process communication. For CPU, it uses |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/22801
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-33801/23249 |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8660a4/15887/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
|
@makortel any further review? |
@@ -21,19 +23,22 @@ namespace cms { | |||
const char* cmd, | |||
const char* error, | |||
const char* message, | |||
const char* description = nullptr) { | |||
std::string_view description = std::string_view()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fwyzard Do you see any potential problems in using std::string_view
here? (all relevant compilers for CUDA should support C++17 by some time already, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, the TDR has kept me fully busy the last few days (...).
My main concern was what happens in the vast majority of the cases, when no description
is passed. Both @makortel and I have made some checks on godbolt, and it looks like the compiler should optimise the std::string_view
away in that case.
As for C++17 vs earlier versions of the standard: yes, CUDA 11 fully supports C++ 17, so no problem there either.
+heterogeneous |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
allocate()
function to streamline creation of input vectorsuseSharedMemory
client config parameter to disable this for a specific algorithm/producerPR validation:
Confirmed that the same outputs are achieved whether using gRPC or shared memory, and unit tests pass as expected.
Tested performance of the two example models (ResNet50 and Graph Attention Network):
The performance improvements depend on the amount of data being transferred, as well as the size of the model (which controls how long the inference takes).
The impact of these latency decreases on throughput will be tested in the near future, once realistic workflows are prepared.
Technical details: this branch is squashed from several previous development branches.
Requires: cms-sw/cmsdist#6929