-
Notifications
You must be signed in to change notification settings - Fork 699
Update NNPI backend to 0.5.1.8 #4397
Conversation
Hi @omromano! Thank you for your pull request and welcome to our community.We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
@arunm-git , @jfix71 , please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DRT case makes sense here, but what would happen in a 3 partition setup A->(m)B and A->(n)C where m and n are the same PH?
Another use case is A->B but B is duplicated on 2 devices. A->B1 and A->B2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC P2P only support 1-1 connections - i.e. no broadcast.
So for the case you mentioned you'll need to add another PH on A which is a copy of the existing one and connect one to B1 and the other to B2.
Let me verify that and get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it possible to avoid the getPlaceholderByName() call in the execute path? The PH* is constant for the life of the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens only on the first time and is cached in netInputPlaceholders_ (see the condition block its in)
lib/Backends/NNPI/InferencePool.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified with Garret 👍 . An executionContext can only be used in one inference request at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit weird -- so ICEREF is broken for this test but device isn't?
1390d93
to
f434821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arunm-git has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Update of NNPI Backend to v0.5.1.8. Pull Request resolved: pytorch/glow#4397 Reviewed By: jfix71 Differential Revision: D20938791 Pulled By: arunm-git fbshipit-source-id: 4f50d104db1ac274e92d9fce6fb86afd930d8511
@arunm-git merged this pull request in 5e3544a. |
NNPIBackendOptions NNPIBackend::backendOptions_; | ||
NNPIAdapterContainer NNPIBackend::adapter_; | ||
|
||
unsigned NNPIBackend::numDevices() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@omromano @jsubag
This change to numDevices reverts a change I made to your last PR for 5.1.4.
numDevices as it is coded will leak memory if run on a host that doesn't have a card attached (adapterCreate will allocate, adapterGetInfo will fail and we return without calling adapterDestroy). numDevices() on github master fixes this - please update your internal sources for that change so we don't try to revert it on the next PR.
deviceId_(config_.deviceID), adapter_(adapter), | ||
device_(NNPI_INVALID_NNPIHANDLE), deviceOptions_(deviceOptions) { | ||
deviceId_(config_.deviceID), device_(NNPI_INVALID_NNPIHANDLE), | ||
deviceOptions_(deviceOptions), adapter_(adapter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arunm-git
Sorry we missed that. Will make sure it updated in our code (and not reverted in the next PR).
Summary: Update of NNPI Backend to v0.5.1.8. Pull Request resolved: pytorch#4397 Reviewed By: jfix71 Differential Revision: D20938791 Pulled By: arunm-git fbshipit-source-id: 4f50d104db1ac274e92d9fce6fb86afd930d8511
Summary:
Update of NNPI Backend to v0.5.1.8.