Skip to content
This repository was archived by the owner on Jul 1, 2025. It is now read-only.

Conversation

omromano
Copy link
Contributor

@omromano omromano commented Apr 7, 2020

Summary:
Update of NNPI Backend to v0.5.1.8.

@facebook-github-bot
Copy link

Hi @omromano!

Thank you for your pull request and welcome to our community.We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@jsubag
Copy link

jsubag commented Apr 7, 2020

@arunm-git , @jfix71 , please take a look.

@omromano omromano changed the title Update backend to 0.5.1.8 Update NNPI backend to 0.5.1.8 Apr 7, 2020
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DRT case makes sense here, but what would happen in a 3 partition setup A->(m)B and A->(n)C where m and n are the same PH?
Another use case is A->B but B is duplicated on 2 devices. A->B1 and A->B2.

Copy link

@jsubag jsubag Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC P2P only support 1-1 connections - i.e. no broadcast.
So for the case you mentioned you'll need to add another PH on A which is a copy of the existing one and connect one to B1 and the other to B2.
Let me verify that and get back to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is it possible to avoid the getPlaceholderByName() call in the execute path? The PH* is constant for the life of the network.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens only on the first time and is cached in netInputPlaceholders_ (see the condition block its in)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified with Garret 👍 . An executionContext can only be used in one inference request at a time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit weird -- so ICEREF is broken for this test but device isn't?

@omromano omromano force-pushed the 0518_pr branch 4 times, most recently from 1390d93 to f434821 Compare April 8, 2020 12:03
Copy link
Contributor

@jfix71 jfix71 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @omromano @jsubag

Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arunm-git has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Apr 21, 2020
Summary:
Update of NNPI Backend to v0.5.1.8.
Pull Request resolved: pytorch/glow#4397

Reviewed By: jfix71

Differential Revision: D20938791

Pulled By: arunm-git

fbshipit-source-id: 4f50d104db1ac274e92d9fce6fb86afd930d8511
@facebook-github-bot
Copy link

@arunm-git merged this pull request in 5e3544a.

NNPIBackendOptions NNPIBackend::backendOptions_;
NNPIAdapterContainer NNPIBackend::adapter_;

unsigned NNPIBackend::numDevices() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omromano @jsubag
This change to numDevices reverts a change I made to your last PR for 5.1.4.
numDevices as it is coded will leak memory if run on a host that doesn't have a card attached (adapterCreate will allocate, adapterGetInfo will fail and we return without calling adapterDestroy). numDevices() on github master fixes this - please update your internal sources for that change so we don't try to revert it on the next PR.

deviceId_(config_.deviceID), adapter_(adapter),
device_(NNPI_INVALID_NNPIHANDLE), deviceOptions_(deviceOptions) {
deviceId_(config_.deviceID), device_(NNPI_INVALID_NNPIHANDLE),
deviceOptions_(deviceOptions), adapter_(adapter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omromano @jsubag
This ordering was also an intentional change I made to your last PR. Without it there is a build warning/error due to incorrect initialization order. Same comment as above - please pull what's in github master into your tree so we don't try to revert it on the next PR. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arunm-git
Sorry we missed that. Will make sure it updated in our code (and not reverted in the next PR).

vdantu pushed a commit to vdantu/glow that referenced this pull request Jul 12, 2020
Summary:
Update of NNPI Backend to v0.5.1.8.
Pull Request resolved: pytorch#4397

Reviewed By: jfix71

Differential Revision: D20938791

Pulled By: arunm-git

fbshipit-source-id: 4f50d104db1ac274e92d9fce6fb86afd930d8511
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants