Turn on BUILD_NAMEDTENSOR permanently #26264

zou3519 · 2019-09-16T03:13:43Z

Stack from ghstack:

Turn on BUILD_NAMEDTENSOR permanently #26264 Turn on BUILD_NAMEDTENSOR permanently

This PR enables BUILD_NAMEDTENSOR by default. This is done via including
a header, c10/core/EnableNamedTensor, that sets BUILD_NAMEDTENSOR.
In the future, the plan is to get rid of the flag entirely: we can
incrementally delete usages after this PR goes in.

This PR also maintains the namedtensor ci vs regular ci distinction.
test/test_namedtensor.py only runs if TEST_NAMEDTENSOR=1 is specified.
TEST_NAMEDTENSOR=1 is set on the namedtensor ci. I'll remove this
distinction later and send out an announcement about it; devs will be
responsible for named tensor failures after that.

The initial reason why we had the BUILD_NAMEDTENSOR flag was so that we
could quickly prototype named tensor features without worrying about
adding overhead to the framework. The overheads can be categorized as
memory overhead and performance overhead.

Memory overhead: named tensors adds 1 additional word per Tensor. This
is because TensorImpl stores a unique_ptr<NamedTensorMetaInterface>
field. This is not a lot of overhead.

Performance overhead: At all entry points to name inference, we check
if inputs to an op are named. If inputs are not named, we short-circuit
and don't do name inference. These calls should therefore be as
efficient as error-checking code and not take up a lot of time.

My plan is to benchmark a few functions and then post the results in a
comment to this PR.

Test Plan:

[namedtensor ci]

Differential Revision: D17392428

This PR enables BUILD_NAMEDTENSOR by default. This is done via including a header, `c10/core/EnableNamedTensor`, that sets `BUILD_NAMEDTENSOR`. In the future, the plan is to get rid of the flag entirely: we can incrementally delete usages after this PR goes in. This PR also maintains the namedtensor ci vs regular ci distinction. `test/test_namedtensor.py` only runs if TEST_NAMEDTENSOR=1 is specified. TEST_NAMEDTENSOR=1 is set on the namedtensor ci. I'll remove this distinction later and send out an announcement about it; devs will be responsible for named tensor failures after that. The initial reason why we had the BUILD_NAMEDTENSOR flag was so that we could quickly prototype named tensor features without worrying about adding overhead to the framework. The overheads can be categorized as memory overhead and performance overhead. Memory overhead: named tensors adds 1 additional word per Tensor. This is because TensorImpl stores a `unique_ptr<NamedTensorMetaInterface>` field. This is not a lot of overhead. Performance overhead: At all entry points to name inference, we check if inputs to an op are named. If inputs are not named, we short-circuit and don't do name inference. These calls should therefore be as efficient as error-checking code and not take up a lot of time. My plan is to benchmark a few functions and then post the results in a comment to this PR. Test Plan: - [namedtensor ci]

This PR enables BUILD_NAMEDTENSOR by default. This is done via including a header, `c10/core/EnableNamedTensor`, that sets `BUILD_NAMEDTENSOR`. In the future, the plan is to get rid of the flag entirely: we can incrementally delete usages after this PR goes in. This PR also maintains the namedtensor ci vs regular ci distinction. `test/test_namedtensor.py` only runs if TEST_NAMEDTENSOR=1 is specified. TEST_NAMEDTENSOR=1 is set on the namedtensor ci. I'll remove this distinction later and send out an announcement about it; devs will be responsible for named tensor failures after that. The initial reason why we had the BUILD_NAMEDTENSOR flag was so that we could quickly prototype named tensor features without worrying about adding overhead to the framework. The overheads can be categorized as memory overhead and performance overhead. Memory overhead: named tensors adds 1 additional word per Tensor. This is because TensorImpl stores a `unique_ptr<NamedTensorMetaInterface>` field. This is not a lot of overhead. Performance overhead: At all entry points to name inference, we check if inputs to an op are named. If inputs are not named, we short-circuit and don't do name inference. These calls should therefore be as efficient as error-checking code and not take up a lot of time. My plan is to benchmark a few functions and then post the results in a comment to this PR. Test Plan: - [namedtensor ci] ghstack-source-id: 8346d1813960f6b10342a2e63cd0d69ed9c58986 Pull Request resolved: #26264

pytorchbot added the module: internals Related to internal abstractions in c10 and ATen label Sep 16, 2019

zou3519 closed this Sep 17, 2019

zou3519 mentioned this pull request Oct 18, 2019

Illegal instruction 4 in 1.3, OSX & GPU #27627

Closed

facebook-github-bot deleted the gh/zou3519/170/head branch October 28, 2019 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turn on BUILD_NAMEDTENSOR permanently #26264

Turn on BUILD_NAMEDTENSOR permanently #26264

zou3519 commented Sep 16, 2019 •

edited

Turn on BUILD_NAMEDTENSOR permanently #26264

Turn on BUILD_NAMEDTENSOR permanently #26264

Conversation

zou3519 commented Sep 16, 2019 • edited

zou3519 commented Sep 16, 2019 •

edited