Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FX] Changes done internally at Facebook #1456

Merged
merged 7 commits into from
Nov 28, 2022
Merged

[FX] Changes done internally at Facebook #1456

merged 7 commits into from
Nov 28, 2022

Conversation

frank-wei
Copy link
Contributor

d269be2fc7d84738a642d1d53eb44e6886a28d0c Alex Beloi alexbeloi@fb.com [fx] add deferred weights (xl_weight) and tracing for xl_embedding_bag
6f233bc9c72d90a908db0548c9d2dbe853895137 Alex Beloi alexbeloi@fb.com [fx] fix out of bounds indices/offsets for embedding_bag ops with xl_weight
3ca3b21c6a85ab9a6e9de503d0f13ee713a7b67c Janet Yang qxy11@fb.com Support div, torch.norm 52955d93d25e857510ed1b765220e8e5b0b0bb08 Janet Yang qxy11@fb.com Pass to replace sum(elmtwise(X))/numel(X) w/ mean(elmtwise(X))
89c56ef76a7a329f244a013ac5ccb099cb00c3c0 Janet Yang qxy11@fb.com Support scalar clamp, fixes for nan_to_num and benchmark
afdc533da031a64e162bb08c8629ff38739e24f8 Wei Wei wwei6@fb.com [fx2trt] disable dispatch trace leaf node test d160a7a5e554d37c142e13f100bf4d8739ced232 Wei Wei wwei6@fb.com add option to remove passes c22f691e6eae1b06ecd301eb6285b32d5dc9717c Mike Iovine mikeiovine@fb.com [fx2trt] Support dict inputs in acc tracer
8c05a3c57b1f5c63108b979ef8c61411525d0b1f Mike Iovine mikeiovine@fb.com [fx2trt] Support namedtuple access in acc tracer getattr
ff2000594e3f3ff75e0074edf9c38b5609128bbd Janet Yang qxy11@fb.com Generalize remove split ops more 1580805d827eb40c941e769b0b99e7c6a3ed6f89 Wei Wei wwei6@fb.com [fx2trt] add reshape unit test d6a975462071a3747d18edcbe87a3b143b3ece88 Archie Sravankumar archishmans@fb.com Added FX tracing for log_softmax
6943ac0e322077b36a03c50c4c9065de6cd32837 Sungmin Cho sungmincho@fb.com Add replace_mutable_op lower pass
baab27b81b1275de92fdaf760a158ce951564d33 Donglin Xia doxia@fb.com Register avg_pool3d for acc_op in acc_op.py
ae4c4e2c3c18d78542140fcc30e1c24f7c647ef3 Wei Wei wwei6@fb.com [aten2trt] init check-in 87ef03338c9a25c5a610a2eb590345e8935f8d75 Wei Wei wwei6@fb.com [aten2trt] add binary ops 2bb168517ace7e638cffc7a241b1cbf528790b92 Mike Iovine mikeiovine@fb.com [fx2trt] Add acc normalization blocklist 8c912e085cf8722d572698286020ae1ce055023d Zhijing Li (Accelerator Enablement) tissue030@fb.com Skip unstable test_conv_add_standalone_module
b80dca9c9afa3b7d253e7806f48a890b9f83bf04 Jonathan Amazon jonamazon@fb.com [PyTorch][FX][Compiler] Add acc_op tracing support for torch.baddbmm in FX
137a3977ffeb03d0387e8a95ff2f32f3d15b3de8 Wei Wei wwei6@meta.com [aten2trt] resnet support fef54c237589a70c007c861e2d59c4052e3de054 Kefei Lu kefeilu@meta.com [easy] fx2xxx: fix fuse_parallel_linear which changes getitem slices from tuple to list
4b062ef361cd7797e72c51bb4dc41766aca7b6db Kefei Lu kefeilu@meta.com fx2trt: fix bad reshape pattern x.reshape(y.size(0), ...)
49573920892bb2fe75fe011a8cad9887bdc8bd04 Alex Beloi alexbeloi@meta.com [FX] add tracing for torch.detach fe3cc75e775af53f603a83e8b4899b28f3cb6ddc Yinghai Lu yinghai@meta.com [fx2ait] add support to torch.clip 42c54d69c68dc58ac348647acada88b1e5634b40 Fei Kou feikou@meta.com Fix clamping float32 boundary values e013621dedf5960f81b915cef8d2ce19ca349a7a Kefei Lu kefeilu@meta.com trt lower: change preset application logic to in-place instead of immutable update
adc9f8ff48c01a0ce70080c930221ac81f048563 Kefei Lu kefeilu@meta.com [easy]: fix another instance of [slice(), ...] to (slice(), ...)
a22e9ff2cc55eb8669690eedd6971be93a2a356b Rui Zhu zrphercule@meta.com Support NoneType in acc_tracing by setting its meta shape to be 1
4f54ce9283f02fe416ff3f502ef1a4e4f80c0f37 Mike Iovine mikeiovine@meta.com [fx2ait] Avoid extra copies from view ops
0baf42ebf6ce4146df1bee2d2e62fa2b77dbd7fb Mor Tzur mortzur@meta.com add torch.concat to acc_ops 9cd933707772b0f05b8aca62bcc813929bd52868 Shirong Wu shirong@meta.com replace assert_allclose with assert_close
e418d0653752022ea4ee186036b79dc8ca0ae87b Valeriu Lacatusu valeriu@meta.com [PyTorch][FX][Compiler] Add acc_op tracing support for torch.nn.functional.softplus in FX
afb2f560b3995ea3a1cd440df3cdd66d92472e46 Wei Wei wwei6@meta.com [fx2trt] test fix to adopt new interface of dynamo
8ca2307c744f13ef15bad49f5030dddd2b787b9d Huamin Li huaminli@meta.com rename test_setitem to test_setitem_trt
e0b75bbfda8604d4b60599ddba4d4aa7023887a5 Valeriu Lacatusu valeriu@meta.com [FX] Replace deprecated torch.testing.assert_allclose with torch.testing.assert_close
4a233da979a755fa605e9750c6035ed885597afa Valeriu Lacatusu valeriu@meta.com [PyTorch][FX][Compiler] Add acc_op tracing support for torch.ops._caffe2.RoIAlign in FX

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

d269be2fc7d84738a642d1d53eb44e6886a28d0c Alex Beloi <alexbeloi@fb.com> [fx] add deferred weights (xl_weight) and tracing for xl_embedding_bag
6f233bc9c72d90a908db0548c9d2dbe853895137 Alex Beloi <alexbeloi@fb.com> [fx] fix out of bounds indices/offsets for embedding_bag ops with xl_weight
3ca3b21c6a85ab9a6e9de503d0f13ee713a7b67c Janet Yang <qxy11@fb.com> Support div, torch.norm
52955d93d25e857510ed1b765220e8e5b0b0bb08 Janet Yang <qxy11@fb.com> Pass to replace sum(elmtwise(X))/numel(X) w/ mean(elmtwise(X))
89c56ef76a7a329f244a013ac5ccb099cb00c3c0 Janet Yang <qxy11@fb.com> Support scalar clamp, fixes for nan_to_num and benchmark
afdc533da031a64e162bb08c8629ff38739e24f8 Wei Wei <wwei6@fb.com> [fx2trt] disable dispatch trace leaf node test
d160a7a5e554d37c142e13f100bf4d8739ced232 Wei Wei <wwei6@fb.com> add option to remove passes
c22f691e6eae1b06ecd301eb6285b32d5dc9717c Mike Iovine <mikeiovine@fb.com> [fx2trt] Support dict inputs in acc tracer
8c05a3c57b1f5c63108b979ef8c61411525d0b1f Mike Iovine <mikeiovine@fb.com> [fx2trt] Support namedtuple access in acc tracer getattr
ff2000594e3f3ff75e0074edf9c38b5609128bbd Janet Yang <qxy11@fb.com> Generalize remove split ops more
1580805d827eb40c941e769b0b99e7c6a3ed6f89 Wei Wei <wwei6@fb.com> [fx2trt] add reshape unit test
d6a975462071a3747d18edcbe87a3b143b3ece88 Archie Sravankumar <archishmans@fb.com> Added FX tracing for `log_softmax`
6943ac0e322077b36a03c50c4c9065de6cd32837 Sungmin Cho <sungmincho@fb.com> Add replace_mutable_op lower pass
baab27b81b1275de92fdaf760a158ce951564d33 Donglin Xia <doxia@fb.com> Register avg_pool3d for acc_op in acc_op.py
ae4c4e2c3c18d78542140fcc30e1c24f7c647ef3 Wei Wei <wwei6@fb.com> [aten2trt] init check-in
87ef03338c9a25c5a610a2eb590345e8935f8d75 Wei Wei <wwei6@fb.com> [aten2trt] add binary ops
2bb168517ace7e638cffc7a241b1cbf528790b92 Mike Iovine <mikeiovine@fb.com> [fx2trt] Add acc normalization blocklist
8c912e085cf8722d572698286020ae1ce055023d Zhijing Li (Accelerator Enablement) <tissue030@fb.com> Skip unstable test_conv_add_standalone_module
b80dca9c9afa3b7d253e7806f48a890b9f83bf04 Jonathan Amazon <jonamazon@fb.com> [PyTorch][FX][Compiler] Add acc_op tracing support for torch.baddbmm in FX
137a3977ffeb03d0387e8a95ff2f32f3d15b3de8 Wei Wei <wwei6@meta.com> [aten2trt] resnet support
fef54c237589a70c007c861e2d59c4052e3de054 Kefei Lu <kefeilu@meta.com> [easy] fx2xxx: fix fuse_parallel_linear which changes getitem slices from tuple to list
4b062ef361cd7797e72c51bb4dc41766aca7b6db Kefei Lu <kefeilu@meta.com> fx2trt: fix bad reshape pattern x.reshape(y.size(0), ...)
49573920892bb2fe75fe011a8cad9887bdc8bd04 Alex Beloi <alexbeloi@meta.com> [FX] add tracing for torch.detach
fe3cc75e775af53f603a83e8b4899b28f3cb6ddc Yinghai Lu <yinghai@meta.com> [fx2ait] add support to torch.clip
42c54d69c68dc58ac348647acada88b1e5634b40 Fei Kou <feikou@meta.com> Fix clamping float32 boundary values
e013621dedf5960f81b915cef8d2ce19ca349a7a Kefei Lu <kefeilu@meta.com> trt lower: change preset application logic to in-place instead of immutable update
adc9f8ff48c01a0ce70080c930221ac81f048563 Kefei Lu <kefeilu@meta.com> [easy]: fix another instance of [slice(), ...] to (slice(), ...)
a22e9ff2cc55eb8669690eedd6971be93a2a356b Rui Zhu <zrphercule@meta.com> Support NoneType in acc_tracing by setting its meta shape to be 1
4f54ce9283f02fe416ff3f502ef1a4e4f80c0f37 Mike Iovine <mikeiovine@meta.com> [fx2ait] Avoid extra copies from view ops
0baf42ebf6ce4146df1bee2d2e62fa2b77dbd7fb Mor Tzur <mortzur@meta.com> add torch.concat to acc_ops
9cd933707772b0f05b8aca62bcc813929bd52868 Shirong Wu <shirong@meta.com> replace assert_allclose with assert_close
e418d0653752022ea4ee186036b79dc8ca0ae87b Valeriu Lacatusu <valeriu@meta.com> [PyTorch][FX][Compiler] Add acc_op tracing support for torch.nn.functional.softplus in FX
afb2f560b3995ea3a1cd440df3cdd66d92472e46 Wei Wei <wwei6@meta.com> [fx2trt] test fix to adopt new interface of dynamo
8ca2307c744f13ef15bad49f5030dddd2b787b9d Huamin Li <huaminli@meta.com> rename test_setitem to test_setitem_trt
e0b75bbfda8604d4b60599ddba4d4aa7023887a5 Valeriu Lacatusu <valeriu@meta.com> [FX] Replace deprecated torch.testing.assert_allclose with torch.testing.assert_close
4a233da979a755fa605e9750c6035ed885597afa Valeriu Lacatusu <valeriu@meta.com> [PyTorch][FX][Compiler] Add acc_op tracing support for torch.ops._caffe2.RoIAlign in FX
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

@frank-wei frank-wei changed the title Changes done internally at Facebook [FX] Changes done internally at Facebook Nov 17, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

@frank-wei frank-wei marked this pull request as ready for review November 17, 2022 05:51
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

@frank-wei
Copy link
Contributor Author

@narendasan as we discussed, we are ok to leave this PR out of 1.3 release. But let's see what we can do to merge it in master.

@yinghai
Copy link

yinghai commented Nov 22, 2022

What's the issue in terms of merging to master?

@narendasan
Copy link
Collaborator

Should be fine to merge, we just needed a bit of time to make sure release/1.3 was close so that cherry picking wasnt too painful. Though, any idea about the failing tests?

@frank-wei
Copy link
Contributor Author

frank-wei commented Nov 23, 2022

What's the issue in terms of merging to master?

It seems that PT 1.14 has some compilation issues with new interface.

Traceback (most recent call last):
File "", line 1, in
File "/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt/init.py", line 85, in
from torch_tensorrt._compile import *
File "/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt/_compile.py", line 2, in
from torch_tensorrt import _enums
File "/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt/_enums.py", line 1, in
from torch_tensorrt._C import dtype, DeviceType, EngineCapability, TensorFormat
ImportError: /opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt/lib/libtorchtrt.so: undefined symbol: _ZN3c104cuda20CUDACachingAllocator9allocatorE

@yinghai
Copy link

yinghai commented Nov 23, 2022

But this symbol issue doesn't seem to be specific to this diff.

[yinghai:~:]$ c++filt _ZN3c104cuda20CUDACachingAllocator9allocatorE
c10::cuda::CUDACachingAllocator::allocator

We might need to check with pytorch core team on what's going on (cc @malfet ). It seems to be introduced by a recent change. It wonder if this symbol got optimized away (https://github.com/pytorch/pytorch/blob/29742786f38d4873576c73917e8509908132dae2/c10/cuda/CUDACachingAllocator.cpp#L2375).

A stupid way to work around this might be just adding

namespace c10::cuda::CUDACachingAllocator {
std::atomic<CUDAAllocator*> allocator{};
}

in some C++ code in libtorchtrt. But I think it'd be better if we can fix that in pytorch.

Btw, if seems weird that fx tests are also failing because of this. Supposingly, fx path can be python only.

@albanD
Copy link

albanD commented Nov 23, 2022

@yinghai this symbol is already properly exposed in the header: https://github.com/pytorch/pytorch/blob/a188f05e8c1788d393c072868421991dfcb55b02/c10/cuda/CUDACachingAllocator.h#L219

@yinghai
Copy link

yinghai commented Nov 23, 2022

Yeah, that's the part that makes the error weird.

@malfet
Copy link

malfet commented Nov 23, 2022

Does the code depend on c10_cuda library? Because if it depends on just libtorch.so the symbol might not be re-exported (unless RTLD_GLOBAL loading mode is used)

@frank-wei
Copy link
Contributor Author

Does the code depend on c10_cuda library? Because if it depends on just libtorch.so the symbol might not be re-exported (unless RTLD_GLOBAL loading mode is used)

cc @narendasan

@emcastillo
Copy link

Looking at the date of the wheel being used, I think that this PR may have caused the issue
pytorch/pytorch#87251

@yinghai
Copy link

yinghai commented Nov 27, 2022

Thanks folks. The undefined symbol issue is fixed in #1479.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

@frank-wei frank-wei merged commit 97b6708 into master Nov 28, 2022
@frank-wei frank-wei deleted the fb-sync-wwei6 branch January 21, 2023 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: fx documentation Improvements or additions to documentation fx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants