forked from pytorch/pytorch
-
Couldn't load subscription status.
- Fork 0
Send Pull Request #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Re-enable previously filtered op tests. Expecting lotsa failures. Should dtype also be wrapped in list? cc @mruberry, @suo Pull Request resolved: #77330 Approved by: https://github.com/suo
Removes azure_pipelines removal from create_release.yml Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77369 Approved by: https://github.com/suo, https://github.com/janeyx99
Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77370 Approved by: https://github.com/suo, https://github.com/janeyx99
This reverts commit 56bed0d. Reverted #76823 on behalf of https://github.com/rohan-varma
Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: #77376 Approved by: https://github.com/atalman, https://github.com/malfet
This infrastructure was used at some point but I do not believe it is used anymore. We should remove this to reduce the amount of confusion one has when trying to contribute to pytorch ci. Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77364 Approved by: https://github.com/janeyx99
Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77383 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Approved by: https://github.com/malfet
These jobs take a while to spin up, so let's make it so that they use custom runners Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77384 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Approved by: https://github.com/malfet
Exposes `cuFuncSetAttribute` & `cuFuncGetAttribute` Used for runtime compilation by nvfuser Pull Request resolved: #77296 Approved by: https://github.com/davidberard98
Decompositions can be used to fill in meta support where necessary, assuming the operations they decompose to support meta key. This PR adds register_meta kwarg to register_decomposition that optionally lets you register the meta to the C++ dispatch table for meta tensors. I use this to then get the meta function for where and huber_loss for free. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: #77353 Approved by: https://github.com/mruberry
Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: #77390 Approved by: https://github.com/malfet
In `_need_symbolic_context`, when the annotation is postponed evaluated, the annotation is a string and not a type. We need to use get_type_hints to get the real type. For example, ```python def g(a: int) -> int: return a def f(a: "int") -> "int": return a ``` we will get the correct type `int` for both g and f with `typing.get_type_hints`. Otherwise, the type for `a` in `f` will be a string and is not comparable to the type `int` - `issubclass` will complain. This is necessary as we will use postponed typing evaluation to break circular dependencies. Pull Request resolved: #77365 Approved by: https://github.com/BowenBao
This adds basic coverage, but can be easily made more efficient by providing a native implementation. Follow up work includes supporting CSR gradients for strided Tensors. Pull Request resolved: #77177 Approved by: https://github.com/nikitaved, https://github.com/mikaylagawarecki
None as input is legal per ONNX spec for representing optional inputs. For [example](https://github.com/onnx/onnx/blob/main/docs/Operators.md#inputs-2---3-7) `constant_value` for `ONNX::Pad`. This PR removes such constraint check that was set prior to calling onnx shape inference. For the issue below, such constraint prevents the onnx shape inference of `ONNX::Pad`, which leads to falling back on an incorrect constant traced shape. For the unit test in this PR, prior to this PR, the ONNX shape inference for `ONNX::Pad` would be skipped, and would return `None` instead. Fixes pytorch/vision#5971 Pull Request resolved: #77379 Approved by: https://github.com/garymm
Support quantization for maxpool exporting to ONNX. Pull Request resolved: #77393 Approved by: https://github.com/BowenBao
Adds support for scripting ParameterDicts and getattr() on them. It does not support iterating on ParameterDicts because torch/nn/container.py implementation of ParameterDict.items() uses a generator, which is not supported by torchscript. torch/nn/container.py would need to be updated so that iter gets correctly registered in python_sugared_value.cpp Added a test in test_module_containers.py Pull Request resolved: #77143 Approved by: https://github.com/eellison
Pull Request resolved: #77371 Approved by: https://github.com/mruberry
Reduce circular dependencies
- Lift constants and flags from `symbolic_helper` to `_constants` and `_globals`
- Standardized constant naming to make it consistant
- Make `utils` strictly dependent on `symbolic_helper`, removing inline imports from symbolic_helper
- Move side effects from `utils` to `_patch_torch`
Pull Request resolved: #77142
Approved by: https://github.com/garymm, https://github.com/BowenBao
Main question mark is that `log_sigmoid_forward` uses `acc_t` instead of `opmath_t` - not sure if we have a decorator today for that? Glad to add one if we don't. Pull Request resolved: #77329 Approved by: https://github.com/ezyang
…dows) (#77192) Ref: #74537 Pull Request resolved: #77192 Approved by: https://github.com/anjali411
Pull Request resolved: #77347 Approved by: https://github.com/datumbox, https://github.com/frank-wei
This PR makes the following changes... Prims - adds as_strided - fixes errors in flatten meta Testing - enables view consistency checking (which can be opted out of, see issues below) - adds reference inputs for view, reshape, and flatten - adds error inputs for reshape Refs - adds as_strided, reshape, and view - fixes an error in the flatten ref where it was not returning self on no-op - fixes a bug in transpose where it was not retuning a view when the transposed tensor has 1 or fewer dims Issues - #77218 - #77216 Pull Request resolved: #77220 Approved by: https://github.com/ngimel
…idesOf` (#77387) Summary: s/size/in_size/ in outer func Test Plan: CI Differential Revision: D36357483 Pull Request resolved: #77387 Approved by: https://github.com/seemethere, https://github.com/mehtanirav
Pull Request resolved: #77358 Approved by: https://github.com/ezyang
Pull Request resolved: #76296 Approved by: https://github.com/cpuhrsch, https://github.com/ngimel
…t[value={0}]"
Retry of #76875. It was reverted
due to torchvision failures, but it turned out that the failures were
caused by a different PR.
irparser previously didn't support these, which would cause failures in
log_extract.py
Pull Request resolved: #77377
Approved by: https://github.com/datumbox
Summary: The new PrivateUse1 DeviceType is associated with the PrivateUse1 DispatchKey, which can be used for non-public devices without introducing a new device type. Note that the stringified name of the PrivateUse1 device is "privateuseone". Test Plan: All CI should pass. Differential Revision: D35859437 Pull Request resolved: #77208 Approved by: https://github.com/bdhirsh
Summary: The root module may have different forward functions. The current implementation assumes only the func `forward` can be traced. In this diff, we add an argument of forward func name to enable users trace different forward functions Test Plan: N1903198 Differential Revision: D36157032 Pull Request resolved: #77109 Approved by: https://github.com/jamesr66a
…ance on CPU Pull Request resolved: #73953 Approved by: https://github.com/frank-wei
) Fixes #ISSUE_NUMBER Pull Request resolved: #76888 Approved by: https://github.com/eellison
Fixes #77412 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: #77488 Approved by: https://github.com/mruberry
…representing tensor sizes (#76836)"" This reverts commit c35bd8d. Pull Request resolved: #77719 Approved by: https://github.com/Chillee, https://github.com/malfet
Updating nvfuser code base. This should fix the indexing issue observed in pytorch/vision#6015. Running tests locally as well. Will update the description here at a later point @bypass-github-export-checks Pull Request resolved: #77471 Approved by: https://github.com/seemethere, https://github.com/eellison
This reverts commit d03d43d. Reverted #77673 on behalf of https://github.com/suo
In preparation of adopting future rocblas library options, it is necessary to track when the backward pass of training is executing. The scope-based helper class `BackwardPassGuard` is provided to toggle state. Pull Request resolved: #71881 Approved by: https://github.com/albanD
Pull Request resolved: #77653 Approved by: https://github.com/albanD
…ong type for detach() Pull Request resolved: #77655 Approved by: https://github.com/albanD
Pull Request resolved: #77782 Approved by: https://github.com/albanD
Fixing #77748 Pull Request resolved: #77767 Approved by: https://github.com/soulitzer
Pull Request resolved: #77605 Approved by: https://github.com/cpuhrsch
Introduce error handling across all ranks when loading and saving checkpoints. This makes it a lot simpler for users to handle failures and, as a positive side-effect, coordination of when it successfully finished. This change requires 3 collectives when saving and 1 when loading. All those collectives carry a small payload so they will be latency bound and write time should dominate it. Pull Request resolved: #77091 Approved by: https://github.com/pritamdamania87, https://github.com/wanchaol
This reverts commit a7cf95a. Reverted #77708 on behalf of https://github.com/suo
Makes debugging of failures like #76999 (comment) easier, by posting a link to checkrun that have failed/still pending Pull Request resolved: #77763 Approved by: https://github.com/seemethere
This is a workaround for EFA for TensorPipe. This allows RPC enabled tests to be ran on AWS clusters. Pull Request resolved: #77363 Approved by: https://github.com/wanchaol
more! Pull Request resolved: #77803 Approved by: https://github.com/seemethere
Resubmit of #77673, which was reverted due to Windows test failures: #77673 (comment). I suspect these failures happened because I don't explicitly set a side stream for graph capture in the new test. Not setting a side stream explicitly is alright on Linux because cuda tests implicitly use a side stream. I think Windows cuda tests implicitly use the default stream, breaking capture and leaving the backend in a bad state. Other graphs tests explicitly set side streams and don't error in Windows builds, so i'm 95% sure doing the same for the new test will work. Pull Request resolved: #77789 Approved by: https://github.com/ezyang
Pull Request resolved: #77800 Approved by: https://github.com/pritamdamania87, https://github.com/fduwjj
This is the first PR to make DataPipe deterministic. Users should be able to use `torch.manual_seed(seed)` to control the shuffle order for the following cases: - Directly over `DataPipe` - For single-process DataLoader - Multiprocessing DataLoader Unfortunately, for distributed training, users have to run `apply_shuffle_seed` manually to make sure all distributed processes having the same order of shuffle. Pull Request resolved: #77741 Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
Pull Request resolved: #77663 Approved by: https://github.com/cpuhrsch
Rehash of #75426 now that a revised version of load_state_dict_post_hook has landed. Pull Request resolved: #76912 Approved by: https://github.com/awgu
Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: #77682 Approved by: https://github.com/ngimel, https://github.com/mruberry
Pull Request resolved: #77761 Approved by: https://github.com/jbschlosser
miladm
pushed a commit
that referenced
this pull request
Jun 14, 2022
…78136) This prevents `import torch` accidentally crash on machines with no metal devices Should prevent crashes reported in pytorch#77662 (comment) and https://github.com/pytorch/functorch/runs/6560056366?check_suite_focus=true Backtrace to the crash: ``` (lldb) bt * thread #1, stop reason = signal SIGSTOP * frame #0: 0x00007fff7202be57 libobjc.A.dylib`objc_msgSend + 23 frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 frame pytorch#2: 0x000000010fda011d libtorch_cpu.dylib`_GLOBAL__sub_I_MPSAllocator.mm + 125 frame pytorch#3: 0x000000010ada81e3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535 frame pytorch#4: 0x000000010ada85ee dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40(lldb) up frame #1: 0x000000010fd9f524 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl() + 436 libtorch_cpu.dylib`at::mps::HeapAllocator::MPSHeapAllocatorImpl::MPSHeapAllocatorImpl: -> 0x10fd9f524 <+436>: movq %rax, 0x1b0(%rbx) 0x10fd9f52b <+443>: movw $0x0, 0x1b8(%rbx) 0x10fd9f534 <+452>: addq $0x8, %rsp 0x10fd9f538 <+456>: popq %rbx (lldb) disassemble ... 0x10fd9f514 <+420>: movq 0xf19ad15(%rip), %rsi ; "maxBufferLength" 0x10fd9f51b <+427>: movq %r14, %rdi 0x10fd9f51e <+430>: callq *0xeaa326c(%rip) ; (void *)0x00007fff7202be40: objc_msgSend ``` which corresponds to `[m_device maxBufferLength]` call, where `m_device` is not initialized in https://github.com/pytorch/pytorch/blob/2ae3c59e4bcb8e6e75b4a942cacc2d338c88e609/aten/src/ATen/mps/MPSAllocator.h#L171 Pull Request resolved: pytorch#78136 Approved by: https://github.com/seemethere
miladm
pushed a commit
that referenced
this pull request
Jun 14, 2022
… of libtorch_python (pytorch#78028) Summary: This moves torch::class_<WorkerInfo> into `rpc_agent.cpp` so it gets registered in libtorch instead of libtorch_python. This is intermediate work to getting torch::deploy to load an unmodified copy of libtorch. Current RPC is incompatible due to duplicate registrations. ``` unknown file: Failure C++ exception with description "Exception Caught inside torch::deploy embedded library: Custom class with name __torch__.torch.classes.dist_rpc.WorkerInfo is already registered. Ensure that registration with torch::class_ is only called once. Exception raised from registerCustomClass at ../aten/src/ATen/core/custom_class.cpp:61 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f3bd9adb92e in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7f3bd9ab7068 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so) frame pytorch#2: torch::registerCustomClass(std::shared_ptr<c10::ClassType>) + 0x110 (0x7f3bc2258980 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so) frame pytorch#3: torch::detail::class_base::class_base(std::string const&, std::string const&, std::string, std::type_info const&, std::type_info const&) + 0x3b9 (0x7f3bc225a419 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so) frame pytorch#4: [0x7f3ba45cfea1] frame pytorch#5: <unknown function> + 0x1b5334 (0x5652bdab9334 in ./test_deploy) frame pytorch#6: <unknown function> + 0x1b4f3e (0x5652bdab8f3e in ./test_deploy) frame pytorch#7: <unknown function> + 0x1b519b (0x5652bdab919b in ./test_deploy) frame pytorch#8: loadSearchFile(char const*) + 0x23e (0x7f3ba62f37f8 in /tmp/torch_deploy9ATEFg) frame pytorch#9: deploy_set_self + 0x51 (0x7f3ba62f38f9 in /tmp/torch_deploy9ATEFg) frame pytorch#10: torch::deploy::Interpreter::Interpreter(torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>) + 0x274 (0x5652bdaaa790 in ./test_deploy) frame pytorch#11: void __gnu_cxx::new_allocator<torch::deploy::Interpreter>::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x81 (0x5652bdaaf58b in ./test_deploy) frame pytorch#12: void std::allocator_traits<std::allocator<torch::deploy::Interpreter> >::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(std::allocator<torch::deploy::Interpreter>&, torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x4a (0x5652bdaae320 in ./test_deploy) frame pytorch#13: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::_M_realloc_insert<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(__gnu_cxx::__normal_iterator<torch::deploy::Interpreter*, std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> > >, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xee (0x5652bdaae4a0 in ./test_deploy) frame pytorch#14: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::emplace_back<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xb6 (0x5652bdaad258 in ./test_deploy) frame pytorch#15: torch::deploy::InterpreterManager::InterpreterManager(unsigned long, std::shared_ptr<torch::deploy::Environment>) + 0x123 (0x5652bdaa83b1 in ./test_deploy) frame pytorch#16: TorchpyTest_InitTwice_Test::TestBody() + 0x65 (0x5652bda075a9 in ./test_deploy) frame pytorch#17: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x65 (0x5652bda944b7 in ./test_deploy) frame pytorch#18: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x5a (0x5652bda8cfe7 in ./test_deploy) frame pytorch#19: testing::Test::Run() + 0x100 (0x5652bda68622 in ./test_deploy) frame pytorch#20: testing::TestInfo::Run() + 0x10f (0x5652bda68fb3 in ./test_deploy) frame pytorch#21: testing::TestSuite::Run() + 0x121 (0x5652bda6980d in ./test_deploy) frame pytorch#22: testing::internal::UnitTestImpl::RunAllTests() + 0x38e (0x5652bda756e6 in ./test_deploy) frame pytorch#23: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x65 (0x5652bda9586b in ./test_deploy) frame pytorch#24: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x5a (0x5652bda8e0f7 in ./test_deploy) frame pytorch#25: testing::UnitTest::Run() + 0xc9 (0x5652bda73fd1 in ./test_deploy) frame pytorch#26: RUN_ALL_TESTS() + 0x11 (0x5652bda169fa in ./test_deploy) frame pytorch#27: main + 0x27 (0x5652bda10ce2 in ./test_deploy) frame pytorch#28: <unknown function> + 0x2d310 (0x7f3bc0431310 in /usr/lib/libc.so.6) frame pytorch#29: __libc_start_main + 0x81 (0x7f3bc04313c1 in /usr/lib/libc.so.6) frame pytorch#30: _start + 0x25 (0x5652bda063b5 in ./test_deploy) ``` Test Plan: CI Differential Revision: D36564258 Pull Request resolved: pytorch#78028 Approved by: https://github.com/rohan-varma
miladm
pushed a commit
that referenced
this pull request
Jun 14, 2022
…ytorch#78276) Fixes pytorch#325 **Summary**: Currently, the pytorchbot only allows for rebasing to the master branch. These modifications add functionality for rebasing to the 'viable/strict' branch of pytorch/pytorch by adding a flag to the comment. **Test Plan:** tested manually on personal fork ([#1](swang392#1)), and included a test case in test_tryrebase.py that checks if rebasing to viable/strict branch was successful. Pull Request resolved: pytorch#78276 Approved by: https://github.com/clee2000, https://github.com/janeyx99
miladm
pushed a commit
that referenced
this pull request
Jun 14, 2022
… to conform with non-quantized countertpart filenames Summary: Names of analogous files in quantized directory (previously snake case) were inconsistent with their non-quantized filename counterparts (pascal case). This is the first of a series of PRs that changes all files in quantized (and sub-directories) dir to have pascal case. `aten/src/ATen/native/quantized/qconv_unpack.cpp` has not been renamed yet because (for reasons currently unknown) after making the name change, `import torch` produces the below error (`qlinear_unpack.cpp` renaming also seems to fail some phabricator CI tests for similar reasons). We suspect that these may be undefined errors and will revisit naming these files in a future PR. ``` terminate called after throwing an instance of 'c10::Error' what(): Type c10::intrusive_ptr<ConvPackedParamsBase<2> > could not be converted to any of the known types. Exception raised from operator() at ../aten/src/ATen/core/jit_type.h:1735 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x55 (0x7f26745c0c65 in /data/users/dzdang/pytorch/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb1 (0x7f26745bdcd1 in /data/users/dzdang/pytorch/torch/lib/libc10.so) frame pytorch#2: <unknown function> + 0x1494e24 (0x7f2663b14e24 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#3: <unknown function> + 0xfed0bc (0x7f266366d0bc in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#4: c10::detail::infer_schema::make_function_schema(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x5a (0x7f266366d71a in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#5: c10::detail::infer_schema::make_function_schema(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x7b (0x7f266366e06b in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#6: <unknown function> + 0x1493f32 (0x7f2663b13f32 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#7: <unknown function> + 0xe227dd (0x7f26634a27dd in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so) frame pytorch#8: <unknown function> + 0x14e0a (0x7f268c934e0a in /lib64/ld-linux-x86-64.so.2) ..........................truncated............. ``` Test Plan: ``` python test/test_quantization.py ``` Pull Request resolved: pytorch#77037 Approved by: https://github.com/jerryzh168
miladm
pushed a commit
that referenced
this pull request
Jul 29, 2022
### Summary: This PR implements PTQ for APoT FakeQuant. It runs models (Resnet-18 pre-trained model, ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. According to the collected accuracy stats, model pytorch#2 (uniform activation and APoT weight) appears to have a slight improvement in accuracy compared to model #1 (uniform activation and uniform weight) for 8-bit and significant improvement for 4-bit (see "Accuracy Stats" section below). ### Test Plan: Run models with: `python test/quantization/core/experimental/fx_graph_mode_apot.py` ### Accuracy Stats: 8-bit (Uniform int8, APoT b = 8 k = 2) **Model #1:** Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.43% (Top-1), 85.62% (Top-5) **Model pytorch#2:** Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.51% (Top-1), 85.78% (Top-5) **Model pytorch#3:** APoT activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.32% (Top-1), 85.78% (Top-5) 4-bit (Uniform int4, APoT b = 4 k = 2) **Model #1:** Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 45.63% (Top-1), 71.96% (Top-5) **Model pytorch#2:** Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 64.24% (Top-1), 85.56% (Top-5) **Model pytorch#3:** APoT activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 45.40% (Top-1), 76.21% (Top-5) **Full Precision model (FX Graph Mode quantized)** Evaluation accuracy on test dataset: 69.76% (Top-1), 89.08% (Top-5) **Eager mode quantized model** Evaluation accuracy on test dataset: 69.49% (Top-1), 88.90% (Top-5) Pull Request resolved: pytorch#81040 Approved by: https://github.com/jerryzh168
miladm
pushed a commit
that referenced
this pull request
Aug 12, 2022
Hi! I was playing with libfuzzer and found bug when loading a model from file via `torch::jit::load` function. There is an unhandled exception in caffe2/serialize when calling a `stoull` function on unsanitized version string. The bug can be reproduced with `aot_model_compiler` binary: ``` aot_model_compiler --model=crash-stoull --model_name=name --model_version=1 --input_dims='1,3,224,224;2,2' --input_types='float;float' ``` Crash file is provided in [crash.zip](https://github.com/pytorch/pytorch/files/8701504/crash.zip). gdb output: ``` Temporary breakpoint 1, main (argc=6, argv=0x7ffcd160f9f8) at /pytorch_master/binaries/aot_model_compiler.cc:87 87 "Run NNC AOT compiler for pytorch model. Example usage:\n" (gdb) c Continuing. terminate called after throwing an instance of 'std::invalid_argument' what(): stoull Program received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fa637f16859 in __GI_abort () at abort.c:79 pytorch#2 0x00007fa6381c1911 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6 pytorch#3 0x00007fa6381cd38c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6 pytorch#4 0x00007fa6381cd3f7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6 pytorch#5 0x00007fa6381cd6a9 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6 pytorch#6 0x00007fa6381c42ce in std::__throw_invalid_argument(char const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6 pytorch#7 0x000000000247d567 in __gnu_cxx::__stoa<unsigned long long, unsigned long long, char, int> (__str=0x7ffcd160f228 "ZZ", __idx=0x0, __base=10, __convf=<optimized out>, __name=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/ext/string_conversions.h:83 pytorch#8 std::__cxx11::stoull (__str="ZZ", __idx=0x0, __base=10) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/basic_string.h:6577 pytorch#9 caffe2::serialize::PyTorchStreamReader::init (this=this@entry=0x8c11ce0) at /pytorch_master/caffe2/serialize/inline_container.cc:145 pytorch#10 0x000000000247d9c7 in caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader (this=0x8c11ce0, in=std::shared_ptr<class caffe2::serialize::ReadAdapterInterface> (empty) = {...}) at /pytorch_master/caffe2/serialize/inline_container.cc:88 pytorch#11 0x00000000035b7ba4 in __gnu_cxx::new_allocator<caffe2::serialize::PyTorchStreamReader>::construct<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > ( __p=0x2, __args=..., this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/ext/new_allocator.h:150 pytorch#12 std::allocator_traits<std::allocator<caffe2::serialize::PyTorchStreamReader> >::construct<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__a=..., __p=0x2, __p@entry=0x8c11ce0, __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/alloc_traits.h:512 pytorch#13 0x00000000035b1988 in std::_Sp_counted_ptr_inplace<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x8c11cd0, __a=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:551 pytorch#14 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a8, __p=@0x7ffcd160f3a0: 0x10, __args=..., __a=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:683 pytorch#15 std::__shared_ptr<caffe2::serialize::PyTorchStreamReader, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a0, __args=..., __tag=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:1371 pytorch#16 std::shared_ptr<caffe2::serialize::PyTorchStreamReader>::shared_ptr<std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a0, __args=..., __tag=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:408 pytorch#17 std::allocate_shared<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__args=..., __a=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:859 pytorch#18 std::make_shared<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:875 pytorch#19 torch::jit::load (rai=std::shared_ptr<class caffe2::serialize::ReadAdapterInterface> (empty) = {...}, device=device@entry=..., Python Exception <class 'gdb.error'> No type named std::__detail::_Hash_node<struct std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, true>.: extra_files=std::unordered_map with 0 elements) at /pytorch_master/torch/csrc/jit/serialization/import.cpp:474 pytorch#20 0x00000000035b1ef6 in torch::jit::load (filename="crash-stoull", device=device@entry=..., Python Exception <class 'gdb.error'> No type named std::__detail::_Hash_node<struct std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, true>.: extra_files=std::unordered_map with 0 elements) at /pytorch_master/torch/csrc/jit/serialization/import.cpp:444 pytorch#21 0x00000000035b1d22 in torch::jit::load (filename="", device=device@entry=...) at /pytorch_master/torch/csrc/jit/serialization/import.cpp:424 pytorch#22 0x00000000008f9be3 in main (argc=1, argv=0x7ffcd160f9f8) at /pytorch_master/binaries/aot_model_compiler.cc:128 ``` Pull Request resolved: pytorch#77557 Approved by: https://github.com/Gamrix
miladm
pushed a commit
that referenced
this pull request
Aug 12, 2022
### Summary: This PR implements QAT for APoT FakeQuant. It runs QAT with FX graph mode quantized models (Resnet-18 pre-trained model, full ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. It also refactors the APoT PTQ module `apot_fx_graph_mode_ptq.py` (previously `fx_graph_mode_apot.py`) such that shared helper functions between PTQ and QAT are in a separate file `quantization_util.py`. Model pytorch#2 (uniformly quantized activation, APoT quantized weight) shows comparable accuracy compared to model #1 (uniformly quantized activation, APoT quantized weight) for 8-bit and significant accuracy improvement for 4-bit (see "Accuracy Stats" section below). ### Test Plan: Run QAT models with: `python test/quantization/core/experimental/apot_qat.py` Run PTQ models with: `python test/quantization/core/experimental/apot_ptq.py` ### Accuracy Stats 8-bit (Uniform int8, APoT b = 8 k = 2) Model #1: Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 69.67% (Top-1), 89.04% (Top-5) Model pytorch#2: Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 69.72% (Top-1), 89.06% (Top-5) 4-bit (Uniform int4, APoT b = 4 k = 2) Model #1: Uniform activation, uniform weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 46.85% (Top-1), 72.85% (Top-5) Model pytorch#2: Uniform activation, APoT weight (FX Graph Mode quantized) Evaluation accuracy on test dataset: 66.45% (Top-1), 86.23% (Top-5) Pull Request resolved: pytorch#83282 Approved by: https://github.com/jerryzh168
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #ISSUE_NUMBER