Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support: remove local definition of ArrayRef #4

Merged
merged 1 commit into from
Sep 29, 2017
Merged

Support: remove local definition of ArrayRef #4

merged 1 commit into from
Sep 29, 2017

Conversation

compnerd
Copy link
Contributor

Summary:

Rather than reimplement ArrayRef, pull in LLVM's implementation of ArrayRef, and
forward it along to consumers.

Test Plan: ninja

Reviewers: nrotem

Subscribers: #glow

Tasks:

Tags:

Blame Revision:

Summary:

Rather than reimplement ArrayRef, pull in LLVM's implementation of ArrayRef, and
forward it along to consumers.

Test Plan: ninja

Reviewers: nrotem

Subscribers: #glow

Tasks:

Tags:

Blame Revision:
Copy link
Contributor

@nadavrot nadavrot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay!

@compnerd compnerd merged commit 4a1aba3 into pytorch:master Sep 29, 2017
@compnerd compnerd deleted the arrayref branch September 29, 2017 18:28
@HKLee2040 HKLee2040 mentioned this pull request Sep 18, 2018
facebook-github-bot pushed a commit that referenced this pull request Jun 15, 2019
Summary:
**Description**
This commit fixes two bugs in the OpenCL implementation of
`BatchedReduceAddInst` and adds a few comments for clarity.

The first is a segmentation fault caused by
incorporating feedback on #2958. A suggestion was made to make the loop
variable `i` in the loop that computes `batchSliceSizes` count down instead of
count up, but this suggestion was taken without changing the type (which was `size_t`,
an unsigned type), so the loop never terminates and eventually leads to a
segmentation fault.

The second bug is an incorrect computation of `destSliceSizes`. Instead of
multiplying the slice size at a dimension with the number of elements in
that same dimension, the code was multiplying the former with the number
of elements in the *adjacent* dimension. This was surfaced by the unit
test added in #2958 for `axis = 2`.

**Test Plan**
1) `ninja check` with OpenCL enabled, DEBUG mode

```
      Start  1: BackendCorrectnessTest
 1/34 Test  #1: BackendCorrectnessTest ..............   Passed   21.28 sec
      Start  2: BackendTest
 2/34 Test  #2: BackendTest .........................   Passed    1.97 sec
      Start  3: BasicIRTest
 3/34 Test  #3: BasicIRTest .........................   Passed    0.05 sec
      Start  4: Caffe2ImporterTest
 4/34 Test  #4: Caffe2ImporterTest ..................   Passed    3.00 sec
      Start  5: DeviceManagerTest
 5/34 Test  #5: DeviceManagerTest ...................   Passed    0.76 sec
      Start  6: ThreadPoolExecutorTest
 6/34 Test  #6: ThreadPoolExecutorTest ..............   Passed    1.48 sec
      Start  7: Float16Test
 7/34 Test  #7: Float16Test .........................   Passed    0.01 sec
      Start  8: GemmTest
 8/34 Test  #8: GemmTest ............................   Passed    0.05 sec
      Start  9: GlowOnnxifiManagerTest
 9/34 Test  #9: GlowOnnxifiManagerTest ..............   Passed    0.06 sec
      Start 10: GradCheckTest
10/34 Test #10: GradCheckTest .......................   Passed    4.72 sec
      Start 11: GraphGradTest
11/34 Test #11: GraphGradTest .......................   Passed    0.06 sec
      Start 12: GraphOptzTest
12/34 Test #12: GraphOptzTest .......................   Passed    0.03 sec
      Start 13: GraphSchedulerTest
13/34 Test #13: GraphSchedulerTest ..................   Passed    0.01 sec
      Start 14: GraphTest
14/34 Test #14: GraphTest ...........................   Passed    1.03 sec
      Start 15: HostManagerTest
15/34 Test #15: HostManagerTest .....................   Passed    7.49 sec
      Start 16: HyphenTest
16/34 Test #16: HyphenTest ..........................   Passed    1.17 sec
      Start 17: IROptTest
17/34 Test #17: IROptTest ...........................   Passed    0.01 sec
      Start 18: ImageTest
18/34 Test #18: ImageTest ...........................   Passed    0.31 sec
      Start 19: LLVMIRGenTest
19/34 Test #19: LLVMIRGenTest .......................   Passed    0.01 sec
      Start 20: MLTest
20/34 Test #20: MLTest ..............................   Passed   46.30 sec
      Start 21: MemoryAllocatorTest
21/34 Test #21: MemoryAllocatorTest .................   Passed    0.03 sec
      Start 22: OCLTest
22/34 Test #22: OCLTest .............................   Passed    0.24 sec
      Start 23: OnnxImporterTest
23/34 Test #23: OnnxImporterTest ....................   Passed    0.12 sec
      Start 24: OperatorGradTest
24/34 Test #24: OperatorGradTest ....................   Passed    0.05 sec
      Start 25: OperatorTest
25/34 Test #25: OperatorTest ........................   Passed   14.47 sec
      Start 26: PartitionerTest
26/34 Test #26: PartitionerTest .....................   Passed    0.05 sec
      Start 28: ProvisionerTest
27/34 Test #28: ProvisionerTest .....................   Passed    1.00 sec
      Start 29: QuantizationTest
28/34 Test #29: QuantizationTest ....................   Passed    7.46 sec
      Start 30: TensorsTest
29/34 Test #30: TensorsTest .........................   Passed    0.36 sec
      Start 31: TensorPoolTest
30/34 Test #31: TensorPoolTest ......................   Passed    0.01 sec
      Start 32: ThreadPoolTest
31/34 Test #32: ThreadPoolTest ......................   Passed    0.01 sec
      Start 33: TraceEventsTest
32/34 Test #33: TraceEventsTest .....................   Passed   10.62 sec
      Start 34: TypeAToTypeBFunctionConverterTest
33/34 Test #34: TypeAToTypeBFunctionConverterTest ...   Passed    0.06 sec
      Start 35: UtilsTest
34/34 Test #35: UtilsTest ...........................   Passed    0.02 sec

100% tests passed, 0 tests failed out of 34

Total Test time (real) = 124.33 sec
```

2) `ninja check` with OpenCL enabled, RELEASE mode
```
      Start  1: BackendCorrectnessTest
 1/34 Test  #1: BackendCorrectnessTest ..............   Passed   11.51 sec
      Start  2: BackendTest
 2/34 Test  #2: BackendTest .........................   Passed    1.53 sec
      Start  3: BasicIRTest
 3/34 Test  #3: BasicIRTest .........................   Passed    0.02 sec
      Start  4: Caffe2ImporterTest
 4/34 Test  #4: Caffe2ImporterTest ..................   Passed    0.62 sec
      Start  5: DeviceManagerTest
 5/34 Test  #5: DeviceManagerTest ...................   Passed    0.83 sec
      Start  6: ThreadPoolExecutorTest
 6/34 Test  #6: ThreadPoolExecutorTest ..............   Passed    0.71 sec
      Start  7: Float16Test
 7/34 Test  #7: Float16Test .........................   Passed    0.01 sec
      Start  8: GemmTest
 8/34 Test  #8: GemmTest ............................   Passed    0.31 sec
      Start  9: GlowOnnxifiManagerTest
 9/34 Test  #9: GlowOnnxifiManagerTest ..............   Passed    0.33 sec
      Start 10: GradCheckTest
10/34 Test #10: GradCheckTest .......................   Passed    1.90 sec
      Start 11: GraphGradTest
11/34 Test #11: GraphGradTest .......................   Passed    0.32 sec
      Start 12: GraphOptzTest
12/34 Test #12: GraphOptzTest .......................   Passed    0.03 sec
      Start 13: GraphSchedulerTest
13/34 Test #13: GraphSchedulerTest ..................   Passed    0.02 sec
      Start 14: GraphTest
14/34 Test #14: GraphTest ...........................   Passed    0.59 sec
      Start 15: HostManagerTest
15/34 Test #15: HostManagerTest .....................   Passed   10.61 sec
      Start 16: HyphenTest
16/34 Test #16: HyphenTest ..........................   Passed    4.18 sec
      Start 17: IROptTest
17/34 Test #17: IROptTest ...........................   Passed    0.04 sec
      Start 18: ImageTest
18/34 Test #18: ImageTest ...........................   Passed    0.10 sec
      Start 19: LLVMIRGenTest
19/34 Test #19: LLVMIRGenTest .......................   Passed    0.71 sec
      Start 20: MLTest
20/34 Test #20: MLTest ..............................   Passed   52.44 sec
      Start 21: MemoryAllocatorTest
21/34 Test #21: MemoryAllocatorTest .................   Passed    0.03 sec
      Start 22: OCLTest
22/34 Test #22: OCLTest .............................   Passed    0.96 sec
      Start 23: OnnxImporterTest
23/34 Test #23: OnnxImporterTest ....................   Passed    0.89 sec
      Start 24: OperatorGradTest
24/34 Test #24: OperatorGradTest ....................   Passed    0.76 sec
      Start 25: OperatorTest
25/34 Test #25: OperatorTest ........................   Passed   33.00 sec
      Start 26: PartitionerTest
26/34 Test #26: PartitionerTest .....................   Passed    0.79 sec
      Start 28: ProvisionerTest
27/34 Test #28: ProvisionerTest .....................   Passed    3.00 sec
      Start 29: QuantizationTest
28/34 Test #29: QuantizationTest ....................   Passed   19.64 sec
      Start 30: TensorsTest
29/34 Test #30: TensorsTest .........................   Passed    0.09 sec
      Start 31: TensorPoolTest
30/34 Test #31: TensorPoolTest ......................   Passed    0.04 sec
      Start 32: ThreadPoolTest
31/34 Test #32: ThreadPoolTest ......................   Passed    0.04 sec
      Start 33: TraceEventsTest
32/34 Test #33: TraceEventsTest .....................   Passed   13.18 sec
      Start 34: TypeAToTypeBFunctionConverterTest
33/34 Test #34: TypeAToTypeBFunctionConverterTest ...   Passed    0.87 sec
      Start 35: UtilsTest
34/34 Test #35: UtilsTest ...........................   Passed    0.04 sec

100% tests passed, 0 tests failed out of 34

Total Test time (real) = 160.15 sec
```
3) `ninja check` with OpenCL enabled, ASAN+UBSAN mode
```
      Start  1: BackendCorrectnessTest
 1/34 Test  #1: BackendCorrectnessTest ..............   Passed   65.05 sec
      Start  2: BackendTest
 2/34 Test  #2: BackendTest .........................   Passed    5.42 sec
      Start  3: BasicIRTest
 3/34 Test  #3: BasicIRTest .........................   Passed    0.09 sec
      Start  4: Caffe2ImporterTest
 4/34 Test  #4: Caffe2ImporterTest ..................   Passed   11.51 sec
      Start  5: DeviceManagerTest
 5/34 Test  #5: DeviceManagerTest ...................   Passed    1.93 sec
      Start  6: ThreadPoolExecutorTest
 6/34 Test  #6: ThreadPoolExecutorTest ..............   Passed    5.08 sec
      Start  7: Float16Test
 7/34 Test  #7: Float16Test .........................   Passed    0.03 sec
      Start  8: GemmTest
 8/34 Test  #8: GemmTest ............................   Passed    0.22 sec
      Start  9: GlowOnnxifiManagerTest
 9/34 Test  #9: GlowOnnxifiManagerTest ..............   Passed    0.18 sec
      Start 10: GradCheckTest
10/34 Test #10: GradCheckTest .......................   Passed   15.40 sec
      Start 11: GraphGradTest
11/34 Test #11: GraphGradTest .......................   Passed    0.22 sec
      Start 12: GraphOptzTest
12/34 Test #12: GraphOptzTest .......................   Passed    0.12 sec
      Start 13: GraphSchedulerTest
13/34 Test #13: GraphSchedulerTest ..................   Passed    0.03 sec
      Start 14: GraphTest
14/34 Test #14: GraphTest ...........................   Passed    3.00 sec
      Start 15: HostManagerTest
15/34 Test #15: HostManagerTest .....................   Passed   13.79 sec
      Start 16: HyphenTest
16/34 Test #16: HyphenTest ..........................   Passed    3.47 sec
      Start 17: IROptTest
17/34 Test #17: IROptTest ...........................   Passed    0.05 sec
      Start 18: ImageTest
18/34 Test #18: ImageTest ...........................   Passed    1.08 sec
      Start 19: LLVMIRGenTest
19/34 Test #19: LLVMIRGenTest .......................   Passed    0.05 sec
      Start 20: MLTest
20/34 Test #20: MLTest ..............................   Passed  141.01 sec
      Start 21: MemoryAllocatorTest
21/34 Test #21: MemoryAllocatorTest .................   Passed    0.08 sec
      Start 22: OCLTest
22/34 Test #22: OCLTest .............................   Passed    0.64 sec
      Start 23: OnnxImporterTest
23/34 Test #23: OnnxImporterTest ....................   Passed    0.51 sec
      Start 24: OperatorGradTest
24/34 Test #24: OperatorGradTest ....................   Passed    0.14 sec
      Start 25: OperatorTest
25/34 Test #25: OperatorTest ........................   Passed   35.78 sec
      Start 26: PartitionerTest
26/34 Test #26: PartitionerTest .....................   Passed    0.20 sec
      Start 28: ProvisionerTest
27/34 Test #28: ProvisionerTest .....................   Passed    2.25 sec
      Start 29: QuantizationTest
28/34 Test #29: QuantizationTest ....................   Passed   17.17 sec
      Start 30: TensorsTest
29/34 Test #30: TensorsTest .........................   Passed    1.28 sec
      Start 31: TensorPoolTest
30/34 Test #31: TensorPoolTest ......................   Passed    0.03 sec
      Start 32: ThreadPoolTest
31/34 Test #32: ThreadPoolTest ......................   Passed    0.05 sec
      Start 33: TraceEventsTest
32/34 Test #33: TraceEventsTest .....................   Passed   32.11 sec
      Start 34: TypeAToTypeBFunctionConverterTest
33/34 Test #34: TypeAToTypeBFunctionConverterTest ...   Passed    0.15 sec
      Start 35: UtilsTest
34/34 Test #35: UtilsTest ...........................   Passed    0.07 sec

100% tests passed, 0 tests failed out of 34

Total Test time (real) = 358.24 sec
```
Pull Request resolved: #3118

Differential Revision: D15836207

Pulled By: SplitInfinity

fbshipit-source-id: 7bfa3c6ed5583d6a8f42b1f712f359e8e1d10b47
facebook-github-bot pushed a commit that referenced this pull request Jun 7, 2022
…ntertpart filenames (#77037)

Summary:
X-link: pytorch/pytorch#77037

Names of analogous files in quantized directory (previously snake case) were inconsistent with
their non-quantized filename counterparts (pascal case). This is the first of a series of PRs that changes
all files in quantized (and sub-directories) dir to have pascal case.

`aten/src/ATen/native/quantized/qconv_unpack.cpp` has not been renamed yet
because (for reasons currently unknown) after making the name change, `import torch` produces the below error (`qlinear_unpack.cpp` renaming also seems to fail some phabricator CI tests for similar reasons). We suspect that these may be undefined errors and will revisit naming these files in a future PR.

```
terminate called after throwing an instance of 'c10::Error'
  what():  Type c10::intrusive_ptr<ConvPackedParamsBase<2> > could not be converted to any of the known types.
Exception raised from operator() at ../aten/src/ATen/core/jit_type.h:1735 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x55 (0x7f26745c0c65 in /data/users/dzdang/pytorch/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xb1 (0x7f26745bdcd1 in /data/users/dzdang/pytorch/torch/lib/libc10.so)
frame #2: <unknown function> + 0x1494e24 (0x7f2663b14e24 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0xfed0bc (0x7f266366d0bc in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #4: c10::detail::infer_schema::make_function_schema(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x5a (0x7f266366d71a in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #5: c10::detail::infer_schema::make_function_schema(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 0x7b (0x7f266366e06b in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0x1493f32 (0x7f2663b13f32 in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0xe227dd (0x7f26634a27dd in /data/users/dzdang/pytorch/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x14e0a (0x7f268c934e0a in /lib64/ld-linux-x86-64.so.2)
..........................truncated.............
```

Reviewed By: malfet

Differential Revision: D36862332

Pulled By: dzdang

fbshipit-source-id: 598c36656b4e71f906d940e7ff19ecf82d43031d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants