[VTA][OpenCL] intelfocl #6126

zhanghaohit · 2020-07-23T07:07:13Z

This is related to #5840 and split from PR #5842

This PR only keeps the modification to TVM for opencl support

tmoreau89 · 2020-07-24T23:20:12Z

python/tvm/autotvm/task/topi_integration.py

            wkl = traverse(op.input_tensors)
            if wkl:
                return wkl
+


do you mind explaining the changes made to this file?

Original code will fail if there are multiple workloads in one schedule. For example, in fused_nn_conv2d_add_add_right_shift_clip_cast_31, the conv2d and add may both have workload attrs. We have to get the correct workload by comparing the task_name.

Previously it works fine, as add is not a tunable op. But since we also want to put middle alu-only nodes (residual blocks) to VTA, such as fused_cast_cast_add_nn_relu_clip_cast_3. We create a vta schedule for add (see add.alu)

Thanks for clarifying. How do we guard against extracting add as a standalone op for other backends int his case?

src/relay/backend/compile_engine.cc

tmoreau89 · 2020-07-24T23:24:29Z

src/tir/transforms/lower_tvm_builtin.cc

@@ -88,16 +88,6 @@ class BuiltinLower : public StmtExprMutator {
    op = stmt.as<AllocateNode>();
    // Get constant allocation bound.
    int64_t nbytes = GetVectorBytes(op->dtype);


do you mind explaining the reasoning behind this deletion?

This removes special handling for kDLCPU. Otherwise, it may cause LLVM parameters match error.

Traceback (most recent call last): File "vta/tutorials/frontend/deploy_classification.py", line 210, in <module> params=params, target_host=env.target_host) File "/4pd/home/zhanghao/workspace/tvm-2/tvm/python/tvm/relay/build_module.py", line 251, in build graph_json, mod, params = bld_mod.build(mod, target, target_host, params) File "/4pd/home/zhanghao/workspace/tvm-2/tvm/python/tvm/relay/build_module.py", line 120, in build self._build(mod, target, target_host) File "tvm/_ffi/_cython/./packed_func.pxi", line 321, in tvm._ffi._cy3.core.PackedFuncBase.__call__ File "tvm/_ffi/_cython/./packed_func.pxi", line 256, in tvm._ffi._cy3.core.FuncCall File "tvm/_ffi/_cython/./packed_func.pxi", line 245, in tvm._ffi._cy3.core.FuncCall3 File "tvm/_ffi/_cython/./base.pxi", line 160, in tvm._ffi._cy3.core.CALL tvm._ffi.base.TVMError: Traceback (most recent call last): [bt] (8) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(TVMFuncCall+0x4c) [0x7f385ac9bc1c] [bt] (7) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x316) [0x7f385ab2a566] [bt] (6) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::NDArray, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, tvm::runtime::NDArray> > > const&)+0xe31) [0x7f385ab29c11] [bt] (5) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::runtime::String, tvm::IRModule, void, void> const&, tvm::Target const&)+0x3c4) [0x7f385a4322d4] [bt] (4) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)+0x326) [0x7f385a4318c6] [bt] (3) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(tvm::codegen::Build(tvm::IRModule, tvm::Target const&)+0x67a) [0x7f385a74f68a] [bt] (2) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(+0x1277ea1) [0x7f385ac7eea1] [bt] (1) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x1388) [0x7f385ac82c68] [bt] (0) /4pd/home/zhanghao/workspace/tvm-2/tvm/build/libtvm.so(+0x1276a57) [0x7f385ac7da57] File "/4pd/home/zhanghao/workspace/tvm-2/tvm/src/target/llvm/llvm_module.cc", line 230 TVMError: LLVM module verification failed with the following errors: Call parameter type does not match function signature! %.sub = getelementptr inbounds [4 x <8 x float>], [4 x <8 x float>]* %3, i64 0, i64 0 i8* %34 = call i8* @VTABufferCPUPtr(i8* %17, <8 x float>* nonnull %.sub) Call parameter type does not match function signature! %.sub = getelementptr inbounds [8 x float], [8 x float]* %3, i64 0, i64 0 i8* %31 = call i8* @VTABufferCPUPtr(i8* %14, float* nonnull %.sub)

The raise error is due to the LLVM code here (lib/IR/Verifier.cpp):

2598 // Verify that all arguments to the call match the function type. 2599 for (unsigned i = 0, e = FTy->getNumParams(); i != e; ++i) 2600 Assert(CS.getArgument(i)->getType() == FTy->getParamType(i), 2601 "Call parameter type does not match function signature!", 2602 CS.getArgument(i), FTy->getParamType(i), I);

It will raise this error if the special handling for kDLCPU is there. I think it is because the signature for the AllocateNode is not consistent with the parameter? Any ideas about alternative fix?

@tqchen perhaps you'd have some input on why this code was needed in the first place?

@tmoreau89 @tqchen @zhanghaohit

I've been searching for a bug introduced by this PR that somehow doesn't show up in CI? I've tested it locally with the docker image and still see the failure.

Anyway, if I run python/tests/onnx/test_forward.py:test_loop on main locally it fails. If I revert the change to this file, it passes.

I'm tempted to revert this PR until we can find a better way to fix this for VTA, do you guys have a better suggestion here?

@mbrookhart I am in favor of reverting the changes applied to this file and in a separate PR we can ensure that the error encountered by @zhanghaohit is resolved while making sure that python/tests/onnx/test_forward.py:test_loop passes

I don't think it would be needed to revert the entire PR, rather introduce a PR that just reverts the changes applied to this file. Given that the Intelfocl backend is not CI tested it's not going to break unit tests in TVM.

It seems that the error is due to the _concatenate_shape_func, where an Any is checked against a normal shape.

The changes of this PR may introduce Any shape, thus triggering this bug.

One quick fix is to remove the assert out[i] == inputs[j][i], "Dims mismatch in the inputs of concatenate." as there may be Any shape in the inputs, and it should allow.

@jwfromm took a look at the IR generated with and without this code snippet, this is what he got:

it seems like its trying to figure out how to allocate for the shape function and without that simplifying if statement it gets into a weird recursive let construction correct IR with fix: allocate(v_copy_shape_func: Pointer(int64), int64, [1]) { attr [0] "extern_scope" = 0 { v_expand_dim_shape_func: Pointer(int64)[0] = 1i64 v_expand_dim_shape_func[1] = (int64*)v_copy_shape_func[0] } attr [0] "extern_scope" = 0 { v_concatenate_shape_func: Pointer(int64)[0] = 0i64 v_concatenate_shape_func[0] = ((int64*)v_concatenate_shape_func[0] + (int64*)placeholder: Pointer(int64)[0]) v_concatenate_shape_func[0] = ((int64*)v_concatenate_shape_func[0] + (int64*)v_expand_dim_shape_func[0]) v_concatenate_shape_func[1] = (int64*)placeholder[1] assert(((int64*)v_concatenate_shape_func[1] == (int64*)v_expand_dim_shape_func[1]), "Dims mismatch in the inputs of concatenate.") 0 } } naughty IR: attr [v_expand_dim_shape_func: Pointer(int64)] "storage_alignment" = 128 { let v_expand_dim_shape_func = @tir.TVMBackendAllocWorkspace(1, dev_id: int32, 16u64, 0, 64, dtype=handle) { if @tir.isnullptr(v_expand_dim_shape_func, dtype=bool) { @tir.tvm_throw_last_error(, dtype=int32) } attr [v_copy_shape_func: Pointer(int64)] "storage_scope" = "global"; attr [v_copy_shape_func] "storage_alignment" = 128 { let v_copy_shape_func = @tir.TVMBackendAllocWorkspace(1, dev_id, 8u64, 0, 64, dtype=handle) { if @tir.isnullptr(v_copy_shape_func, dtype=bool) { @tir.tvm_throw_last_error(, dtype=int32) } { attr [0] "extern_scope" = 0 { v_expand_dim_shape_func[0] = 1i64 v_expand_dim_shape_func[1] = (int64*)v_copy_shape_func[0] } attr [0] "extern_scope" = 0 { v_concatenate_shape_func: Pointer(int64)[0] = 0i64 v_concatenate_shape_func[0] = ((int64*)v_concatenate_shape_func[0] + (int64*)placeholder: Pointer(int64)[0]) v_concatenate_shape_func[0] = ((int64*)v_concatenate_shape_func[0] + (int64*)v_expand_dim_shape_func[0]) v_concatenate_shape_func[1] = (int64*)placeholder[1] assert(((int64*)v_concatenate_shape_func[1] == (int64*)v_expand_dim_shape_func[1]), "Dims mismatch in the inputs of concatenate.") 0 } } } if (@tir.TVMBackendFreeWorkspace(1, dev_id, v_copy_shape_func, dtype=int32) != 0) { @tir.tvm_throw_last_error(, dtype=int32) } } } if (@tir.TVMBackendFreeWorkspace(1, dev_id, v_expand_dim_shape_func, dtype=int32) != 0) { @tir.tvm_throw_last_error(, dtype=int32) } } i have a feeling this could cause some non trivial performance regressions

It looks like this is fundamentally changing how shape functions get lowered, I don't think that just removing the assert is the right way to go about it?

It seems that the error is due to the _concatenate_shape_func, where an Any is checked against a normal shape.

The changes of this PR may introduce Any shape, thus triggering this bug.

One quick fix is to remove the assert out[i] == inputs[j][i], "Dims mismatch in the inputs of concatenate." as there may be Any shape in the inputs, and it should allow.

:/ It should not be possible to pass Any to a shape function, everything should be concrete in the VM before the shape_func is called, I'm not sure I buy this.

vta/tutorials/autotvm/tune_alu_vta.py

tmoreau89 · 2020-07-24T23:28:26Z

Thanks @zhanghaohit. This is converging nicely. I made some additional comments.

In addition, I'd like to request further partitioning given the large size of the PR.

(1) the following files will need to migrate to #6125:

src/relay/op/annotation/annotation.cc
python/tvm/relay/op/_tensor.py

(2) changes made for quantization should be isolated to an additional PR, this includes:

src/relay/quantize/realize.cc
python/tvm/relay/quantize/_partition.py
python/tvm/relay/quantize/_annotate.py

tmoreau89

Please address aforementioned changes, thank you

tmoreau89 · 2020-07-24T23:30:01Z

I've also noticed that the issue with the CI is that the VTA submodule is dependent on changes that haven't landed. We'll first merge the changes you have made in incubator-tvm-vta, then this one here.

zhanghaohit · 2020-08-06T02:28:33Z

Please address aforementioned changes, thank you

Done. Thanks @tmoreau89 for the comments.

zhanghaohit · 2020-08-06T02:29:16Z

Thanks @zhanghaohit. This is converging nicely. I made some additional comments.

In addition, I'd like to request further partitioning given the large size of the PR.

(1) the following files will need to migrate to #6125:

src/relay/op/annotation/annotation.cc

python/tvm/relay/op/_tensor.py

(2) changes made for quantization should be isolated to an additional PR, this includes:

src/relay/quantize/realize.cc

python/tvm/relay/quantize/_partition.py

python/tvm/relay/quantize/_annotate.py

changes made for quantization are moved to #6191

vta/runtime/runtime.cc

tmoreau89 · 2020-08-26T04:24:44Z

Thanks overall the PR is in good shape. I think that we'll need to merge the incubator-tvm-vta PR first, before we merge this one in (due to the bump on the VTA submodule)

tmoreau89 · 2020-08-26T04:25:41Z

@liangfu would you mind doing a pass on this PR as well?

zhanghaohit · 2020-12-09T05:49:36Z

@liangfu would you mind doing a pass on this PR as well?

ping @liangfu

liangfu

Thanks for the update, and really sorry for the late rely.

I took another look, and changes looks good.

tmoreau89 · 2020-12-11T02:20:54Z

apache/tvm-vta#9 is finally merged! @zhanghaohit can you please update submodule and let's make sure that the unit tests do pass now that the ISA has changed.

Note: we'll also need to modify the bitstreams for the FPGAs that VTA supports. @liangfu are you able to update the bitstreams for DE10 in the next week or so?

tmoreau89 · 2021-01-11T17:42:51Z

@zhanghaohit can you update this PR to update the VTA submodule to the latest? Currently other PRs that update the VTA submodule will break unless we merge this PR. Thanks!

zhanghaohit · 2021-01-12T01:57:20Z

@zhanghaohit can you update this PR to update the VTA submodule to the latest? Currently other PRs that update the VTA submodule will break unless we merge this PR. Thanks!

Sure. Apology for the delay. I think there are some unit tests failed for this PR. I'll try to fix it.

zhanghaohit · 2021-01-12T11:24:53Z

@tmoreau89 CI seems not working? wait for 9 hours but not start yet.

zhanghaohit · 2021-01-21T01:54:08Z

@tqchen seems CI is hang? Could you take a look at it? Thanks.

tmoreau89 · 2021-04-20T16:22:20Z

Thank you @zhanghaohit @liangfu @vegaluisjose this PR has been merged!

* intelfocl support * disable tsim test * bugfix to vta autotvm * disable tsim test in task_python_vta_tsim.sh * fix integration test * update vta submodule and re-enable tsim tests * remove unnecessary comments

* revert a change to lower_tvm_builtin.cc from #6126 * disable sim target on VTA tutorial fix bad refactor try again

…pache#8274) * revert a change to lower_tvm_builtin.cc from apache#6126 * disable sim target on VTA tutorial fix bad refactor try again

zhanghaohit force-pushed the feature/intelfocl-pr branch from f6d0f72 to 37692bf Compare July 23, 2020 07:15

zhanghaohit changed the title ~~Feature/intelfocl pr~~ [VTA][OpenCL] intelfocl Jul 23, 2020

zhanghaohit mentioned this pull request Jul 23, 2020

[VTA][OpenCL] Cloud FPGA support #5842

Closed

tmoreau89 reviewed Jul 24, 2020

View reviewed changes

src/relay/backend/compile_engine.cc Outdated Show resolved Hide resolved

tmoreau89 reviewed Jul 24, 2020

View reviewed changes

src/relay/backend/compile_engine.cc Outdated Show resolved Hide resolved

tmoreau89 reviewed Jul 24, 2020

View reviewed changes

vta/tutorials/autotvm/tune_alu_vta.py Show resolved Hide resolved

tmoreau89 requested changes Jul 24, 2020

View reviewed changes

zhanghaohit mentioned this pull request Aug 3, 2020

[VTA] quant support for alu-only op #6191

Merged

tmoreau89 reviewed Aug 26, 2020

View reviewed changes

vta/runtime/runtime.cc Outdated Show resolved Hide resolved

tmoreau89 reviewed Aug 26, 2020

View reviewed changes

vta/runtime/runtime.cc Outdated Show resolved Hide resolved

tmoreau89 requested a review from liangfu August 26, 2020 04:25

tqchen changed the base branch from master to main October 11, 2020 18:28

liangfu approved these changes Dec 10, 2020

View reviewed changes

zhanghaohit force-pushed the feature/intelfocl-pr branch from 0c90401 to 62d41ab Compare December 14, 2020 13:07

tmoreau89 mentioned this pull request Jan 11, 2021

[VTA] update vta-hw to the latest #7219

Closed

zhanghaohit force-pushed the feature/intelfocl-pr branch from 957aa3c to bbb5aed Compare February 5, 2021 02:37

tmoreau89 merged commit d0a0194 into apache:main Apr 20, 2021

zhanghaohit deleted the feature/intelfocl-pr branch May 7, 2021 02:07

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Jun 17, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

05f28b4

jwfromm mentioned this pull request Jun 17, 2021

[TIR][VM] Revert a change to lower_tvm_builtin.cc from #6126 #8274

Merged

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Jul 28, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

6c65f7a

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Jul 29, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

b0f8c02

zhanghaohit pushed a commit to zhanghaohit/incubator-tvm that referenced this pull request Aug 5, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

4cf4dc5

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 13, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

72e7fbb

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 16, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

f822e68

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 18, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

b8d38be

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 19, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

9663261

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 26, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

f1c5c99

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 30, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

e1168bf

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 31, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

97c53d2

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Aug 31, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

0bea027

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Sep 2, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

4a460c0

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Sep 3, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

87ab4dd

mbrookhart pushed a commit to mbrookhart/tvm that referenced this pull request Sep 7, 2021

revert a change to lower_tvm_builtin.cc from apache#6126

3787e93

tmoreau89 pushed a commit that referenced this pull request Sep 9, 2021

[TIR][VM] Revert a change to lower_tvm_builtin.cc from #6126 (#8274)

90676c4

* revert a change to lower_tvm_builtin.cc from #6126 * disable sim target on VTA tutorial fix bad refactor try again

zhanghaohit mentioned this pull request Sep 10, 2021

[Bug][VTA][OpenCL] If allowed to allocate in stack, VTA multiple target test will fail #8978

Open

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VTA][OpenCL] intelfocl #6126

[VTA][OpenCL] intelfocl #6126

zhanghaohit commented Jul 23, 2020

tmoreau89 Jul 24, 2020

zhanghaohit Jul 29, 2020

tmoreau89 Aug 26, 2020

tmoreau89 Jul 24, 2020

zhanghaohit Jul 29, 2020

tmoreau89 Aug 26, 2020

mbrookhart Jun 16, 2021

tmoreau89 Jun 17, 2021

tmoreau89 Jun 17, 2021

zhanghaohit Jun 17, 2021

mbrookhart Jun 17, 2021

mbrookhart Jun 17, 2021

mbrookhart Jun 17, 2021

tmoreau89 commented Jul 24, 2020

tmoreau89 left a comment

tmoreau89 commented Jul 24, 2020

zhanghaohit commented Aug 6, 2020

zhanghaohit commented Aug 6, 2020

tmoreau89 commented Aug 26, 2020

tmoreau89 commented Aug 26, 2020

zhanghaohit commented Dec 9, 2020

liangfu left a comment

tmoreau89 commented Dec 11, 2020

tmoreau89 commented Jan 11, 2021

zhanghaohit commented Jan 12, 2021

zhanghaohit commented Jan 12, 2021

zhanghaohit commented Jan 21, 2021

tmoreau89 commented Apr 20, 2021

[VTA][OpenCL] intelfocl #6126

[VTA][OpenCL] intelfocl #6126

Conversation

zhanghaohit commented Jul 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmoreau89 commented Jul 24, 2020

tmoreau89 left a comment

Choose a reason for hiding this comment

tmoreau89 commented Jul 24, 2020

zhanghaohit commented Aug 6, 2020

zhanghaohit commented Aug 6, 2020

tmoreau89 commented Aug 26, 2020

tmoreau89 commented Aug 26, 2020

zhanghaohit commented Dec 9, 2020

liangfu left a comment

Choose a reason for hiding this comment

tmoreau89 commented Dec 11, 2020

tmoreau89 commented Jan 11, 2021

zhanghaohit commented Jan 12, 2021

zhanghaohit commented Jan 12, 2021

zhanghaohit commented Jan 21, 2021

tmoreau89 commented Apr 20, 2021