Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI] Conv2d Schedule for Intel HD Graphics Target fails and produces wrong output #1420

Closed
rajh619 opened this issue Jul 11, 2018 · 9 comments

Comments

@rajh619
Copy link
Contributor

rajh619 commented Jul 11, 2018

nnvm.compiler.build() fails for intel_graphics target.
sample model https://s3.amazonaws.com/download.onnx/models/opset_3/resnet50.tar.gz
error log

Traceback (most recent call last):
  File "C:\Users\rg\Documents\Visual Studio 2015\Projects\nnvm_tvm_resnet\src\nnvm_tvm_igpu_.py", line 144, in <module>
    graph, lib, params = nnvm.compiler.build(sym, tvm.target.intel_graphics(), input_dict, params=params)
  File "C:\tvm\nnvm\python\nnvm\compiler\build_module.py", line 294, in build
    graph = graph.apply("GraphFusePartition").apply("GraphFuseCompile")
  File "C:\tvm\nnvm\python\nnvm\graph.py", line 234, in apply
    check_call(_LIB.NNGraphApplyPasses(self.handle, npass, cpass, ctypes.byref(ghandle)))
  File "C:\tvm\nnvm\python\nnvm\_base.py", line 75, in check_call
    raise NNVMError(py_str(_LIB.NNGetLastError()))
nnvm._base.NNVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "C:\tvm\python\tvm\_ffi\_ctypes\function.py", line 54, in cfun
    rv = local_pyfunc(*pyargs)
  File "C:\tvm\nnvm\python\nnvm\top\nn.py", line 164, in compute_contrib_conv2d_NCHWc
    strides, padding, layout, out_layout)
  File "<decorator-gen-40>", line 2, in conv2d_NCHWc
  File "C:\tvm\python\tvm\target.py", line 345, in dispatch_func
    return dispatch_dict[k](*args, **kwargs)
TypeError: _decl_conv2d() takes from 6 to 7 positional arguments but 9 were given

call stack hits from here :
https://github.com/dmlc/tvm/blob/fd1a572058aef5a07e1e1032e26e67fe1906f9b2/nnvm/python/nnvm/top/nn.py#L169-L170

https://github.com/dmlc/tvm/blob/fd1a572058aef5a07e1e1032e26e67fe1906f9b2/nnvm/python/nnvm/top/nn.py#L192-L193

and further calls intel_graphics schedules below .
But conv2d implementation for Intel graphics missed " layout," & " out_layout," function parameters.as shown below :
https://github.com/dmlc/tvm/blob/fd1a572058aef5a07e1e1032e26e67fe1906f9b2/topi/python/topi/intel_graphics/conv2d.py#L60

https://github.com/dmlc/tvm/blob/fd1a572058aef5a07e1e1032e26e67fe1906f9b2/topi/python/topi/intel_graphics/conv2d.py#L99

On solving this error by passing default prarmeters as 'layout =None" & " out_layout=None" , It compiled successfully .
But now the prediction output results[inference] turned to be wrong with tvm.target.intel_graphics() !
[ I tested in default mode ie. target='opencl' the prediction result is good ] .
why the Inference output changes when using tvm.target.intel_graphics() ?

@rajh619 rajh619 changed the title [TOPI] Conv2d for Intel HD Graphics Target fails and produces wrong output [TOPI] Conv2d Schedule for Intel HD Graphics Target fails and produces wrong output Jul 11, 2018
@Laurawly
Copy link
Contributor

Laurawly commented Jul 11, 2018

@rajh619 Are you using opt_level= 3? It seems that a recent PR: #1299 has changed the layout transferring api. I haven't update accordingly yet. But it should work for opt_level = 2. I have tested on Intel Graphics HD500. Also which hardware are you running on?

@tqchen
Copy link
Member

tqchen commented Jul 12, 2018

Thanks @Laurawly for answering, is it possible for us to move this to http://discuss.tvm.ai/ and when we have actionable items, open issues to resolve this. Thanks!

@rajh619
Copy link
Contributor Author

rajh619 commented Jul 12, 2018

@Laurawly Yes you are correct ,i was using opt_level=3 .
I have tried it with opt_level=2 and found it is working . but as i said before while using opt_lvl=2 , the prediction accuracy changes a lot , which is a major concern as the output changes !

I have a test script to test here . If i set target='opencl' the prediction is correct .But if target is set to tvm.target.intel_graphics() ,the prediction is wrong .
Am testing in Intel graphics HD 4600.

@tqchen this issue is w.r.t existing implementation . should we move to the forum ?

Thanks

@Laurawly
Copy link
Contributor

@rajh619 What's your hardware specification? I tested both the performance and accuracy on Intel HD graphics 500 and there's no problem on that hardware.

@tqchen
Copy link
Member

tqchen commented Jul 12, 2018

@rajh619 the community will always trying to solve the problem together, be it on forum or issue ;) The forum is preferred because issues are for actionable items we aggressively close issues and expect issues to be active actionable items that can get closed in an expected time span (so we won't have pile of issues that get missed).

@rajh619
Copy link
Contributor Author

rajh619 commented Jul 13, 2018

@Laurawly my hardware specs are : iGPU -Intel HD 4600 and CPU is - intel core i5 (4590s) ,8GB .

Using opt_lvl=2 , the accuracy changes .
I have tried the resnet50 model mentioned below :
https://s3.amazonaws.com/download.onnx/models/opset_3/resnet50.tar.gz

you can load the model and check the change in accuracy with this script :
https://gist.github.com/rajh619/bbcbb2e776c7c24a016bb61bf1a2d156

@rajh619
Copy link
Contributor Author

rajh619 commented Jul 13, 2018

@Laurawly I have an update :
I tested in Intel HD Graphics 520 .The prediction is correct for opt_lvl =2 .
Also i modified the code , to use opt_lvl =3 ,and the prediction is correct.
The problem i think is , I was using Intel HD 4600 which is not "Intel® Processor Graphics Gen9" category . Also i can see improvement in performance too !
It will be great if schedules for other operations like dense ,pooling etc is available.
Thanks

@Laurawly
Copy link
Contributor

@rajh619 Glad to hear that you find the problem. Yeah I agree that the intel_graphics target's schedulers are more suitable for intel integrated graphics cards while when you use independent intel graphics, the original cuda scheduler can do the job better. We'll do more coverage on both the operators and networks.

@tqchen Maybe we are good on closing the issue.

@tqchen tqchen closed this as completed Jul 13, 2018
@tqchen
Copy link
Member

tqchen commented Jul 13, 2018

Thanks for all the discussions, this issue is closed, let us move further discussions to the forum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants