New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[U May Need It]nvdla_runtime options #9
Comments
unhandled level 1 translation fault1. I compiled alexnet caffe model on my virtual machine, the linux version is Ubuntu14.04. the compile process is OK. and I got the loadable file: output.protobuf, and the output dir: wisdom.dir, which contains layers/networks/tensors.2. I want to run the nvdla_runtime with the NVDLA VP. so I copied the output file and the output dir to the VP.Here is the Error. /# ./nvdla_runtime --loadable output.protobuf Anyone may help???
/# gdb |
@wujunning2011 I think your input loadable file is wrong. You can try any file, and will get the same error. I use the loadable file in the docker, but get out of bounds error. Any suggestions? ./nvdla_runtime --loadable BDMA_L0_0_fbuf |
@xmchen1987 Have u installed the driver (drm.ko, opendla.ko) first? |
@xmchen1987 can you share your loadable file? @jarodw0723 when I install the drm.ko/opendla.ko, I got level 2 translation fault. |
@wujunning2011 You can find the loadable file in https://github.com/nvdla/sw/tree/master/regression/flatbufs/kmd |
@jarodw0723 After installing the driver, it sucesses. Thanks a lot. @jarodw0723 Is the VP able to dump performance data currently? Or just for software development? |
@xmchen1987 It is just for software development. |
@jarodw0723 @xmchen1987 When I run in Docker mod, It is OK. |
@jarodw0723 I see currently cmod has interface like: Do you have plan to develop cmod as performance model? |
I have the same problem as wujunning2011. |
+1 same request to compile and run a custom model. Besides, @jarodw0723 Is there any schedule when the performance profiling function be ready? |
@wujunning2011 @geyijun @blueardour output.protobuf is no the target loadable file. I use the default.nvdla, and succeed to run the test. |
@xmchen1987 Thank You very much. Maybe default.nvdla is the Loadable file. |
@xmchen1987 May I know more about your "default.nvdla" test? |
@xmchen1987 Thanks. |
Hi,
also tried agian: |
@blueardour I suppose it's because the AlexNet is too HUGE, which may take about 20mins to create the context, you can try LeNet first. |
@wujunning2011 Hi, thanks for your tips. this is the last ouput after 14hours of execution.As you mentioned a try of the lenet. The computing complexity of the Alexnet is about 1G mac according to the tool of Netscope CNN analyzer. However, most of my focused networks are bigger than Alexnet. Thus, if the simulator is so slow, it might be some kinds of unacceptable for me to run my own networks. |
@blueardour my ALexNet's Running is not successful, which is also stucked at somewhere. with such huge NN, I suggest that you may USE Candence's Protium or Synopsys's ZEBU. BTW, when I run the tiny Lenet, there stil has some errors, hope you can GOTO LeNet, and GIVE me some help. |
hi, @wujunning2011 sorry for late reply. After having a try of Lenet, I also failed to run it successfully. |
Hi, has any one found a solution to this? I am having the same issue and it seems that it is not the system virtual memory issue? Any help is appreciated. Thanks! |
@JunningWu NVDLA VP configuration should be using 1GB system mem, which config file are you checking for it? are you able to run LeNet? |
Hi, still having issues with AlexNet -- running it with 1GB system mem. I tried running utilizing the latest NVDLA updates (with this one, we can load a .jpg image format). Please see the attached log file for more info: There are some error messages that are ignored and it hangs at the last point shown in the log file: 20180205_pascalvoc_BoatRes227x227.jpg.log Regarding LeNet: I was able to run it all the way thru without any issues (Here, the input file format used is .pgm) . |
I am able to reproduce it, created #21 for debugging AlexNet failure |
FYI -- some of the team is out for CNY this week. I'll follow up to see who's around, but expect a little more latency on this one. Thanks! |
We have resolved this issue, will fix push to KMD. Waiting for some verification results. |
Great, thanks. Much appreciated! |
Hi @prasshantg, @jwise. Would it be possible to ask for ballpark estimate of when the AlexNet fix will be available so we can update our team’s schedule? |
@qdchau 5th Mar 2018 |
Awesome. Thank you! |
@qdchau @ned-varnica @JunningWu fix for alexnet pushed. please test it. |
Hi @prasshantg. The fix works for us. Thanks for your help! |
Thanks so much @prasshantg. Should we be expecting the correct output at this point? We tried this AlexNet with some images and got outputs that look like noise (negative values close to 0). On the other hand, when we run the same network on our local CPU we get very good prediction with same input images (1 out of 20 output values is a large positive number, and this matches the correct label). Do you have any recommendation how to proceed with debugging? Thanks! |
@ned-varnica I think the rawdump file will contain 1000 predictions, like this http://ddl.escience.cn/f/Qdtr. |
@JunningWu do you get expected results? |
@prasshantg I am trying to figure out whether the result is indicating "CAT". |
@JunningWu In the example we are running, it has 20 outputs. The network was taken from Caffe Model Zoo http://heatmapping.org/files/bvlc_model_zoo/pascal_voc_2012_multilabel/deploy_x30.prototxt It was trained on the following 20 categories:
In your example, looking at the rawdump file, it seems you are seeing the same issue as we do. All the entries (in your case 1000 of them, in our case 20 of them) show very small values and nothing stands out. At least, this is our experience so far. @prasshantg |
This could be due to missing mean subtraction feature in compiler. Let me confirm it. |
Thanks @prasshantg. I agree this is the part of it, but there is probably more to it. FYI, I tried removing mean subtraction in our local simulator (just to test this hypothesis) and the result still looks OK: It can still produce outputs showing that 'Boat" is much more likely than the other 19 outputs. The confidence is worse (compared to the confidence when the appropriate means are used), but looks fine. On the other hand, the outputs we get in the file 20180306_pascalvoc_BoatRes227x227.jpg.dimg.txt (please see previous message) are not showing this behavior. |
I'm still not clear. Also, how do I create a loadable file from output.protobuf which is the compiler output |
@ferin08 Output from compiler is default.nvdla |
After giving the prototext and caffe model file that I got from the DIGITS model output using this command:
I get an output file from the complier called output.protobuf |
Hi @ferin08 , After running the above command I get "basic.nvdla" file in the current folder from where it is run. Check for *.nvdla in your current folder. Thanks. |
Hi @smsharif1991 Regarding the compiler and runtime test is there any more documentation other than http://nvdla.org/sw/test_application.html |
Hi @ferin Thanks. |
How do we use '-s' to "launch test in server mode"? Also, I tested a few images using --rawdump. |
Which model are you using to test? |
lenet |
Still have not been able to resolve this issue. "Also, I tested a few images using --rawdump. Anyway we can solve this problem? |
@prasshantg @smsharif1991 |
./nvdla_runtime -loadable output.protobuf
Usage: ./nvdla_runtime [-options] --loadable <loadable_file>
where options include:
-h print this help message
-s launch test in server mode
--loadable <loadable_file>
--image <image_file>
--imgshift <shift_value>
--imgscale <scale_value>
--imgpower <power_value>
--softmax
The text was updated successfully, but these errors were encountered: