-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AlexNet NN on NVDLA small #70
Comments
Hi, The Alexnet test has a bit long run time, please try to wait for enough time, such as ~6000s. Best Regards, |
@giusecesa4 Could you check what happens when you reserve more RAM for NVDLA and disable dla_info and dla_debug in kernel module? I get this assertion:
UPDATE: Added extra dump before assertion. Looks like some issue with input data structure.
|
Could you please share your dsa file with me, if it's possible .I am new to FPGA, it take me some days to setup the environment, however with no progress.Thanks a lot.
Junning Wu
邮箱:wujunning11@gmail.com
Signature is customized by Netease Mail Master
On 07/04/2018 20:30, giusecesa4 wrote: Hi, I am running the NVDLA small architecture on a FPGA.
I succeded at running almost all the flatbufs tests available in the UMD. I just have a problem with the loadable NN_L0_1_small_fbuf , which is supposed to be an AlexNet NN precompiled for the small architecture. The process stalls at a certain point and the FPGA stops working. Does anyone have the same problem or know how to solve it?
When will the compiler for the NVDLA small be available?
—You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/nvdla/sw","title":"nvdla/sw","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/nvdla/sw"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"AlexNet NN on NVDLA small (#70)"}],"action":{"name":"View Issue","url":"#70"}}}
[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "#70",
"url": "#70",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
},
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"hideOriginalBody": "false",
"originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB",
"title": "AlexNet NN on NVDLA small (#70)",
"sections": [
{
"text": "",
"activityTitle": "**giusecesa4**",
"activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png",
"activitySubtitle": "@giusecesa4",
"facts": [
{
"name": "Repository: ",
"value": "nvdla/sw"
},
{
"name": "Issue #: ",
"value": 70
}
]
}
],
"potentialAction": [
{
"name": "Add a comment",
"@type": "ActionCard",
"inputs": [
{
"isMultiLine": true,
"@type": "TextInput",
"id": "IssueComment",
"isRequired": false
}
],
"actions": [
{
"name": "Comment",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"nvdla/sw\",\n\"issueId\": 70,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}"
}
]
},
{
"name": "Close issue",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"nvdla/sw\",\n\"issueId\": 70\n}"
},
{
"targets": [
{
"os": "default",
"uri": "#70"
}
],
"@type": "OpenUri",
"name": "View on GitHub"
},
{
"name": "Unsubscribe",
"@type": "HttpPOST",
"target": "https://api.github.com",
"body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 352978108\n}"
}
],
"themeColor": "26292E"
}
]
|
@HaiqingSun even if I wait a lot of time, the process stalls. What about the compiler? Looking at the forum I see that it is going to be committed in a few days. @mmaciag I am going to try it very soon. Thank you for the advice. I also have another question. Is it possible to understand the properties of the loadable files? I am trying to retrieve the performances of the different engines, but this is not possible without knowing the characteristics of the loadable I put at the input. Thank you for your collaboration |
@giusecesa4 I think you don't have to, since your logs contain Assertion Fail as well. My first log had it too, I just had to read it more carefully :). The atomic size obviously should be 8 for nv_small. There is a lot of 'Skip dequeue op' due to assertion. Exiting from erroneous state could be also improved IMHO. Now it hangs the application and kernel driver which cannot be unloaded until system reboot. |
Hi @giusecesa4 , I ran NN_L0_1_small_fbuf on FPGA and platform and it run successfully. Here is the output: insmod drm.koinsmod opendla_small.ko[ 70.747544] opendla: loading out-of-tree module taints kernel. ./nvdla_runtime --loadable ../../../NN_L0_1_small_fbufcreating new runtime context... Thanks. |
Thanks for logs. I will try to track down why the assertion fails. Let me know if you could give me a clue where to start. |
@smsharif1991 which FPGA are you using? At which frequency are you running the test? @mmaciag I was thinking that maybe it is possible that the "reserved" memory used to run bigger networks is not big enough. I will try this. Did you managed this point? |
@giusecesa4 Ok... That was 🤦♂️ kind of issue. Device tree node obviously needs to be set to
|
I was just writing you the same thing. The problem is that I tried to run the AlexNet, the problem of assertion has been solved but it still does not work. Another issue could be related to the clock of the FPGA. I considered 50MHz, but maybe I should change also the Memory Interface Device Frequency (in the Zynq IP), because the transfer of data maybe is too fast. Did you leave the options about the other clocks as they are? |
Tested with 1GB reserved memory. I have disabled traces, and left only that line before assertion.
|
Unfortunately, even changing the quantity of reserved memory the process still stalls. The problem could be related to the convolution engine, which is the engine stalling (this happens When I run more than one convolution test). Do you have some ideas about this? The memory I reserved is from |
@giusecesa4 No idea. I have changed DTSI and worked like a charm. I am also using the same memory region. @smsharif1991 How can I verify that |
Hi @mmaciag, I have another question about the Vivado design. Did you reserve more than 64K for the CSB part of the NVDLA or did you leave the configurations as they are by considering the default ones? |
I reserved 64K. NVDLA engines are aligned to 0x1000 boundary but all fit within those 64K. |
@mmaciag I think that the problem is related to the fact that the ISR is not called When running the second convolution test. Which kind of interrupt of the FPGA did you connect to the NVDLA? I used the |
I don't have design opened right now, but I am sure it was irq0. I don't have a clue why it does not work for you but you could quite easily verify if interrupts are working at all:
EDIT: |
Hi @mmaciag, thanks for your explanation. In my project I observed that the interrupt is not raised When running the second convolution, that is, the convolution running at GROUP 1. I think the problem could be related to the clock I used for this project. When changing the frequency, I just changed this value: I changed the divisors of the PL0 in order to obtain a frequency of 33MHz (this is just an example, I also tried with 10, 20 and 50). PL0 is the only clock connected to all the elements of the block diagram. When you changed the frequency of the zynq, did you change ONLY this parameter or also other parameters? Then, did you use some clock buffers? I am sorry, but I am not an expert in Vivado, so a help will be really appreciated. Thank you |
If you think this is a problem with timing, just check the timing report in "Open Implemented Design". WNS should have some good margin and definitely cannot be negative. My rule of thumb is to keep it above 1ns. For 10 MHz clock it's actually hard not to meet timing closure. I did not use global clock buffers manually, since Vivado infers it pretty well. Unless you forget to disable clock gating in NVDLA. In this case the design is completely unroutable. The clock frequency you choose in the IP configurator does not need to match any other clock in the system, because Zynq AXI4 ports support clock domain crossing. On larger -2 grade devices I managed to synthesize with 150 MHz and still had WNS around ~2ns . For ZU3EG-1 device it should go up to 90 MHz, but you need to disable CPD or PDP in order to fit the logic. BTW. Did you try the latest nvdla/hw master? |
Without disabling the clock gating I have only a problem of WORST HOLD SLACK, that in any case is very small. |
@smsharif1991 @prasshantg @giusecesa4 : is The image speedboat is referenced in this thread: #16 Also should I have to pre-process the image such that: |
I can run all nv_small flatbufs tests successfully on FPGA/zc706 using 32-bit systems and use the rawdump output.dimg got from VP test environment to verify the correctness of FPGA test running. |
|
@anakin1028 I think NN_L0_1_small_fbuf is alexnet like network. DTB file must be modified to reflect your hw info like compatible field and address range etc. Please reference to [thread] (nvdla/hw#110). I implement nv_small on zc706 and run sw on ARM Cortex-A9 which communicate to NVDLA through AXI interface (Data path) and AXI-to-APB bridge (Register path). |
@ddkevin thanks a lot. Your comment is really helpful. I will take a look about that thread. |
@smsharif1991 Is NN_L0_1_small_fbuf sanity test ? Changes to --image --normalize --mean values have no effect on the output.dimg. Only --rawdump option always output the same one-dimensional array (length 142560). How to analysis .dimg format file. Is it image file or report? How to read correctly? |
For NN_L0_1_small_fbuf sanity test, option --image --normalize --mean have no effect at all because the input tensor is zero. |
@ddkevin do you mean --image argument has no effect? Since the test is hard coded to zero input tensor? |
@ddkkevin Thanks for your answer, so it's compiler's problem, which can be solved until nv release nv_small's compiler. NN_L0_1_small_fbuf is just a sanity test? |
@anakin1028 I think the input image data is hard coded internally and there is no need to input image data from outside. |
Hi again @mmaciag, I am running different configurations of nv_small on a Zyng FPGA and am trying to compare runtime performance. In your comment you mentioned that you turned off traces for nvdla kernel module. I'm interested in this because I believe the flatbuf tests runtime is influenced by debug messages ( around 6s) and would like to reduce them to make runtime performance more comparable. |
Hi @nookfoo, As I remember I just added some #ifdefs around dla_debug/dla_info/dla_warn/dla_error functions in nvdla_core_callbacks.c which I could control easily from Makefile. As I also remember it influenced the driver performance significantly. |
Thanks, I was able to implement your tip and also experienced significant improvement to run time of the flatbuf tests |
@giusecesa4 @JunningWu @@nookfoo@ddkkevin I am trying to run the NN_L0_1_small_fbuf test on the NVDLA small on fpga ZCU102 = Run PDP/PDP_L0_0_small_fbuf can't continue to run, Did you have the same problem? |
~6000s? but I wait for 12000s? The process stalls at a certain point and the FPGA stops working. I am trying to run the NN_L0_1_small_fbuf test on the NVDLA small on fpga ZCU102 = Run PDP/PDP_L0_0_small_fbuf can't continue to run, Did you have the same problem? What may be the problem? |
Can you give the test code for the interruption you mentioned? |
Hi, @smsharif1991 I was not able to see the test pass.do I need to change something in umd before compilation? can you please guide me to resolve this issue? |
Hi, I am running the NVDLA small architecture on a FPGA.
I succeded at running almost all the flatbufs tests available in the UMD. I just have a problem with the loadable
NN_L0_1_small_fbuf
, which is supposed to be an AlexNet NN precompiled for the small architecture. The process stalls at a certain point and the FPGA stops working. Does anyone have the same problem or know how to solve it?When will the compiler for the NVDLA small be available?
The text was updated successfully, but these errors were encountered: