Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSD Object Detector Dual output #72

Closed
bomerzz opened this issue Jan 17, 2022 · 4 comments
Closed

SSD Object Detector Dual output #72

bomerzz opened this issue Jan 17, 2022 · 4 comments

Comments

@bomerzz
Copy link
Contributor

bomerzz commented Jan 17, 2022

Hello,

I've been trying to run the tf2_ssdincepion_v2 model using dpu-pynq on an Ultra96-V2 using a modified version of the dpu_tf_inceptionv1.ipynb file.

By using the defaults of shapeOut = tuple(outputTensors[0].dims) I was able to get an output array of shape (1,1917,91) which I assume is the confidence score for each box.

If I access the second output tensor shapeOut2 = tuple(outputTensors[1].dims) it will give me a shape of (1,1917,4) which I assume will be the box locations.

However while trying to run with the second set of outputensors, the following error will be thrown:

 job_id = dpu.execute_async(input_data, output_data2)
double free or corruption (out)
Aborted (core dumped)

I am not sure how to proceed with doing the detection as from the graph generated using analyze_subgraphs.sh the subgraph has 2 outputs that require CPU processing to obtain the final result.

Any help will be appreciated. Thank you!

Graph Image

@skalade
Copy link
Collaborator

skalade commented Jan 17, 2022

Hi there,

I think this is a limitation DpuOverlay, looking at dpu.py the first and only subgraph is used to create a runner object. We might support subgraphs in the future, but for now you would have to use vart directly and run separate instances like here.

Thanks
Shawn

@bomerzz
Copy link
Contributor Author

bomerzz commented Jan 18, 2022

Hi @skalade thanks for the reply.
I've tried the above mentioned method but it will still throw the double free error when attempting to use an output buffer of size of the second output (1,1917,4). From the graph image above it seems that the DPU Subgraph is able to output two different tensors. Is there a way to set the output of the subgraph to be of the second output? Below is the code I used where i set the output_data to the size of the second output. I'm not too sure if this is the correct way to do this.

Thanks!

shapeOut2 = tuple(outputTensors[1].dims)
outputSize2 = int(outputTensors[1].get_data_size() / shapeIn[0])
output_data2 = [np.empty(shapeOut2, dtype=np.float32, order="C")]

dpu_1 = vart.Runner.create_runner(subgraph[0], "run")
dpu_2 = vart.Runner.create_runner(subgraph[0], "run")
job_id = dpu_1.execute_async(input_data,output_data)
dpu_1.wait(job_id)
print("Job 1")
job_id2 = dpu_2.execute_async(input_data,output_data2)
dpu_2.wait(job_id2)
print("Job 2")

@skalade
Copy link
Collaborator

skalade commented Jan 18, 2022

Hi, since this does not seem like a DPU-PYNQ bug I'd recommend posting this on the pynq discuss forum or maybe as a more general vart question to the xilinx forums / Vitis AI github issues.

I've not really worked with models like ssd before so can't give too much advice. But parsing your model looks like the 4th CPU subgraph corresponds to the box encodings that you can see on one of your outputs in that graph image you provided. So maybe you could grab those somehow, there should be some examples in the Vitis AI library.

image

Hope this helps a bit...

I'm going to close this issue because this isn't a core DPU-PYNQ problem. If you still have issues I encourage you to post on one of the forums!

Thanks
Shawn

@skalade skalade closed this as completed Jan 18, 2022
@bomerzz
Copy link
Contributor Author

bomerzz commented Jan 19, 2022

Hi @skalade thanks for the help! I'll give the forum a shot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants