Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object detection - CPU resources #18

Closed
aaguiar96 opened this issue Nov 4, 2019 · 7 comments
Closed

Object detection - CPU resources #18

aaguiar96 opened this issue Nov 4, 2019 · 7 comments

Comments

@aaguiar96
Copy link

aaguiar96 commented Nov 4, 2019

Hello,

I'm performing object detection in the same way as classify_image.cc but using the detection engine and doing it in a loop for several images.
However, when I do so it consumes 100% of one of my CPU cores.
I didn't expect this since the detection is being executed on the USB accelerator. Right?

Thanks in advance.

@aaguiar96 aaguiar96 changed the title Classification example execution - CPU resources Object detection - CPU resources Nov 4, 2019
@Namburger
Copy link

Namburger commented Nov 4, 2019

@aaguiar96 Can you give some code snippets, as well as models used? It's a little hard to tell with the given info, but yes, most inference works will be delegated to the TPU (depending on the model). Keep in mind that input processing will be executed with the CPU.
Edit: Also some data transferring should be using CPU

@aaguiar96
Copy link
Author

aaguiar96 commented Nov 4, 2019

My main question is:

  • If I use the detection engine for example as:
engine             = new coral::DetectionEngine((*params).model);
input_tensor_shape = (*engine).get_input_tensor_shape();
labels             = coral::ReadLabelFile((*params).labels);
...
std::vector<uint8_t> input_tensor = coral::GetInputFromImage(in_image, {input_tensor_shape[1], input_tensor_shape[2], input_tensor_shape[3]});
auto results = (*engine).DetectWithInputTensor(input_tensor);

will this code be delegated to the USB Accelerator device?

I'm using a retrained MobileNet model, compatible with the edgetpu.

Thanks in advance.

@Namburger
Copy link

Namburger commented Nov 4, 2019

@aaguiar96 Hi, as far as I could tell, the code looks okay at glance. As for your retrained model, have you compiled it for the edgetpu? And could you also confirm that it is indeed using CPU with htop?

@aaguiar96
Copy link
Author

aaguiar96 commented Nov 4, 2019

Yes, it is compiled for the edgetpu.
So, this code should been working? I profiled the code and the Invoke() routine in RunInference() function is the one who is responsible for the major CPU occupation.

I think this should not occur if the detection is running on the coral device.
Am I wrong?

@Namburger
Copy link

Hi, you're not wrong, when running inference after creating a DetectionEngine should automatically delegate works to TPU. However, I'm wondering if all operation were mapped to the edgetpu during compilation. Any operation that was not mapped to the tpu will be executed on the CPU, can you provide the log file generated by the edgetpu_compiler by any chance (it should give very nice details on which operations are mapped to the edgetpu and which are not)?

@aaguiar96
Copy link
Author

Ok, you saved me... My bad!

I did not compile the model.
I only converted it to .tflite and I was assuming that this was the compile step.

Thank you for your help.
I'm closing the issue.

@Namburger
Copy link

No problems!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants