Major challengers to integrating hard real time image processing using neural networks in MPSoCs

Guideline Information

Item	Value
Guideline Number	31
Guideline Responsible (Name, Affiliation)	Alvin Sashala Naik, Thales
Guideline Reviewer (Name, Affiliation)	Pedro Machado, Sundance
Guideline Audience (Category)	Application developers
Guideline Expertise (Category)	Application developers, System Architects
Guideline Keywords (Category)	Neural Networks, Computer vision, Deep-learning

Guideline advice

To integrate Artificial Neural Networks in low-power embedded computer vision in embedded and real-time constraints it is necessary to rework the network and the methodology to use is very similar to other applications but at a higher level of abstraction.

Insights that led to the guideline

There are barriers to implement computationally intensive deep learning algorithms in low-power embedded system-on-chips considering real-time constraints. Because the embedded resources are limited in terms of computing power and in terms of memory bandwidth & size. It is very often impossible to execute a neural network designed on servers directly on an embedded target. The model developed after training of a neural network do not consider any constraints like memory sizes.

Some processors, designed for neural network execution or already well adapted to accelerate the computation taking place in neural networks, come with a tool chain that will adapt the operators of the neural network to their target.

When such a tool chain does not exists, or when this tool chain is not able to shrink the network on the embedded platform, some steps have to be done manually.

Recommended implementation method of the guideline along with a solid motivation for the recommendation

Reduce neural network weights to the order of kilobytes so that they fit into the SRAM memory (use lower than INT8 precision)
Network Pruning so as to make your neural network as most efficient as possible (up to 90% pruning possible and needed)
modify the basic operators of each layers
Because of the kind of computation used during the learning methods, the networks are trained with floating points. A conversion to integer will reduce the number of computation and allow to target simpler chips and FPGAs.
Reduce the dynamic of the integer data. It has been shown that 16 bits integer can be used with the same performance as with the floating points and 8 bits only has a marginal effect on the network accuracy. Some layers, like the last layers doing the classification can even be binary. Low-power oriented chips implement 4-bit, 8-bit and 16-bit operators to deal with such optimized networks.

This goes farther than optimizing the source code of the application and require to change the algorithm itself. Some of the changes, like modification of the dynamic requires to redo a part of the learning as it will modify the weights of the neurons.

Instantiation of the recommended implementation method in the reference platform

This has impact on the selection of the target. An FPGA is a well suitable device that is very well adapted to deal with several dynamic and can highly benefit from binary data.

Because the Tulipp platform goal is to develop an image processing platform and because neural networks are now the state-of-the-art algorithms to analyze images, the platform is ready for these AI applications.

Evaluation of the guideline in reference applications

Neural networks were not implemented in the applications during the project, but a PhD work is currently starting and will use this platform for neural network development. A following project will also be proposed to add neural network analyze in robots based on the Tulipp platform.

References

Review

Related guidelines

none

TULIPP Guideline Wiki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly