Inference

According to the SNPE C++ Tutorial - Build the Sample, a basic Inference SDK should include the content shown in the diagram below

This section of content is consistent and transparent for all models and generally does not change with variations in the model. The encapsulation of this portion is found in SNPEPipeline.cpp.

Check Available Runtime

Before performing runtime checks, in its constructor, calls the SNPE interface to check the version information of the library, serving as a fundamental check.

static zdl::DlSystem::Version_t version = zdl::SNPE::SNPEFactory::getLibraryVersion();
std::cout << "[info] SNPE version: " << version.asString().c_str() << std::endl;

The log will print out while executing.
The snpe version used is [info] SNPE version: 2.5.0.4052

if (zdl::SNPE::SNPEFactory::isRuntimeAvailable(zdl::DlSystem::Runtime_t::GPU)) {
    m_runtim = zdl::DlSystem::Runtime_t::GPU;
    std::cout << "[info] SNPE runtim: GPU" << std::endl;
} else {
    m_runtim = zdl::DlSystem::Runtime_t::CPU;
    std::cout << "[info] SNPE runtim: CPU" << std::endl;
}

It calls the corresponding interface to check whether the current platform supports the runtime set by the user. The supported runtimes by SNPE are platform-specific, and the official documentation provides a list of runtimes supported by certain SOC models:

The details of the runtime require an understanding of the datasheet for the platform's SOC in use, along with the necessary hardware and driver support. Typically, CPUs and GPUs (Adreno) can run successfully. Therefore, if it’s detects an unsupported runtime, it sets the runtime to CPU.

You could use tool snpe-net-run to check the run time is available as well

Load Network

The process of loading the model is relatively straightforward. Simply call the corresponding interface, passing the absolute path of the DLC, and it will automatically parse the DLC to obtain the basic network information.

m_container = zdl::DlContainer::IDlContainer::open(model_path);

DLC (Deep Learning Container), as the name suggests, is merely a container for deep learning networks

Set Network Builder Options

Notes that SNPE v2.5, due to its use of the C API, no longer supports the chained functions present in the deprecated C++ API of the older version.

zdl::SNPE::SNPEBuilder snpe_builder(m_container.get());
m_snpe = snpe_builder.setOutputLayers(output_layers)
                     .setRuntimeProcessor(m_runtim)
                     .setCPUFallbackMode(true)
                     .setUseUserSuppliedBuffers(false)
                     .setPerformanceProfile(zdl::DlSystem::PerformanceProfile_t::HIGH_PERFORMANCE)
                     .build();

`SetOutputLayers`

Setting the output layer for the current model implies that SNPE can access the output of any layer in the entire inference process, provided you have made the appropriate settings. This configuration is mandatory; if you haven't specified a particular output layer, the default behavior is to use the last layer of the model as the output. Single-output networks can rely on the default, but for networks like YOLO which typically have three output layers, relying solely on default behavior is not feasible

const char *layers_name[3] = {"Conv_134", "Conv_148", "Conv_162"};
for (auto &name: layers_name) {
    output_layers.append(name);
}

ITensors & UserBuffers

ITensors and UserBuffers represent two types of memory. ITensors correspond to regular user space memory (such as memory allocated with malloc/new), while UserBuffers correspond to DMA (ION memory). The most notable difference in usage is that ITensors involve an additional std::copy compared to UserBuffers. You could see more information here

In this project we using ITensors
The loadInputTensor() and getOutputTensor() are created as the external API

Execute

The invocation of the inference interface is relatively straightforward. Once the user places the prepared input data, simply make a direct call

bool SNPEPipeline::execute() {
    ...
    m_snpe->execute(input_tmp, output_tmp);
    ...
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference

Check Available Runtime

Load Network

Set Network Builder Options

`SetOutputLayers`

ITensors & UserBuffers

Execute

Introduction

Documentation

Reference

Clone this wiki locally