Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use DLA to build engine #27

Closed
c7934597 opened this issue Jan 28, 2021 · 14 comments
Closed

How to use DLA to build engine #27

c7934597 opened this issue Jan 28, 2021 · 14 comments

Comments

@c7934597
Copy link

https://forums.developer.nvidia.com/t/how-to-use-dla-in-deepstream-yolov5/161550/25

Hi @marcoslucianops ,
I used deepstream-yolov4, and I check out the engine. That been build on GPU. I saw the article before. How do I fix following program for building DLA engine?

// Build the engine
std::cout << "Building the TensorRT Engine" << std::endl;
nvinfer1::ICudaEngine * engine = builder->buildCudaEngine(*network);
if (engine) {
std::cout << "Building complete\n" << std::endl;
} else {
std::cerr << "Building engine failed\n" << std::endl;
}

@marcoslucianops
Copy link
Owner

marcoslucianops commented Jan 31, 2021

Hi, edit nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder) function (lines 61-90), in yolo.cpp file, to:

nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder)
{
    assert (builder);

    if (m_DeviceType == "kDLA") {
        builder->setDefaultDeviceType(nvinfer1::DeviceType::kDLA);
    }

    std::vector<float> weights = loadWeights(m_WtsFilePath, m_NetworkType);
    std::vector<nvinfer1::Weights> trtWeights;

    nvinfer1::INetworkDefinition *network = builder->createNetwork();
    if (parseModel(*network) != NVDSINFER_SUCCESS) {
        network->destroy();
        return nullptr;
    }

    // Build the engine
    std::cout << "Building the TensorRT Engine" << std::endl;
    nvinfer1::ICudaEngine * engine = builder->buildCudaEngine(*network);
    if (engine) {
        std::cout << "Building complete\n" << std::endl;
    } else {
        std::cerr << "Building engine failed\n" << std::endl;
    }

    // destroy
    network->destroy();
    return engine;
}

and add these lines in config_infer_primary.txt file (in [property] section):

enable-dla=1
use-dla-core=0

Note: edit these lines according to:

  • enable-dla: Indicates whether to use the DLA engine for inferencing.
    Boolean: 0 or 1

  • use-dla-core: DLA core to be used.
    Integer: ≥0

I don't have Xavier board to test. Please tell me if it works.

@c7934597
Copy link
Author

It can run on Xavier, but seems some layer convert to DLA unsuccessfully.

I use tegrastats and jtop for seeing effects. The effect is as same as original.


Building YOLO network complete
Building the TensorRT Engine
ERROR: [TRT]: leaky_1: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_1 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_2: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_2 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_3: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 10) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_5: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_5 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_6: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_6 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_8: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_8 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_11: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_11 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_11: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_11 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 27) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_13: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_13 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_14: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_14 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_16: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_16 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_19: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_19 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_19: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_19 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 44) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_21: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_21 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_22: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_22 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_24: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_24 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_27: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_27 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_28: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_28 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_29: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_29 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_31 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_31: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_31 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_33: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer preMul_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer postMul_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer mm1_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer mm2_33 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_36: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_36 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_38 is not supported on DLA, falling back to GPU.
INFO: [TRT]: mm1_33: broadcasting input0 to make tensors conform, dims(input0)=[1,26,13][NONE] dims(input1)=[128,13,13][NONE].
INFO: [TRT]: mm2_33: broadcasting input1 to make tensors conform, dims(input0)=[128,26,13][NONE] dims(input1)=[1,13,26][NONE].
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_6
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_8
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_13
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_14
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_16
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_21
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_22
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_24
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_28
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_36
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_30
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_37
INFO: [TRT]:
INFO: [TRT]: --------------- Layers running on DLA:
INFO: [TRT]: {conv_1,batch_norm_1}, {conv_2,batch_norm_2}, {conv_3,batch_norm_3}, {conv_5,batch_norm_5}, {maxpool_10,conv_11,batch_norm_11}, {maxpool_18,conv_19,batch_norm_19}, {maxpool_26,conv_27,batch_norm_27}, {conv_29,batch_norm_29,conv_33,batch_norm_33},
INFO: [TRT]: --------------- Layers running on GPU:
INFO: [TRT]: preMul_33, postMul_33, leaky_1, leaky_2, leaky_3, (Unnamed Layer* 10) [Slice], leaky_5, conv_6, leaky_6, conv_8, leaky_8, (Unnamed Layer* 8) [Activation]_output copy, leaky_11, (Unnamed Layer* 27) [Slice], conv_13, leaky_13, conv_14, leaky_14, conv_16, leaky_16, (Unnamed Layer* 25) [Activation]_output copy, leaky_19, (Unnamed Layer* 44) [Slice], conv_21, leaky_21, conv_22, leaky_22, conv_24, leaky_24, (Unnamed Layer* 42) [Activation]_output copy, leaky_27, conv_28, leaky_28, leaky_29, leaky_33, mm1_33, mm2_33, conv_30, (Unnamed Layer* 75) [Matrix Multiply]_output copy, (Unnamed Layer* 54) [Activation]_output copy, conv_36, yolo_31, leaky_36, conv_37, yolo_38,
INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 2 output network tensors.
Building complete

0:02:48.599896399 13265 0x7f20002300 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1748> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/model_b1_dla0_fp16.engine successfully
INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x416x416
1 OUTPUT kFLOAT yolo_31 24x13x13
2 OUTPUT kFLOAT yolo_38 24x26x26

0:02:48.925054321 13265 0x7f20002300 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt sucessfully

@marcoslucianops
Copy link
Owner

Some layers can't run in DLA core. It's a TensorRT issue. I think it will be implemented in future releases of TensorRT/DeepStream SDK.

@satyajitghana
Copy link

satyajitghana commented Mar 20, 2021

what could be the reason of the following error

trying to run on Xavier NX

ERROR: [TRT]: ../rtExt/dla/native/dlaExecuteRunner.cpp (135) - Assertion Error in updateContextResources: 0 (execParams.dlaCore >= 0 && execParams.dlaCore < core->numEngines())
Building engine failed

Failed to build CUDA engine on yolov4-tiny-3l-1024.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:06.471895338 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:00:06.472071755 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:00:06.472165676 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:00:06.472758673 18672   0x55c26e8f30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:06.472804529 18672   0x55c26e8f30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(810): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed

@marcoslucianops
Copy link
Owner

marcoslucianops commented Mar 25, 2021

Did you followed these steps? You need to recompile nvdsinfer_custom_impl_Yolo too.

@satyajitghana
Copy link

@marcoslucianops yes !, it seems that for batch-size=1 i am able to use dla, but on batch size >= 2 it doesn't work.

INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 3 output network tensors.
ERROR: [TRT]: ../builder/cudnnBuilder2.cpp (1757) - Assertion Error in operator(): 0 (et.region->getType() == RegionType::kNVM)
Building engine failed

Failed to build CUDA engine on drone_tiny_3l_1024_test.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:03:12.310730296 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:03:12.310807833 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:03:12.310911066 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:03:12.311629282 10901     0x31115e30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:03:12.311680930 10901     0x31115e30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo_60fps_awesome_tracking/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

@willosonico
Copy link

it this fix updated? because the function prototype

nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder)

doesn't match with

nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder, nvinfer1::IBuilderConfig* config)

@marcoslucianops
Copy link
Owner

I don't know if it will work on DLA because I don't have board to test. Can you check please?

@willosonico
Copy link

i am checking but it can't compile because the method in the new version has a different prototype, so maybe it's updated and the fix you provided time ago wont' work anymore?

@marcoslucianops
Copy link
Owner

Can you test only adding these lines in config_infer_primary.txt file (in [property] section):

enable-dla=1
use-dla-core=0

@willosonico
Copy link

seems to ignore it, in fact it use model_b2_gpu0_fp16.engine

Deserialize yoloLayer plugin: yolo_93
Deserialize yoloLayer plugin: yolo_96
Deserialize yoloLayer plugin: yolo_99
Running Healthcheck
0:00:07.012733708 22422 0x7f8c82a550 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 4
0 INPUT kFLOAT data 3x640x640
1 OUTPUT kFLOAT yolo_93 255x80x80
2 OUTPUT kFLOAT yolo_96 255x40x40
3 OUTPUT kFLOAT yolo_99 255x20x20

@marcoslucianops
Copy link
Owner

Can you send the output when the model is building?

@willosonico
Copy link

in fact the engine fails building, i deleted the .engine file, and i got, with batch_size = 2 the following output

ERROR: Deserialize engine failed because file path: /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine open error
0:00:02.280438983 7734 0x7f7882a350 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1889> [UID = 1]: deserialize engine from file :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine failed
0:00:02.280554183 7734 0x7f7882a350 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1996> [UID = 1]: deserialize backend context from engine from file :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine failed, try rebuild
0:00:02.280615048 7734 0x7f7882a350 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files

Loading pre-trained weights
Running Healthcheck
Loading weights of /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/fogsphere/deepstream/models/yolov5s6/yolov5s complete
Total weights read: 7254397
Building YOLO network

  layer                        input               output         weightPtr

(0) conv_silu 3 x 640 x 640 32 x 320 x 320 3584
(1) conv_silu 32 x 320 x 320 64 x 160 x 160 22272
(2) conv_silu 64 x 160 x 160 32 x 160 x 160 24448
(3) route - 64 x 160 x 160 24448
(4) conv_silu 64 x 160 x 160 32 x 160 x 160 26624
(5) conv_silu 32 x 160 x 160 32 x 160 x 160 27776
(6) conv_silu 32 x 160 x 160 32 x 160 x 160 37120
(7) shortcut_linear: 4 - 32 x 160 x 160 -
(8) route - 64 x 160 x 160 37120
(9) conv_silu 64 x 160 x 160 64 x 160 x 160 41472
(10) conv_silu 64 x 160 x 160 128 x 80 x 80 115712
(11) conv_silu 128 x 80 x 80 64 x 80 x 80 124160
(12) route - 128 x 80 x 80 124160
(13) conv_silu 128 x 80 x 80 64 x 80 x 80 132608
(14) conv_silu 64 x 80 x 80 64 x 80 x 80 136960
(15) conv_silu 64 x 80 x 80 64 x 80 x 80 174080
(16) shortcut_linear: 13 - 64 x 80 x 80 -
(17) conv_silu 64 x 80 x 80 64 x 80 x 80 178432
(18) conv_silu 64 x 80 x 80 64 x 80 x 80 215552
(19) shortcut_linear: 16 - 64 x 80 x 80 -
(20) route - 128 x 80 x 80 215552
(21) conv_silu 128 x 80 x 80 128 x 80 x 80 232448
(22) conv_silu 128 x 80 x 80 256 x 40 x 40 528384
(23) conv_silu 256 x 40 x 40 128 x 40 x 40 561664
(24) route - 256 x 40 x 40 561664
(25) conv_silu 256 x 40 x 40 128 x 40 x 40 594944
(26) conv_silu 128 x 40 x 40 128 x 40 x 40 611840
(27) conv_silu 128 x 40 x 40 128 x 40 x 40 759808
(28) shortcut_linear: 25 - 128 x 40 x 40 -
(29) conv_silu 128 x 40 x 40 128 x 40 x 40 776704
(30) conv_silu 128 x 40 x 40 128 x 40 x 40 924672
(31) shortcut_linear: 28 - 128 x 40 x 40 -
(32) conv_silu 128 x 40 x 40 128 x 40 x 40 941568
(33) conv_silu 128 x 40 x 40 128 x 40 x 40 1089536
(34) shortcut_linear: 31 - 128 x 40 x 40 -
(35) route - 256 x 40 x 40 1089536
(36) conv_silu 256 x 40 x 40 256 x 40 x 40 1156096
(37) conv_silu 256 x 40 x 40 512 x 20 x 20 2337792
(38) conv_silu 512 x 20 x 20 256 x 20 x 20 2469888
(39) route - 512 x 20 x 20 2469888
(40) conv_silu 512 x 20 x 20 256 x 20 x 20 2601984
(41) conv_silu 256 x 20 x 20 256 x 20 x 20 2668544
(42) conv_silu 256 x 20 x 20 256 x 20 x 20 3259392
(43) shortcut_linear: 40 - 256 x 20 x 20 -
(44) route - 512 x 20 x 20 3259392
(45) conv_silu 512 x 20 x 20 512 x 20 x 20 3523584
(46) conv_silu 512 x 20 x 20 256 x 20 x 20 3655680
(47) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(48) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(49) maxpool 256 x 20 x 20 256 x 20 x 20 3655680
(50) route - 1024 x 20 x 20 3655680
(51) conv_silu 1024 x 20 x 20 512 x 20 x 20 4182016
(52) conv_silu 512 x 20 x 20 256 x 20 x 20 4314112
(53) upsample 256 x 20 x 20 256 x 40 x 40 -
(54) route - 512 x 40 x 40 4314112
(55) conv_silu 512 x 40 x 40 128 x 40 x 40 4380160
(56) route - 512 x 40 x 40 4380160
(57) conv_silu 512 x 40 x 40 128 x 40 x 40 4446208
(58) conv_silu 128 x 40 x 40 128 x 40 x 40 4463104
(59) conv_silu 128 x 40 x 40 128 x 40 x 40 4611072
(60) route - 256 x 40 x 40 4611072
(61) conv_silu 256 x 40 x 40 256 x 40 x 40 4677632
(62) conv_silu 256 x 40 x 40 128 x 40 x 40 4710912
(63) upsample 128 x 40 x 40 128 x 80 x 80 -
(64) route - 256 x 80 x 80 4710912
(65) conv_silu 256 x 80 x 80 64 x 80 x 80 4727552
(66) route - 256 x 80 x 80 4727552
(67) conv_silu 256 x 80 x 80 64 x 80 x 80 4744192
(68) conv_silu 64 x 80 x 80 64 x 80 x 80 4748544
(69) conv_silu 64 x 80 x 80 64 x 80 x 80 4785664
(70) route - 128 x 80 x 80 4785664
(71) conv_silu 128 x 80 x 80 128 x 80 x 80 4802560
(72) conv_silu 128 x 80 x 80 128 x 40 x 40 4950528
(73) route - 256 x 40 x 40 4950528
(74) conv_silu 256 x 40 x 40 128 x 40 x 40 4983808
(75) route - 256 x 40 x 40 4983808
(76) conv_silu 256 x 40 x 40 128 x 40 x 40 5017088
(77) conv_silu 128 x 40 x 40 128 x 40 x 40 5033984
(78) conv_silu 128 x 40 x 40 128 x 40 x 40 5181952
(79) route - 256 x 40 x 40 5181952
(80) conv_silu 256 x 40 x 40 256 x 40 x 40 5248512
(81) conv_silu 256 x 40 x 40 256 x 20 x 20 5839360
(82) route - 512 x 20 x 20 5839360
(83) conv_silu 512 x 20 x 20 256 x 20 x 20 5971456
(84) route - 512 x 20 x 20 5971456
(85) conv_silu 512 x 20 x 20 256 x 20 x 20 6103552
(86) conv_silu 256 x 20 x 20 256 x 20 x 20 6170112
(87) conv_silu 256 x 20 x 20 256 x 20 x 20 6760960
(88) route - 512 x 20 x 20 6760960
(89) conv_silu 512 x 20 x 20 512 x 20 x 20 7025152
(90) route - 128 x 80 x 80 7025152
(91) conv_logistic 128 x 80 x 80 255 x 80 x 80 7058047
(92) yolo 255 x 80 x 80 255 x 80 x 80 7058047
(93) route - 256 x 40 x 40 7058047
(94) conv_logistic 256 x 40 x 40 255 x 40 x 40 7123582
(95) yolo 255 x 40 x 40 255 x 40 x 40 7123582
(96) route - 512 x 20 x 20 7123582
(97) conv_logistic 512 x 20 x 20 255 x 20 x 20 7254397
(98) yolo 255 x 20 x 20 255 x 20 x 20 7254397
Output YOLO blob names:
yolo_93
yolo_96
yolo_99
Total number of YOLO layers: 273
Building YOLO network complete
Building the TensorRT Engine

WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_12: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_12 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_24: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_24 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_39: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_39 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer upsample_53 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_56: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_56 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer upsample_63 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_66: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_66 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_75: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_75 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_84: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_84 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_90: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_90 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_93 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_93: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_93 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_96 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_96: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_96 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_99 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_1
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_54
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_64
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_98
WARNING: [TRT]: Detected invalid timing cache, setup a local cache instead

ERROR: [TRT]: 2: [nvdlaUtils.cpp::getInputDesc::176] Error Code 2: Internal Error (Assertion idx < num failed.Index is out of range of valid number of input tensors.)
Building engine failed

@marcoslucianops
Copy link
Owner

marcoslucianops commented Apr 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants