How to use DLA to build engine #27

c7934597 · 2021-01-28T14:38:00Z

https://forums.developer.nvidia.com/t/how-to-use-dla-in-deepstream-yolov5/161550/25

Hi @marcoslucianops ,
I used deepstream-yolov4, and I check out the engine. That been build on GPU. I saw the article before. How do I fix following program for building DLA engine?

DeepStream-Yolo/native/nvdsinfer_custom_impl_Yolo/yolo.cpp

Lines 74 to 81 in 470ed82

    
           // Build the engine 
        
           std::cout << "Building the TensorRT Engine" << std::endl; 
        
           nvinfer1::ICudaEngine * engine = builder->buildCudaEngine(*network); 
        
           if (engine) { 
        
               std::cout << "Building complete\n" << std::endl; 
        
           } else { 
        
               std::cerr << "Building engine failed\n" << std::endl; 
        
           }

marcoslucianops · 2021-01-31T00:32:18Z

Hi, edit nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder) function (lines 61-90), in yolo.cpp file, to:

nvinfer1::ICudaEngine *Yolo::createEngine (nvinfer1::IBuilder* builder)
{
    assert (builder);

    if (m_DeviceType == "kDLA") {
        builder->setDefaultDeviceType(nvinfer1::DeviceType::kDLA);
    }

    std::vector<float> weights = loadWeights(m_WtsFilePath, m_NetworkType);
    std::vector<nvinfer1::Weights> trtWeights;

    nvinfer1::INetworkDefinition *network = builder->createNetwork();
    if (parseModel(*network) != NVDSINFER_SUCCESS) {
        network->destroy();
        return nullptr;
    }

    // Build the engine
    std::cout << "Building the TensorRT Engine" << std::endl;
    nvinfer1::ICudaEngine * engine = builder->buildCudaEngine(*network);
    if (engine) {
        std::cout << "Building complete\n" << std::endl;
    } else {
        std::cerr << "Building engine failed\n" << std::endl;
    }

    // destroy
    network->destroy();
    return engine;
}

and add these lines in config_infer_primary.txt file (in [property] section):

enable-dla=1
use-dla-core=0

Note: edit these lines according to:

enable-dla: Indicates whether to use the DLA engine for inferencing.
Boolean: 0 or 1
use-dla-core: DLA core to be used.
Integer: ≥0

I don't have Xavier board to test. Please tell me if it works.

c7934597 · 2021-01-31T06:57:49Z

It can run on Xavier, but seems some layer convert to DLA unsuccessfully.

I use tegrastats and jtop for seeing effects. The effect is as same as original.

Building YOLO network complete
Building the TensorRT Engine
ERROR: [TRT]: leaky_1: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_1 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_2: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_2 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_3: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 10) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_5: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_5 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_6: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_6 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_8: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_8 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_11: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_11 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_11: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_11 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 27) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_13: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_13 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_14: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_14 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_16: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_16 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_19: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_19 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_19: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_19 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer (Unnamed Layer* 44) [Slice] is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_21: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_21 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_22: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_22 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_24: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_24 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_27: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_27 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_28: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_28 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_29: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_29 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_31 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_31: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_31 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_33: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer preMul_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer postMul_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer mm1_33 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer mm2_33 is not supported on DLA, falling back to GPU.
ERROR: [TRT]: leaky_36: ActivationLayer (with ActivationType = LEAKY_RELU) not supported for DLA.
WARNING: [TRT]: Default DLA is enabled but layer leaky_36 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_38 is not supported on DLA, falling back to GPU.
INFO: [TRT]: mm1_33: broadcasting input0 to make tensors conform, dims(input0)=[1,26,13][NONE] dims(input1)=[128,13,13][NONE].
INFO: [TRT]: mm2_33: broadcasting input1 to make tensors conform, dims(input0)=[128,26,13][NONE] dims(input1)=[1,13,26][NONE].
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_6
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_8
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_13
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_14
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_16
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_21
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_22
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_24
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_28
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_36
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_30
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_37
INFO: [TRT]:
INFO: [TRT]: --------------- Layers running on DLA:
INFO: [TRT]: {conv_1,batch_norm_1}, {conv_2,batch_norm_2}, {conv_3,batch_norm_3}, {conv_5,batch_norm_5}, {maxpool_10,conv_11,batch_norm_11}, {maxpool_18,conv_19,batch_norm_19}, {maxpool_26,conv_27,batch_norm_27}, {conv_29,batch_norm_29,conv_33,batch_norm_33},
INFO: [TRT]: --------------- Layers running on GPU:
INFO: [TRT]: preMul_33, postMul_33, leaky_1, leaky_2, leaky_3, (Unnamed Layer* 10) [Slice], leaky_5, conv_6, leaky_6, conv_8, leaky_8, (Unnamed Layer* 8) [Activation]_output copy, leaky_11, (Unnamed Layer* 27) [Slice], conv_13, leaky_13, conv_14, leaky_14, conv_16, leaky_16, (Unnamed Layer* 25) [Activation]_output copy, leaky_19, (Unnamed Layer* 44) [Slice], conv_21, leaky_21, conv_22, leaky_22, conv_24, leaky_24, (Unnamed Layer* 42) [Activation]_output copy, leaky_27, conv_28, leaky_28, leaky_29, leaky_33, mm1_33, mm2_33, conv_30, (Unnamed Layer* 75) [Matrix Multiply]_output copy, (Unnamed Layer* 54) [Activation]_output copy, conv_36, yolo_31, leaky_36, conv_37, yolo_38,
INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 2 output network tensors.
Building complete

0:02:48.599896399 13265 0x7f20002300 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1748> [UID = 1]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/model_b1_dla0_fp16.engine successfully
INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT data 3x416x416
1 OUTPUT kFLOAT yolo_31 24x13x13
2 OUTPUT kFLOAT yolo_38 24x26x26

0:02:48.925054321 13265 0x7f20002300 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt sucessfully

marcoslucianops · 2021-02-02T11:22:52Z

Some layers can't run in DLA core. It's a TensorRT issue. I think it will be implemented in future releases of TensorRT/DeepStream SDK.

satyajitghana · 2021-03-20T16:31:00Z

what could be the reason of the following error

trying to run on Xavier NX

ERROR: [TRT]: ../rtExt/dla/native/dlaExecuteRunner.cpp (135) - Assertion Error in updateContextResources: 0 (execParams.dlaCore >= 0 && execParams.dlaCore < core->numEngines())
Building engine failed

Failed to build CUDA engine on yolov4-tiny-3l-1024.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:06.471895338 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:00:06.472071755 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:00:06.472165676 18672   0x55c26e8f30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:00:06.472758673 18672   0x55c26e8f30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:06.472804529 18672   0x55c26e8f30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(810): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed

marcoslucianops · 2021-03-25T12:21:57Z

Did you followed these steps? You need to recompile nvdsinfer_custom_impl_Yolo too.

satyajitghana · 2021-04-28T14:58:19Z

@marcoslucianops yes !, it seems that for batch-size=1 i am able to use dla, but on batch size >= 2 it doesn't work.

INFO: [TRT]: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
INFO: [TRT]: Detected 1 inputs and 3 output network tensors.
ERROR: [TRT]: ../builder/cudnnBuilder2.cpp (1757) - Assertion Error in operator(): 0 (et.region->getType() == RegionType::kNVM)
Building engine failed

Failed to build CUDA engine on drone_tiny_3l_1024_test.cfg
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:03:12.310730296 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1735> [UID = 1]: build engine file failed
0:03:12.310807833 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1821> [UID = 1]: build backend context failed
0:03:12.310911066 10901     0x31115e30 ERROR                nvinfer gstnvinfer.cpp:614:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1148> [UID = 1]: generate backend failed, check config file settings
0:03:12.311629282 10901     0x31115e30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:03:12.311680930 10901     0x31115e30 WARN                 nvinfer gstnvinfer.cpp:810:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/yolo_60fps_awesome_tracking/config_infer_primary.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

willosonico · 2022-04-19T13:34:01Z

it this fix updated? because the function prototype

nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder)

doesn't match with

nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder, nvinfer1::IBuilderConfig* config)

marcoslucianops · 2022-04-19T14:14:36Z

I don't know if it will work on DLA because I don't have board to test. Can you check please?

willosonico · 2022-04-19T14:45:21Z

i am checking but it can't compile because the method in the new version has a different prototype, so maybe it's updated and the fix you provided time ago wont' work anymore?

marcoslucianops · 2022-04-20T14:22:56Z

Can you test only adding these lines in config_infer_primary.txt file (in [property] section):

enable-dla=1
use-dla-core=0

willosonico · 2022-04-21T08:57:50Z

seems to ignore it, in fact it use model_b2_gpu0_fp16.engine

Deserialize yoloLayer plugin: yolo_93
Deserialize yoloLayer plugin: yolo_96
Deserialize yoloLayer plugin: yolo_99
Running Healthcheck
0:00:07.012733708 22422 0x7f8c82a550 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 4
0 INPUT kFLOAT data 3x640x640
1 OUTPUT kFLOAT yolo_93 255x80x80
2 OUTPUT kFLOAT yolo_96 255x40x40
3 OUTPUT kFLOAT yolo_99 255x20x20

marcoslucianops · 2022-04-21T09:25:18Z

Can you send the output when the model is building?

willosonico · 2022-04-21T09:30:48Z

in fact the engine fails building, i deleted the .engine file, and i got, with batch_size = 2 the following output

ERROR: Deserialize engine failed because file path: /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine open error
0:00:02.280438983 7734 0x7f7882a350 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1889> [UID = 1]: deserialize engine from file :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine failed
0:00:02.280554183 7734 0x7f7882a350 WARN nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1996> [UID = 1]: deserialize backend context from engine from file :/home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/model_b2_gpu0_fp16.engine failed, try rebuild
0:00:02.280615048 7734 0x7f7882a350 INFO nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files

Loading pre-trained weights
Running Healthcheck
Loading weights of /home/aaeon/sdcard/fogsphere-edge/fogsphere-engine-py/fogsphere/deepstream/models/yolov5s6/yolov5s complete
Total weights read: 7254397
Building YOLO network

  layer                        input               output         weightPtr

(0) conv_silu 3 x 640 x 640 (1) conv_silu 32 x 320 x 320 (2) conv_silu 64 x 160 x 160 (3) route - (4) conv_silu 64 x 160 x 160 (5) conv_silu 32 x 160 x 160 (6) conv_silu 32 x 160 x 160 (7) shortcut_linear: 4 - (8) route - (9) conv_silu 64 x 160 x 160 (10) conv_silu 64 x 160 x 160 (11) conv_silu 128 x 80 x 80 (12) route - (13) conv_silu 128 x 80 x 80 (14) conv_silu 64 x 80 x 80 (15) conv_silu 64 x 80 x 80 (16) shortcut_linear: 13 - (17) conv_silu 64 x 80 x 80 (18) conv_silu 64 x 80 x 80 (19) shortcut_linear: 16 - (20) route - (21) conv_silu 128 x 80 x 80 (22) conv_silu 128 x 80 x 80 (23) conv_silu 256 x 40 x 40 (24) route - (25) conv_silu 256 x 40 x 40 (26) conv_silu 128 x 40 x 40 (27) conv_silu 128 x 40 x 40 (28) shortcut_linear: 25 - (29) conv_silu 128 x 40 x 40 (30) conv_silu 128 x 40 x 40 (31) shortcut_linear: 28 - (32) conv_silu 128 x 40 x 40 (33) conv_silu 128 x 40 x 40 (34) shortcut_linear: 31 - (35) route - (36) conv_silu 256 x 40 x 40 (37) conv_silu 256 x 40 x 40 (38) conv_silu 512 x 20 x 20 (39) route - (40) conv_silu 512 x 20 x 20 (41) conv_silu 256 x 20 x 20 (42) conv_silu 256 x 20 x 20 (43) shortcut_linear: 40 - (44) route - (45) conv_silu 512 x 20 x 20 (46) conv_silu 512 x 20 x 20 (47) maxpool 256 x 20 x 20 (48) maxpool 256 x 20 x 20 (49) maxpool 256 x 20 x 20 (50) route - (51) conv_silu 1024 x 20 x 20 (52) conv_silu 512 x 20 x 20 (53) upsample 256 x 20 x 20 (54) route - (55) conv_silu 512 x 40 x 40 (56) route - (57) conv_silu 512 x 40 x 40 (58) conv_silu 128 x 40 x 40 (59) conv_silu 128 x 40 x 40 (60) route - (61) conv_silu 256 x 40 x 40 (62) conv_silu 256 x 40 x 40 (63) upsample 128 x 40 x 40 (64) route - (65) conv_silu 256 x 80 x 80 (66) route - (67) conv_silu 256 x 80 x 80 (68) conv_silu 64 x 80 x 80 (69) conv_silu 64 x 80 x 80 (70) route - (71) conv_silu 128 x 80 x 80 (72) conv_silu 128 x 80 x 80 (73) route - (74) conv_silu 256 x 40 x 40 (75) route - (76) conv_silu 256 x 40 x 40 (77) conv_silu 128 x 40 x 40 (78) conv_silu 128 x 40 x 40 (79) route - (80) conv_silu 256 x 40 x 40 (81) conv_silu 256 x 40 x 40 (82) route - (83) conv_silu 512 x 20 x 20 (84) route - (85) conv_silu 512 x 20 x 20 (86) conv_silu 256 x 20 x 20 (87) conv_silu 256 x 20 x 20 (88) route - (89) conv_silu 512 x 20 x 20 (90) route - (91) conv_logistic 128 x 80 x 80 (92) yolo 255 x 80 x 80 (93) route - (94) conv_logistic 256 x 40 x 40 (95) yolo 255 x 40 x 40 (96) route - (97) conv_logistic 512 x 20 x 20 (98) yolo 255 x 20 x 20 Output YOLO blob names:
yolo_93
yolo_96
yolo_99
Total number of YOLO layers: 273
Building YOLO network complete
Building the TensorRT Engine 32 x 320 x 320 3584
64 x 160 x 160 22272
32 x 160 x 160 24448
64 x 160 x 160 24448
32 x 160 x 160 26624
32 x 160 x 160 27776
32 x 160 x 160 37120
32 x 160 x 160 -
64 x 160 x 160 37120
64 x 160 x 160 41472
128 x 80 x 80 115712
64 x 80 x 80 124160
128 x 80 x 80 124160
64 x 80 x 80 132608
64 x 80 x 80 136960
64 x 80 x 80 174080
64 x 80 x 80 -
64 x 80 x 80 178432
64 x 80 x 80 215552
64 x 80 x 80 -
128 x 80 x 80 215552
128 x 80 x 80 232448
256 x 40 x 40 528384
128 x 40 x 40 561664
256 x 40 x 40 561664
128 x 40 x 40 594944
128 x 40 x 40 611840
128 x 40 x 40 759808
128 x 40 x 40 -
128 x 40 x 40 776704
128 x 40 x 40 924672
128 x 40 x 40 -
128 x 40 x 40 941568
128 x 40 x 40 1089536
128 x 40 x 40 -
256 x 40 x 40 1089536
256 x 40 x 40 1156096
512 x 20 x 20 2337792
256 x 20 x 20 2469888
512 x 20 x 20 2469888
256 x 20 x 20 2601984
256 x 20 x 20 2668544
256 x 20 x 20 3259392
256 x 20 x 20 -
512 x 20 x 20 3259392
512 x 20 x 20 3523584
256 x 20 x 20 3655680
256 x 20 x 20 3655680
256 x 20 x 20 3655680
256 x 20 x 20 3655680
1024 x 20 x 20 3655680
512 x 20 x 20 4182016
256 x 20 x 20 4314112
256 x 40 x 40 -
512 x 40 x 40 4314112
128 x 40 x 40 4380160
512 x 40 x 40 4380160
128 x 40 x 40 4446208
128 x 40 x 40 4463104
128 x 40 x 40 4611072
256 x 40 x 40 4611072
256 x 40 x 40 4677632
128 x 40 x 40 4710912
128 x 80 x 80 -
256 x 80 x 80 4710912
64 x 80 x 80 4727552
256 x 80 x 80 4727552
64 x 80 x 80 4744192
64 x 80 x 80 4748544
64 x 80 x 80 4785664
128 x 80 x 80 4785664
128 x 80 x 80 4802560
128 x 40 x 40 4950528
256 x 40 x 40 4950528
128 x 40 x 40 4983808
256 x 40 x 40 4983808
128 x 40 x 40 5017088
128 x 40 x 40 5033984
128 x 40 x 40 5181952
256 x 40 x 40 5181952
256 x 40 x 40 5248512
256 x 20 x 20 5839360
512 x 20 x 20 5839360
256 x 20 x 20 5971456
512 x 20 x 20 5971456
256 x 20 x 20 6103552
256 x 20 x 20 6170112
256 x 20 x 20 6760960
512 x 20 x 20 6760960
512 x 20 x 20 7025152
128 x 80 x 80 7025152
255 x 80 x 80 7058047
255 x 80 x 80 7058047
256 x 40 x 40 7058047
255 x 40 x 40 7123582
255 x 40 x 40 7123582
512 x 20 x 20 7123582
255 x 20 x 20 7254397
255 x 20 x 20 7254397

WARNING: [TRT]: route_3: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_3 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_12: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_12 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_24: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_24 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_39: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_39 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer upsample_53 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_56: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_56 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer upsample_63 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_66: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_66 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_75: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_75 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_84: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_84 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_90: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_90 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_93 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_93: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_93 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_96 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: route_96: Concatenation on DLA requires at least two inputs.
WARNING: [TRT]: Default DLA is enabled but layer route_96 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: Default DLA is enabled but layer yolo_99 is not supported on DLA, falling back to GPU.
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_1
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_54
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer route_64
WARNING: [TRT]: DLA supports only 8 subgraphs per DLA core. Switching to GPU for layer conv_98
WARNING: [TRT]: Detected invalid timing cache, setup a local cache instead

ERROR: [TRT]: 2: [nvdlaUtils.cpp::getInputDesc::176] Error Code 2: Internal Error (Assertion idx < num failed.Index is out of range of valid number of input tensors.)
Building engine failed

marcoslucianops · 2022-04-25T12:05:07Z

@willosonico, it's an issue in TensorRT: https://forums.developer.nvidia.com/t/error-while-building-engine-on-tensorrt8-0-2/202978/3.

marcoslucianops closed this as completed Feb 4, 2021

marcoslucianops reopened this Apr 19, 2022

marcoslucianops closed this as completed Jun 13, 2022

thunder95 mentioned this issue Jan 11, 2023

segfault on re-init pipeline #182

Open

PX-Xu mentioned this issue May 29, 2023

When generated the Engine(FP16) in Tesla T4, an error occured. And the Engine(FP32) that generated in the Tesla T4 couldn't detect. #353

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use DLA to build engine #27

How to use DLA to build engine #27

c7934597 commented Jan 28, 2021

marcoslucianops commented Jan 31, 2021 •

edited

Loading

c7934597 commented Jan 31, 2021

marcoslucianops commented Feb 2, 2021

satyajitghana commented Mar 20, 2021 •

edited

Loading

marcoslucianops commented Mar 25, 2021 •

edited

Loading

satyajitghana commented Apr 28, 2021

willosonico commented Apr 19, 2022

marcoslucianops commented Apr 19, 2022

willosonico commented Apr 19, 2022

marcoslucianops commented Apr 20, 2022

willosonico commented Apr 21, 2022

marcoslucianops commented Apr 21, 2022

willosonico commented Apr 21, 2022

marcoslucianops commented Apr 25, 2022 •

edited

Loading

How to use DLA to build engine #27

How to use DLA to build engine #27

Comments

c7934597 commented Jan 28, 2021

marcoslucianops commented Jan 31, 2021 • edited Loading

c7934597 commented Jan 31, 2021

marcoslucianops commented Feb 2, 2021

satyajitghana commented Mar 20, 2021 • edited Loading

marcoslucianops commented Mar 25, 2021 • edited Loading

satyajitghana commented Apr 28, 2021

willosonico commented Apr 19, 2022

marcoslucianops commented Apr 19, 2022

willosonico commented Apr 19, 2022

marcoslucianops commented Apr 20, 2022

willosonico commented Apr 21, 2022

marcoslucianops commented Apr 21, 2022

willosonico commented Apr 21, 2022

marcoslucianops commented Apr 25, 2022 • edited Loading

marcoslucianops commented Jan 31, 2021 •

edited

Loading

satyajitghana commented Mar 20, 2021 •

edited

Loading

marcoslucianops commented Mar 25, 2021 •

edited

Loading

marcoslucianops commented Apr 25, 2022 •

edited

Loading