Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Edwardmark · 2021-05-19T08:42:32Z

Description
I run a ensemble model contains three model which executed sequencely one by one, I check each model, and each one is ok, and I also check two models ensemble, that is ok too. But when I connect them together,and I run the grpc client, the server crashed without meaningful error logs as follows:

Traceback (most recent call last):
  File "grpc_client.py", line 209, in <module>
    main()
  File "grpc_client.py", line 182, in main
    results = triton_client.infer(model_name=model_name, inputs=inputs, outputs=outputs)
  File "/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py", line 1086, in infer
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed

Triton Information
21.03 docker container

To Reproduce
The ensemble config.pbtxt is as follows:

platform: "ensemble"
max_batch_size: 16
input [
  {
    name: "IMAGE_RAW"
    data_type: TYPE_UINT8
    dims: [ -1 ]
  }
]
output [
  {
    name: "SCALE_RATIO" # ratio
    data_type: TYPE_FP32
    dims: [2]
  },
  {
    name: "NUM_DETECTIONS"
    data_type: TYPE_INT32
    dims: [ 1 ]
  },
  {
    name: "NMSED_SCORES"
    data_type: TYPE_FP32
    dims: [ 100 ]
  },
  {
    name: "NMSED_CLASSES"
    data_type: TYPE_FP32
    dims: [ 100 ]
  },
  {
    name: "SCALED_NMSED_BOXES"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  }
]

ensemble_scheduling {
  step [
    {
      model_name: "dali_det_pre"
      model_version: -1
      input_map {
        key: "IMAGE_RAW"
        value: "IMAGE_RAW"
      }
      output_map {
        key: "DALI_OUTPUT_0"
        value: "NORM_IMG"
      }
      output_map {
        key: "DALI_OUTPUT_1"
        value: "SCALE_RATIO"
      }
    },
    {
      model_name: "face_det-ucs"
      model_version: -1
      input_map {
        key: "images"
        value: "NORM_IMG"
      }
      output_map {
        key: "num_detections"
        value: "NUM_DETECTIONS"
      }
      output_map {
        key: "nmsed_boxes"
        value: "NMSED_BOXES"
      }
      output_map {
        key: "nmsed_scores"
        value: "NMSED_SCORES"
      }
      output_map {
        key: "nmsed_classes"
        value: "NMSED_CLASSES"
      }
    },
    {
      model_name: "dali_det_post"
      model_version: -1
      input_map {
        key: "NMSED_BOXES"
        value: "NMSED_BOXES"
      }
      input_map {
        key: "SCALE_RATIO_INPUT"
        value: "SCALE_RATIO"
      }
      output_map {
        key: "SCALED_NMSED_BOXES_OUTPUT"
        value: "SCALED_NMSED_BOXES"
      }
    }
  ]
}

The client is as follows:

FLAGS = parse_args()

    triton_client = tritonclient.grpc.InferenceServerClient(url=FLAGS.url,
                                                            verbose=FLAGS.verbose)

    model_name = FLAGS.model_name
    model_version = -1

    print("Loading images")

    image_data = load_images(FLAGS.img_dir if FLAGS.img_dir is not None else FLAGS.img,
                             max_images=FLAGS.batch_size * FLAGS.n_iter)

    image_data = array_from_list(image_data)
    inputs = generate_inputs(FLAGS.input_names, image_data.shape, "UINT8")
    outputs = generate_outputs(FLAGS.output_names)

    # Initialize the data
    inputs[0].set_data_from_numpy(image_data)
    # Test with outputs
    results = triton_client.infer(model_name=model_name, inputs=inputs, outputs=outputs)
    print(results)

Expected behavior
results should be obtained without error, but now, the server just crashes.
@deadeyegoodwin Looking forward to your reply.

The dali_det_post model config.pbtxt is as follows:

backend: "dali"
max_batch_size: 32
input [
  {
    name: "NMSED_BOXES"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  },
  {
    name: "SCALE_RATIO_INPUT"
    data_type: TYPE_FP32
    dims: [ 2 ]
  }
]
output [
  {
    name: "SCALED_NMSED_BOXES_OUTPUT" 
    data_type: TYPE_FP32
    dims: [100, 4]
  }
]
dynamic_batching {
  preferred_batch_size: [ 4, 8, 16, 32 ]
  max_queue_delay_microseconds: 100
}

the above pipeline is generated using the following code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    nmsed_boxes = fn.external_source(device='gpu', name="NMSED_BOXES")
    scale_ratio = fn.external_source(device='gpu', name='SCALE_RATIO_INPUT')
   
    # Rescale BBOX
    ratio = fn.reductions.min(scale_ratio)
    nmsed_boxes /= ratio
    pipe.set_outputs(nmsed_boxes)

pipe.serialize(filename="1/model.dali")

The above dali_det_post modle can run correctly itself, but connecting it to the firt two models causes crashes in server.

Change the above post-processing model with python backend model as follows can run without error:

for request in requests:
            # Get INPUT0
            in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
            # Get INPUT1
            in_1 = pb_utils.get_input_tensor_by_name(request, "INPUT1")

            out_0 = in_0.as_numpy() / np.min(in_1.as_numpy())

            # Create output tensors. You need pb_utils.Tensor
            # objects to create pb_utils.InferenceResponse.
            out_tensor_0 = pb_utils.Tensor("OUTPUT0",
                                           out_0.astype(output0_dtype))

            # Create InferenceResponse. You can set an error here in case
            # there was a problem with handling this inference request.
            # Below is an example of how you can set errors in inference
            # response:
            #
            # pb_utils.InferenceResponse(
            #    output_tensors=..., TritonError("An error occured"))
            inference_response = pb_utils.InferenceResponse(
                output_tensors=[out_tensor_0])
            responses.append(inference_response)

Python config.pbtxt is as follows:

name: "python_det_post"
backend: "python"

input [
  {
    name: "INPUT0"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
    
  }
]
input [
  {
    name: "INPUT1"
    data_type: TYPE_FP32
    dims: [ 2 ]
    
  }
]
output [
  {
    name: "OUTPUT0"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  }
]

instance_group [{ kind: KIND_CPU }]

Any suggestions please? @deadeyegoodwin Thanks in advance.

The text was updated successfully, but these errors were encountered:

szalpal · 2021-05-19T12:58:08Z

Hi @Edwardmark !

Thank you for extensive description of the problem. I suspect your issue might be connected to the "gpu" backend of external_source operator in DALI. Currently, the GPU input is not yet supported - we are finishing this effort (
#53). It's going to be released in tritonserver:21.06.

Should you like to verify that it's about the GPU input, please update your tritonserver to 21.04. With this version we added missing error log in DALI Backend (#43).

Edwardmark · 2021-05-20T02:33:56Z

@szalpal I changed the dali_det_post pipeline as follows:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    nmsed_boxes = fn.external_source(device='cpu', name="NMSED_BOXES")
    scale_ratio = fn.external_source(device='cpu', name='SCALE_RATIO_INPUT')
   
    # Rescale BBOX
    ratio = fn.reductions.min(scale_ratio)
    nmsed_boxes /= ratio
    pipe.set_outputs(nmsed_boxes)

pipe.serialize(filename="1/model.dali")

But I met the same error:

I0520 02:15:41.783194 133528 ensemble_scheduler.cc:509] Internal response allocation: nmsed_classes, size 400, addr 0x7fb0844b0e00, memory type 2, type id 0
I0520 02:15:41.788463 133528 ensemble_scheduler.cc:524] Internal response release: size 4, addr 0x7fb0844b0200
I0520 02:15:41.788483 133528 ensemble_scheduler.cc:524] Internal response release: size 1600, addr 0x7fb0844b0400
I0520 02:15:41.788489 133528 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7fb0844b0c00
I0520 02:15:41.788496 133528 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7fb0844b0e00
I0520 02:15:41.788517 133528 infer_request.cc:502] prepared: [0x0x7fadd40015e0] request id: , model: dali_det_post, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7fadd40019b8] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
[0x0x7fadd4001868] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
override inputs:
inputs:
[0x0x7fadd4001868] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
[0x0x7fadd40019b8] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
original requested outputs:
SCALED_NMSED_BOXES_OUTPUT
requested outputs:
SCALED_NMSED_BOXES_OUTPUT

tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed
> /app/model_repository/ensemble-face_det-ucs/grpc_client.py(182)main()

In addition, my first preprocess model is defined as follows:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types
import argparse
import numpy as np
import os

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    expect_output_size = (640., 640.)
    images = fn.external_source(device='cpu', name="IMAGE_RAW")
    images = fn.image_decoder(images, device="mixed", output_type=types.RGB)
    raw_shapes = fn.shapes(images, dtype=types.INT32)
    images = fn.resize(
        images,
        mode='not_larger',
        size=expect_output_size,
    )
    resized_shapes = fn.shapes(images, dtype=types.INT32)
    ratio = fn.slice(resized_shapes / raw_shapes, 0, 2, axes=[0])
    images = fn.crop_mirror_normalize(images, mean=[0.], std=[255.], output_layout='CHW')
    images = fn.pad(images, axis_names="HW", align=expect_output_size)
    pipe.set_outputs(images, ratio)
os.system('rm -rf 1 && mkdir -p 1')
pipe.serialize(filename="1/model.dali")

Any advise to make it work please? Thanks. @szalpal

Edwardmark · 2021-05-20T02:58:59Z

@szalpal I changed the version to 21.04 and change all input to cpu, but still no error log is shown, and I get the same log as below, what is your advise? Thanks.
The outpus is same as 21.03

I0520 02:58:13.877026 1181 plan_backend.cc:2447] Running face_det-ucs_0_gpu0 with 1 requests
I0520 02:58:13.877071 1181 plan_backend.cc:3378] Optimization profile default [0] is selected for face_det-ucs_0_gpu0
I0520 02:58:13.877337 1181 plan_backend.cc:2869] Context with profile default [0] is being executed for face_det-ucs_0_gpu0
I0520 02:58:14.543531 1181 infer_response.cc:139] add response output: output: num_detections, type: INT32, shape: [1,1]
I0520 02:58:14.543578 1181 ensemble_scheduler.cc:509] Internal response allocation: num_detections, size 4, addr 0x7f7bf04b0200, memory type 2, type id 0
I0520 02:58:14.543609 1181 infer_response.cc:139] add response output: output: nmsed_boxes, type: FP32, shape: [1,100,4]
I0520 02:58:14.543621 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_boxes, size 1600, addr 0x7f7bf04b0400, memory type 2, type id 0
I0520 02:58:14.543642 1181 infer_response.cc:139] add response output: output: nmsed_scores, type: FP32, shape: [1,100]
I0520 02:58:14.543653 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_scores, size 400, addr 0x7f7bf04b0c00, memory type 2, type id 0
I0520 02:58:14.543672 1181 infer_response.cc:139] add response output: output: nmsed_classes, type: FP32, shape: [1,100]
I0520 02:58:14.543683 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_classes, size 400, addr 0x7f7bf04b0e00, memory type 2, type id 0
I0520 02:58:14.544713 1181 ensemble_scheduler.cc:524] Internal response release: size 4, addr 0x7f7bf04b0200
I0520 02:58:14.544741 1181 ensemble_scheduler.cc:524] Internal response release: size 1600, addr 0x7f7bf04b0400
I0520 02:58:14.544749 1181 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7f7bf04b0c00
I0520 02:58:14.544764 1181 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7f7bf04b0e00
I0520 02:58:14.544789 1181 infer_request.cc:497] prepared: [0x0x7f79300016b0] request id: , model: dali_det_post, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7f7930001a88] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
[0x0x7f7930001938] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
override inputs:
inputs:
[0x0x7f7930001938] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
[0x0x7f7930001a88] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
original requested outputs:
SCALED_NMSED_BOXES_OUTPUT
requested outputs:
SCALED_NMSED_BOXES_OUTPUT

tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed
> /app/model_repository_2104/ensemble-face_det-ucs/grpc_client.py(182)main()

szalpal · 2021-05-20T14:24:15Z

@Edwardmark

It's possible, that even though you changed the ExternalSource to "cpu", the bug still prevents normal processing. Anyhow, we've just merged the GPU input feature to upstream. It's going to be released in tritonserver:21.06, however it's very easy to run the upstream dali_backend with the latest tritonserver release.

Could you try it out and verify if the GPU input solves you problem, or we need to dig deeper? The instructions how to build dali_backend docker image are here: Docker build

Edwardmark · 2021-05-21T03:18:11Z

@szalpal It works, thank you very much.

Edwardmark · 2021-05-26T07:55:41Z

@szalpal How to build the docker without download the git repositorys? I mean if I download the related git repos beforehand, what changes should I make to the cmakelists in dali_backend? When build the docker, it happens the following erros which seems like network error:

Step 12/19 : RUN mkdir build_in_ci && cd build_in_ci &&     cmake                                                   -D CMAKE_INSTALL_PREFIX=/opt/tritonserver             -D CMAKE_BUILD_TYPE=Release                           -D TRITON_COMMON_REPO_TAG="r$TRITON_VERSION"          -D TRITON_CORE_REPO_TAG="r$TRITON_VERSION"            -D TRITON_BACKEND_REPO_TAG="r$TRITON_VERSION"         .. &&                                               make -j"$(grep ^processor /proc/cpuinfo | wc -l)" install
 ---> Running in e11becb3e19f
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build configuration: Release
-- RapidJSON found. Headers: /usr/include
-- RapidJSON found. Headers: /usr/include
Scanning dependencies of target repo-core-populate
[ 11%] Creating directories for 'repo-core-populate'
[ 22%] Performing download step (git clone) for 'repo-core-populate'
Cloning into 'repo-core-src'...
Switched to a new branch 'r21.05'
Branch 'r21.05' set up to track remote branch 'r21.05' from 'origin'.
[ 33%] No patch step for 'repo-core-populate'
[ 44%] Performing update step for 'repo-core-populate'
fatal: unable to access 'https://github.com/triton-inference-server/core.git/': GnuTLS recv error (-110): The TLS connection was non-properly terminated.
CMake Error at /dali/build_in_ci/_deps/repo-core-subbuild/repo-core-populate-prefix/tmp/repo-core-populate-gitupdate.cmake:55 (message):
  Failed to fetch repository
  'https://github.com/triton-inference-server/core.git'


make[2]: *** [CMakeFiles/repo-core-populate.dir/build.make:117: repo-core-populate-prefix/src/repo-core-populate-stamp/repo-core-populate-update] Error 1
make[1]: *** [CMakeFiles/Makefile2:96: CMakeFiles/repo-core-populate.dir/all] Error 2
make: *** [Makefile:104: all] Error 2

CMake Error at /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:912 (message):
  Build step for repo-core failed: 2
Call Stack (most recent call first):
  /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:1003 (__FetchContent_directPopulate)
  /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:1044 (FetchContent_Populate)
  CMakeLists.txt:72 (FetchContent_MakeAvailable)

szalpal · 2021-05-26T10:27:26Z

@Edwardmark ,

as far as I now, unfortunately cloning git repos is immanent for building backends in Triton. Is there a particular reason you would like to clone repos beforehand? If you want to use the latest tritonserver version (21.05), I merged today the PR, which applies that #68 . So you can clone the upstream dali_backend

Edwardmark · 2021-05-27T02:13:02Z

@szalpal because the network is not always good, so I want to clone repos beforhand, then just use the repo to make the build process more quicklyl.

szalpal · 2021-05-27T21:42:48Z

@Edwardmark ,

I see. It would be possible to tweak the root CMakeLists.txt file in order to achieve what you want. Although it is not in our scope right now (and I doubt it will ever be), so we will not implement it, you would need to try to do it yourself.

IMPORTANT: this is a dirty explanation of a workaround and we certainly do not support nor plan to support this way of building in foreseeable future. We also highly discourage changing this building procedure for production environments.

The point is, that there are these three repos, that need to be acquired for proper building any backend: core, common and backend. Our build procedure acquires them in these three declarations:

dali_backend/CMakeLists.txt

Lines 54 to 71 in bb9204c

    
           FetchContent_Declare( 
        
             repo-common 
        
             GIT_REPOSITORY https://github.com/triton-inference-server/common.git 
        
             GIT_TAG ${TRITON_COMMON_REPO_TAG} 
        
             GIT_SHALLOW ON 
        
           ) 
        
           FetchContent_Declare( 
        
             repo-core 
        
             GIT_REPOSITORY https://github.com/triton-inference-server/core.git 
        
             GIT_TAG ${TRITON_CORE_REPO_TAG} 
        
             GIT_SHALLOW ON 
        
           ) 
        
           FetchContent_Declare( 
        
             repo-backend 
        
             GIT_REPOSITORY https://github.com/triton-inference-server/backend.git 
        
             GIT_TAG ${TRITON_BACKEND_REPO_TAG} 
        
             GIT_SHALLOW ON 
        
           )

Should you like to change them to be acquired from your disk, firstly clone all three repos you need and then you can switch from fetching content from git repository to fetching content from disk location, by changing GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands. Below is the documentation of the FetchContent functions, which might be helpful:
https://cmake.org/cmake/help/latest/module/FetchContent.html
https://cmake.org/cmake/help/latest/module/ExternalProject.html#command:externalproject_add
You should pay attention to Directory Options in ExternalProject_Add directive

Edwardmark · 2021-05-31T09:00:54Z

@szalpal Thank you very much.

Edwardmark · 2021-06-17T07:31:38Z

@szalpal could you please give me more hit on how to change GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands? Thanks. I changed the lines as follows:


FetchContent_Declare(
  repo-common
  SOURCE_DIR /dali/common/
)
FetchContent_Declare(
  repo-core
  SOURCE_DIR /dali/core/
)
FetchContent_Declare(
  repo-backend
  SOURCE_DIR /dali/backend/
)
FetchContent_MakeAvailable(repo-common repo-core repo-backend)

is that right?
The DIR /dali/common, /dali/core/, /dali/backend/ is the obtained by :

 git clone https://github.com/triton-inference-server/common.git
 git clone https://github.com/triton-inference-server/core.git
 git clone https://github.com/triton-inference-server/backend.git

I build the docker image successfully.

szalpal · 2021-06-17T08:06:49Z

@Edwardmark ,

what is the problem you are facing?

Edwardmark · 2021-06-17T08:10:53Z

@szalpal could you please give me more hit on how to change GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands? Thanks. I changed the lines as follows:
FetchContent_Declare(
  repo-common
  SOURCE_DIR /dali/common/
)
FetchContent_Declare(
  repo-core
  SOURCE_DIR /dali/core/
)
FetchContent_Declare(
  repo-backend
  SOURCE_DIR /dali/backend/
)
FetchContent_MakeAvailable(repo-common repo-core repo-backend)
is that right?
The DIR /dali/common, /dali/core/, /dali/backend/ is the obtained by :
 git clone https://github.com/triton-inference-server/common.git
 git clone https://github.com/triton-inference-server/core.git
 git clone https://github.com/triton-inference-server/backend.git
I build the docker image successfully.
@szalpal I just want to make sure if the way I tried is correct to replace git repo with local repos.
The docker-build process is ok, but when I want to run the server, it shows a bug:

I0617 08:17:59.462826 81 dali_backend.cc:269] Triton TRITONBACKEND API version: 1.0
I0617 08:17:59.462836 81 dali_backend.cc:273] 'dali' TRITONBACKEND API version: 1.4
 Segmentation fault (core dumped)

So how to deal with that?

szalpal · 2021-06-17T08:37:58Z

@Edwardmark ,

As I mentioned above, we do not support nor plan to support this kind of building procedure. Therefore I unfortunately won't be able to answer all the questions, simply because I didn't tried it nor tested it.

The error you're facing is there because the server verifies the API version the backend has been built with. Be sure to use proper version of backend.git repo, which has the following defines:

#define TRITONBACKEND_API_VERSION_MAJOR 1
#define TRITONBACKEND_API_VERSION_MINOR 0

Edwardmark · 2021-06-17T08:41:29Z

@Edwardmark ,

As I mentioned above, we do not support nor plan to support this kind of building procedure. Therefore I unfortunately won't be able to answer all the questions, simply because I didn't tried it nor tested it.

The error you're facing is there because the server verifies the API version the backend has been built with. Be sure to use proper version of backend.git repo, which has the following defines:
#define TRITONBACKEND_API_VERSION_MAJOR 1
#define TRITONBACKEND_API_VERSION_MINOR 0

I checkout to the 21.05 branch, and the problem is solved. Thank you very much.@szalpal

Edwardmark · 2021-06-18T05:06:02Z

@szalpal Do I have to install nvidia-dali-nightly?
https://github.com/triton-inference-server/dali_backend/blob/main/docker/Dockerfile.release#L65
Thanks.

Edwardmark · 2021-06-18T05:34:19Z

@szalpal Thanks.

szalpal · 2021-06-18T05:34:58Z

@szalpal Do I have to install nvidia-dali-nightly?
https://github.com/triton-inference-server/dali_backend/blob/main/docker/Dockerfile.release#L65
Thanks.

Not necessarily. We recommend using latest DALI release

Edwardmark · 2021-06-18T05:37:37Z

https://github.com/triton-inference-server/dali_backend/blob/main/docker/Dockerfile.release#L65

@Edwardmark

It's possible, that even though you changed the ExternalSource to "cpu", the bug still prevents normal processing. Anyhow, we've just merged the GPU input feature to upstream. It's going to be released in tritonserver:21.06, however it's very easy to run the upstream dali_backend with the latest tritonserver release.

Could you try it out and verify if the GPU input solves you problem, or we need to dig deeper? The instructions how to build dali_backend docker image are here: Docker build

If I use dali 1.2, would the dali_backend support gpu input?

szalpal · 2021-06-18T12:01:16Z

If I use dali 1.2, would the dali_backend support gpu input?

@Edwardmark, yes. Although we don't guarantee backwards compatibility. Therefore, only the latest DALI version is properly tested and maintained

szalpal self-assigned this May 19, 2021

Edwardmark closed this as completed May 21, 2021

szalpal transferred this issue from triton-inference-server/server May 26, 2021

szalpal added the question Further information is requested label May 26, 2021

szalpal reopened this May 26, 2021

Edwardmark closed this as completed May 31, 2021

Edwardmark reopened this Jun 17, 2021

Edwardmark closed this as completed Jun 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Edwardmark commented May 19, 2021

szalpal commented May 19, 2021

Edwardmark commented May 20, 2021

Edwardmark commented May 20, 2021

szalpal commented May 20, 2021

Edwardmark commented May 21, 2021

Edwardmark commented May 26, 2021

szalpal commented May 26, 2021

Edwardmark commented May 27, 2021

szalpal commented May 27, 2021 •

edited

Edwardmark commented May 31, 2021

Edwardmark commented Jun 17, 2021 •

edited

szalpal commented Jun 17, 2021

Edwardmark commented Jun 17, 2021 •

edited

szalpal commented Jun 17, 2021

Edwardmark commented Jun 17, 2021 •

edited

Edwardmark commented Jun 18, 2021 •

edited

Edwardmark commented Jun 18, 2021

szalpal commented Jun 18, 2021

Edwardmark commented Jun 18, 2021

szalpal commented Jun 18, 2021

Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Comments

Edwardmark commented May 19, 2021

szalpal commented May 19, 2021

Edwardmark commented May 20, 2021

Edwardmark commented May 20, 2021

szalpal commented May 20, 2021

Edwardmark commented May 21, 2021

Edwardmark commented May 26, 2021

szalpal commented May 26, 2021

Edwardmark commented May 27, 2021

szalpal commented May 27, 2021 • edited

Edwardmark commented May 31, 2021

Edwardmark commented Jun 17, 2021 • edited

szalpal commented Jun 17, 2021

Edwardmark commented Jun 17, 2021 • edited

szalpal commented Jun 17, 2021

Edwardmark commented Jun 17, 2021 • edited

Edwardmark commented Jun 18, 2021 • edited

Edwardmark commented Jun 18, 2021

szalpal commented Jun 18, 2021

Edwardmark commented Jun 18, 2021

szalpal commented Jun 18, 2021

szalpal commented May 27, 2021 •

edited

Edwardmark commented Jun 17, 2021 •

edited

Edwardmark commented Jun 17, 2021 •

edited

Edwardmark commented Jun 17, 2021 •

edited

Edwardmark commented Jun 18, 2021 •

edited