Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensemble model cannot be inferenced by clients without clear error log to debug. #70

Closed
Edwardmark opened this issue May 19, 2021 · 20 comments
Assignees
Labels
question Further information is requested

Comments

@Edwardmark
Copy link

Description
I run a ensemble model contains three model which executed sequencely one by one, I check each model, and each one is ok, and I also check two models ensemble, that is ok too. But when I connect them together,and I run the grpc client, the server crashed without meaningful error logs as follows:

Traceback (most recent call last):
  File "grpc_client.py", line 209, in <module>
    main()
  File "grpc_client.py", line 182, in main
    results = triton_client.infer(model_name=model_name, inputs=inputs, outputs=outputs)
  File "/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py", line 1086, in infer
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed

Triton Information
21.03 docker container

To Reproduce
The ensemble config.pbtxt is as follows:

platform: "ensemble"
max_batch_size: 16
input [
  {
    name: "IMAGE_RAW"
    data_type: TYPE_UINT8
    dims: [ -1 ]
  }
]
output [
  {
    name: "SCALE_RATIO" # ratio
    data_type: TYPE_FP32
    dims: [2]
  },
  {
    name: "NUM_DETECTIONS"
    data_type: TYPE_INT32
    dims: [ 1 ]
  },
  {
    name: "NMSED_SCORES"
    data_type: TYPE_FP32
    dims: [ 100 ]
  },
  {
    name: "NMSED_CLASSES"
    data_type: TYPE_FP32
    dims: [ 100 ]
  },
  {
    name: "SCALED_NMSED_BOXES"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  }
]

ensemble_scheduling {
  step [
    {
      model_name: "dali_det_pre"
      model_version: -1
      input_map {
        key: "IMAGE_RAW"
        value: "IMAGE_RAW"
      }
      output_map {
        key: "DALI_OUTPUT_0"
        value: "NORM_IMG"
      }
      output_map {
        key: "DALI_OUTPUT_1"
        value: "SCALE_RATIO"
      }
    },
    {
      model_name: "face_det-ucs"
      model_version: -1
      input_map {
        key: "images"
        value: "NORM_IMG"
      }
      output_map {
        key: "num_detections"
        value: "NUM_DETECTIONS"
      }
      output_map {
        key: "nmsed_boxes"
        value: "NMSED_BOXES"
      }
      output_map {
        key: "nmsed_scores"
        value: "NMSED_SCORES"
      }
      output_map {
        key: "nmsed_classes"
        value: "NMSED_CLASSES"
      }
    },
    {
      model_name: "dali_det_post"
      model_version: -1
      input_map {
        key: "NMSED_BOXES"
        value: "NMSED_BOXES"
      }
      input_map {
        key: "SCALE_RATIO_INPUT"
        value: "SCALE_RATIO"
      }
      output_map {
        key: "SCALED_NMSED_BOXES_OUTPUT"
        value: "SCALED_NMSED_BOXES"
      }
    }
  ]
}

The client is as follows:

FLAGS = parse_args()

    triton_client = tritonclient.grpc.InferenceServerClient(url=FLAGS.url,
                                                            verbose=FLAGS.verbose)

    model_name = FLAGS.model_name
    model_version = -1

    print("Loading images")

    image_data = load_images(FLAGS.img_dir if FLAGS.img_dir is not None else FLAGS.img,
                             max_images=FLAGS.batch_size * FLAGS.n_iter)

    image_data = array_from_list(image_data)
    inputs = generate_inputs(FLAGS.input_names, image_data.shape, "UINT8")
    outputs = generate_outputs(FLAGS.output_names)

    # Initialize the data
    inputs[0].set_data_from_numpy(image_data)
    # Test with outputs
    results = triton_client.infer(model_name=model_name, inputs=inputs, outputs=outputs)
    print(results)

Expected behavior
results should be obtained without error, but now, the server just crashes.
@deadeyegoodwin Looking forward to your reply.

The dali_det_post model config.pbtxt is as follows:

backend: "dali"
max_batch_size: 32
input [
  {
    name: "NMSED_BOXES"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  },
  {
    name: "SCALE_RATIO_INPUT"
    data_type: TYPE_FP32
    dims: [ 2 ]
  }
]
output [
  {
    name: "SCALED_NMSED_BOXES_OUTPUT" 
    data_type: TYPE_FP32
    dims: [100, 4]
  }
]
dynamic_batching {
  preferred_batch_size: [ 4, 8, 16, 32 ]
  max_queue_delay_microseconds: 100
}

the above pipeline is generated using the following code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    nmsed_boxes = fn.external_source(device='gpu', name="NMSED_BOXES")
    scale_ratio = fn.external_source(device='gpu', name='SCALE_RATIO_INPUT')
   
    # Rescale BBOX
    ratio = fn.reductions.min(scale_ratio)
    nmsed_boxes /= ratio
    pipe.set_outputs(nmsed_boxes)

pipe.serialize(filename="1/model.dali")

The above dali_det_post modle can run correctly itself, but connecting it to the firt two models causes crashes in server.

Change the above post-processing model with python backend model as follows can run without error:

for request in requests:
            # Get INPUT0
            in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
            # Get INPUT1
            in_1 = pb_utils.get_input_tensor_by_name(request, "INPUT1")

            out_0 = in_0.as_numpy() / np.min(in_1.as_numpy())

            # Create output tensors. You need pb_utils.Tensor
            # objects to create pb_utils.InferenceResponse.
            out_tensor_0 = pb_utils.Tensor("OUTPUT0",
                                           out_0.astype(output0_dtype))

            # Create InferenceResponse. You can set an error here in case
            # there was a problem with handling this inference request.
            # Below is an example of how you can set errors in inference
            # response:
            #
            # pb_utils.InferenceResponse(
            #    output_tensors=..., TritonError("An error occured"))
            inference_response = pb_utils.InferenceResponse(
                output_tensors=[out_tensor_0])
            responses.append(inference_response)

Python config.pbtxt is as follows:

name: "python_det_post"
backend: "python"

input [
  {
    name: "INPUT0"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
    
  }
]
input [
  {
    name: "INPUT1"
    data_type: TYPE_FP32
    dims: [ 2 ]
    
  }
]
output [
  {
    name: "OUTPUT0"
    data_type: TYPE_FP32
    dims: [ 100, 4 ]
  }
]

instance_group [{ kind: KIND_CPU }]

Any suggestions please? @deadeyegoodwin Thanks in advance.

@szalpal szalpal self-assigned this May 19, 2021
@szalpal
Copy link
Member

szalpal commented May 19, 2021

Hi @Edwardmark !

Thank you for extensive description of the problem. I suspect your issue might be connected to the "gpu" backend of external_source operator in DALI. Currently, the GPU input is not yet supported - we are finishing this effort (
#53). It's going to be released in tritonserver:21.06.

Should you like to verify that it's about the GPU input, please update your tritonserver to 21.04. With this version we added missing error log in DALI Backend (#43).

@Edwardmark
Copy link
Author

@szalpal I changed the dali_det_post pipeline as follows:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    nmsed_boxes = fn.external_source(device='cpu', name="NMSED_BOXES")
    scale_ratio = fn.external_source(device='cpu', name='SCALE_RATIO_INPUT')
   
    # Rescale BBOX
    ratio = fn.reductions.min(scale_ratio)
    nmsed_boxes /= ratio
    pipe.set_outputs(nmsed_boxes)

pipe.serialize(filename="1/model.dali")

But I met the same error:

I0520 02:15:41.783194 133528 ensemble_scheduler.cc:509] Internal response allocation: nmsed_classes, size 400, addr 0x7fb0844b0e00, memory type 2, type id 0
I0520 02:15:41.788463 133528 ensemble_scheduler.cc:524] Internal response release: size 4, addr 0x7fb0844b0200
I0520 02:15:41.788483 133528 ensemble_scheduler.cc:524] Internal response release: size 1600, addr 0x7fb0844b0400
I0520 02:15:41.788489 133528 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7fb0844b0c00
I0520 02:15:41.788496 133528 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7fb0844b0e00
I0520 02:15:41.788517 133528 infer_request.cc:502] prepared: [0x0x7fadd40015e0] request id: , model: dali_det_post, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7fadd40019b8] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
[0x0x7fadd4001868] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
override inputs:
inputs:
[0x0x7fadd4001868] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
[0x0x7fadd40019b8] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
original requested outputs:
SCALED_NMSED_BOXES_OUTPUT
requested outputs:
SCALED_NMSED_BOXES_OUTPUT

tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed
> /app/model_repository/ensemble-face_det-ucs/grpc_client.py(182)main()

In addition, my first preprocess model is defined as follows:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as types
import argparse
import numpy as np
import os

pipe = dali.pipeline.Pipeline(batch_size=32, num_threads=8)
with pipe:
    expect_output_size = (640., 640.)
    images = fn.external_source(device='cpu', name="IMAGE_RAW")
    images = fn.image_decoder(images, device="mixed", output_type=types.RGB)
    raw_shapes = fn.shapes(images, dtype=types.INT32)
    images = fn.resize(
        images,
        mode='not_larger',
        size=expect_output_size,
    )
    resized_shapes = fn.shapes(images, dtype=types.INT32)
    ratio = fn.slice(resized_shapes / raw_shapes, 0, 2, axes=[0])
    images = fn.crop_mirror_normalize(images, mean=[0.], std=[255.], output_layout='CHW')
    images = fn.pad(images, axis_names="HW", align=expect_output_size)
    pipe.set_outputs(images, ratio)
os.system('rm -rf 1 && mkdir -p 1')
pipe.serialize(filename="1/model.dali")

Any advise to make it work please? Thanks. @szalpal

@Edwardmark
Copy link
Author

@szalpal I changed the version to 21.04 and change all input to cpu, but still no error log is shown, and I get the same log as below, what is your advise? Thanks.
The outpus is same as 21.03

I0520 02:58:13.877026 1181 plan_backend.cc:2447] Running face_det-ucs_0_gpu0 with 1 requests
I0520 02:58:13.877071 1181 plan_backend.cc:3378] Optimization profile default [0] is selected for face_det-ucs_0_gpu0
I0520 02:58:13.877337 1181 plan_backend.cc:2869] Context with profile default [0] is being executed for face_det-ucs_0_gpu0
I0520 02:58:14.543531 1181 infer_response.cc:139] add response output: output: num_detections, type: INT32, shape: [1,1]
I0520 02:58:14.543578 1181 ensemble_scheduler.cc:509] Internal response allocation: num_detections, size 4, addr 0x7f7bf04b0200, memory type 2, type id 0
I0520 02:58:14.543609 1181 infer_response.cc:139] add response output: output: nmsed_boxes, type: FP32, shape: [1,100,4]
I0520 02:58:14.543621 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_boxes, size 1600, addr 0x7f7bf04b0400, memory type 2, type id 0
I0520 02:58:14.543642 1181 infer_response.cc:139] add response output: output: nmsed_scores, type: FP32, shape: [1,100]
I0520 02:58:14.543653 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_scores, size 400, addr 0x7f7bf04b0c00, memory type 2, type id 0
I0520 02:58:14.543672 1181 infer_response.cc:139] add response output: output: nmsed_classes, type: FP32, shape: [1,100]
I0520 02:58:14.543683 1181 ensemble_scheduler.cc:509] Internal response allocation: nmsed_classes, size 400, addr 0x7f7bf04b0e00, memory type 2, type id 0
I0520 02:58:14.544713 1181 ensemble_scheduler.cc:524] Internal response release: size 4, addr 0x7f7bf04b0200
I0520 02:58:14.544741 1181 ensemble_scheduler.cc:524] Internal response release: size 1600, addr 0x7f7bf04b0400
I0520 02:58:14.544749 1181 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7f7bf04b0c00
I0520 02:58:14.544764 1181 ensemble_scheduler.cc:524] Internal response release: size 400, addr 0x7f7bf04b0e00
I0520 02:58:14.544789 1181 infer_request.cc:497] prepared: [0x0x7f79300016b0] request id: , model: dali_det_post, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7f7930001a88] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
[0x0x7f7930001938] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
override inputs:
inputs:
[0x0x7f7930001938] input: SCALE_RATIO_INPUT, type: FP32, original shape: [1,2], batch + shape: [1,2], shape: [2]
[0x0x7f7930001a88] input: NMSED_BOXES, type: FP32, original shape: [1,100,4], batch + shape: [1,100,4], shape: [100,4]
original requested outputs:
SCALED_NMSED_BOXES_OUTPUT
requested outputs:
SCALED_NMSED_BOXES_OUTPUT

tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Socket closed
> /app/model_repository_2104/ensemble-face_det-ucs/grpc_client.py(182)main()

@szalpal
Copy link
Member

szalpal commented May 20, 2021

@Edwardmark

It's possible, that even though you changed the ExternalSource to "cpu", the bug still prevents normal processing. Anyhow, we've just merged the GPU input feature to upstream. It's going to be released in tritonserver:21.06, however it's very easy to run the upstream dali_backend with the latest tritonserver release.

Could you try it out and verify if the GPU input solves you problem, or we need to dig deeper? The instructions how to build dali_backend docker image are here: Docker build

@Edwardmark
Copy link
Author

@szalpal It works, thank you very much.

@Edwardmark
Copy link
Author

@szalpal How to build the docker without download the git repositorys? I mean if I download the related git repos beforehand, what changes should I make to the cmakelists in dali_backend? When build the docker, it happens the following erros which seems like network error:

Step 12/19 : RUN mkdir build_in_ci && cd build_in_ci &&     cmake                                                   -D CMAKE_INSTALL_PREFIX=/opt/tritonserver             -D CMAKE_BUILD_TYPE=Release                           -D TRITON_COMMON_REPO_TAG="r$TRITON_VERSION"          -D TRITON_CORE_REPO_TAG="r$TRITON_VERSION"            -D TRITON_BACKEND_REPO_TAG="r$TRITON_VERSION"         .. &&                                               make -j"$(grep ^processor /proc/cpuinfo | wc -l)" install
 ---> Running in e11becb3e19f
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc - works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build configuration: Release
-- RapidJSON found. Headers: /usr/include
-- RapidJSON found. Headers: /usr/include
Scanning dependencies of target repo-core-populate
[ 11%] Creating directories for 'repo-core-populate'
[ 22%] Performing download step (git clone) for 'repo-core-populate'
Cloning into 'repo-core-src'...
Switched to a new branch 'r21.05'
Branch 'r21.05' set up to track remote branch 'r21.05' from 'origin'.
[ 33%] No patch step for 'repo-core-populate'
[ 44%] Performing update step for 'repo-core-populate'
fatal: unable to access 'https://github.com/triton-inference-server/core.git/': GnuTLS recv error (-110): The TLS connection was non-properly terminated.
CMake Error at /dali/build_in_ci/_deps/repo-core-subbuild/repo-core-populate-prefix/tmp/repo-core-populate-gitupdate.cmake:55 (message):
  Failed to fetch repository
  'https://github.com/triton-inference-server/core.git'


make[2]: *** [CMakeFiles/repo-core-populate.dir/build.make:117: repo-core-populate-prefix/src/repo-core-populate-stamp/repo-core-populate-update] Error 1
make[1]: *** [CMakeFiles/Makefile2:96: CMakeFiles/repo-core-populate.dir/all] Error 2
make: *** [Makefile:104: all] Error 2

CMake Error at /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:912 (message):
  Build step for repo-core failed: 2
Call Stack (most recent call first):
  /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:1003 (__FetchContent_directPopulate)
  /usr/local/share/cmake-3.17/Modules/FetchContent.cmake:1044 (FetchContent_Populate)
  CMakeLists.txt:72 (FetchContent_MakeAvailable)

@szalpal szalpal transferred this issue from triton-inference-server/server May 26, 2021
@szalpal szalpal added the question Further information is requested label May 26, 2021
@szalpal
Copy link
Member

szalpal commented May 26, 2021

@Edwardmark ,

as far as I now, unfortunately cloning git repos is immanent for building backends in Triton. Is there a particular reason you would like to clone repos beforehand? If you want to use the latest tritonserver version (21.05), I merged today the PR, which applies that #68 . So you can clone the upstream dali_backend

@szalpal szalpal reopened this May 26, 2021
@Edwardmark
Copy link
Author

@szalpal because the network is not always good, so I want to clone repos beforhand, then just use the repo to make the build process more quicklyl.

@szalpal
Copy link
Member

szalpal commented May 27, 2021

@Edwardmark ,

I see. It would be possible to tweak the root CMakeLists.txt file in order to achieve what you want. Although it is not in our scope right now (and I doubt it will ever be), so we will not implement it, you would need to try to do it yourself.

IMPORTANT: this is a dirty explanation of a workaround and we certainly do not support nor plan to support this way of building in foreseeable future. We also highly discourage changing this building procedure for production environments.

The point is, that there are these three repos, that need to be acquired for proper building any backend: core, common and backend. Our build procedure acquires them in these three declarations:

FetchContent_Declare(
repo-common
GIT_REPOSITORY https://github.com/triton-inference-server/common.git
GIT_TAG ${TRITON_COMMON_REPO_TAG}
GIT_SHALLOW ON
)
FetchContent_Declare(
repo-core
GIT_REPOSITORY https://github.com/triton-inference-server/core.git
GIT_TAG ${TRITON_CORE_REPO_TAG}
GIT_SHALLOW ON
)
FetchContent_Declare(
repo-backend
GIT_REPOSITORY https://github.com/triton-inference-server/backend.git
GIT_TAG ${TRITON_BACKEND_REPO_TAG}
GIT_SHALLOW ON
)

Should you like to change them to be acquired from your disk, firstly clone all three repos you need and then you can switch from fetching content from git repository to fetching content from disk location, by changing GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands. Below is the documentation of the FetchContent functions, which might be helpful:
https://cmake.org/cmake/help/latest/module/FetchContent.html
https://cmake.org/cmake/help/latest/module/ExternalProject.html#command:externalproject_add
You should pay attention to Directory Options in ExternalProject_Add directive

@Edwardmark
Copy link
Author

@szalpal Thank you very much.

@Edwardmark
Copy link
Author

Edwardmark commented Jun 17, 2021

@szalpal could you please give me more hit on how to change GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands? Thanks. I changed the lines as follows:


FetchContent_Declare(
  repo-common
  SOURCE_DIR /dali/common/
)
FetchContent_Declare(
  repo-core
  SOURCE_DIR /dali/core/
)
FetchContent_Declare(
  repo-backend
  SOURCE_DIR /dali/backend/
)
FetchContent_MakeAvailable(repo-common repo-core repo-backend)

is that right?
The DIR /dali/common, /dali/core/, /dali/backend/ is the obtained by :

 git clone https://github.com/triton-inference-server/common.git
 git clone https://github.com/triton-inference-server/core.git
 git clone https://github.com/triton-inference-server/backend.git

I build the docker image successfully.

@Edwardmark Edwardmark reopened this Jun 17, 2021
@szalpal
Copy link
Member

szalpal commented Jun 17, 2021

@Edwardmark ,

what is the problem you are facing?

@Edwardmark
Copy link
Author

Edwardmark commented Jun 17, 2021

@szalpal could you please give me more hit on how to change GIT, GIT_SHALLOW and GIT_REPOSITORY subcommands? Thanks. I changed the lines as follows:


FetchContent_Declare(
  repo-common
  SOURCE_DIR /dali/common/
)
FetchContent_Declare(
  repo-core
  SOURCE_DIR /dali/core/
)
FetchContent_Declare(
  repo-backend
  SOURCE_DIR /dali/backend/
)
FetchContent_MakeAvailable(repo-common repo-core repo-backend)

is that right?
The DIR /dali/common, /dali/core/, /dali/backend/ is the obtained by :

 git clone https://github.com/triton-inference-server/common.git
 git clone https://github.com/triton-inference-server/core.git
 git clone https://github.com/triton-inference-server/backend.git

I build the docker image successfully.
@szalpal I just want to make sure if the way I tried is correct to replace git repo with local repos.
The docker-build process is ok, but when I want to run the server, it shows a bug:

I0617 08:17:59.462826 81 dali_backend.cc:269] Triton TRITONBACKEND API version: 1.0
I0617 08:17:59.462836 81 dali_backend.cc:273] 'dali' TRITONBACKEND API version: 1.4
 Segmentation fault (core dumped)

So how to deal with that?

@szalpal
Copy link
Member

szalpal commented Jun 17, 2021

@Edwardmark ,

As I mentioned above, we do not support nor plan to support this kind of building procedure. Therefore I unfortunately won't be able to answer all the questions, simply because I didn't tried it nor tested it.

The error you're facing is there because the server verifies the API version the backend has been built with. Be sure to use proper version of backend.git repo, which has the following defines:

#define TRITONBACKEND_API_VERSION_MAJOR 1
#define TRITONBACKEND_API_VERSION_MINOR 0

@Edwardmark
Copy link
Author

Edwardmark commented Jun 17, 2021

@Edwardmark ,

As I mentioned above, we do not support nor plan to support this kind of building procedure. Therefore I unfortunately won't be able to answer all the questions, simply because I didn't tried it nor tested it.

The error you're facing is there because the server verifies the API version the backend has been built with. Be sure to use proper version of backend.git repo, which has the following defines:

#define TRITONBACKEND_API_VERSION_MAJOR 1
#define TRITONBACKEND_API_VERSION_MINOR 0

I checkout to the 21.05 branch, and the problem is solved. Thank you very much.@szalpal

@Edwardmark
Copy link
Author

Edwardmark commented Jun 18, 2021

@Edwardmark
Copy link
Author

@szalpal Thanks.

@szalpal
Copy link
Member

szalpal commented Jun 18, 2021

@szalpal Do I have to install nvidia-dali-nightly?
https://github.com/triton-inference-server/dali_backend/blob/main/docker/Dockerfile.release#L65
Thanks.

Not necessarily. We recommend using latest DALI release

@Edwardmark
Copy link
Author

https://github.com/triton-inference-server/dali_backend/blob/main/docker/Dockerfile.release#L65

@Edwardmark

It's possible, that even though you changed the ExternalSource to "cpu", the bug still prevents normal processing. Anyhow, we've just merged the GPU input feature to upstream. It's going to be released in tritonserver:21.06, however it's very easy to run the upstream dali_backend with the latest tritonserver release.

Could you try it out and verify if the GPU input solves you problem, or we need to dig deeper? The instructions how to build dali_backend docker image are here: Docker build

If I use dali 1.2, would the dali_backend support gpu input?

@szalpal
Copy link
Member

szalpal commented Jun 18, 2021

If I use dali 1.2, would the dali_backend support gpu input?

@Edwardmark, yes. Although we don't guarantee backwards compatibility. Therefore, only the latest DALI version is properly tested and maintained

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants