Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not able to load the gptlm example given in readme #5

Closed
rkoystart opened this issue Sep 29, 2020 · 3 comments
Closed

not able to load the gptlm example given in readme #5

rkoystart opened this issue Sep 29, 2020 · 3 comments

Comments

@rkoystart
Copy link

so when i run the following command i am getting this error . Is it because of the some mistake in gpt.pb file ?
sudo docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/media/zlabs-nlp/hdd1/ravi/ravi/lightseq/modelzoo:/models nvcr.io/nvidia/tensorrtserver:19.05-py3 trtserver --model-store=/models

===============================
== TensorRT Inference Server ==
===============================

NVIDIA Release 19.05 (build 6393584)

Copyright (c) 2018-2019, NVIDIA CORPORATION.  All rights reserved.
Copyright 2019 The TensorFlow Authors.  All rights reserved.
Copyright 2019 The TensorFlow Serving Authors.  All rights reserved.
Copyright (c) 2016-present, Facebook Inc. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

I0929 14:23:02.049226 1 main.cc:267] Starting endpoints, 'inference:0' listening on
I0929 14:23:02.049323 1 main.cc:271]  localhost:8001 for gRPC requests
I0929 14:23:02.049440 1 grpc_server.cc:265] Building nvrpc server
I0929 14:23:02.049452 1 grpc_server.cc:272] Register TensorRT GRPCService
I0929 14:23:02.049464 1 grpc_server.cc:275] Register Infer RPC
I0929 14:23:02.049470 1 grpc_server.cc:279] Register StreamInfer RPC
I0929 14:23:02.049474 1 grpc_server.cc:284] Register Status RPC
I0929 14:23:02.049480 1 grpc_server.cc:288] Register Profile RPC
I0929 14:23:02.049484 1 grpc_server.cc:292] Register Health RPC
I0929 14:23:02.049490 1 grpc_server.cc:304] Register Executor
I0929 14:23:02.054788 1 main.cc:282]  localhost:8000 for HTTP requests
I0929 14:23:02.096256 1 main.cc:294]  localhost:8002 for metric reporting
I0929 14:23:02.098009 1 metrics.cc:149] found 1 GPUs supporting NVML metrics
I0929 14:23:02.103680 1 metrics.cc:158]   GPU 0: GeForce GTX 1080 Ti
I0929 14:23:02.104140 1 server.cc:243] Initializing TensorRT Inference Server
I0929 14:23:02.109893 1 server_status.cc:106] New status tracking for model 'gptlm'
2020-09-29 14:23:02.110033: I external/tf_serving/tensorflow_serving/model_servers/server_core.cc:465] Adding/updating models.
2020-09-29 14:23:02.110065: I external/tf_serving/tensorflow_serving/model_servers/server_core.cc:562]  (Re-)adding model: gptlm
2020-09-29 14:23:02.210513: I external/tf_serving/tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: gptlm version: 1}
2020-09-29 14:23:02.210607: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: gptlm version: 1}
2020-09-29 14:23:02.210631: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: gptlm version: 1}
I0929 14:23:02.211445 1 custom_bundle.cc:164] Creating instance gptlm_0_0_gpu0 on GPU 0 (6.1) using libgptlm.so
2020-09-29 14:23:02.219823: I external/tf_serving/tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: gptlm version: 1}
Trtis instance init start
plz set environment variable MODEL_ZOO !
E0929 14:23:10.055644 1 dynamic_batch_scheduler.cc:162] Initialization failed for dynamic-batch scheduler thread 0: initialize error for 'gptlm': (18) load gpt weight in .pb failed

This is the example given in the readme and also i want the config.pbtxt and weight file for the transformer model that is supported by you if possible
Thanks in advance !!!!!

@Taka152
Copy link
Contributor

Taka152 commented Sep 30, 2020

plz set environment variable MODEL_ZOO !

It seems that you forgot to set this environment variable required by the server.

i want the config.pbtxt and weight file for the transformer model that is supported by you if possible

You can check the proto/transformer.proto file to generate your transformer model(.pb file) and the config.pbtxt file only differs in names of inputs and outputs. Also, we'll consider adding a transformer example after the National Day holiday.

@rkoystart
Copy link
Author

rkoystart commented Sep 30, 2020

@Taka152 even after initializing the environment variable i am getting the same error

(allennlp)nlp@nlp:/hdd1/ravi/ravi/lightseq$ export MODEL_ZOO="/hdd1/ravi/ravi/lightseq/modelzoo"
(allennlp)nlp@nlp:/hdd1/ravi/ravi/lightseq$ sudo docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/hdd1/ravi/ravi/lightseq/modelzoo:/models
nvcr.io/nvidia/tensorrtserver:19.05-py3 trtserver --model-store=/models

@Taka152
Copy link
Contributor

Taka152 commented Oct 12, 2020

@rkoystart You can try docker run -e MODEL_ZOO="/models" to add environment variable in the container. This variable is used to find .pb file when starting the custom engine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants