Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16B 2gpu models on HF Hub are corrupt #82

Open
3 tasks
moyix opened this issue Oct 14, 2022 · 1 comment
Open
3 tasks

16B 2gpu models on HF Hub are corrupt #82

moyix opened this issue Oct 14, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@moyix
Copy link
Collaborator

moyix commented Oct 14, 2022

Not sure how this happened, but currently the 16B 2gpu models fail in unzstd with Decoding error (36) : Corrupted block detected. I will re-convert and re-upload them. Steps to fix:

  • Double-check to see if any other models are affected.
  • Re-run the conversion script and create that .tar.zst files
  • To prevent this in the future, add a SHASUMS file for each model and check it in ./setup.sh to prevent corrupted downloads.
@moyix moyix self-assigned this Oct 14, 2022
@moyix moyix added the bug Something isn't working label Oct 14, 2022
@bananasmoothii
Copy link

bananasmoothii commented Oct 21, 2023

I'm not sure if this is the same issue, but I tried to use a 2gpu model and got this:

terminate called after throwing an instance of 'std::runtime_error'
  what():  [FT][ERROR] shared_ft_model->getTensorParaSize() * shared_ft_model->getPipelineParaSize() == world_size Assertion fail: /workspace/build/fastertransformer_backend/src/libfastertransformer.cc:498
Complete logs
[+] Building 0.0s (1/4)                                                                                                                                                                                                                      docker:default
 => [copilot_proxy internal] load .dockerignore                                                                                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                                                                                                        0.0s
 => [copilot_proxy internal] load build definition from proxy.Dockerfile                                                                                                                                                                               0.0s
 => => transferring dockerfile: 307B                                                                                                                                                                                                 [+] Building 2.2s (17/17) FINISHED                                                                                                                                                                                    docker:default
 => [copilot_proxy internal] load .dockerignore                                                                                                                                                                                 0.0s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [copilot_proxy internal] load build definition from proxy.Dockerfile                                                                                                                                                        0.1s
 => => transferring dockerfile: 307B                                                                                                                                                                                            0.0s
 => [triton internal] load .dockerignore                                                                                                                                                                                        0.1s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [triton internal] load build definition from triton.Dockerfile                                                                                                                                                              0.1s
 => => transferring dockerfile: 325B                                                                                                                                                                                            0.0s
 => [copilot_proxy internal] load metadata for docker.io/library/python:3.10-slim-buster                                                                                                                                        1.7s
 => [triton internal] load metadata for docker.io/moyix/triton_with_ft:22.09                                                                                                                                                    0.8s
 => [triton 1/3] FROM docker.io/moyix/triton_with_ft:22.09@sha256:5a15c1f29c6b018967b49c588eb0ea67acbf897abb7f26e509ec21844574c9b1                                                                                              0.0s
 => CACHED [triton 2/3] RUN python3 -m pip install --disable-pip-version-check -U torch --extra-index-url https://download.pytorch.org/whl/cu116                                                                                0.0s
 => CACHED [triton 3/3] RUN python3 -m pip install --disable-pip-version-check -U transformers bitsandbytes accelerate                                                                                                          0.0s
 => [triton] exporting to image                                                                                                                                                                                                 0.0s
 => => exporting layers                                                                                                                                                                                                         0.0s
 => => writing image sha256:79dd3771c789003418dd215e18f816ca7e796d4d77a4de792907f7d8aa8a5bee                                                                                                                                    0.0s
 => => naming to docker.io/library/fauxpilot-triton                                                                                                                                                                             0.0s
 => [copilot_proxy 1/5] FROM docker.io/library/python:3.10-slim-buster@sha256:37aa274c2d001f09b14828450d903c55f821c90f225fdfdd80c5180fcca77b3f                                                                                  0.0s
 => [copilot_proxy internal] load build context                                                                                                                                                                                 0.3s
 => => transferring context: 1.10kB                                                                                                                                                                                             0.3s
 => CACHED [copilot_proxy 2/5] WORKDIR /python-docker                                                                                                                                                                           0.0s
 => CACHED [copilot_proxy 3/5] COPY copilot_proxy/requirements.txt requirements.txt                                                                                                                                             0.0s
 => CACHED [copilot_proxy 4/5] RUN pip3 install --no-cache-dir -r requirements.txt                                                                                                                                              0.0s
 => CACHED [copilot_proxy 5/5] COPY copilot_proxy .                                                                                                                                                                             0.0s
 => [copilot_proxy] exporting to image                                                                                                                                                                                          0.0s
 => => exporting layers                                                                                                                                                                                                         0.0s
 => => writing image sha256:6aaa5d89d067dcc60e23eed04bb393abeb1d1e62ff46fd6031ee15d63a480801                                                                                                                                    0.0s
 => => naming to docker.io/library/fauxpilot-copilot_proxy                                                                                                                                                                      0.0s
[+] Running 2/0
 ✔ Container fauxpilot-copilot_proxy-1  Created                                                                                                                                                                                 0.0s
 ✔ Container fauxpilot-triton-1         Created                                                                                                                                                                                 0.0s
Attaching to fauxpilot-copilot_proxy-1, fauxpilot-triton-1
fauxpilot-triton-1         |
fauxpilot-triton-1         | =============================
fauxpilot-triton-1         | == Triton Inference Server ==
fauxpilot-triton-1         | =============================
fauxpilot-triton-1         |
fauxpilot-triton-1         | NVIDIA Release 22.06 (build 39726160)
fauxpilot-triton-1         | Triton Server Version 2.23.0
fauxpilot-triton-1         |
fauxpilot-triton-1         | Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
fauxpilot-triton-1         |
fauxpilot-triton-1         | Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
fauxpilot-triton-1         |
fauxpilot-triton-1         | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
fauxpilot-triton-1         | By pulling and using the container, you accept the terms and conditions of this license:
fauxpilot-triton-1         | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
fauxpilot-copilot_proxy-1  | INFO:     Started server process [1]
fauxpilot-copilot_proxy-1  | INFO:     Waiting for application startup.
fauxpilot-copilot_proxy-1  | INFO:     Application startup complete.
fauxpilot-copilot_proxy-1  | INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)
fauxpilot-triton-1         |
fauxpilot-triton-1         | I1021 20:33:12.659520 88 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x204e00000' with size 268435456
fauxpilot-triton-1         | I1021 20:33:12.659623 88 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
fauxpilot-triton-1         | I1021 20:33:17.888475 88 model_repository_manager.cc:1191] loading: fastertransformer:1
fauxpilot-triton-1         | I1021 20:33:18.058662 88 libfastertransformer.cc:1226] TRITONBACKEND_Initialize: fastertransformer
fauxpilot-triton-1         | I1021 20:33:18.058688 88 libfastertransformer.cc:1236] Triton TRITONBACKEND API version: 1.10
fauxpilot-triton-1         | I1021 20:33:18.058691 88 libfastertransformer.cc:1242] 'fastertransformer' TRITONBACKEND API version: 1.10
fauxpilot-triton-1         | I1021 20:33:18.058712 88 libfastertransformer.cc:1274] TRITONBACKEND_ModelInitialize: fastertransformer (version 1)
fauxpilot-triton-1         | W1021 20:33:18.059506 88 libfastertransformer.cc:149] model configuration:
fauxpilot-triton-1         | {
fauxpilot-triton-1         |     "name": "fastertransformer",
fauxpilot-triton-1         |     "platform": "",
fauxpilot-triton-1         |     "backend": "fastertransformer",
fauxpilot-triton-1         |     "version_policy": {
fauxpilot-triton-1         |         "latest": {
fauxpilot-triton-1         |             "num_versions": 1
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "max_batch_size": 1024,
fauxpilot-triton-1         |     "input": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "input_ids",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "start_id",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "end_id",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "input_lengths",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "request_output_len",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "runtime_top_k",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "runtime_top_p",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "beam_search_diversity_rate",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "temperature",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "len_penalty",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "repetition_penalty",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "random_seed",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "is_return_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_BOOL",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "beam_width",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "reshape": {
fauxpilot-triton-1         |                 "shape": []
fauxpilot-triton-1         |             },
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "bad_words_list",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 2,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "stop_words_list",
fauxpilot-triton-1         |             "data_type": "TYPE_INT32",
fauxpilot-triton-1         |             "format": "FORMAT_NONE",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 2,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "is_shape_tensor": false,
fauxpilot-triton-1         |             "allow_ragged_batch": false,
fauxpilot-triton-1         |             "optional": true
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "output": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "output_ids",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "sequence_length",
fauxpilot-triton-1         |             "data_type": "TYPE_UINT32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "cum_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "output_log_probs",
fauxpilot-triton-1         |             "data_type": "TYPE_FP32",
fauxpilot-triton-1         |             "dims": [
fauxpilot-triton-1         |                 -1,
fauxpilot-triton-1         |                 -1
fauxpilot-triton-1         |             ],
fauxpilot-triton-1         |             "label_filename": "",
fauxpilot-triton-1         |             "is_shape_tensor": false
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "batch_input": [],
fauxpilot-triton-1         |     "batch_output": [],
fauxpilot-triton-1         |     "optimization": {
fauxpilot-triton-1         |         "priority": "PRIORITY_DEFAULT",
fauxpilot-triton-1         |         "input_pinned_memory": {
fauxpilot-triton-1         |             "enable": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "output_pinned_memory": {
fauxpilot-triton-1         |             "enable": true
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "gather_kernel_buffer_threshold": 0,
fauxpilot-triton-1         |         "eager_batching": false
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "instance_group": [
fauxpilot-triton-1         |         {
fauxpilot-triton-1         |             "name": "fastertransformer_0",
fauxpilot-triton-1         |             "kind": "KIND_CPU",
fauxpilot-triton-1         |             "count": 1,
fauxpilot-triton-1         |             "gpus": [],
fauxpilot-triton-1         |             "secondary_devices": [],
fauxpilot-triton-1         |             "profile": [],
fauxpilot-triton-1         |             "passive": false,
fauxpilot-triton-1         |             "host_policy": ""
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     ],
fauxpilot-triton-1         |     "default_model_filename": "codegen-6B-mono",
fauxpilot-triton-1         |     "cc_model_filenames": {},
fauxpilot-triton-1         |     "metric_tags": {},
fauxpilot-triton-1         |     "parameters": {
fauxpilot-triton-1         |         "model_name": {
fauxpilot-triton-1         |             "string_value": "codegen-6B-mono"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "is_half": {
fauxpilot-triton-1         |             "string_value": "1"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "enable_custom_all_reduce": {
fauxpilot-triton-1         |             "string_value": "0"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "vocab_size": {
fauxpilot-triton-1         |             "string_value": "51200"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "tensor_para_size": {
fauxpilot-triton-1         |             "string_value": "2"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "decoder_layers": {
fauxpilot-triton-1         |             "string_value": "33"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "size_per_head": {
fauxpilot-triton-1         |             "string_value": "256"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "max_seq_len": {
fauxpilot-triton-1         |             "string_value": "2048"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "end_id": {
fauxpilot-triton-1         |             "string_value": "50256"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "inter_size": {
fauxpilot-triton-1         |             "string_value": "16384"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "head_num": {
fauxpilot-triton-1         |             "string_value": "16"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "model_type": {
fauxpilot-triton-1         |             "string_value": "GPT-J"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "model_checkpoint_path": {
fauxpilot-triton-1         |             "string_value": "/model/fastertransformer/1/2-gpu"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "rotary_embedding": {
fauxpilot-triton-1         |             "string_value": "64"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "pipeline_para_size": {
fauxpilot-triton-1         |             "string_value": "1"
fauxpilot-triton-1         |         },
fauxpilot-triton-1         |         "start_id": {
fauxpilot-triton-1         |             "string_value": "50256"
fauxpilot-triton-1         |         }
fauxpilot-triton-1         |     },
fauxpilot-triton-1         |     "model_warmup": []
fauxpilot-triton-1         | }
fauxpilot-triton-1         | I1021 20:33:18.059575 88 libfastertransformer.cc:1320] TRITONBACKEND_ModelInstanceInitialize: fastertransformer_0 (device 0)
fauxpilot-triton-1         | W1021 20:33:18.059594 88 libfastertransformer.cc:453] Faster transformer model instance is created at GPU '0'
fauxpilot-triton-1         | W1021 20:33:18.059596 88 libfastertransformer.cc:459] Model name codegen-6B-mono
fauxpilot-triton-1         | W1021 20:33:18.059601 88 libfastertransformer.cc:578] Get input name: input_ids, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059603 88 libfastertransformer.cc:578] Get input name: start_id, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059605 88 libfastertransformer.cc:578] Get input name: end_id, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059606 88 libfastertransformer.cc:578] Get input name: input_lengths, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059608 88 libfastertransformer.cc:578] Get input name: request_output_len, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059609 88 libfastertransformer.cc:578] Get input name: runtime_top_k, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059611 88 libfastertransformer.cc:578] Get input name: runtime_top_p, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059612 88 libfastertransformer.cc:578] Get input name: beam_search_diversity_rate, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059614 88 libfastertransformer.cc:578] Get input name: temperature, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059615 88 libfastertransformer.cc:578] Get input name: len_penalty, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059616 88 libfastertransformer.cc:578] Get input name: repetition_penalty, type: TYPE_FP32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059618 88 libfastertransformer.cc:578] Get input name: random_seed, type: TYPE_INT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059619 88 libfastertransformer.cc:578] Get input name: is_return_log_probs, type: TYPE_BOOL, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059621 88 libfastertransformer.cc:578] Get input name: beam_width, type: TYPE_UINT32, shape: [1]
fauxpilot-triton-1         | W1021 20:33:18.059623 88 libfastertransformer.cc:578] Get input name: bad_words_list, type: TYPE_INT32, shape: [2, -1]
fauxpilot-triton-1         | W1021 20:33:18.059625 88 libfastertransformer.cc:578] Get input name: stop_words_list, type: TYPE_INT32, shape: [2, -1]
fauxpilot-triton-1         | W1021 20:33:18.059628 88 libfastertransformer.cc:620] Get output name: output_ids, type: TYPE_UINT32, shape: [-1, -1]
fauxpilot-triton-1         | W1021 20:33:18.059630 88 libfastertransformer.cc:620] Get output name: sequence_length, type: TYPE_UINT32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059632 88 libfastertransformer.cc:620] Get output name: cum_log_probs, type: TYPE_FP32, shape: [-1]
fauxpilot-triton-1         | W1021 20:33:18.059634 88 libfastertransformer.cc:620] Get output name: output_log_probs, type: TYPE_FP32, shape: [-1, -1]
fauxpilot-triton-1         | terminate called after throwing an instance of 'std::runtime_error'
fauxpilot-triton-1         |   what():  [FT][ERROR] shared_ft_model->getTensorParaSize() * shared_ft_model->getPipelineParaSize() == world_size Assertion fail: /workspace/build/fastertransformer_backend/src/libfastertransformer.cc:498
fauxpilot-triton-1         |
fauxpilot-triton-1         | [7704449ec6f1:00088] *** Process received signal ***
fauxpilot-triton-1         | [7704449ec6f1:00088] Signal: Aborted (6)
fauxpilot-triton-1         | [7704449ec6f1:00088] Signal code:  (-6)
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f3aa66b6420]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f3aa50aa00b]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f3aa5089859]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7f3aa5463911]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7f3aa546f38c]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7f3aa546f3f7]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7f3aa546f6a9]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 7] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x2a9a0)[0x7f3a930639a0]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 8] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x1e79f)[0x7f3a9305779f]
fauxpilot-triton-1         | [7704449ec6f1:00088] [ 9] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x1fd42)[0x7f3a93058d42]
fauxpilot-triton-1         | [7704449ec6f1:00088] [10] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(TRITONBACKEND_ModelInstanceInitialize+0x38c)[0x7f3a9305b63c]
fauxpilot-triton-1         | [7704449ec6f1:00088] [11] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x10c275)[0x7f3aa5958275]
fauxpilot-triton-1         | [7704449ec6f1:00088] [12] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x10d9c3)[0x7f3aa59599c3]
fauxpilot-triton-1         | [7704449ec6f1:00088] [13] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1019de)[0x7f3aa594d9de]
fauxpilot-triton-1         | [7704449ec6f1:00088] [14] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1b3b7a)[0x7f3aa59ffb7a]
fauxpilot-triton-1         | [7704449ec6f1:00088] [15] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1c29a1)[0x7f3aa5a0e9a1]
fauxpilot-triton-1         | [7704449ec6f1:00088] [16] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7f3aa549bde4]
fauxpilot-triton-1         | [7704449ec6f1:00088] [17] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f3aa66aa609]
fauxpilot-triton-1         | [7704449ec6f1:00088] [18] /usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f3aa5186133]
fauxpilot-triton-1         | [7704449ec6f1:00088] *** End of error message ***
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | Primary job  terminated normally, but 1 process returned
fauxpilot-triton-1         | a non-zero exit code. Per user-direction, the job has been aborted.
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1         | mpirun noticed that process rank 0 with PID 0 on node 7704449ec6f1 exited on signal 6 (Aborted).
fauxpilot-triton-1         | --------------------------------------------------------------------------
fauxpilot-triton-1 exited with code 134

Edit: my first gpu is Intel Xeon integrated graphics, this might not be an usable GPU for fauxpilot since the error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants