TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). #1786

Night-Quiet · 2024-04-22T02:29:21Z

System Info

2024-04-22T02:18:37.204227Z  INFO text_generation_launcher: Runtime environment:
Target: x86_64-unknown-linux-gnu
Cargo version: 1.75.0
Commit sha: 2d0a7173d4891e7cd5f9b77f8e0987b82a339e51
Docker label: N/A
nvidia-smi:
Mon Apr 22 10:18:37 2024       
   +---------------------------------------------------------------------------------------+
   | NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
   |-----------------------------------------+----------------------+----------------------+
   | GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
   |                                         |                      |               MIG M. |
   |=========================================+======================+======================|
   |   0  NVIDIA GeForce RTX 4090        On  | 00000000:1C:00.0 Off |                  Off |
   | 30%   29C    P8              30W / 450W |      2MiB / 24564MiB |      0%      Default |
   |                                         |                      |                  N/A |
   +-----------------------------------------+----------------------+----------------------+
                                                                                            
   +---------------------------------------------------------------------------------------+
   | Processes:                                                                            |
   |  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
   |        ID   ID                                                             Usage      |
   |=======================================================================================|
   |  No running processes found                                                           |
   +---------------------------------------------------------------------------------------+

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

text-generation-launcher --model-id /root/autodl-tmp/baichuan --trust-remote-code

2024-04-22T02:11:00.927702Z  INFO download: text_generation_launcher: Successfully downloaded weights.
2024-04-22T02:11:00.928097Z  INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-04-22T02:11:03.592505Z ERROR text_generation_launcher: exllamav2_kernels not installed.

2024-04-22T02:11:03.623388Z  WARN text_generation_launcher: Could not import Flash Attention enabled models: cannot import name 'FastLayerNorm' from 'text_generation_server.utils.layers' (/root/text-generation-inference/server/text_generation_server/utils/layers.py)

2024-04-22T02:11:03.623918Z  WARN text_generation_launcher: Could not import Mamba: No module named 'mamba_ssm'

2024-04-22T02:11:04.132360Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

Traceback (most recent call last):

  File "/root/miniconda3/envs/tgi/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
             ^^^^^

  File "/root/text-generation-inference/server/text_generation_server/cli.py", line 71, in serve
    from text_generation_server import server

  File "/root/text-generation-inference/server/text_generation_server/server.py", line 16, in <module>
    from text_generation_server.models.vlm_causal_lm import VlmCausalLMBatch

  File "/root/text-generation-inference/server/text_generation_server/models/vlm_causal_lm.py", line 14, in <module>
    from text_generation_server.models.flash_mistral import (

  File "/root/text-generation-inference/server/text_generation_server/models/flash_mistral.py", line 18, in <module>
    from text_generation_server.models.custom_modeling.flash_mistral_modeling import (

  File "/root/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_mistral_modeling.py", line 30, in <module>
    from text_generation_server.utils.layers import (

ImportError: cannot import name 'PositionRotaryEmbedding' from 'text_generation_server.utils.layers' (/root/text-generation-inference/server/text_generation_server/utils/layers.py)
 rank=0
2024-04-22T02:11:04.230211Z ERROR text_generation_launcher: Shard 0 failed to start
2024-04-22T02:11:04.230247Z  INFO text_generation_launcher: Shutting down shards
Error: ShardCannotStart

Expected behavior

Please let me know if it supports baichuan-inc/Baichuan2-7B-Chat, or perhaps this error was caused by some of my operational errors. Thank you.

The text was updated successfully, but these errors were encountered:

Night-Quiet · 2024-04-22T12:35:20Z

I referred to the following link to complete the repair:
#1778

Night-Quiet closed this as completed Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). #1786

TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). #1786

Night-Quiet commented Apr 22, 2024 •

edited

Loading

Night-Quiet commented Apr 22, 2024

TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). #1786

TGI seems is not supported that other position encoding, like baichuan2-7B(alibi position). #1786

Comments

Night-Quiet commented Apr 22, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

Night-Quiet commented Apr 22, 2024

Night-Quiet commented Apr 22, 2024 •

edited

Loading