Does opt_13b model support tensor parallelism vias inferflow? #49

LHQUer · 2024-03-18T02:06:23Z

The settings are as followed:

devices = 0&1&2&3;4&5&6&7
decoder_cpu_layer_count = 0
cpu_threads = 8

max_concurrent_queries = 6

return_output_tensors = true

;debug options
is_study_mode = false
show_tensors = false

When I run the opt_13b by inferflow, the error is as followed:

Configuration = release; Platform = x64
========== ========== ========== ========== ========== ==========
Loading model specifications...
Loading model opt_13b...
vocab_size: 50272, embd_dims: 5120, decoder layers: 40, decoder heads: 40, decoder kv heads: 40
qkv_format 1 is not compatible with tensor parallelism
Failed to load the model
Failed to initialize the inference engine
Memory usage (MB): 203.21, 203.21 (Peak)
Press the enter key to quit...

shumingshi · 2024-03-19T10:53:29Z

Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).

LHQUer · 2024-03-19T11:10:49Z

Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).

Thanks for your general answer！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does opt_13b model support tensor parallelism vias inferflow? #49

Does opt_13b model support tensor parallelism vias inferflow? #49

LHQUer commented Mar 18, 2024

shumingshi commented Mar 19, 2024

LHQUer commented Mar 19, 2024

Does opt_13b model support tensor parallelism vias inferflow? #49

Does opt_13b model support tensor parallelism vias inferflow? #49

Comments

LHQUer commented Mar 18, 2024

The settings are as followed:

When I run the opt_13b by inferflow, the error is as followed:

shumingshi commented Mar 19, 2024

LHQUer commented Mar 19, 2024