You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the opt_13b by inferflow, the error is as followed:
Configuration = release; Platform = x64
========== ========== ========== ========== ========== ==========
Loading model specifications...
Loading model opt_13b...
vocab_size: 50272, embd_dims: 5120, decoder layers: 40, decoder heads: 40, decoder kv heads: 40
qkv_format 1 is not compatible with tensor parallelism
Failed to load the model
Failed to initialize the inference engine
Memory usage (MB): 203.21, 203.21 (Peak)
Press the enter key to quit...
The text was updated successfully, but these errors were encountered:
Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).
Thank you for raising this issue. Tensor parallelsim is not supported so far for this model. We will try to fix this issue. As a mitigation, you can either apply pipeline parallelsim or serve the model on one GPU device (together with quantization if the VRAM of each device is less than 32GB).
The settings are as followed:
devices = 0&1&2&3;4&5&6&7
decoder_cpu_layer_count = 0
cpu_threads = 8
max_concurrent_queries = 6
return_output_tensors = true
;debug options
is_study_mode = false
show_tensors = false
When I run the opt_13b by inferflow, the error is as followed:
Configuration = release; Platform = x64
========== ========== ========== ========== ========== ==========
Loading model specifications...
Loading model opt_13b...
vocab_size: 50272, embd_dims: 5120, decoder layers: 40, decoder heads: 40, decoder kv heads: 40
qkv_format 1 is not compatible with tensor parallelism
Failed to load the model
Failed to initialize the inference engine
Memory usage (MB): 203.21, 203.21 (Peak)
Press the enter key to quit...
The text was updated successfully, but these errors were encountered: