Hi,
When I load the model into 4 gpus with model parallelism:
transformers.pipeline(model='fixie-ai/ultravox-v0_4_1-llama-3_1-70b', trust_remote_code=True, device_map='auto')
It gives the below error:
ValueError: weight is on the meta device, we need a `value` to put in on 0.