This repository was archived by the owner on Oct 9, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 112
Issues: huggingface/transformers-bloom-inference
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
"bloom-ds-zero-inference.py" works but "inference_server.cli --deployment_framework ds_zero" fails
#68
by richarddwang
was closed Jun 17, 2024
updated Jun 17, 2024
running error with "Bus error: nonexistent physical address"
#16
by Emerald01
was closed Feb 18, 2023
updated Nov 24, 2023
Can not generate text correctly after loading an int8 model
#80
by moonlightian
was closed Jul 8, 2023
updated Jul 8, 2023
how to inference by multi-node and multi-card?
#51
by vicwer
was closed Mar 14, 2023
updated Jun 19, 2023
Bloom176B RuntimeError: expected scalar type Half but found BFloat16
#89
by wohenniubi
was closed Jun 9, 2023
updated Jun 9, 2023
ds_inference success but OOM when use tp_presharded_mode=True
#92
by LiuShixing
was closed Jun 1, 2023
updated Jun 1, 2023
why no use deepspeed.init_inference in zero benchmark
#86
by tingshua-yts
was closed May 31, 2023
updated May 31, 2023
Can I combine fastertransformer to make it faster
#93
by xsj4cs
was closed May 31, 2023
updated May 31, 2023
Why does ds-inference int8 run slower than ds-inference fp16?
#79
by DominickZhang
was closed May 10, 2023
updated May 10, 2023
question regarding the float16 and bfloat
#87
by allanj
was closed May 10, 2023
updated May 10, 2023
Is there a way to initialize a random weight for
#74
by PannenetsF
was closed Apr 19, 2023
updated Apr 19, 2023
Why is the throughput of DS-inference doubled when using 4 A100 GPUs compared to 8 A100 GPUs
#59
by DominickZhang
was closed Apr 6, 2023
updated Apr 6, 2023
Distributed Training using the same loading method
#61
by ananda1996ai
was closed Mar 14, 2023
updated Mar 14, 2023
Previous Next
ProTip!
Updated in the last three days: updated:>2025-03-26.