Skip to content
This repository was archived by the owner on Oct 9, 2024. It is now read-only.

Issues: huggingface/transformers-bloom-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

running error with "Bus error: nonexistent physical address"
#16 by Emerald01 was closed Feb 18, 2023 updated Nov 24, 2023
does this work for llama 65B
#98 by GradientGuru was closed Jul 31, 2023 updated Jul 31, 2023
Can not generate text correctly after loading an int8 model
#80 by moonlightian was closed Jul 8, 2023 updated Jul 8, 2023
how to inference by multi-node and multi-card?
#51 by vicwer was closed Mar 14, 2023 updated Jun 19, 2023
ds_inference success but OOM when use tp_presharded_mode=True
#92 by LiuShixing was closed Jun 1, 2023 updated Jun 1, 2023
OOM of CUDA when using one GPU
#60 by xiongjun19 was closed May 31, 2023 updated May 31, 2023
why no use deepspeed.init_inference in zero benchmark
#86 by tingshua-yts was closed May 31, 2023 updated May 31, 2023
Can I combine fastertransformer to make it faster
#93 by xsj4cs was closed May 31, 2023 updated May 31, 2023
accelerate in bloom-inference-scripts?
#91 by jeromeku was closed May 23, 2023 updated May 23, 2023
The details of hf-accelerate pp.
#83 by tohneecao was closed May 20, 2023 updated May 20, 2023
Unable to reload a quantized model
#85 by moonlightian was closed May 10, 2023 updated May 10, 2023
tokenizer.json 乱码怎么解析
#78 by hongshengxin was closed May 10, 2023 updated May 10, 2023
Why does ds-inference int8 run slower than ds-inference fp16?
#79 by DominickZhang was closed May 10, 2023 updated May 10, 2023
BUILD ERROR with nvcc
#81 by tohneecao was closed May 10, 2023 updated May 10, 2023
question regarding the float16 and bfloat
#87 by allanj was closed May 10, 2023 updated May 10, 2023
beam search
#73 by syp1997 was closed Apr 19, 2023 updated Apr 19, 2023
Is there a way to initialize a random weight for
#74 by PannenetsF was closed Apr 19, 2023 updated Apr 19, 2023
concurrent requests
#75 by ustclan was closed Apr 7, 2023 updated Apr 7, 2023
how to run the server?
#64 by raihan0824 was closed Mar 29, 2023 updated Mar 29, 2023
Short response for bloom inferring
#70 by raihan0824 was closed Mar 29, 2023 updated Mar 29, 2023
Should I use bf16 or fp16?
#69 by richarddwang was closed Mar 23, 2023 updated Mar 24, 2023
Distributed Training using the same loading method
#61 by ananda1996ai was closed Mar 14, 2023 updated Mar 14, 2023
ProTip! Updated in the last three days: updated:>2025-03-26.