This repository was archived by the owner on Oct 9, 2024. It is now read-only.

huggingface / transformers-bloom-inference Public archive

Notifications
Fork 112
Star 562

Code
Issues 20
Pull requests 2
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/transformers-bloom-inference

Labels 9 Milestones 0

Clear current search query, filters, and sorts

20 Open 44 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Sharding a model checkpoint for deepspeed usage

#39 opened Dec 5, 2022 by CoderPat updated Dec 5, 2022

NotImplementedError: Cannot copy out of meta tensor; no data!

#50 opened Feb 10, 2023 by zdaoguang updated Mar 10, 2023

stuck when inferring

#63 opened Mar 14, 2023 by raihan0824 updated Mar 14, 2023

RuntimeError: This event loop is already running

#62 opened Mar 9, 2023 by syp1997 updated Mar 14, 2023

The generated results are different when using greedy search during generation

#65 opened Mar 14, 2023 by FrostML updated Mar 22, 2023

Cannot explain recurring OOM error

#66 opened Mar 17, 2023 by Remorax updated Mar 29, 2023

Incorrectly benchmarking

#72 opened Apr 1, 2023 by JoeyTPChou updated Apr 1, 2023

cuBLAS error with NVIDIA H100 HGX, CUDA v12.1, and cuDNN 8.8.1

#76 opened Apr 14, 2023 by BenFauber updated Apr 14, 2023

root_dir in TemporaryCheckpointsJSON is redundant

#82 opened Apr 24, 2023 by dc3671 updated Apr 24, 2023

Big batchsize cause OOM in bloom-ds-inference.py, how to adjust max_split_size_mb value

#84 opened Apr 27, 2023 by tohneecao updated May 10, 2023

Are there fine-tuning and inference scripts available for int4 quantization in bloom-7b? Is it possible to limit the GPU memory usage to within 10GB?

#94 opened May 31, 2023 by dizhenx updated May 31, 2023

AttributeError: 'BloomForCausalLM' object has no attribute 'module'

#95 opened Jun 1, 2023 by detectiveJoshua updated Jun 1, 2023

The Makefile execution was successful, but there is no response when entering text.

#96 opened Jun 2, 2023 by dizhenx updated Jun 2, 2023

When deploying the Bloom model, I noticed that the POST method is used for the generation task. Is it possible to modify it to perform question-answering instead?

#97 opened Jun 5, 2023 by dizhenx updated Jun 5, 2023

How to understand this note: "note: Since Deepspeed-ZeRO can process multiple generate streams in parallel its throughput can be further divided by 8 or 16 ..."

#99 opened Jun 14, 2023 by HuipengXu updated Jun 15, 2023

It does not work with Falcon-40B correctly

#100 opened Jun 23, 2023 by AGrosserHH updated Jun 23, 2023

[Bug] Int8 quantize inference failed using bloom-inference-scripts/bloom-ds-inference.py with deepspeed==0.9.0 on multi-gpus

#77 opened Apr 17, 2023 by hanrui1sensetime updated Jun 27, 2023

ValueError: Couldn't instantiate the backend tokenizer from one of:

#101 opened Jun 30, 2023 by SeekPoint updated Jun 30, 2023

pip install command does not work as expected

#88 opened May 9, 2023 by Billijk updated Jul 10, 2023

Inference(chatbot) does not work as expected on 2 gpus with bigscience/bloom-7b1 model

#90 opened May 19, 2023 by dantalyon updated Sep 22, 2023

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly