-
-
Notifications
You must be signed in to change notification settings - Fork 796
Description
Firstly: Fantastic work! This is the way!
I followed the instructions in your doc file where instead of opt66b I used bloom and bloom-3b.
The models load properly on my 8 V100 32GB gpus (3b needs 1 gpu obviously).
Decoding also finishes but the output is problematic:
My input: text = """The translation of 'I am a boy' in French is"""
My output: The translation of 'I am a boy' in French is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is is
This happens for both models.
Some details about my settings:
- V100 gpus
- transformers-4.22.0.dev0
- CUDA 11.1
- CUDNN 8.x
- bitsandbytes (I am assuming its the latest version copatible with cuda 11.x)
Kindly let me know how this can be fixed.
Thanks and regards.