You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thansk for ur great job!
I set up fresh install in docker in ubuntu22.04.3
Question :
I got error then container keep rebooting: how to solve it ?
is this becuase of cuda verson 12.2 ?
i can not set it up to 12.1 for nvidia drvier 535.154.05 and cuda_12.2.0_535.54.03_linux.run make it 12.2
(with ubuntu-drivers devices, i see nvidia-driver-530 but install failed. )
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
h2ogpt@d1225cb3fb68:~$ nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
I bumped below messages:
h2ogpt-h2ogpt-1 | Using Model h2oai/h2ogpt-4096-llama2-7b-chat
h2ogpt-h2ogpt-1 | fatal: not a git repository (or any of the parent directories): .git
h2ogpt-h2ogpt-1 | load INSTRUCTOR_Transformer
h2ogpt-h2ogpt-1 | max_seq_length 512
h2ogpt-h2ogpt-1 | Traceback (most recent call last):
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1364, in _get_module
h2ogpt-h2ogpt-1 | return importlib.import_module("." + module_name, self.__name__)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/importlib/__init__.py", line 126, in import_module
h2ogpt-h2ogpt-1 | return _bootstrap._gcd_import(name[level:], package, level)
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap_external>", line 883, in exec_module
h2ogpt-h2ogpt-1 | File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/models/whisper/modeling_whisper.py", line 50, in <module>
h2ogpt-h2ogpt-1 | from flash_attn import flash_attn_func, flash_attn_varlen_func
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
h2ogpt-h2ogpt-1 | from flash_attn.flash_attn_interface import (
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
h2ogpt-h2ogpt-1 | import flash_attn_2_cuda as flash_attn_cuda
h2ogpt-h2ogpt-1 | ImportError: /h2ogpt_conda/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
h2ogpt-h2ogpt-1 |
h2ogpt-h2ogpt-1 | The above exception was the direct cause of the following exception:
h2ogpt-h2ogpt-1 |
h2ogpt-h2ogpt-1 | Traceback (most recent call last):
h2ogpt-h2ogpt-1 | File "/workspace/generate.py", line 16, in <module>
h2ogpt-h2ogpt-1 | entrypoint_main()
h2ogpt-h2ogpt-1 | File "/workspace/generate.py", line 12, in entrypoint_main
h2ogpt-h2ogpt-1 | H2O_Fire(main)
h2ogpt-h2ogpt-1 | File "/workspace/src/utils.py", line 65, in H2O_Fire
h2ogpt-h2ogpt-1 | fire.Fire(component=component, command=args)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
h2ogpt-h2ogpt-1 | component_trace = _Fire(component, args, parsed_flag_args, context, name)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
h2ogpt-h2ogpt-1 | component, remaining_args = _CallAndUpdateTrace(
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
h2ogpt-h2ogpt-1 | component = fn(*varargs, **kwargs)
h2ogpt-h2ogpt-1 | File "/workspace/src/gen.py", line 1701, in main
h2ogpt-h2ogpt-1 | transcriber = get_transcriber(model=stt_model,
h2ogpt-h2ogpt-1 | File "/workspace/src/stt.py", line 15, in get_transcriber
h2ogpt-h2ogpt-1 | transcriber = pipeline("automatic-speech-recognition", model=model, device_map=device_map)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 870, in pipeline
h2ogpt-h2ogpt-1 | framework, model = infer_framework_load_model(
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 249, in infer_framework_load_model
h2ogpt-h2ogpt-1 | _class = getattr(transformers_module, architecture, None)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1355, in __getattr__
h2ogpt-h2ogpt-1 | value = getattr(module, name)
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1354, in __getattr__
h2ogpt-h2ogpt-1 | module = self._get_module(self._class_to_module[name])
h2ogpt-h2ogpt-1 | File "/h2ogpt_conda/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1366, in _get_module
h2ogpt-h2ogpt-1 | raise RuntimeError(
h2ogpt-h2ogpt-1 | RuntimeError: Failed to import transformers.models.whisper.modeling_whisper because of the following error (look up to see its traceback):
h2ogpt-h2ogpt-1 | /h2ogpt_conda/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
h2ogpt-h2ogpt-1 exited with code 0
The text was updated successfully, but these errors were encountered:
Hi, I only got that when I had newer torch 2.2.0 with flash attn. This happened because some langchain package was updating to 2.2.0 (because it could) and I only set constraints in requirements.txt not langchain one. But that's there now and I no longer that that issue.
I see that the latest docker image I made has the same problem of still going to torch 2.2.0. So should be fixable.
(h2ogpt) jon@gpu:~/h2ogpt$ docker run -ti --entrypoint=bash gcr.io/vorvan/h2oai/h2ogpt-runtime:0.1.0
h2ogpt@f0692c4f43b7:~$ python
Python 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import flash_attn
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/h2ogpt_conda/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/h2ogpt_conda/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
import flash_attn_2_cuda as flash_attn_cuda
ImportError: /h2ogpt_conda/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
>>> import torch
>>> torch.__version__
'2.2.0+cu121'
>>>
thansk for ur great job!
I set up fresh install in docker in ubuntu22.04.3
Question :
I got error then container keep rebooting: how to solve it ?
is this becuase of cuda verson 12.2 ?
i can not set it up to 12.1 for nvidia drvier 535.154.05 and cuda_12.2.0_535.54.03_linux.run make it 12.2
(with ubuntu-drivers devices, i see nvidia-driver-530 but install failed. )
i goggled and found something that seems related below issue:
thanks in advance.
Details:
Host:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
--
Docker Container :
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
h2ogpt@d1225cb3fb68:~$ nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
I bumped below messages:
The text was updated successfully, but these errors were encountered: