You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/app/llama_cpp/server/app.py", line 313, in create_app
llama = llama_cpp.Llama(
^^^^^^^^^^^^^^^^
File "/app/llama_cpp/llama.py", line 313, in init
assert self.model is not None
^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Exception ignored in: <function Llama.del at 0xffff863f6200>
Traceback (most recent call last):
File "/app/llama_cpp/llama.py", line 1510, in del
if self.ctx is not None:
^^^^^^^^
AttributeError: 'Llama' object has no attribute 'ctx'
/usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` and ``easy_install``.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://github.com/pypa/setuptools/issues/917 for details.
********************************************************************************
!!
easy_install.initialize_options(self)
/usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
/usr/local/lib/python3.11/site-packages/pydantic/_internal/fields.py:126: UserWarning: Field "model_alias" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',).
warnings.warn(
llama.cpp: loading model from /models/llama-2-70b-chat.bin
llama_model_load_internal: format = ggjt v3 (latest)
Thanks for reporting, @itamargero. We have added N_GQA: 8 in the docker-compose.yml for the 70B model. The token generation is slow on M1 due to lack of GPU offloading and Metal support, so currently only the CPU is being utilized. We hope to add Metal support soon. Closing this issue for now.
sing /usr/local/lib/python3.11/site-packages
Finished processing dependencies for llama-cpp-python==0.1.77
Initializing server with:
Batch size: 2096
Number of CPU threads: 8
Number of GPU layers: 0
Context window: 4096
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/app/llama_cpp/server/main.py", line 46, in
File "/app/llama_cpp/server/app.py", line 313, in create_app
File "/app/llama_cpp/llama.py", line 313, in init
AssertionError
Exception ignored in: <function Llama.del at 0xffff863f6200>
Traceback (most recent call last):
File "/app/llama_cpp/llama.py", line 1510, in del
AttributeError: 'Llama' object has no attribute 'ctx'
/usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!
!!
easy_install.initialize_options(self)
/usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
!!
self.initialize_options()
/usr/local/lib/python3.11/site-packages/pydantic/_internal/fields.py:126: UserWarning: Field "model_alias" has conflict with protected namespace "model".
You may be able to resolve this warning by setting
model_config['protected_namespaces'] = ('settings_',)
.warnings.warn(
llama.cpp: loading model from /models/llama-2-70b-chat.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 4096
llama_model_load_internal: n_embd = 8192
llama_model_load_internal: n_mult = 4096
llama_model_load_internal: n_head = 64
llama_model_load_internal: n_head_kv = 64
llama_model_load_internal: n_layer = 80
llama_model_load_internal: n_rot = 128
llama_model_load_internal: n_gqa = 1
llama_model_load_internal: rnorm_eps = 1.0e-06
llama_model_load_internal: n_ff = 24576
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: model size = 65B
llama_model_load_internal: ggml ctx size = 0.21 MB
warning: failed to mlock 221184-byte buffer (after previously locking 0 bytes): Cannot allocate memory
Try increasing RLIMIT_MLOCK ('ulimit -l' as root).
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/app/llama_cpp/server/main.py", line 46, in
File "/app/llama_cpp/server/app.py", line 313, in create_app
File "/app/llama_cpp/llama.py", line 313, in init
AssertionError
Exception ignored in: <function Llama.del at 0xffff9187a200>
Traceback (most recent call last):
File "/app/llama_cpp/llama.py", line 1510, in del
AttributeError: 'Llama' object has no attribute 'ctx'
The text was updated successfully, but these errors were encountered: