-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random assertion errors due to evaluate_nochat #1600
Comments
Hi, I see the issue is from awq:
It's likely a bug in awq, perhaps when combined with attention sinks, flash attention, or compile of model. While we expose those options from transformers, I cannot be sure arbitrary combinations work. If I run:
I don't have a generic issue running. I removed things that shouldn't be relevant to the awq issue. However, when I upload some text and then ask a question, I get the same issue:
|
This does the same thing:
|
As does this:
So it seems to be a pure awq issue. The latest 0.2.5 does the same thing. Reducing to (say) 15000 does same thing. |
A small script does the same thing, so it's not related to h2oGPT itself. |
I'm getting a similar error with LLAMA-3 GGUF as well (same model mentioned in the FAQ), it only includes the evaluate_nochat exception from the log above. error running LLAMA-3evaluate_nochat exception: : ('', '', '', True, 'unknown', "{ 'PreInput': None,\n 'PreInstruct': None,\n 'PreResponse': None,\n 'botstr': None,\n 'can_handle_system_prompt': False,\n 'chat_sep': '\\n',\n 'chat_turn_
sep': '\\n',\n 'generates_leading_space': False,\n 'humanstr': None,\n 'promptA': None,\n 'promptB': None,\n 'system_prompt': '',\n 'terminate_response': []}", 0, 1, 1, 0, 1, 1024, 0, False, 600, 1.07, 1, False, 0, Tr
ue, '', '', 'LLM', True, 'Query', None, 10, True, 512, 'Relevant', ['All'], None, None, None, None, 'Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.', 'Acco
rding to only the information in the document sources provided within the context above, write an insightful and well-structured response to: ', 'In order to write a concise single-paragraph or bulleted list summary, pay attention to t
he following text.', 'Using only the information in the document sources above, write a condensed and concise summary of key results (preferably as about 10 bullet points).', 'Answer this question with vibrant details in order for some
NLP embedding model to use that answer as better query than original question: ', 'Who are you and what do you do?', 'Ensure your entire response is outputted as a single piece of strict valid JSON text.', 'Ensure your response is str
ictly valid JSON text.', 'Ensure your entire response is outputted as strict valid JSON text inside a Markdown code block with the json language identifier. Ensure all JSON keys are less than 64 characters, and ensure JSON key names are made of only alphanumerics, underscores, or hyphens.', 'Ensure you follow this JSON schema:\n```json\n{properties_schema}\n```', 'auto', ['DocTR', 'Caption', 'ASR'], ['PyPDF'], ['Unstructured'], '.[]', 10, 'auto', [], None, '', False, '[]', '[]', 'best_near_prompt', 512, -1, -1, 'split_or_merge', '\n\n', 0, 'auto', False, False, '[]', 'None', None, None, 1, None, None, 'text', '', '', '', '', {'model': 'model', 'tokenizer': 'tokenizer', 'device': 'cpu', 'base_model': 'llama', 'tokenizer_base_model': '', 'lora_weights': '[]', 'inference_server': '[]', 'prompt_type': 'unknown', 'prompt_dict': {'promptA': None, 'promptB': None, 'PreInstruct': None, 'PreInput': None, 'PreResponse': None, 'terminate_response': [], 'chat_sep': '\n', 'chat_turn_sep': '\n', 'humanstr': None, 'botstr': None, 'generates_leading_space': False, 'system_prompt': '', 'can_handle_system_prompt': False}, 'visible_models': 0, 'h2ogpt_key': None}, {'MyData': [None, '6164574a-865b-4659-8fda-d35faa6a1d09', 'test']}, {'langchain_modes': ['Disabled', 'LLM', 'MyData', 'UserData'], 'langchain_mode_paths': {'UserData': None}, 'langchain_mode_types': {'UserData': 'shared', 'github h2oGPT': 'shared', 'DriverlessAI docs': 'shared', 'wiki': 'shared', 'wiki_full': '', 'MyData': 'personal', 'LLM': 'personal', 'Disabled': 'personal'}}, {'headers': '', 'host': '0.0.0.0:7850', 'username': 'test', 'connection': 'keep-alive', 'content-length': '117', 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0', 'dnt': '1', 'content-type': 'application/json', 'accept': '*/*', 'origin': 'http://0.0.0.0:7850', 'referer': 'http://0.0.0.0:7850/', 'accept-encoding': 'gzip, deflate', 'accept-language': 'en-US,en;q=0.9,ar;q=0.8', 'cookie': 'access-token-unsecure-hhN8py5JLVRfL-0OTPND8TGcb3qhs2GvSJQ8qV1LI50=vrLRNuXKqoKCZDSCqo1OHg; access-token-unsecure-s-dRx26Pws-xf2TfvaYIjqwWsGjiH9960S06PrlT6tg=AnrezJi1hR1NjfFx29n_bg; access-token-unsecure-SF0CZ7POfi6Imk0jDfN44qO9W9VB0hu3nUcGevVPMYw=SU1SQYZL79hpAN43hEDgIQ; access-token-unsecure-9LIDZewsE4If1yY7ixHa-yOZJO20M-PQVSDjJtfYQYA=o8YMAhHGtoLQDjMVZVITsQ; access-token-unsecure-qS0zsQdPdQYJsrMX4RXh3HQwEDeknaNz0RppngdPvGY=AGmuVQm8_KVKkMg8HdQtqg; access-token-unsecure--qfFGcbj-JQc0O0MamjIfNGlfgUrb6t7xyB3hRUL1I8=NVbKjP5O7Q3xJxHYvaiUfw; access-token-unsecure-YeY4iDfE2-hlA1izGtL7vBNbLbCosRLpSAJFo-j6_e0=xkWJTIiCTZGbhG1H60OTBg; access-token-unsecure-BwVTmtTwIzOYqtTpvsZkHQvnjr8N60WJaX_V6njwUAw=8uPW51j557W7S8ZO_e5iSQ; access-token-unsecure-JSzZdmZ5Fn4S9ekIB_5lXnXTrnwvQu1X7IyivtmRjuk=mm2CzLGIw9b3H9xFfS1KpQ; access-token-unsecure-IxTC1FBXOKLvW0SXsNRzMYxrHxvTPTIwwB4y69dHG9A=fYRfZinU_x99RK1k11fvIA; access-token-unsecure-eJjGwBq3ju0P30aflFl-P8uUU2QqEgAIsxgw-FsdJgU=fDM1Je-16fNS5ndkdNRL6g; access-token-unsecure-uiZ2ybGZZJgrCTxTV22rFbgZVNssg73T8oL2xiUqp1I=VrTDb6L_l-gQy27srKSq0Q; access-token-unsecure-U2bTOnaNSeVnj1LFMlrf1Mtm6lWYkZALH_MqD44PYNU=lt481IiB7f9YGqWkkOinIA; access-token-unsecure-RxkNbfFX2paq-I07CliUNsj55vZP1qWOWwp2u-TDNCc=mmNdzgtQmfvlDlQe6i4PvA; access-token-unsecure-zfJ9zz3Jn9sTfeKgGIdQFP7gIY4kjYQ8rW5jEkOaylQ=IsBrbZibgh9mmFvTDmEOAg; access-token-unsecure-L9Xt7VY0kFHSJqZAK8-I90p4rp6XNVtxCYJp3nAmxfs=LZu9e7Ggvl2vYBE1AuZbLQ', 'host2': '14.1.246.124', 'picture': 'None'}, {}, [['hello', '']])
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/queueing.py", line 566, in process_events
response = await route_utils.call_process_api(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api
output = await app.get_blocks().process_api(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api
result = await self.call_function(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in call_function
prediction = await utils.async_iteration(iterator)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 595, in async_iteration
return await iterator.__anext__()
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 588, in __anext__
return await anyio.to_thread.run_sync(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 571, in run_sync_iterator_async
return next(iterator)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 754, in gen_wrapper
response = next(iterator)
File "/workspace/src/gradio_runner.py", line 5053, in bot
for res in get_response(fun1, history, chatbot_role1, speaker1, tts_language1, roles_state1,
File "/workspace/src/gradio_runner.py", line 4948, in get_response
for output_fun in fun1():
File "/workspace/src/gen.py", line 4402, in evaluate
prompt_basic = prompter.generate_prompt(data_point, context_from_history=False)
File "/workspace/src/prompter.py", line 1729, in generate_prompt
assert self.use_chat_template
AssertionError run commandexport IMAGE_TAG=4059a2c9
export HF_TOKEN=hf_xxx
docker run \
--init \
--gpus all \
--runtime=nvidia \
--shm-size=2g \
-p 7850:7860 \
-v /etc/passwd:/etc/passwd:ro \
-v /etc/group:/etc/group:ro \
-u $(id -u):$(id -g) \
gcr.io/vorvan/h2oai/h2ogpt-runtime:$IMAGE_TAG /workspace/generate.py \
--openai_server=False \
--auth="/workspace/auth/users.json" \
--h2ogpt_api_keys="/workspace/auth/api_keys.json" \
--use_gpu_id=False \
--score_model=None \
--base_model=llama \
--model_path_llama=https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf?download=true
--tokenizer_base_model=meta-llama/Meta-Llama-3-8B-Instruct \
--save_dir='/workspace/save/' \
--user_path='/workspace/user_path/' \
--langchain_mode="UserData" \
--langchain_modes="['UserData', 'LLM']" \
--visible_langchain_actions="['Query']" \
--visible_langchain_agents="[]" \
--use_llm_if_no_docs=True \
--enable_ocr=True \
--enable_tts=False \
--enable_stt=False
|
It should fail and make you pass:
as well. e.g.
gives: I don't see the error you see. And when I debug the code with the above command on latest h2oGPT, I see that chat_template is True. Perhaps you are using older docker image or older h2oGPT or something? |
there was a missing slash in the command after this occurs on both latest and previous docker images with tags |
Can you share a stack trace of where it's failing? |
here is the full stack trace, I'm authenticated on the huggingface-cli, have exported the error is not raised while starting the container, only after sending a query is the error raised. evaluate_nochat exception: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
401 Client Error. (Request ID: Root=1-6639f802-164d6a07579d391a189244e2;12303a0c-d620-412a-b17f-58372bd127d5)
Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/generation_config.json.
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.: ('', '', '', True, 'unknown', "{ 'PreInput': None,\n 'PreInstruct': None,\n 'PreResponse': None,\n 'botstr': None,\n
'can_handle_system_prompt': False,\n 'chat_sep': '\\n',\n 'chat_turn_sep': '\\n',\n 'generates_leading_space': False,\n 'humanstr': None,\n 'promptA': None,\n 'promptB': None,\n 'system_prompt': '',\n 'termi
nate_response': []}", 0, 1, 1, 0, 1, 1024, 0, False, 600, 1.07, 1, False, 0, True, '', '', 'UserData', True, 'Query', None, 10, True, 512, 'Relevant', ['/workspace/user_path/9b999f43-2ade-4148-97cf-d2448125168c/res/b1a173b2_user_upload
_Abubakar-Yusif-March-2024-Progress-Report.pdf'], None, None, None, None, 'Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.', 'According to only the informat
ion in the document sources provided within the context above, write an insightful and well-structured response to: ', 'In order to write a concise single-paragraph or bulleted list summary, pay attention to the following text.', 'Usin
g only the information in the document sources above, write a condensed and concise summary of key results (preferably as about 10 bullet points).', 'Answer this question with vibrant details in order for some NLP embedding model to us
e that answer as better query than original question: ', 'Who are you and what do you do?', 'Ensure your entire response is outputted as a single piece of strict valid JSON text.', 'Ensure your response is strictly valid JSON text.', '
Ensure your entire response is outputted as strict valid JSON text inside a Markdown code block with the json language identifier. Ensure all JSON keys are less than 64 characters, and ensure JSON key names are made of only alphanume
rics, underscores, or hyphens.', 'Ensure you follow this JSON schema:\n```json\n{properties_schema}\n```', 'auto', ['OCR', 'DocTR', 'Caption', 'ASR'], ['PyPDF'], ['Unstructured'], '.[]', 10, 'auto', [], None, '', False, '[]', '[]', 'be
st_near_prompt', 512, -1, -1, 'split_or_merge', '\n\n', 0, 'auto', False, False, '[]', 'None', None, None, 1, None, None, 'text', '', '', '', '', {'model': 'model', 'tokenizer': 'tokenizer', 'device': 'cpu', 'base_model': 'llama', 'tok
enizer_base_model': 'meta-llama/Meta-Llama-3-8B-Instruct', 'lora_weights': '[]', 'inference_server': '[]', 'prompt_type': 'unknown', 'prompt_dict': {'promptA': None, 'promptB': None, 'PreInstruct': None, 'PreInput': None, 'PreResponse'
: None, 'terminate_response': [], 'chat_sep': '\n', 'chat_turn_sep': '\n', 'humanstr': None, 'botstr': None, 'generates_leading_space': False, 'system_prompt': '', 'can_handle_system_prompt': False}, 'visible_models': 0, 'h2ogpt_key':
None}, {'MyData': [None, '2db742f5-7880-4708-bfcc-061f995f6c51', 'test']}, {'langchain_modes': ['Disabled', 'LLM', 'UserData'], 'langchain_mode_paths': {'UserData': '/workspace/user_path/'}, 'langchain_mode_types': {'UserData': 'shared
', 'github h2oGPT': 'shared', 'DriverlessAI docs': 'shared', 'wiki': 'shared', 'wiki_full': '', 'LLM': 'personal', 'Disabled': 'personal'}}, {'headers': '', 'host': '0.0.0.0:7850', 'username': 'test', 'connection': 'keep-alive', 'c
ontent-length': '155', 'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0', 'dnt': '1', 'content-type': 'application/json', 'accept': '*/*', 'origin': 'htt
p://0.0.0.0:7850', 'referer': 'http://0.0.0.0:7850/', 'accept-encoding': 'gzip, deflate', 'accept-language': 'en-US,en;q=0.9,ar;q=0.8', 'cookie': 'access-token-unsecure-hhN8py5JLVRfL-0OTPND8TGcb3qhs2GvSJQ8qV1LI50=vrLRNuXKqoKCZD
SCqo1OHg; access-token-unsecure-s-dRx26Pws-xf2TfvaYIjqwWsGjiH9960S06PrlT6tg=AnrezJi1hR1NjfFx29n_bg; access-token-unsecure-SF0CZ7POfi6Imk0jDfN44qO9W9VB0hu3nUcGevVPMYw=SU1SQYZL79hpAN43hEDgIQ; access-token-unsecure-9LIDZewsE4If1yY7ixHa-yO
ZJO20M-PQVSDjJtfYQYA=o8YMAhHGtoLQDjMVZVITsQ; access-token-unsecure-qS0zsQdPdQYJsrMX4RXh3HQwEDeknaNz0RppngdPvGY=AGmuVQm8_KVKkMg8HdQtqg; access-token-unsecure--qfFGcbj-JQc0O0MamjIfNGlfgUrb6t7xyB3hRUL1I8=NVbKjP5O7Q3xJxHYvaiUfw; access-tok
en-unsecure-YeY4iDfE2-hlA1izGtL7vBNbLbCosRLpSAJFo-j6_e0=xkWJTIiCTZGbhG1H60OTBg; access-token-unsecure-BwVTmtTwIzOYqtTpvsZkHQvnjr8N60WJaX_V6njwUAw=8uPW51j557W7S8ZO_e5iSQ; access-token-unsecure-JSzZdmZ5Fn4S9ekIB_5lXnXTrnwvQu1X7IyivtmRjuk
=mm2CzLGIw9b3H9xFfS1KpQ; access-token-unsecure-IxTC1FBXOKLvW0SXsNRzMYxrHxvTPTIwwB4y69dHG9A=fYRfZinU_x99RK1k11fvIA; access-token-unsecure-eJjGwBq3ju0P30aflFl-P8uUU2QqEgAIsxgw-FsdJgU=fDM1Je-16fNS5ndkdNRL6g; access-token-unsecure-uiZ2ybGZ
ZJgrCTxTV22rFbgZVNssg73T8oL2xiUqp1I=VrTDb6L_l-gQy27srKSq0Q; access-token-unsecure-U2bTOnaNSeVnj1LFMlrf1Mtm6lWYkZALH_MqD44PYNU=lt481IiB7f9YGqWkkOinIA; access-token-unsecure-RxkNbfFX2paq-I07CliUNsj55vZP1qWOWwp2u-TDNCc=mmNdzgtQmfvlDlQe6i4
PvA; access-token-unsecure-zfJ9zz3Jn9sTfeKgGIdQFP7gIY4kjYQ8rW5jEkOaylQ=IsBrbZibgh9mmFvTDmEOAg; access-token-unsecure-L9Xt7VY0kFHSJqZAK8-I90p4rp6XNVtxCYJp3nAmxfs=LZu9e7Ggvl2vYBE1AuZbLQ; access-token-unsecure-Q1siuNCSLCiNRxIYSoa5j3shaosW
MURmotg6HLCC7_U=GnFkZ58wi1tNt8jvtK_UXw; access-token-unsecure-18dc1X-LAJmB7lwYCfNUZLG0jDJ14hzU62wssf0wc68=sl8WTC1xPw4lEirkcWQTwg; access-token-unsecure-EjGzYB6qR9D1LPBN82KeJcwOVChluLaElinAd1OgeSk=jzCYJhyTI3Aka1vIsd-b8g; access-token-un
secure-X7gaKAyAK8MUU18su8lDmyVdbuDVILxO9xngYdFnvgo=1JT2Rzr5QHd5rLGbEgqASg; access-token-unsecure-SBEZ_P_Ugyx9zNJcGUYmCbOTE7jSx0BgJFTIyH3OkSg=_X_Xbu1W23MfAgZgdLlWZg; access-token-unsecure-wxxoptBvH1YNkOHaad2CST68Cg5SrC23tRQMsGy2TOc=Vl3F
HsOce3Q6zMWHJq79tA; access-token-unsecure-6WqVOYaEXoF6RktDUq-aghEiDdfSuCY2tkPQHCqGjck=N4QVkzJs0_AgRFQRdx5Y-Q; access-token-unsecure-PX6qZPvMN8mpgv3RwObslqNOspATAkytwLj-5mrO12Q=CxUwwnlESuufP4AeTnOh3Q; access-token-unsecure-a8SnDigFzjafo
M84LMb3tCUsrk44E9VoXdpmv36kpNo=DjNp22jvVqPfWo9k6fWxUg; access-token-unsecure-xy6fU_AFL6Tp1nllr3w_QAOlrvJRFgRn0bTPJnCrTAo=Nqt1rhRGgIiZD_m8zaERrg; access-token-unsecure-9JU3qRVH6hK4ZC54rM0J035t62b8-m26tT3T5I-wEWA=ESa5PlhZ1RUHvb1Klzxujg',
'host2': '14.1.246.126', 'picture': 'None'}, {}, [['summarize the given data', '']])
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
response.raise_for_status()
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/generation_config.json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1823, in _raise_on_head_call_error
raise head_call_error
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
r = _request_wrapper(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
response = _request_wrapper(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper
hf_raise_for_status(response)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 321, in hf_raise_for_status
raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-6639f802-164d6a07579d391a189244e2;12303a0c-d620-412a-b17f-58372bd127d5)
Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/generation_config.json.
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/queueing.py", line 566, in process_events
response = await route_utils.call_process_api(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/route_utils.py", line 261, in call_process_api
output = await app.get_blocks().process_api(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1788, in process_api
result = await self.call_function(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/blocks.py", line 1352, in call_function
prediction = await utils.async_iteration(iterator)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 595, in async_iteration
return await iterator.__anext__()
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 588, in __anext__
return await anyio.to_thread.run_sync(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 571, in run_sync_iterator_async
return next(iterator)
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/gradio/utils.py", line 754, in gen_wrapper
response = next(iterator)
File "/workspace/src/gradio_runner.py", line 5053, in bot
for res in get_response(fun1, history, chatbot_role1, speaker1, tts_language1, roles_state1,
File "/workspace/src/gradio_runner.py", line 4948, in get_response
for output_fun in fun1():
File "/workspace/src/gen.py", line 4278, in evaluate
prompter = Prompter(prompt_type, prompt_dict, debug=debug, stream_output=stream_output,
File "/workspace/src/prompter.py", line 1706, in __init__
self.terminate_response = update_terminate_responses(self.terminate_response,
File "/workspace/src/stopping.py", line 25, in update_terminate_responses
generate_eos_token_id = GenerationConfig.from_pretrained(tokenizer.name_or_path).eos_token_id
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/configuration_utils.py", line 843, in from_pretrained
resolved_config_file = cached_file(
File "/h2ogpt_conda/envs/h2ogpt/lib/python3.10/site-packages/transformers/utils/hub.py", line 416, in cached_file
raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
401 Client Error. (Request ID: Root=1-6639f802-164d6a07579d391a189244e2;12303a0c-d620-412a-b17f-58372bd127d5)
Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/generation_config.json.
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it. |
I see. But if you pass the env e.g. the docker line would add:
|
But, I made some changes for that particular piece of code. |
I can confirm passing the environment variable to the docker image with |
when using the docker image, I randomly get assertion errors when making a request from the gradio UI, sometimes it works and sometimes it does not, here is the raised error.
this occurs with the latest two docker images tagged
4059a2c9
and7297519c
.Full error
Docker command
used command to run h2ogpt
The text was updated successfully, but these errors were encountered: