`specialized_infer.py` returns `401 Client Error` #9

yumemio · 2024-05-28T01:23:46Z

Hello! Kudos to you for making this repository. Also I want to say the paper was awesome too. Combining multiple domain expert models seems to be a promising approach, especially in low-resource settings where we can't run a huge general-purpose model!

I'm having some issue running end-to-end inference with specialized_infer.py (by "end-to-end inference" I mean calling the Octopus model, and then calling an expert model to get the final answer).

First I commented out some experts that do not exist yet:

from utils import functional_token_mapping, extract_content
from specialized_models_inference import (
    inference_biology,
    inference_business,
    inference_chemistry,
    inference_computer_science,
    inference_math,
    inference_physics,
    inference_electrical_engineering,
    inference_history,
    inference_philosophy,
    inference_law,
    #inference_politics,
    inference_culture,
    inference_economics,
    inference_geography,
    #inference_psychology,
    #inference_health,
    #inference_general,
)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import time

torch.random.manual_seed(0)

model_import_mapping = {
    "physics_gpt": lambda: inference_physics.model(),
    "chemistry_gpt": lambda: inference_chemistry.model(),
    "biology_gpt": lambda: inference_biology.model(),
    "computer_science_gpt": lambda: inference_computer_science.model(),
    "math_gpt": lambda: inference_math.model(),
    "business_gpt": lambda: inference_business.model(),
    "electrical_engineering_gpt": lambda: inference_electrical_engineering.model(),
    "history_gpt": lambda: inference_history.model(),
    "philosophy_gpt": lambda: inference_philosophy.model(),
    "law_gpt": lambda: inference_law.model(),
    #"politics_gpt": lambda: inference_politics.model(),
    "culture_gpt": lambda: inference_culture.model(),
    "economics_gpt": lambda: inference_economics.model(),
    "geography_gpt": lambda: inference_geography.model(),
    #"psychology_gpt": lambda: inference_psychology.model(),
    #"health_gpt": lambda: inference_health.model(),
    #"general_gpt": lambda: inference_general.model(),
}

But then I got this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/NexaAIDev/octopus-v4-finetuned-v1/resolve/main/tokenizer_config.json

...

Traceback (most recent call last):
  File "/content/octopus-v4/specialized_infer.py", line 108, in <module>
    tokenizer = AutoTokenizer.from_pretrained("NexaAIDev/octopus-v4-finetuned-v1")
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 817, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 649, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 422, in cached_file
    raise EnvironmentError(
OSError: NexaAIDev/octopus-v4-finetuned-v1 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

The error suggests that the code is trying to access a 🤗 model that's not released yet. Any plans on making the model public?

Thanks for looking into this!

The text was updated successfully, but these errors were encountered:

LyH88 · 2024-06-03T03:56:26Z

Hi yumemio! The error is due to the model name that has been changed but not yet reflected in the code. For now, if you change the model name to "NexaAIDev/Octopus-v4," it should resolve the issue. Specifically, update the tokenizer initialization at line 108 to:
tokenizer = AutoTokenizer.from_pretrained("NexaAIDev/Octopus-v4")

zhiyuan8 · 2024-06-03T04:18:16Z

@yumemio Please try the updated code, the repo name for AutoTokenizer has been changed.

yumemio · 2024-06-04T00:19:22Z

@LyH88 @zhiyuan8 Now it works like a charm. Thank you! 🤗

Complete log output

$ python specialized_infer.py
`flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'.
Current `flash-attenton` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`.
Loading checkpoint shards: 100% 2/2 [00:02<00:00,  1.45s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

============= Below is Octopus-V4 response ==============

You are not running the flash-attention implementation, expect numerical differences.
2024-06-04 00:10:05.319401: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-06-04 00:10:05.372195: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-04 00:10:05.372249: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-04 00:10:05.374141: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-04 00:10:05.382628: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-04 00:10:06.476382: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
<nexa_4> ('Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.')<nexa_end>
Elapsed time: 7.09s
Functional Token: <nexa_4>
Format Argument: Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.

============= Below is specialized LLM response ==============

config.json: 100% 623/623 [00:00<00:00, 5.82MB/s]
pytorch_model.bin.index.json: 100% 23.9k/23.9k [00:00<00:00, 57.6MB/s]
Downloading shards:   0% 0/2 [00:00<?, ?it/s]
pytorch_model-00001-of-00002.bin:   0% 0.00/9.94G [00:00<?, ?B/s]
...
pytorch_model-00001-of-00002.bin: 100% 9.94G/9.94G [00:34<00:00, 290MB/s]
Downloading shards:  50% 1/2 [00:34<00:34, 34.50s/it]
pytorch_model-00002-of-00002.bin:   0% 0.00/4.54G [00:00<?, ?B/s]
...
pytorch_model-00002-of-00002.bin: 100% 4.54G/4.54G [00:15<00:00, 290MB/s]
Downloading shards: 100% 2/2 [00:50<00:00, 25.15s/it]
Loading checkpoint shards: 100% 2/2 [00:03<00:00,  1.98s/it]
generation_config.json: 100% 120/120 [00:00<00:00, 1.15MB/s]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
tokenizer_config.json: 100% 1.69k/1.69k [00:00<00:00, 16.3MB/s]
tokenizer.model: 100% 493k/493k [00:00<00:00, 198MB/s]
added_tokens.json: 100% 90.0/90.0 [00:00<00:00, 711kB/s]
special_tokens_map.json: 100% 101/101 [00:00<00:00, 948kB/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


To find the derivative of the function f(x) = x^3 at the point where x equals 2, we will use the power rule of differentiation. The power rule states that if a function is in the form f(x) = x^n, then the derivative of the function is f'(x) = n * x^(n-1).

In this case, n = 3, so the derivative of f(x) = x^3 is f'(x) = 3 * x^2.

Now, we need to evaluate the derivative at x = 2:

f'(2) = 3 * (2)^2 = 3 * 4 = 12

So, the derivative of f(x) = x^3 at the point where x equals 2 is f'(2) = 12.

Interpreting the result within the context of rate of change and tangent slope:

The derivative of a function represents the rate of change of the function with respect to the independent variable. In this case, the rate of change of f(x) = x^3 with respect to x at x = 2 is 12.

The tangent slope at the point (2, f(2)) is also equal to the derivative f'(2) = 12. This means that the tangent line to the curve y = x^3 at the point (2, 8) has a slope of 12.

In conclusion, the derivative of f(x) = x^3 at the point where x equals 2 is f'(2) = 12, which represents the rate of change of the function and the slope of the tangent line at that point.

Closing the issue as resolved.

yumemio closed this as completed Jun 4, 2024

RaccoonOnion mentioned this issue Jun 5, 2024

Releasing inference code for the politics, psychology, psychology, general #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`specialized_infer.py` returns `401 Client Error` #9

`specialized_infer.py` returns `401 Client Error` #9

yumemio commented May 28, 2024 •

edited

Loading

LyH88 commented Jun 3, 2024 •

edited

Loading

zhiyuan8 commented Jun 3, 2024

yumemio commented Jun 4, 2024

specialized_infer.py returns 401 Client Error #9

specialized_infer.py returns 401 Client Error #9

Comments

yumemio commented May 28, 2024 • edited Loading

LyH88 commented Jun 3, 2024 • edited Loading

zhiyuan8 commented Jun 3, 2024

yumemio commented Jun 4, 2024

`specialized_infer.py` returns `401 Client Error` #9

`specialized_infer.py` returns `401 Client Error` #9

yumemio commented May 28, 2024 •

edited

Loading

LyH88 commented Jun 3, 2024 •

edited

Loading