RecursionError: maximum recursion depth exceeded #442

philwee · 2023-04-27T14:36:51Z

.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:1142 in │
│ unk_token_id │
│ │
│ 1139 │ │ """ │
│ 1140 │ │ if self._unk_token is None: │
│ 1141 │ │ │ return None │
│ ❱ 1142 │ │ return self.convert_tokens_to_ids(self.unk_token) │
│ 1143 │ │
│ 1144 │ @Property │
│ 1145 │ def sep_token_id(self) -> Optional[int]: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RecursionError: maximum recursion depth exceeded

Weird bug that happens when using hf-causal-experiment with model and peft

haileyschoelkopf · 2023-04-28T14:28:36Z

Thanks for raising this! I have a suspicion this might be fixable by setting the environment variable TOKENIZERS_PARALLELISM=false.

Do you have a model + task combination / command that can replicate this consistently?

philwee · 2023-04-28T14:47:11Z

Yup, I was doing using this adapter gpt4all-lora on llama7b, running arc_easy, arc_challenge (acc), piqa (acc), sciq, mnli and truthful_qa_mc

python main.py --model hf-causal-experimental --model_args pretrained=decapoda-research/llama-7b-hf,peft=nomic-ai/gpt4all-lora --tasks piqa,wikitext,mnli,arc_easy,arc_challenge,openbookqa,truthfulqa_mc,sciq --device cuda:0

If relevant, it ran perfectly fine but when it came time for the results to show up, it crashes with the error message in the issue

haileyschoelkopf · 2023-04-28T16:53:12Z

Thanks, I’ll give this a try in a minute!

This does sound like an interaction between the bootstrap stderr multiprocessing and tokenizers in this case.

haileyschoelkopf · 2023-04-28T17:04:04Z

  File "/home/hailey/anaconda3/envs/new-harness/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 699, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

Since it seems you've been able to get this running--what's the recommended fix for this LLaMA upload? @philwee

philwee · 2023-04-28T17:18:40Z

You can either changing the transformers package you're on via one of these two

pip install git+https://github.com/mbehm/transformers (old one where it worked)
pip install git+https://github.com/huggingface/transformers (I think they fixed it some point but not sure how its going right now)

Please let me know if this helps! :)

StellaAthena · 2023-04-30T16:22:37Z

Closing this issue as it seems to be a bug in the HF library that has now been fixed. Anyone encountering this issue should make sure they’ve updated to the latest version of transformers before reporting the bug.

haileyschoelkopf · 2023-04-30T20:22:45Z

Do you have a link for the bug in transformers that raises + fixes this? The bug being raised in this issue is not the LLamaTokenizer class name error, if that's what you're referring to.

StellaAthena · 2023-05-01T00:31:24Z

Do you have a link for the bug in transformers that raises + fixes this? The bug being raised in this issue is not the LLamaTokenizer class name error, if that's what you're referring to.

Oh you’re right, I misread the end of the convo. The issue you’re having is that it’s LlamaTokenizer now, not LLamaTokenizer

upunaprosk · 2023-05-15T17:17:51Z

Had the same issue with Llama models. The problem stems from tokenizer initialization.
Put exact bos, eos and unk tokens in your tokenizer config:
{"bos_token": "<s>", "eos_token": "</s>", ... "unk_token": "<unk>"}
and add tokenizer path:
python main.py --model hf-causal-experimental --model_args pretrained="hf-format-llama-7B",tokenizer="hf-format-llama-7B" --device cuda:0 --tasks crows_pairs_english

StellaAthena · 2023-05-19T04:56:43Z

@upunaprosk if correcting the tokenizer solves the problem, it seems like this issue should be opened on the HF transformers repo instead of this one. We are loading the model the way we are told to, it’s just that the transformers library doesn’t know how to load the model.

@philwee @haileyschoelkopf if one of you can verify that this patch solves the problem, I’m happy to mark this as closed and open a corresponding issue on the transformers repo.

StellaAthena closed this as completed Apr 30, 2023

StellaAthena added the bug Something isn't working. label Apr 30, 2023

StellaAthena reopened this May 1, 2023

upunaprosk mentioned this issue May 15, 2023

Bad results for LLaMA #443

Closed

haileyschoelkopf mentioned this issue Jul 9, 2023

RecursionError: maximum recursion depth exceeded while calling a Python object at _create_auto_tokenizer #668

Closed

StellaAthena closed this as completed Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RecursionError: maximum recursion depth exceeded #442

RecursionError: maximum recursion depth exceeded #442

philwee commented Apr 27, 2023

haileyschoelkopf commented Apr 28, 2023

philwee commented Apr 28, 2023 •

edited

Loading

haileyschoelkopf commented Apr 28, 2023

haileyschoelkopf commented Apr 28, 2023

philwee commented Apr 28, 2023

StellaAthena commented Apr 30, 2023

haileyschoelkopf commented Apr 30, 2023

StellaAthena commented May 1, 2023

upunaprosk commented May 15, 2023 •

edited

Loading

StellaAthena commented May 19, 2023

RecursionError: maximum recursion depth exceeded #442

RecursionError: maximum recursion depth exceeded #442

Comments

philwee commented Apr 27, 2023

haileyschoelkopf commented Apr 28, 2023

philwee commented Apr 28, 2023 • edited Loading

haileyschoelkopf commented Apr 28, 2023

haileyschoelkopf commented Apr 28, 2023

philwee commented Apr 28, 2023

StellaAthena commented Apr 30, 2023

haileyschoelkopf commented Apr 30, 2023

StellaAthena commented May 1, 2023

upunaprosk commented May 15, 2023 • edited Loading

StellaAthena commented May 19, 2023

philwee commented Apr 28, 2023 •

edited

Loading

upunaprosk commented May 15, 2023 •

edited

Loading