Why only the official Llama model from Meta? #10

namin · 2023-12-05T23:51:54Z

I am trying to use
https://github.com/chaoyi-wu/PMC-LLaMA
https://huggingface.co/axiong/PMC_LLaMA_13B
instead of the official Llama, and the repo doesn't let me.

Is there a reason why only the official Llama model from Meta is allowed?

Thanks.

karthiksoman · 2023-12-06T01:54:32Z

I see! Can you please share the exception raised while trying this Llama model?

The use of the official Llama model was motivated by the aim to conduct a comparative analysis with GPT models. I thought the performance comparison would be fair if we use official Llama model instead of using a quantized version of it.

namin · 2023-12-06T02:09:36Z

$ python -m kg_rag.run_setup

Starting to set up KG-RAG ...

Did you update the config.yaml file with all necessary configurations (such as GPT .env path, vectorDB file paths, other file paths)? Enter Y or N: Y

Checking disease vectorDB ...
vectorDB already exists!

Do you want to install Llama model? Enter Y or N: Y
Did you update the config.yaml file with proper configuration for downloading Llama model? Enter Y or N: Y
Are you using official Llama model from Meta? Enter Y or N: Y
Did you get access to use the model? Enter Y or N: Y
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Model is not downloaded! Make sure the above mentioned conditions are satisfied
Congratulations! Setup is completed.

and

$ python -m kg_rag.rag_based_generation.Llama.text_generation interactive
...
Press enter for Step 5 - LLM prompting
Prompting  llama
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Traceback (most recent call last):
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/scratch/namin/KG_RAG/kg_rag/rag_based_generation/Llama/text_generation.py", line 52, in <module>
    main()
  File "/scratch/namin/KG_RAG/kg_rag/rag_based_generation/Llama/text_generation.py", line 43, in main
    interactive(question, vectorstore, node_context_df, embedding_function_for_context_retrieval, "llama")
  File "/scratch/namin/KG_RAG/kg_rag/utility.py", line 344, in interactive
    llm = llama_model(config_data["LLAMA_MODEL_NAME"], config_data["LLAMA_MODEL_BRANCH"], config_data["LLM_CACHE_DIR"], stream=True) 
  File "/scratch/namin/KG_RAG/kg_rag/utility.py", line 133, in llama_model
    tokenizer = AutoTokenizer.from_pretrained(model_name,
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 736, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
    return cls._from_pretrained(
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 128, in __init__
    self.update_post_processor()
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 141, in update_post_processor
    bos_token_id = self.bos_token_id
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1141, in bos_token_id
    return self.convert_tokens_to_ids(self.bos_token)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 277, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 284, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1160, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 277, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 284, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1160, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 277, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 284, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
...
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1160, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 277, in convert_tokens_to_ids
    return self._convert_token_to_id_with_added_voc(tokens)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 284, in _convert_token_to_id_with_added_voc
    return self.unk_token_id
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1160, in unk_token_id
    return self.convert_tokens_to_ids(self.unk_token)
  File "/home/namin/mambaforge/envs/kg_rag2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1040, in unk_token
    return str(self._unk_token)
RecursionError: maximum recursion depth exceeded while getting the str of an object

karthiksoman · 2023-12-06T09:40:42Z

Issue was with the tokenizer that was used to download the model. Apparently, PMC_Llama required 'LlamaTokenizer' and for meta Llama 'AutoTokenizer' was used. In addition, PMC_Llama required to configure an additional 'legacy' flag.
I have made those changes and it was successfully downloaded to my server. Please note the following:

run_setup.py is updated now. You can run it and it should download PMC_Llama (just follow the instructions which comes interactively).
Even though it downloads PMC_Llama, for me, it worked only on the prompt_based mode. In the rag based mode, no exceptions are raised, but it was just printing back the question and the context without generating any response. I presume, unlike vanilla Llama, this may require some specific way of prompting (this is a guess). I haven't explored much of PMC_Llama using KG-RAG. Please let me know how it goes for you.
I have now changed the command line arguments to make KG-RAG run flexible.
For interactive mode: -i True (if you dont give this, by default it will be non-interactive)
For choosing gpt model : -g gpt-4 (if you dont give this, by default it chooses gpt-35-turbo)
For running PMC_Lama : -m method-2 (If you dont give this, by default it chooses 'method-1' which is using 'AutoTokenizer')
So for example, if you want to run gpt-4 interactive mode:

python -m kg_rag.rag_based_generation.GPT.text_generation -i True -g gpt-4

All video demos in README are updated based on these changes and also I have cut a new release of KG-RAG to reflect these changes.

I am closing this issue, since this should address the download of PMC_Llama. Feel free to re-open it, if you hit any wall.

karthiksoman closed this as completed Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why only the official Llama model from Meta? #10

Why only the official Llama model from Meta? #10

namin commented Dec 5, 2023

karthiksoman commented Dec 6, 2023 •

edited

Loading

namin commented Dec 6, 2023

karthiksoman commented Dec 6, 2023 •

edited

Loading

Why only the official Llama model from Meta? #10

Why only the official Llama model from Meta? #10

Comments

namin commented Dec 5, 2023

karthiksoman commented Dec 6, 2023 • edited Loading

namin commented Dec 6, 2023

karthiksoman commented Dec 6, 2023 • edited Loading

karthiksoman commented Dec 6, 2023 •

edited

Loading

karthiksoman commented Dec 6, 2023 •

edited

Loading