Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codellama support? #328

Closed
johndpope opened this issue Jul 24, 2023 · 7 comments
Closed

codellama support? #328

johndpope opened this issue Jul 24, 2023 · 7 comments

Comments

@johndpope
Copy link

is this possible? or have to redo training? or ?

@smellslikeml
Copy link

It does! We've tested on a custom fine-tuned llama2-7b model (remyxai/ffmperative-7b hosted on huggingface). Both models (llama 1 & 2) use the same HF code.

guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

@johndpope
Copy link
Author

johndpope commented Aug 8, 2023

https://github.com/QuangBK/localLLM_guidance

@QuangBK - can you help get this working with llama2?

UPDATE - this fork by @fullstackwebdev is much better / though needs llama2 updates.
Screenshot from 2023-08-09 01-37-19
2 checkpoint.
https://github.com/fullstackwebdev/localLLM_guidance

Agent drop down.
Screenshot from 2023-08-09 07-32-17

UPDATE

I attempt to use @danikhan632 fork - but no dice.


        else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")

N.B. this didn't work.

     if model_string == "TheBloke_Llama-2-13B-chat-GGML":
            MODEL_PATH =    '/media/2TB/text-generation-webui/models/TheBloke_Llama-2-13B-chat-GGML/llama-2-13b-chat.ggmlv3.q5_K_S.bin' 
            CHECKPOINT_PATH = None

attempting now to use solution as above. @smellslikeml - is there any video workflow guidance / canned prompts you crafted that would make sense and you can share?

            else:
                # tokenizer = transformers.LlamaTokenizer.from_pretrained(MODEL_PATH, use_fast=True, device_map="auto")
                # model = transformers.LlamaForCausalLM.from_pretrained(MODEL_PATH, torch_dtype=torch.bfloat16, device_map="auto")
                # Because LLama already has role start and end, we don't need to add role_start=role_start, role_end=role_end)
                # guidance.llm = guidance.llms.transformers.LLaMA(model=model, tokenizer=tokenizer)
                # guidance.llm = guidance.llms.TGWUI("http://127.0.0.1:5000")
                guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

UPDATE 3

using this
        guidance.llm = guidance.llms.transformers.LLaMA("remyxai/ffmperative-7b", device_map="auto")

getting this error 
    raise NotImplementedError("In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!")
NotImplementedError: In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

Error in program:  In order to use chat role tags you need to use a chat-specific subclass of Transformers for your LLM from guidance.transformers.*!

pip list - https://gist.github.com/johndpope/2bc86b8b976a81e47f655267c4daf537

@danikhan632
Copy link

Let me try updating, unsure if repo is still active

@VarunGumma
Copy link

Any update on using LLama2 chat models with guidance ??

@danikhan632
Copy link

expect something maybe friday

@johndpope
Copy link
Author

it looks like @iiis-ai has a working example with llama 1 / maybe working with llama2?

https://github.com/iiis-ai/cumulative-reasoning/blob/6c8632577699a8b3f8eee88671ee83c677fa4aea/AutoTNLI/autotnli-direct.py#L22

guidance.llm = guidance.llms.transformers.LLaMA(args.model, device_map="auto", token_healing=True, torch_dtype=torch.bfloat16)
https://github.com/yifanzhang-pro/cumulative-reasoning-anonymous/blob/07bcc6b21aedbee7c82f44b52aa3c0fc123e4d03/AutoTNLI/autotnli-cr.py#L27

@marcotcr
Copy link
Collaborator

LLama2 works fine in the new release, both with HF transformers and with llama.cpp. Please check this out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants