Can anyone post already trained model? #52

HCBlackFox · 2023-03-17T20:16:58Z

No description provided.

collant · 2023-03-17T20:33:46Z

Hello, you can find this 13B one here: https://huggingface.co/samwit/alpaca13B-lora

Otherwise, there is the 7B one here: https://huggingface.co/tloen/alpaca-lora-7b

Please note these are LoRA models they need the base model to work.

And here is the base model for the 7B: https://huggingface.co/decapoda-research/llama-7b-hf

HCBlackFox · 2023-03-17T20:54:15Z

Thank you

Hello, you can find this 13B one here: https://huggingface.co/samwit/alpaca13B-lora

Otherwise, there is the 7B one here: https://huggingface.co/tloen/alpaca-lora-7b

Please note these are LoRA models they need the base model to work.

And here is the base model for the 7B: https://huggingface.co/decapoda-research/llama-7b-hf

Thank you

0xbitches · 2023-03-18T07:47:00Z

Is there a 30B-4bit lora out there? I think I read somewhere that finetuning in 4bit might not be supported?

ttio2tech · 2023-03-18T13:02:41Z

Hello, you can find this 13B one here: https://huggingface.co/samwit/alpaca13B-lora

Otherwise, there is the 7B one here: https://huggingface.co/tloen/alpaca-lora-7b

Please note these are LoRA models they need the base model to work.

And here is the base model for the 7B: https://huggingface.co/decapoda-research/llama-7b-hf

can the original LLaMA-7B weights (consolidated.00.pth) be used? or can I convert it to hf?

gururise · 2023-03-18T17:19:49Z

Any links for models trained w/3-epochs on the new cleaned dataset?

mattreid1 · 2023-03-18T17:59:15Z

Any links for models trained w/3-epochs on the new cleaned dataset?

I just finished training this 13B one but haven't got it to work yet (I'm using multiple GPUs so maybe that's the issue) https://huggingface.co/mattreid/alpaca-lora-13b

felri · 2023-03-18T19:53:41Z

@collant can you help me understand how can I load the Lora model trained with the 52k dataset and use it to train on another data.json?

In finetune.py I can find the loading of the llama 7b model

model = LlamaForCausalLM.from_pretrained(
    "decapoda-research/llama-7b-hf",
    load_in_8bit=True,
    device_map=device_map,
)
tokenizer = LlamaTokenizer.from_pretrained(
    "decapoda-research/llama-7b-hf", add_eos_token=True
)

and after the lora config obj is being created

config = LoraConfig(
    r=LORA_R,
    lora_alpha=LORA_ALPHA,
    target_modules=TARGET_MODULES,
    lora_dropout=LORA_DROPOUT,
    bias="none",
    task_type="CAUSAL_LM",
)
model = get_peft_model(model, config)

does loading the Lora model from hf involves calling another function and loading that checkpoint? I can see that there is a save_pretrained function, maybe I need to load the Lora model via this? Sorry if this sounds confusing

edit: after a little bit more google I found this load_attn_procs function, maybe it's something around here

edit2: it seems that it was inside generate.py all along

    model = LlamaForCausalLM.from_pretrained(
        "decapoda-research/llama-7b-hf",
        load_in_8bit=True,
        torch_dtype=torch.float16,
        device_map="auto",
    )
    model = PeftModel.from_pretrained(
        model, "tloen/alpaca-lora-7b",
        torch_dtype=torch.float16
    )

aspctu · 2023-03-19T23:15:50Z

30B LoRa adapters here https://huggingface.co/baseten/alpaca-30b

T-Atlas · 2023-03-20T08:38:02Z

@collant can you help me understand how can I load the Lora model trained with the 52k dataset and use it to train on another data.json?

In finetune.py I can find the loading of the llama 7b model
model = LlamaForCausalLM.from_pretrained(
    "decapoda-research/llama-7b-hf",
    load_in_8bit=True,
    device_map=device_map,
)
tokenizer = LlamaTokenizer.from_pretrained(
    "decapoda-research/llama-7b-hf", add_eos_token=True
)
and after the lora config obj is being created
config = LoraConfig(
    r=LORA_R,
    lora_alpha=LORA_ALPHA,
    target_modules=TARGET_MODULES,
    lora_dropout=LORA_DROPOUT,
    bias="none",
    task_type="CAUSAL_LM",
)
model = get_peft_model(model, config)
does loading the Lora model from hf involves calling another function and loading that checkpoint? I can see that there is a save_pretrained function, maybe I need to load the Lora model via this? Sorry if this sounds confusing

edit: after a little bit more google I found this load_attn_procs function, maybe it's something around here

edit2: it seems that it was inside generate.py all along
    model = LlamaForCausalLM.from_pretrained(
        "decapoda-research/llama-7b-hf",
        load_in_8bit=True,
        torch_dtype=torch.float16,
        device_map="auto",
    )
    model = PeftModel.from_pretrained(
        model, "tloen/alpaca-lora-7b",
        torch_dtype=torch.float16
    )

Have you found solution？ #44 I found this may help? But I still confuse with what is

diegolondrina · 2023-03-20T09:47:00Z

Any links for models trained w/3-epochs on the new cleaned dataset?

+1

wafflecomposite · 2023-06-02T23:25:01Z

Please, report @larasatistevany for spamming.

https://support.github.com/contact/report-abuse?category=report-abuse&report=larasatistevany

-> I want to report abusive content or behavior.
-> I want to report SPAM, a user that is disrupting me or my organization's experience on GitHub, or a user who is using my personal information without my permission
-> A user is disrupting me or my organization's experience and productivity by posting SPAM off-topic or other types of disruptive content in projects they do not own.

Put this in the form:

spamming in issue comments
https://github.com/tloen/alpaca-lora/issues/52#issuecomment-1570561693
https://github.com/tloen/alpaca-lora/issues/52#issuecomment-1571059071

Thanks!

T-Atlas mentioned this issue Mar 20, 2023

How to load a model pre-trained on a 52k dataset and continue fine-tuning with another dataset.json? #92

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can anyone post already trained model? #52

Can anyone post already trained model? #52

HCBlackFox commented Mar 17, 2023

collant commented Mar 17, 2023

HCBlackFox commented Mar 17, 2023

0xbitches commented Mar 18, 2023 •

edited

ttio2tech commented Mar 18, 2023

gururise commented Mar 18, 2023

mattreid1 commented Mar 18, 2023

felri commented Mar 18, 2023 •

edited

aspctu commented Mar 19, 2023

T-Atlas commented Mar 20, 2023

diegolondrina commented Mar 20, 2023

wafflecomposite commented Jun 2, 2023

Can anyone post already trained model? #52

Can anyone post already trained model? #52

Comments

HCBlackFox commented Mar 17, 2023

collant commented Mar 17, 2023

HCBlackFox commented Mar 17, 2023

0xbitches commented Mar 18, 2023 • edited

ttio2tech commented Mar 18, 2023

gururise commented Mar 18, 2023

mattreid1 commented Mar 18, 2023

felri commented Mar 18, 2023 • edited

aspctu commented Mar 19, 2023

T-Atlas commented Mar 20, 2023

diegolondrina commented Mar 20, 2023

wafflecomposite commented Jun 2, 2023

0xbitches commented Mar 18, 2023 •

edited

felri commented Mar 18, 2023 •

edited