Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which models correspond here #9

Closed
iamdrivenbymywonderfullife opened this issue May 23, 2024 · 1 comment
Closed

Which models correspond here #9

iamdrivenbymywonderfullife opened this issue May 23, 2024 · 1 comment

Comments

@iamdrivenbymywonderfullife
Copy link

I'm a computer nerd, but very interested in big language modelling
`python3 -m fastchat.model.apply_delta \

--base-model-path /path/to/hf-model/llama-7b \

--target-model-path /path/to/hf-model/character-llm-beethoven-7b \

--delta-path fnlp/character-llm-beethoven-7b-wdiff`

character-llm-beethoven-7b-wdiff ----> the model that can be downloaded here (there's a big pytorch_model file)

And where do the models llama-7b and character-llm-beethoven-7b correspond to the content?
Can you be more specific?

图片1

@choosewhatulike
Copy link
Owner

Hi, the model is trained with Llama 1 and we release the weight differences (the delta of weights) instead of the actual weights. To recover the weights for training and inference, first, you need to download the Llama 1 base model (e.g. https://huggingface.co/luodian/llama-7b-hf), then run the script apply_delta to recover the model. Explained arguments are as follows.

--base-model-path (the Llama 1 base model you need to download and save at this dir)
--target-model-path (the output dir for the recovered model)
--delta-path (the character-ll-wdiff dir, which you can download from our repo)

PS: We use Llama 1 because, at that time, there were not many choices of good open-source LLMs for fine-tuning. You can always switch to a more powerful LLM (e.g. Llama 3) to train a better character-llm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants