Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Alpaca 13B and 30B tested? #12

Closed
imranraad07 opened this issue Mar 23, 2023 · 5 comments
Closed

Is Alpaca 13B and 30B tested? #12

imranraad07 opened this issue Mar 23, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@imranraad07
Copy link

I tried to run setting the path from huggingface, didn't work for 13B and 30B version but worked for 7B version.

@PotatoSpudowski
Copy link
Owner

I will try and get it integrated tonight ;)

@PotatoSpudowski PotatoSpudowski added the enhancement New feature or request label Mar 24, 2023
@PotatoSpudowski
Copy link
Owner

PotatoSpudowski commented Mar 25, 2023

Hi,
I was able to get 30B param model working.
13B should work fine too and 65B (If someone releases it xD)

You can look at this branch
https://github.com/PotatoSpudowski/fastLLaMa/tree/alpaca-lora

You will have to follow the build steps and convert the model again.

The issue with LoRA models are their embedding size. Based on how LoRA method works (It creates low rank decomposition matrices and freezes the pertained weights), I suspect that is why we have have different embedding sizes compared to non LoRA models.

Will need to sort out a few things before merging to main but feel free to use this and let me know if you face any issues :)

@PotatoSpudowski
Copy link
Owner

Merged to main.

Structure of fastLlama.Model() is updated. Please change accordingly!

@robin-coac
Copy link

Hi @PotatoSpudowski . I was curious how alpaca models are handled differently. For example, llama.cpp requires alpaca models to have n_parts and ins flags. are those things accounted for ?
My C/C++ skills are not good enough to navigate your code.

@PotatoSpudowski
Copy link
Owner

Yup, That's why why require users to specify the ModelIdentifier when initialising the model.
Based on the identifier, we chose the config from the backend (Which tells us about parts, vocab size etc). It is an underrated feature of fastLLaMa which imo is the right way to go about it.

The ins flag if I am not right is supposed to specify that it is in instruction mode is it? Either ways we have example files for Alpaca and LLaMA models which show how to use these models for either text completion or QNA tasks.

Finally we also are working on redesigning our save and load feature and optimising it for latency and size in the feature/save_load branch. Extremely GOATED implementation!

Developers should be allowed to implement their own workflows using the features that were developed using first principles thinking rather than us deciding workflows for them. Will document everything extensively so it is easier for everyone!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants