Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to fine-tuning model to extend token limit length? #93

Closed
nai-kon opened this issue Jun 20, 2023 · 3 comments
Closed

Is it possible to fine-tuning model to extend token limit length? #93

nai-kon opened this issue Jun 20, 2023 · 3 comments

Comments

@nai-kon
Copy link

nai-kon commented Jun 20, 2023

For now, most of open LLM models have token size of 2048. I need to expand it to 4096 or more.
Is is possible expand token size by fine-tuning with longer texts exceed 2048 using LoRA?
Or does it need re-training from scratch?
I searched about this but unable to find the information.
The LoRA freezes original model weight and add trainable layers, so I feel like it's bit difficult.

@edwardjhu
Copy link
Collaborator

It might work if you also retrain the positional embedding. I'd love to know if it works!

@byamasu-patrick
Copy link

@nai-kon any update on this. Did you manage to achieve this token expansion?

@nai-kon
Copy link
Author

nai-kon commented Oct 25, 2023

Thank you for advice.
Actually, I do not try this token expansion.
Because llama2 supports longer contexts, and a method to further extend the context length by changing the rope frequency has appeared.
So close this issue.

@nai-kon nai-kon closed this as completed Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants