Is it possible to fine-tuning model to extend token limit length? #93

nai-kon · 2023-06-20T08:01:51Z

For now, most of open LLM models have token size of 2048. I need to expand it to 4096 or more.
Is is possible expand token size by fine-tuning with longer texts exceed 2048 using LoRA?
Or does it need re-training from scratch?
I searched about this but unable to find the information.
The LoRA freezes original model weight and add trainable layers, so I feel like it's bit difficult.

edwardjhu · 2023-08-05T17:10:01Z

It might work if you also retrain the positional embedding. I'd love to know if it works!

byamasu-patrick · 2023-10-24T19:48:42Z

@nai-kon any update on this. Did you manage to achieve this token expansion?

nai-kon · 2023-10-25T04:41:40Z

Thank you for advice.
Actually, I do not try this token expansion.
Because llama2 supports longer contexts, and a method to further extend the context length by changing the rope frequency has appeared.
So close this issue.

nai-kon closed this as completed Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to fine-tuning model to extend token limit length? #93

Is it possible to fine-tuning model to extend token limit length? #93

nai-kon commented Jun 20, 2023 •

edited

Loading

edwardjhu commented Aug 5, 2023

byamasu-patrick commented Oct 24, 2023

nai-kon commented Oct 25, 2023

Is it possible to fine-tuning model to extend token limit length? #93

Is it possible to fine-tuning model to extend token limit length? #93

Comments

nai-kon commented Jun 20, 2023 • edited Loading

edwardjhu commented Aug 5, 2023

byamasu-patrick commented Oct 24, 2023

nai-kon commented Oct 25, 2023

nai-kon commented Jun 20, 2023 •

edited

Loading