experts based on Phi 3 model with PEFT library

Hello,

Thank you for your interesting research work.

I have 10 experts trained based on the Phi 3 model (datasets selected based on paper cluttering). I have used the TRL and PEFT libraries for training, ensuring the checkpoint structures are suitable for these libraries.

In training the experts, I used LoRA in 4-bit quantized mode. Additionally, I utilized the o and kqv attention in each layer during training.

I would like to know how I can use your code to execute Arrow for merging these experts for each token in every model layer.

I have some errors in the code.

please explain step by step. I am a beginner in this field.

Thank you, and I would appreciate your response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

experts based on Phi 3 model with PEFT library #124

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

experts based on Phi 3 model with PEFT library #124

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions