Hello, I saw that the code used for dense retrieval in the fine-tuning documentation is the following code.
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 -m FlagEmbedding.baai_general_embedding.finetune.run
--output_dir ./results/bge-m3
--model_name_or_path /Embedding/bge-m3
--train_data new_neg30.jsonl
--learning_rate 1e-5
--fp16
--num_train_epochs 3
--per_device_train_batch_size 4
--gradient_accumulation_steps 2
--dataloader_drop_last True
--normlized True
--temperature 0.02
--query_max_len 64
--passage_max_len 256
--train_group_size 3
--negatives_cross_device \
After I finish fine-tuning, I use the following code to merge。
model = mix_models(
model_names_or_paths=["/Embedding/Embedding/bge-m3", "./results/bge-m3"], model_type='encoder',
weights=[0.5, 0.5], # you can change the weights to get a better trade-off.
output_path='/results/dense_merge')
but I encounter the following issue. How can I resolve it?Are my steps correct?
from LM_Cocktail import mix_models, mix_models_with_data
ImportError: cannot import name 'mix_models' from 'LM_Cocktail' (unknown location)
Hello, I saw that the code used for dense retrieval in the fine-tuning documentation is the following code.
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 -m FlagEmbedding.baai_general_embedding.finetune.run
--output_dir ./results/bge-m3
--model_name_or_path /Embedding/bge-m3
--train_data new_neg30.jsonl
--learning_rate 1e-5
--fp16
--num_train_epochs 3
--per_device_train_batch_size 4
--gradient_accumulation_steps 2
--dataloader_drop_last True
--normlized True
--temperature 0.02
--query_max_len 64
--passage_max_len 256
--train_group_size 3
--negatives_cross_device \
After I finish fine-tuning, I use the following code to merge。
but I encounter the following issue. How can I resolve it?Are my steps correct?
from LM_Cocktail import mix_models, mix_models_with_data
ImportError: cannot import name 'mix_models' from 'LM_Cocktail' (unknown location)