Skip to content

bge-m3稠密检索微调 #839

@sevenandseven

Description

@sevenandseven

Hello, I saw that the code used for dense retrieval in the fine-tuning documentation is the following code.

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 -m FlagEmbedding.baai_general_embedding.finetune.run
--output_dir ./results/bge-m3
--model_name_or_path /Embedding/bge-m3
--train_data new_neg30.jsonl
--learning_rate 1e-5
--fp16
--num_train_epochs 3
--per_device_train_batch_size 4
--gradient_accumulation_steps 2
--dataloader_drop_last True
--normlized True
--temperature 0.02
--query_max_len 64
--passage_max_len 256
--train_group_size 3
--negatives_cross_device \

After I finish fine-tuning, I use the following code to merge。

model = mix_models(
model_names_or_paths=["/Embedding/Embedding/bge-m3", "./results/bge-m3"], model_type='encoder', 
weights=[0.5, 0.5],  # you can change the weights to get a better trade-off.
output_path='/results/dense_merge')

but I encounter the following issue. How can I resolve it?Are my steps correct?

from LM_Cocktail import mix_models, mix_models_with_data
ImportError: cannot import name 'mix_models' from 'LM_Cocktail' (unknown location)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions