bge-m3稠密检索微调

Hello, I saw that the code used for dense retrieval in the fine-tuning documentation is the following code.


CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 -m FlagEmbedding.baai_general_embedding.finetune.run \
--output_dir ./results/bge-m3 \
--model_name_or_path /Embedding/bge-m3 \
--train_data new_neg30.jsonl \
--learning_rate 1e-5 \
--fp16 \
--num_train_epochs 3 \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 2 \
--dataloader_drop_last True \
--normlized True \
--temperature 0.02 \
--query_max_len 64 \
--passage_max_len 256 \
--train_group_size 3 \
--negatives_cross_device \


After I finish fine-tuning, I use the following code to merge。

    model = mix_models(
    model_names_or_paths=["/Embedding/Embedding/bge-m3", "./results/bge-m3"], model_type='encoder', 
    weights=[0.5, 0.5],  # you can change the weights to get a better trade-off.
    output_path='/results/dense_merge')


but I encounter the following issue. How can I resolve it?Are my steps correct?

  from LM_Cocktail import mix_models, mix_models_with_data
ImportError: cannot import name 'mix_models' from 'LM_Cocktail' (unknown location)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bge-m3稠密检索微调 #839

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bge-m3稠密检索微调 #839

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions