Merging fails with RuntimeError: weight required but not present in model #284

w601sxs · 2024-04-22T15:27:43Z

I'm trying to merge some embedding models with this config file. the architectures are similar but I think it is erroring out on some names of layers? Would love some suggestions on how to change the yaml to make it work.

YAML config:

models:
  - model: mixedbread-ai/mxbai-embed-large-v1
  - model: BAAI/bge-large-en-v1.5
    parameters:
      density: [0, 0.25, 0.5, 0.75, 1]
      weight: [0, 0.25, 0.5, 0.75, 1]
  - model: avsolatorio/GIST-large-Embedding-v0
    parameters:
      density: [0, 0.25, 0.5, 0.75, 1]
      weight: [0, 0.25, 0.5, 0.75, 1]
  - model: WhereIsAI/UAE-Large-V1
    parameters:
      density: [0, 0.25, 0.5, 0.75, 1]
      weight: [0, 0.25, 0.5, 0.75, 1]
merge_method: dare_ties
base_model: mixedbread-ai/mxbai-embed-large-v1
parameters:
  int8_mask: true
dtype: bfloat16

Error

RuntimeError: Tensor bert.encoder.layer.23.output.LayerNorm.weight required but not present in model WhereIsAI/UAE-Large-V1

CLI used

!mergekit-yaml merge.yaml ./output --cuda

The text was updated successfully, but these errors were encountered:

w601sxs · 2024-04-22T17:32:17Z

maybe an example of how to frankenmerge with passthrough?

metric-space · 2024-04-24T06:34:03Z

Hey there, thank you for the detailed issue. This is definitely a bug

As of now for a quick quick fix for this to make it work on your end is to go here mergekit/_data/architectures/bert.json and replace all instances of bert. with an empty string

and that should hopefully get you going with your current config

That said, we will be putting a bug fix soon

w601sxs · 2024-04-25T20:17:50Z

I'll try that in a local branch and wait for the fix! thanks

cg123 · 2024-04-29T03:04:36Z

Thanks for the bug report! PR #295 should fix this issue. If you run into any further trouble please let me know - the BERT support is quite fresh and I appreciate knowing where it fails.

yaof20 · 2024-04-29T09:25:56Z

Hi Charles! Thanks for the great work!

I am encountering similar issues.

I am using phi-1 and phi-1.5 models, the config yml file is as follows.

dtype: float16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 8]
    model: microsoft/phi-1
- sources:
  - layer_range: [4, 12]
    model: microsoft/phi-1
- sources:
  - layer_range: [8, 16]
    model: microsoft/phi-1
- sources:
  - layer_range: [12, 20]
    model: microsoft/phi-1
- sources:
  - layer_range: [16, 24]
    model: microsoft/phi-1
- sources:
  - layer_range: [20, 28]
    model: microsoft/phi-1
- sources:
  - layer_range: [24, 32]
    model: microsoft/phi-1

Both phi-1 and phi-1.5 give me the following feedback. (I also tried Tinyllama, it also gave me the same issue)

RuntimeError: Tensor model.layers.31.mlp.fc2.weight required but not present in model microsoft/phi-1_5

In addition, how can I run the same yml config for phi-3 model, whose architecture is currently not included in the package?

Thanks!
@cg123

cg123 · 2024-04-30T06:35:03Z

@yaof20 This is because microsoft/phi-1 only has 24 layers, but you're telling mergekit to look for 32 total. If you adjust your config to only use 0-24 instead it should work properly.

As for Phi-3 - I'll add support for it in the next couple of days!

yaof20 · 2024-04-30T06:37:45Z

@yaof20 This is because microsoft/phi-1 only has 24 layers, but you're telling mergekit to look for 32 total. If you adjust your config to only use 0-24 instead it should work properly.

As for Phi-3 - I'll add support for it in the next couple of days!

Thanks for the reply!

metric-space self-assigned this Apr 24, 2024

w601sxs mentioned this issue May 11, 2024

Merging BERT-based embedding models #286

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging fails with RuntimeError: weight required but not present in model #284

Merging fails with RuntimeError: weight required but not present in model #284

w601sxs commented Apr 22, 2024

w601sxs commented Apr 22, 2024

metric-space commented Apr 24, 2024

w601sxs commented Apr 25, 2024

cg123 commented Apr 29, 2024

yaof20 commented Apr 29, 2024 •

edited

Loading

cg123 commented Apr 30, 2024

yaof20 commented Apr 30, 2024

Merging fails with RuntimeError: weight required but not present in model #284

Merging fails with RuntimeError: weight required but not present in model #284

Comments

w601sxs commented Apr 22, 2024

Error

CLI used

w601sxs commented Apr 22, 2024

metric-space commented Apr 24, 2024

w601sxs commented Apr 25, 2024

cg123 commented Apr 29, 2024

yaof20 commented Apr 29, 2024 • edited Loading

cg123 commented Apr 30, 2024

yaof20 commented Apr 30, 2024

yaof20 commented Apr 29, 2024 •

edited

Loading