Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for Falcon-11B #10

Open
s-smits opened this issue May 17, 2024 · 0 comments
Open

[Feature Request] Support for Falcon-11B #10

s-smits opened this issue May 17, 2024 · 0 comments

Comments

@s-smits
Copy link

s-smits commented May 17, 2024

Hi, because of it's multi-linguality and undertrainedness, I'd like to slice Falcon-11B.

There are 60 of these layers. Supposedly, it would just be an easy layer name change? Or bumping the requirements?
Screenshot 2024-05-17 at 23 32 36

yaml_config = """
slices:
  - sources:
      - model: tiiuae/falcon-11B
        layer_range: [0, 25]
  - sources:
      - model: tiiuae/falcon-11B
        layer_range: [56,59]
            
merge_method: passthrough
dtype: bfloat16"""

with open('config.yaml', 'w', encoding="utf-8") as f:
    f.write(yaml_config)
    
!mergekit-yaml config.yaml merge --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle

/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Warmup loader cache:   0%|                                | 0/1 [00:00<?, ?it/s]
Fetching 11 files: 100%|█████████████████████| 11/11 [00:00<00:00, 94737.87it/s]
Warmup loader cache: 100%|████████████████████████| 1/1 [00:00<00:00,  2.22it/s]
Executing graph:   0%|                        | 1/946 [00:00<00:00, 1742.54it/s]
Traceback (most recent call last):
  File "/opt/conda/bin/mergekit-yaml", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/kaggle/working/PruneMe/mergekit/mergekit/options.py", line 82, in wrapper
    f(*args, **kwargs)
  File "/kaggle/working/PruneMe/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/kaggle/working/PruneMe/mergekit/mergekit/merge.py", line 92, in run_merge
    for _task, value in exec.run(quiet=options.quiet):
  File "/kaggle/working/PruneMe/mergekit/mergekit/graph.py", line 197, in run
    res = task.execute(**arguments)
  File "/kaggle/working/PruneMe/mergekit/mergekit/io/tasks.py", line 86, in execute
    raise RuntimeError(
RuntimeError: Tensor transformer.h.59.ln_mlp.weight required but not present in model tiiuae/falcon-11B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant