adding lllama fairscale #2604

HamidShojanazeri · 2023-09-20T22:47:59Z

Description

Adding Fariscale llama to Torchserve

Fixes #(issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Logs
Test B
Logs for Test B

Checklist:

Did you have fun?
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

codecov · 2023-09-20T23:07:00Z

Codecov Report

Merging #2604 (19edba1) into master (7f4419f) will not change coverage.
The diff coverage is n/a.

❗ Current head 19edba1 differs from pull request most recent head f144612. Consider uploading reports for the commit f144612 to get more accurate results

@@           Coverage Diff           @@
##           master    #2604   +/-   ##
=======================================
  Coverage   72.44%   72.44%           
=======================================
  Files          85       85           
  Lines        3963     3963           
  Branches       58       58           
=======================================
  Hits         2871     2871           
  Misses       1088     1088           
  Partials        4        4

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

examples/large_models/llama-fairscale/llama-handler.py

examples/large_models/llama-fairscale/Readme.md

chauhang

Thanks @HamidShojanazeri for this PR. Please see comments inline. It will be good to match the readme and config files for consistent example -- eg use 13b model as the base and explain everything for that.

lxning · 2023-10-23T18:09:12Z

examples/large_models/llama-fairscale/Readme.md

+    model_path: "PATH/TO/MODEL_CHECKPOINTS"
+    tokenizer_path: "PATH/TO/MODEL_CHECKPOINTS/tokenizer.model"


could you change the dir as the same way as this PR: https://github.com/pytorch/serve/pull/2623/files#diff-8dff1fb7c93d43b560e8ef09c2e2c6f93b55309399d807e231131ea962303dae?

lxning · 2023-10-23T18:12:04Z

examples/large_models/llama-fairscale/Readme.md

+### Step 3: Generate MAR file
+
+```bash
+torch-model-archiver --model-name llama --version 1.0 --handler llama-handler.py --config-file model-config.yaml --archive-format tgz -r requirements.txt


change "--archive-format tgz" to "--archive-format no-archive"

lxning · 2023-10-23T18:26:56Z

examples/large_models/llama-fairscale/llama-handler.py

+        model_path = ctx.model_yaml_config["handler"]["model_path"]
+        tokenizer_path = ctx.model_yaml_config["handler"]["tokenizer_path"]


could you change to the same way as https://github.com/pytorch/serve/blob/master/examples/large_models/tp_llama/llama-handler.py#L68C1-L68C1?

ie. model_path = f'{model_dir}/{ctx.model_yaml_config["handler"]["model_path"]}'

lxning · 2023-10-23T19:03:40Z

examples/large_models/llama-fairscale/llama-handler.py

+        torch.manual_seed(seed)
+
+        logger.info("Instantiating Llama model")
+        self.model = Llama.build(


qq, should we provide option defer init for llama2-70b?

adding lllama fairscale

06e656b

Merge branch 'master' into llama-fairscale

fef6c33

HamidShojanazeri changed the title ~~[WIP] adding lllama fairscale~~ adding lllama fairscale Oct 12, 2023

HamidShojanazeri and others added 2 commits October 12, 2023 23:25

clean up

9ac6aef

fixed lint

f3336f5

HamidShojanazeri requested review from lxning and chauhang October 13, 2023 12:52

Merge branch 'master' into llama-fairscale

6624f94

chauhang reviewed Oct 16, 2023

View reviewed changes

examples/large_models/llama-fairscale/llama-handler.py Outdated Show resolved Hide resolved

chauhang reviewed Oct 16, 2023

View reviewed changes

examples/large_models/llama-fairscale/llama-handler.py Outdated Show resolved Hide resolved

chauhang reviewed Oct 16, 2023

View reviewed changes

examples/large_models/llama-fairscale/Readme.md Show resolved Hide resolved

chauhang suggested changes Oct 16, 2023

View reviewed changes

HamidShojanazeri and others added 4 commits October 18, 2023 07:57

fix format

fa708f6

fix the spell check

6eece6e

Merge branch 'master' into llama-fairscale

fe4a947

Merge branch 'master' into llama-fairscale

53cbb9a

lxning reviewed Oct 23, 2023

View reviewed changes

Merge branch 'master' into llama-fairscale

f144612

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding lllama fairscale #2604

adding lllama fairscale #2604

HamidShojanazeri commented Sep 20, 2023 •

edited

codecov bot commented Sep 20, 2023 •

edited

chauhang left a comment

lxning Oct 23, 2023

lxning Oct 23, 2023

lxning Oct 23, 2023

lxning Oct 23, 2023

		model_path: "PATH/TO/MODEL_CHECKPOINTS"
		tokenizer_path: "PATH/TO/MODEL_CHECKPOINTS/tokenizer.model"

		model_path = ctx.model_yaml_config["handler"]["model_path"]
		tokenizer_path = ctx.model_yaml_config["handler"]["tokenizer_path"]

adding lllama fairscale #2604

Are you sure you want to change the base?

adding lllama fairscale #2604

Conversation

HamidShojanazeri commented Sep 20, 2023 • edited

Description

Type of change

Feature/Issue validation/testing

Checklist:

codecov bot commented Sep 20, 2023 • edited

Codecov Report

chauhang left a comment

Choose a reason for hiding this comment

lxning Oct 23, 2023

Choose a reason for hiding this comment

lxning Oct 23, 2023

Choose a reason for hiding this comment

lxning Oct 23, 2023

Choose a reason for hiding this comment

lxning Oct 23, 2023

Choose a reason for hiding this comment

HamidShojanazeri commented Sep 20, 2023 •

edited

codecov bot commented Sep 20, 2023 •

edited