New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[WIP] Updating deepspeed handler for more models #359

Merged

lanking520 merged 3 commits into deepjavalibrary:master from siddvenk:deepspeed

Dec 7, 2022

Contributor

siddvenk commented Dec 5, 2022 •

edited

Loading

Description

This PR adds support for built in handlers to serve models using the DeepSpeed engine.

2 handlers are added/modified:

Stable Diffusion handler
Transformer/LLM handler

Testing for Stable Diffusion and most language models except bloom have been tested. Things left to be done:

Stable Diffusion is a separate handler as I felt it was cleaner this way. It is different enough that implementing it within a generic deepspeed handler felt messy. But, right now it's tied to DS so we may want to modify it so it can work with accelerate or deepspeed
- Currently I have only tested the text2image task from HF. Others should be easy to add, but might require additional input/output processing
In the deepspeed handler
- Support user supplied injection policy
- Support user provided MPU
- Better support for batch inference (right now it works, but the usage/interface could probably be improved)
- DS config is written for 0.7.6. It can be modified to be compatible with 0.7.5 if that's the version we plan on including in the next release
- Adding tests

siddvenk changed the title ~~Updating deepspeed handler for more models~~ [WIP] Updating deepspeed handler for more models

frankfliu reviewed

View reviewed changes

engines/python/setup/djl_python/deepspeed.py

+                      self.data_type = None
+                      self.max_tokens = None
+                      self.device = None
+                      self.world_size = None

Contributor

frankfliu Dec 6, 2022

world_size should be identical to tensor_parallel_degree

engines/python/setup/djl_python/deepspeed.py Outdated

+                                       f"tensor_parallel_degree: {self.tensor_parallel_degree}")
+                      self.initialized = True
+                  def parse_properties(self, properties):

Contributor

frankfliu Dec 6, 2022

Suggested change

      
                def parse_properties(self, properties):
          
                def _parse_properties(self, properties):

Contributor Author

siddvenk Dec 6, 2022

👍

engines/python/setup/djl_python/deepspeed.py Outdated

+                          "max_out_tokens": self.max_tokens,
+                      }
+                  def validate_model_type_and_task(self):

Contributor

frankfliu Dec 6, 2022

Suggested change

      
                def validate_model_type_and_task(self):
          
                def _validate_model_type_and_task(self):

Contributor Author

siddvenk Dec 6, 2022

👍

engines/python/setup/djl_python/deepspeed.py

+                      self.model_id = properties.get("model_id")
+                      self.task = properties.get("task")
+                      self.data_type = properties.get("data_type", "fp32")
+                      self.max_tokens = int(properties.get("max_tokens", 1024))

Contributor

frankfliu Dec 6, 2022

what should be a proper default to max_tokens?

Contributor Author

siddvenk Dec 6, 2022

Deepspeed uses 1024 as the default, so I think keeping it the same as that makes sense.

Though, i think task specific pipelines on HF side might override that and set other defaults.

engines/python/setup/djl_python/deepspeed.py

+                          "enable_cuda_graph": properties.get("enable_cuda_graph", "false").lower() == "true",
+                          "triangular_masking": properties.get("triangular_masking", "true").lower() == "true",
+                          "checkpoint": properties.get("checkpoint"),
+                          "base_dir": properties.get("base_dir"),

Contributor

frankfliu Dec 6, 2022

Why user need to set base_dir?

Contributor Author

siddvenk Dec 6, 2022

If base_dir is not set, DS will look for checkpoint in current directory. If checkpoint is in some child dir then base_dir helps to identify it (though it works if you specify the full path to checkpoint and leave base_dir empty). Not sure if we need to expose this, but if user is coming from DS world they might expect this.

engines/python/setup/djl_python/deepspeed.py

+                      self.model_dir = None
+                      self.model_id = None
+                      self.data_type = None
+                      self.max_tokens = None

Contributor

frankfliu Dec 6, 2022

We might need min_tokens

siddvenk force-pushed the deepspeed branch from 3e5e007 to edcd62f Compare

December 6, 2022 23:05

frankfliu reviewed

View reviewed changes

engines/python/setup/djl_python/deepspeed.py Show resolved Hide resolved

siddvenk added 2 commits

December 7, 2022 08:55


          Updating deepspeed handler for more models

7905e8f


          Support additional configs through serving.properties, add initial bl…

…oom support

siddvenk force-pushed the deepspeed branch from edcd62f to ad22e0d Compare

December 7, 2022 16:55


          fail on unsupported dtype, make some methods private

054016e

siddvenk force-pushed the deepspeed branch from ad22e0d to 054016e Compare

December 7, 2022 16:58

lanking520 marked this pull request as ready for review

December 7, 2022 16:59

lanking520 requested a review from zachgk as a code owner

December 7, 2022 16:59

lanking520 approved these changes

View reviewed changes

lanking520 merged commit b0f7364 into deepjavalibrary:master

siddvenk deleted the deepspeed branch

December 8, 2022 17:11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment