-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update DJLModel class for latest container releases #4754
Conversation
fastertransformer_predictor = fastertransformer_model.deploy("ml.g5.12xlarge", | ||
initial_instance_count=1) | ||
|
||
Regardless of which way you choose to create your model, a ``Predictor`` object is returned. You can use this ``Predictor`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we still leave in some docs explaining that a Predictor
is returned on deploy()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea i can do that, I think i deleted this line by mistake!
Each ``Predictor`` provides a ``predict`` method, which can do inference with json data, numpy arrays, or Python lists. | ||
Inference data are serialized and sent to the DJL Serving model server by an ``InvokeEndpoint`` SageMaker operation. The | ||
``predict`` method returns the result of inference against your model. | ||
|
||
By default, the inference data is serialized to a json string, and the inference result is a Python dictionary. | ||
|
||
Model Directory Structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is all of this removed information being moved to the LMI docs on the AWS documentation site?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have this all on our LMI docs site.
We are not publishing on AWS docs anymore, but have aligned on all our docs living here https://docs.djl.ai/docs/serving/serving/docs/lmi/index.html. The model directory structure specifically is here https://docs.djl.ai/docs/serving/serving/docs/lmi/deployment_guide/model-artifacts.html
def _set_serve_properties(hf_model_config: dict, schema_builder: SchemaBuilder) -> tuple: | ||
def _get_default_djl_configurations( | ||
model_id: str, hf_model_config: dict, schema_builder: SchemaBuilder | ||
) -> tuple: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: tuple[dict, int]
src/sagemaker/djl_inference/model.py
Outdated
logger.info("Using provided engine %s", self.engine) | ||
return self.engine | ||
|
||
if self.task is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessary, None == "text-embedding"
is not an error
@akrishna1995 - can you review this PR? I have two approvals, one from my team (owning DJLModel class), and one from the team that owns the ModelBuilder class. |
Issue #, if available:
Description of changes:
The DJLModel class has not been updated in multiple container releases, and as such it's functionality is completely broken.
This change updates the DJLModel class for the latest DJL container releases. Furthermore, it simplifies the interface and functionality so that it is more future-proofed.
Specifically:
serving.properties
requirement in favor of environment variables. A serving.properties file can still be provided and used, but the DJLModel class will not read/parse this configuration.Testing done:
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.