Describe the bug
When using the microsoft/Phi-3-small-8k-instruct with the DJLModel class in the SageMaker Python SDK, the entry_point is not working, and the code specified is not loaded into the container. When endpoint is created it fails since in can't locate /opt/ml/model/code/model.py.
To reproduce
Steps to reproduce the behavior:
-
Set up a SageMaker environment with the required IAM roles and permissions.
-
Use the following code snippet to create a DJLModel:
from sagemaker.djl import DJLModel
djl_model = DJLModel(
model_id="microsoft/Phi-3-small-128k-instruct/",
sagemaker_session=sagemaker_session,
djl_version="0.29.0",
role=iam_role,
task="text-generation",
engine="Python",
entry_point="model.py",
source_dir="code",
env={
"OPTION_ROLLING_BATCH": "vllm",
"TENSOR_PARALLEL_DEGREE": "1",
"OPTION_MAX_ROLLING_BATCH_SIZE": "2",
"OPTION_DTYPE": "fp16",
"HF_MODEL_TRUST_REMOTE_CODE": "True",
"DJL_ENTRY_POINT": "/opt/ml/model/code/model.py"
}
)
-
Deploy the model using SageMaker.
Expected behavior
The entry_point specified should be correctly loaded into the container, allowing the model to function as intended.
System information
- SageMaker Python SDK version:2.229.0
- Framework name (e.g., PyTorch) or algorithm (e.g., KMeans): DJL
- Framework version: 0.29.0
- Python version: 3.11
- CPU or GPU: GPU
- Custom Docker image (Y/N): N