-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
I ran into the following error when trying to deploy a remote function (training job) with SageMaker using a custom docker container that I built.
Importing from sagemaker.remote_function import remote raises a Pydantic validation error at import time in recent versions of the SageMaker Python SDK. The error occurs even before any code using remote is executed.
The issue stems from models inside sagemaker_core (imported transitively by the SDK) that define fields named json, which raise the following exception under Pydantic v2:
NameError: Field name "json" shadows a BaseModel attribute; use a different field name with "alias='json'"The result is that the SageMaker SDK becomes unusable with Pydantic v2 due to this import-time crash
To reproduce
I ran into this error when trying to run a remote sagemaker training job in a custom docker container
# Create a fresh virtual environment
python -m venv venv
source venv/bin/activate
# Install SageMaker SDK and Pydantic v2
pip install "sagemaker==2.254.1" "pydantic"
# Try importing `remote` from `sagemaker.remote_function`
python - << 'EOF'
from sagemaker.remote_function import remote
EOFExpected behavior
The import should succeed without errors, allowing use of the remote function and other SageMaker SDK components.
Screenshots or logs
File "/usr/local/lib/python3.9/site-packages/sagemaker/jumpstart/types.py", line 20, in <module>
from sagemaker_core.shapes import ModelAccessConfig as CoreModelAccessConfig
File "/usr/local/lib/python3.9/site-packages/sagemaker_core/main/shapes.py", line 2509, in <module>
class MonitoringDatasetFormat(Base):
NameError: Field name "json" shadows a BaseModel attribute; use a different field name with "alias='json'".System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.254.1
- Framework version: Pydantic V2
- Python version: 3.9.13
- CPU or GPU: CPU
- Custom Docker image (Y/N): YES
Additional context
I resolved the issue by downgrading my SageMaker sdk version... the issue appears to be rooted in sagemaker_core model definitions being incompatible with the Pydantic v2 API.
Here are the other dependencies that I installed on the docker image:
gensim==4.3.2
joblib==1.5.1
imbalanced-learn==0.11.0
nltk==3.8.1
numpy==1.21.5
pandas==1.4.2
psycopg2-binary==2.9.10
regex==2022.3.15
scikit-learn==1.0.2
SQLAlchemy==2.0.30
vaderSentiment==3.3.2
sagemaker==2.243.2
spacy==3.7.5
pyarrow
boto3