You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Modify the Model and ModelStep with entry_point parameter. (as you do to prepare your model to inference properly)
from sagemaker.model import Model
model = Model(
image_uri=image_uri,
model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
entry_point="inference.py",
sagemaker_session=pipeline_session,
role=role,
)
Build and upsert pipeline from a windows environment (simulating local IDE pycharm debug test launch)
You should get an error during the execution of this type :
With in the cloud watch logs of the failing step :
My understanding of things is :
During the step creation/build, there is sagemaker.workflow._utils._RepackModelStep.py called , the _inject_repack_script_and_launcher method more specifically. It does write a bash script file from a string python variable (_repack_script_launcher.sh), and, if this writing operation is executed from windows OS, there seems to be some "carriage return" characters that are written down in this bash file and then pushed with the rest of the pipeline to the cloud for execution.
Once in sagemaker pipeline execution environment (linux), the _repack_script_launcher.sh generate several errors during the repack_model step, manage to still launch the _repack_model.py script but transmit a model_archive path with extra characters : #15 making the repack step failing because not able to find the model.tar.gz#015 or model.tar.gz\r object.
Note : The exact same code (build upsert run) launch from our CI/CD (linux env) do not cause this error, suggesting that this "writing bash script from python variable" does not cause problem when executed from linux env.
Expected behavior
Be able to build upsert run pipeline from anywhere, especially within my local IDE environment.
Temporary Workaround
I did not succeed in altering the "variable string to write in bash file" or altering the way to write it down in windows environment in a fashion that is still readable without error once transferred linux env...
SO, I duplicated the sagemaker\workflow\_repack_model.py in my project code (in mlops tools folder) and added a small string correction inside to make sure that ".gz" are the 3 last character of the model_archive path. -> Does nothing if bash script already written down from linux env (CICD)
AND I alter SageMaker SDK installation and overwrite the sagemaker\workflow\_repack_model.py with my corrected file right after but this is obviously not a viable way to patch code.
System information
A description of your system. Please provide:
SageMaker Python SDK version: 2.140.0
Framework name (eg. PyTorch) or algorithm (eg. KMeans): basic random forest algo from Scikit
Framework version:
Python version: 3.9
CPU or GPU: CPU
Custom Docker image (Y/N): Y
Additional context
Schneider Electric AI-HUB accounts
The text was updated successfully, but these errors were encountered:
Hi @Mathonal, thanks for reaching out!
I really appreciate your efforts on providing all these details, doing the investigation and presenting the workaround! Your investigated root cause makes sense to me.
Currently the SageMaker Python SDK supports Unix/Linux and Mac OS only, see https://github.com/aws/sagemaker-python-sdk#supported-operating-systems.
However, this is a good callout for supporting Windows environment. I'll re-label this issue to "feature request" and bring this up to my internal team to evaluate.
Synced up with the internal team. Given that the entire SageMaker PySDK does not support Windows OS, will remove the component: pipelines tag and leave this feature request in the general PySDK queue.
Will notify the SageMaker PySDK team offline on this as well.
Describe the bug
RepackModel steps in pipeline execution fails when built and upsert from Windows Environment.
To reproduce
With in the cloud watch logs of the failing step :
My understanding of things is :
During the step creation/build, there is
sagemaker.workflow._utils._RepackModelStep.py
called , the_inject_repack_script_and_launcher
method more specifically. It does write a bash script file from a string python variable (_repack_script_launcher.sh
), and, if this writing operation is executed from windows OS, there seems to be some "carriage return" characters that are written down in this bash file and then pushed with the rest of the pipeline to the cloud for execution.Once in sagemaker pipeline execution environment (linux), the
_repack_script_launcher.sh
generate several errors during the repack_model step, manage to still launch the_repack_model.py
script but transmit a model_archive path with extra characters :#15
making the repack step failing because not able to find themodel.tar.gz#015
ormodel.tar.gz\r
object.Note : The exact same code (build upsert run) launch from our CI/CD (linux env) do not cause this error, suggesting that this "writing bash script from python variable" does not cause problem when executed from linux env.
Expected behavior
Be able to build upsert run pipeline from anywhere, especially within my local IDE environment.
Temporary Workaround
I did not succeed in altering the "variable string to write in bash file" or altering the way to write it down in windows environment in a fashion that is still readable without error once transferred linux env...
SO, I duplicated the
sagemaker\workflow\_repack_model.py
in my project code (in mlops tools folder) and added a small string correction inside to make sure that ".gz" are the 3 last character of the model_archive path. -> Does nothing if bash script already written down from linux env (CICD)AND I alter SageMaker SDK installation and overwrite the
sagemaker\workflow\_repack_model.py
with my corrected file right after but this is obviously not a viable way to patch code.System information
A description of your system. Please provide:
Additional context
Schneider Electric AI-HUB accounts
The text was updated successfully, but these errors were encountered: