Skip to content

Sagemaker Local-Mode permission errors #4764

@david-waterworth

Description

@david-waterworth

Describe the bug
I keep running into issues trying to run bring your own container (BYOC) images, they either work in local model, or they work in "cloud mode" but not both. The issue is almost certainly docker permission problems but I cannot see a way of resolving.

To reproduce
I have a docker image

FROM continuumio/miniconda3

WORKDIR /opt/ml/code/
COPY src/ /opt/ml/code/

ENTRYPOINT ["bash", "/opt/ml/code/processor.sh"]

The processor.sh creates a conda venv, installs packages into it then runs conda pack with the output set to /opt/ml/processing/output/pyenv.tar.gz

In local mode this step fails on this line

sagemaker_session.upload_data(source, bucket, path)

Because the container user (root) writes the file with permissions that are too restrictive. Note there's a "fix" on line 99 for a similar issue but it doesn't help.

I tried to work around this by adding a non-root (gid/pid 1000) user to the container. This "fixed" the local issue - but now the contain fails when I run it normally. It appears to be the opposite issue, the container user doesn't have access to the volume /opt/ml/processing/output/ so the script fails.

Expected behavior
Need to be able to run pipelines in local and sagemaker mode without permission errors. It seems that in local mode, scripts need to run with the same pid as the host user, but in sagemaker mode they need to run as root? Is there some way they can work in both?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions