-
Notifications
You must be signed in to change notification settings - Fork 884
Description
The functionality introduced in PR #4871, which supports command substitution in SKYPILOT_DOCKER_PASSWORD (e.g., $(aws ecr get-login-password ...)), cannot be effectively used when resources.image_id points to an ECR image. The initial docker pull of this ECR image on the provisioned VM fails due to a dependency timing conflict.
Problem Details:
- SkyPilot provisions a base VM.
- When PR Support cli var substitution in docker login command env #4871's feature is used, SkyPilot attempts
docker loginby executing the command inSKYPILOT_DOCKER_PASSWORD, followed bydocker pullfor the specified ECRimage_id. This happens on the newly provisioned VM. - The critical issue is that SkyPilot's standard procedure for copying necessary user credentials, files, or tools (e.g.,
~/.aws/credentials, the AWS CLI itself if not pre-installed, or custom scripts which the password command might be) typically happens in a later provisioning stage. - Consequently, the command in
SKYPILOT_DOCKER_PASSWORDexecutes before its required dependencies are available on the VM. This leads to command failure,docker loginfailure, and ultimately, the ECRimage_idpull fails.
Impact on PR #4871:
This timing conflict is a significant blocker for using PR #4871's command substitution feature for a primary use case: dynamically authenticating to ECR (or similar registries) when the ECR image itself is the runtime environment specified via image_id. While the substitution mechanism itself might be in place, it fails in practice for image_id scenarios due to this provisioning order.
Expected Behavior/Resolution Path:
To enable PR #4871 for ECR image_id use cases, the dependencies (tools, configs, scripts) required by the SKYPILOT_DOCKER_PASSWORD command must be present and configured on the VM at the moment docker login is attempted for the initial image_id pull.
Potential solutions could involve:
- Modifying SkyPilot's provisioning sequence to ensure these specific dependencies are deployed to the VM before the
docker loginforimage_idis attempted. - Introducing a dedicated pre-flight mechanism for staging just these critical Docker authentication dependencies.
Addressing this is essential for PR #4871 to successfully support dynamic ECR authentication for containerized runtime environments.