Skip to content

Conversation

@nammn
Copy link
Collaborator

@nammn nammn commented Dec 17, 2025

Summary

Fixes flaky test failures on IBM Power and Z static machines caused by mixed root/rootless podman usage.

Root cause: Static machines persist state between runs. The scripts mixed sudo podman (root) and podman (user), creating two separate container namespaces with different storage and auth locations. This caused:

  • Flaky crun finding errors (crun configured in user space, but sudo podman looked in root space)
  • Crun version mismatch between root and user podman instances
  • boto3/auth errors (ECR credentials stored in user's auth.json but sudo podman looked in /root/)

Fix: Standardize on rootless podman everywhere:

  • Remove all sudo podman usage
  • Use user-level config paths (~/.config/containers/)
  • Set up XDG_RUNTIME_DIR for rootless operation
  • cleanup lingering podman containers

Proof of Work

Checklist

  • Have you linked a jira ticket and/or is the ticket in the title?
  • Have you checked whether your jira ticket required DOCSP changes?
  • Have you added changelog file?

@github-actions
Copy link

⚠️ (this preview might not be accurate if the PR is not rebased on current master branch)

MCK 1.6.2 Release Notes

@nammn nammn added the skip-changelog Use this label in Pull Request to not require new changelog entry file label Dec 17, 2025
fi
export XDG_RUNTIME_DIR="${runtime_dir}"

# Clean up stale podman state (fixes "cannot re-exec process to join the existing user namespace")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still happens, but once evg agents properly cleanup podman containers we should be able to rremove this: https://jira.mongodb.org/browse/DEVPROD-25447

local start_args=("--driver=podman")
start_args+=("--cpus=4" "--memory=8g")
# Use containerd as container runtime inside minikube for better rootless support
start_args+=("--container-runtime=containerd")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and containerd is also more stable

Fetches an auth token from ECR via boto3 and logs
into the Docker daemon via the Docker SDK.
"""
import boto3
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets only import this when we use this, otherwise build_image (for podman and minikube) and thus ibm container will need those deps

@nammn nammn changed the title use rootless podman CLOUDP-362015 - use rootless podman Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-changelog Use this label in Pull Request to not require new changelog entry file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants