Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's not possible to build image using UDI #22914

Closed
svor opened this issue Apr 9, 2024 · 6 comments
Closed

It's not possible to build image using UDI #22914

svor opened this issue Apr 9, 2024 · 6 comments
Assignees
Labels
area/dogfooding Using Eclispe Che to code, test and build Eclipse Che area/udi Issues and PRs related to the universal developer image https://github.com/devfile/developer-images kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. team/A This team is responsible for the Che Operator and all its operands as well as chectl and Hosted Che

Comments

@svor
Copy link
Contributor

svor commented Apr 9, 2024

Describe the bug

Currently it is not possible to use podman build, podman info, etc commands in UDI.
The error is:
Error: failed to mount overlay for metacopy check with "" options: permission denied

Che version

next (development version)

Steps to reproduce

  1. Start any workspace that is used UDI in dev component (for example https://github.com/che-samples/web-nodejs-sample)
  2. Try to run podman info command in the terminal

Expected behavior

It should be possible to build images in the component that uses UDI

Runtime

OpenShift

Screenshots

screenshot-che-dogfooding apps che-dev x6e0 p1 openshiftapps com-2024 04 09-13_28_20

Installation method

chectl/latest

Environment

Linux

Eclipse Che Logs

No response

Additional context

@svor svor added kind/bug Outline of a bug - must adhere to the bug report template. area/udi Issues and PRs related to the universal developer image https://github.com/devfile/developer-images area/dogfooding Using Eclispe Che to code, test and build Eclipse Che labels Apr 9, 2024
@svor svor added severity/P1 Has a major impact to usage or development of the system. team/A This team is responsible for the Che Operator and all its operands as well as chectl and Hosted Che labels Apr 9, 2024
@AObuchow
Copy link

AObuchow commented Apr 9, 2024

It seems like /home/user/.config/containers/storage.conf is missing, but /home/tooling/.config/containers/storage.conf is present. For some reason, it seems like the stow command in the Dockerfile itself is not creating a symbolic link for storage.conf from the /home/tooling/ directory to /home/user/ EDIT: The stow command in the Dockerfile is working as expected, it's the stow command in the entrypoint that's failing.

What's weird is that running stow . -t /home/user/ -d /home/tooling/ --no-folding -v 2 > /tmp/stow.log 2>&1 in the Che Code terminal will create this missing symbolic link without error (cat'ing /tmp/stow.log shows no errors for stow).

When looking at the GH Actions for the UDI, the stow command in the Dockerfile is not failing:

#57 [50/53] RUN stow . -t /home/user/ -d /home/tooling/ --no-folding
#57 DONE 6.7s

It's worth adding the -v 2 to the stow command in the Dockerfile probably to see if any errors are getting logged when debugging this, though I doubt it.

My current guess is some file ownership/permissions issue is happening with the related storage.conf files.

@AObuchow
Copy link

AObuchow commented Apr 9, 2024

What's weird is that the git blame regarding the storage.conf files in the Dockerfile show they haven't been touched in months (for some lines, in years)...

@dkwon17
Copy link
Contributor

dkwon17 commented Apr 10, 2024

I'm able to reproduce the issue for the empty sample workspace if I have the following in CheCluster:

spec:
  devEnvironments:
    storage:
      pvcStrategy: per-workspace
    persistUserHome:
      enabled: true

From my testing, this issue happens for workspaces with CheCode versions after this change: https://github.com/che-incubator/che-code/pull/221/files. I cannot reproduce the issue for CheCode images created before that PR.

Also, not only is storage.conf not linked, other files are not linked such as .kubectl_aliases -> ../tooling/.kubectl_aliases:

~ $ ls -la ~
total 12
drwxrwsr-x. 5 root user 123 Apr 10 19:06 .
drwxrwxr-x. 1 root root  21 Apr  9 11:13 ..
-rw-r--r--. 1 user user 141 Apr 10 19:06 .bash_profile
-rw-r--r--. 1 user user 376 Apr 10 19:06 .bashrc
drwxr-sr-x. 3 user user  24 Apr 10 19:06 .config
drwxr-sr-x. 2 user user  20 Apr 10 19:06 .kube
drwx--S---. 3 user user  19 Apr 10 19:06 .local
-rw-r--r--. 1 user user   0 Apr 10 19:06 .stow_completed
-rw-r-----. 1 user user 532 Apr 10 19:06 .viminfo

@AObuchow
Copy link

Thank you for the valuable info @dkwon17 🙏
David and I made some more findings regarding this bug:

As mentioned, persistUserHome needs to be enabled in the Che Cluster CR. When this feature is enabled, stow will be ran from the UDI's entrypoint .

If you check the contents of /tmp/stow.log, you can see that the stow command in the entrypoint is currently failing:

LINK: .gitconfig => ../tooling/.gitconfig
Planning stow of package .... done
Processing tasks...
stow: ERROR: Could not create directory: .local (File exists)

Upon inspection, it seems that /home/user/.local/share/containers/storage/ is being created and populated at some point during workspace startup. If you run quay.io/devfile/universal-developer-image:latest as a standalone container in Docker, there's no /home/user/.local/share/ directory.

Because /home/user/.local/share/... is non-empty, stow aborts when run in the entrypoint, causing other important files to not be symbolically linked, such as .kubectl_aliases and /home/user/.config/containers/storage.conf.

As a temporary workaround for this bug, you can do: rm -rf .local/share/ && rm /home/user/.stow_completed && /entrypoint.sh. Afterwards, running podman info should work as expected.

I don't think the issue is coming from the UDI itself. My guess is this is CheCode related: at some point during the workspace bootstrap process /home/user/.local/share/... is being populated.

I also noticed that if you rm -rf .local/share/, then run podman info (or podman buid, and probably other podman commands), /home/user/.local/share/containers/storage will be created. So it's possible that a recent change to Che Code is executing a podman command, which is causing this bug to occur.

@AObuchow
Copy link

This issue should be resolved now that devfile/devworkspace-operator#1251 is merged. Once the nightly build of DevWorkspace Operator is live on the dogfooding instance, verification should be done to ensure this issue is no longer reproducible.

@AObuchow
Copy link

Now that devfile/devworkspace-operator#1251 & devfile/developer-images#173 are merged, this issue seems to finally be resolved. Great work @dkwon17!

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dogfooding Using Eclispe Che to code, test and build Eclipse Che area/udi Issues and PRs related to the universal developer image https://github.com/devfile/developer-images kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. team/A This team is responsible for the Che Operator and all its operands as well as chectl and Hosted Che
Projects
Status: Ready for Review
Development

No branches or pull requests

3 participants