Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudo permission issue for cuquantum-appliance:23.10 container #125

Open
namehta4 opened this issue Mar 13, 2024 · 7 comments
Open

Sudo permission issue for cuquantum-appliance:23.10 container #125

namehta4 opened this issue Mar 13, 2024 · 7 comments
Assignees

Comments

@namehta4
Copy link

Hi All,

I am trying to use cuquantum-appliance:23.10 with shifter on NERSC Perlmutter system.
I am facing the following sudo permission issue with this container:

namehta4@perlmutter:login36:~> salloc -N 1 -G 4 -C gpu -t 120 -c 64 -A nstaff -q interactive --image=nvcr.io/nvidia/cuquantum-appliance:23.10
salloc: Granted job allocation 22896843
salloc: Waiting for resource configuration
salloc: Nodes nid200432 are ready for job
namehta4@nid200432:~> shifter /bin/bash
(base) namehta4@nid200432:~$ cd /home/cuquantum/
bash: cd: /home/cuquantum/: Permission denied
(base) namehta4@nid200432:~$ sudo cd /home/cuquantum
sudo: The "no new privileges" flag is set, which prevents sudo from running as root.
sudo: If sudo is running in a container, you may need to adjust the container configuration to disable the flag.

As far as I know, this is a new issue as the behavior is different compared to the previous imae (23.03)

namehta4@perlmutter:login36:~> salloc -N 1 -G 4 -C gpu -t 120 -c 64 -A nstaff -q interactive --image=nvcr.io/nvidia/cuquantum-appliance:23.03
salloc: Pending job allocation 22896859
salloc: job 22896859 queued and waiting for resources
salloc: job 22896859 has been allocated resources
salloc: Granted job allocation 22896859
salloc: Waiting for resource configuration
salloc: Nodes nid200436 are ready for job
namehta4@nid200436:~> shifter /bin/bash
(base) namehta4@nid200436:~$ cd /home/cuquantum/
(base) namehta4@nid200436:/home/cuquantum$ ls
LICENSE  conda	examples

May I please use your help in resolving this issue?

Thank you!
Neil Mehta

@erinaldiq
Copy link

@namehta4 The image 23.06 does not have the sudo permission issue. It seems that the commands used for image 23.03 also work on 23.06

@haidarazzam
Copy link
Collaborator

Dear @namehta4
was this issue resolved for you?
Thanks

@namehta4
Copy link
Author

namehta4 commented Apr 1, 2024

Hi @haidarazzam , no the issue still persists:

namehta4@perlmutter:login25:~> salloc -N 1 -G 4 -C gpu -t 120 -c 64 -A nstaff -q interactive --image=nvcr.io/nvidia/cuquantum-appliance:23.10-devel-ubuntu22.04
salloc: Pending job allocation 23801046
salloc: job 23801046 queued and waiting for resources
salloc: job 23801046 has been allocated resources
salloc: Granted job allocation 23801046
salloc: Waiting for resource configuration
salloc: Nodes nid001180 are ready for job
namehta4@nid001180:~> shifter /bin/bash
(base) namehta4@nid001180:~$ cd /home/cuquantum/
bash: cd: /home/cuquantum/: Permission denied

@mtjrider
Copy link
Collaborator

mtjrider commented Apr 1, 2024

--image=nvcr.io/nvidia/cuquantum-appliance:23.10-devel-ubuntu22.04

Can you confirm if the issue exists with --image=nvcr.io/nvidia/cuquantum-appliance:23.10-devel-ubuntu20.04?

@namehta4
Copy link
Author

namehta4 commented Apr 1, 2024

Hi @mtjrider,
Ha! The issue seems resolved. Thank you!

namehta4@perlmutter:login25:~> salloc -N 1 -G 4 -C gpu -t 120 -c 64 -A nstaff -q interactive --image=nvcr.io/nvidia/cuquantum-appliance:23.10-devel-ubuntu20.04
salloc: Pending job allocation 23801343
salloc: job 23801343 queued and waiting for resources
salloc: job 23801343 has been allocated resources
salloc: Granted job allocation 23801343
salloc: Waiting for resource configuration
salloc: Nodes nid001249 are ready for job
namehta4@nid001249:~> shifter /bin/bash
(base) namehta4@nid001249:~$ cd /home/cuquantum/
(base) namehta4@nid001249:/home/cuquantum$ cd conda/envs/cuquantum-23.10/bin/
(base) namehta4@nid001249:/home/cuquantum/conda/envs/cuquantum-23.10/bin$ ./python
Python 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy
>>> import cusvaer
>>> exit()

@erinaldiq, I will add ipykernel etc to this base image and upload it asap. Please test it out in roughly an hour or so.

Thank you again @mtjrider and @haidarazzam

@mtjrider
Copy link
Collaborator

mtjrider commented Apr 1, 2024

Hi @mtjrider, Ha! The issue seems resolved. Thank you!

namehta4@perlmutter:login25:~> salloc -N 1 -G 4 -C gpu -t 120 -c 64 -A nstaff -q interactive --image=nvcr.io/nvidia/cuquantum-appliance:23.10-devel-ubuntu20.04
salloc: Pending job allocation 23801343
salloc: job 23801343 queued and waiting for resources
salloc: job 23801343 has been allocated resources
salloc: Granted job allocation 23801343
salloc: Waiting for resource configuration
salloc: Nodes nid001249 are ready for job
namehta4@nid001249:~> shifter /bin/bash
(base) namehta4@nid001249:~$ cd /home/cuquantum/
(base) namehta4@nid001249:/home/cuquantum$ cd conda/envs/cuquantum-23.10/bin/
(base) namehta4@nid001249:/home/cuquantum/conda/envs/cuquantum-23.10/bin$ ./python
Python 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cupy
>>> import cusvaer
>>> exit()

@erinaldiq, I will add ipykernel etc to this base image and upload it asap. Please test it out in roughly an hour or so.

Thank you again @mtjrider and @haidarazzam

Great. This means the root cause is a change in default file-permissions for the home directory under Ubuntu 22.04.
Thanks for reporting this.

@namehta4
Copy link
Author

Adding for posterity, this issue is also observed in cuda_quantum:0.6 image. Would this have to filled separately?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants