Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure bitsandbytes handles permission errors in the right order #622

Merged

Conversation

fozziethebeat
Copy link
Contributor

When trying to run bitsandbytes on GCP, I ran into an unexpected permission error during cuda driver lookup

  File "/opt/conda/envs/cubrio-trainer/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py", line 201, in remove_non_existent_dirs
    raise exc
  File "/opt/conda/envs/cubrio-trainer/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py", line 197, in remove_non_existent_dirs
    if path.exists():
  File "/opt/conda/envs/cubrio-trainer/lib/python3.9/pathlib.py", line 1424, in exists
    self.stat()
  File "/opt/conda/envs/cubrio-trainer/lib/python3.9/pathlib.py", line 1232, in stat
    return self._accessor.stat(self)
PermissionError: [Errno 13] Permission denied: '/root/google_vm_config.lock'

I looked at the code and everything appeared correct until I discovered that PermissionError is a sub-type of OSError and thus the first except clause would catch any permission errors. This fixes that by swapping the except clauses.

@RemiLacroix-IDRIS
Copy link

RemiLacroix-IDRIS commented Jul 24, 2023

I just ran into this, I was about to open the same PR. I can confirm that @fozziethebeat's fix is correct.

@limjiayi
Copy link

I can also confirm that this is the right fix, can this be merged soon?

@xaptronic
Copy link
Contributor

Hmm - this is my fix for the same problem: #715

Could ignoring permission errors mask legit errors?

@fozziethebeat
Copy link
Contributor Author

I've only had this problem on Google VMs, so your fix looks just as good to me. I figured since the code is already written this way, ordering it to work as intended should be fine.

@RemiLacroix-IDRIS
Copy link

Could ignoring permission errors mask legit errors?

I doubt it, the goal of that function is to remove paths that cannot be used to find the CUDA libs anyway.

@madhavthaker1
Copy link

madhavthaker1 commented Sep 29, 2023

I'm also running into the same error in my google VM. Are there plans to merge this fix?

@fozziethebeat
Copy link
Contributor Author

As far as I can tell, this repo has been frozen since early August. looks like this won't be getting merged anytime soon.

@asaiacai
Copy link

This also fixes my issue on GCP. Will this be merged soon?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@RemiLacroix-IDRIS
Copy link

This definitely needs to be addressed.

@TimDettmers
Copy link
Owner

Sorry for the delay on this and thank you for your contribution!

@TimDettmers TimDettmers merged commit 6ba3e62 into TimDettmers:main Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants