Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read the namespace properly #5

Closed
junzhin opened this issue Jan 7, 2023 · 17 comments
Closed

Unable to read the namespace properly #5

junzhin opened this issue Jan 7, 2023 · 17 comments

Comments

@junzhin
Copy link

junzhin commented Jan 7, 2023

I run this command specified in Readme file, but I keep running into this exception.
Could anyone know why please..?
python tools/run.py test --config configs/cityscapes_acdc/refign_daformer.yaml --ckpt_path ./pretrained/refign_daformer_acdc.ckpt --trainer.gpus 1

image

@brdav
Copy link
Owner

brdav commented Jan 16, 2023

Did you set up the environment using "pip install -r requirements.txt"?

@junzhin
Copy link
Author

junzhin commented Jan 16, 2023

Yes, I did, I created a new env and pip install in that environment

@junzhin
Copy link
Author

junzhin commented Jan 16, 2023

This error message happens every time I run the demo described in the readme.

@brdav
Copy link
Owner

brdav commented Jan 16, 2023

The problem is that it can't find the correlation.cpp file, it seems to look for it in a strange directory in your case. I think there is a problem with your environment installation.
Can you post the output of "pip list" as well as the content of the file "refign.egg-info/SOURCES.txt"?

@junzhin
Copy link
Author

junzhin commented Jan 16, 2023

Currently, I am establishing the env in the HPC from my university with a A100 gpu.

image

image

Here is the first problem when installing the torch, but it seems that installing the previous version through the website "https://pytorch.org/get-started/previous-versions/" solves this problem.

I attached my output of "pip list" file and the file of SOURCES.txt below
piplist.txt.
SOURCES.txt

I rebuilt the env again due to limited space in my cloud storage since last time, this time, a slightly different error message appeared. (above infor is related to this time)

image

@brdav
Copy link
Owner

brdav commented Jan 16, 2023

I think I can't help you with the second error, since that seems to be hardware-related.

But regarding your original error, could you print the 'cwd' variable here?

cwd = os.path.dirname(os.path.realpath(__file__))

It's possible that the cluster changed the working directory, which means that the wrong paths are written into 'sources' there.

@junzhin
Copy link
Author

junzhin commented Jan 17, 2023

Here is the value of 'cwd' variable when running "python tools/run.py fit --config configs/cityscapes_acdc/refign_hrda_star.yaml --trainer.gpus 1 --trainer.precision 16" command
image

For the second error related to "compute 80", could you give a suggestion about what is the reason behind it? Is this caused by the incorrect cuda version 11.7 matched with the torch installed in this env?

For my hpc account, I do not think I am able to change the version of cuda toolkit version, is there any way to bypass this error?

Please Correct me if I am wrong and I am very new to this area, Thank u!

@brdav
Copy link
Owner

brdav commented Jan 26, 2023

The 'cwd' seems to point to the correct directory. Unfortunately I can't give you answers to your questions, since such setup issues can be very platform-dependent.

@BruceLin30
Copy link

I also encountered the same problem, did you solve it?

@junzhin
Copy link
Author

junzhin commented Feb 7, 2023

I also encountered the same problem, did you solve it?

Not yet! Do you have any ideas why this happened ?

@zyuanbing
Copy link

zyuanbing commented Feb 8, 2023

Here is the value of 'cwd' variable when running "python tools/run.py fit --config configs/cityscapes_acdc/refign_hrda_star.yaml --trainer.gpus 1 --trainer.precision 16" command image

For the second error related to "compute 80", could you give a suggestion about what is the reason behind it? Is this caused by the incorrect cuda version 11.7 matched with the torch installed in this env?

For my hpc account, I do not think I am able to change the version of cuda toolkit version, is there any way to bypass this error?

Please Correct me if I am wrong and I am very new to this area, Thank u!

Have you checked your nvcc version by /usr/local/cuda/bin/nvcc -V ?
It seems your nvcc version is lower than CUDA Toolkit 11.0 according to this.

@brdav
Copy link
Owner

brdav commented Feb 16, 2023

I'm closing the issue since it does not seem to be a problem with this code.

@brdav brdav closed this as completed Feb 16, 2023
@junzhin
Copy link
Author

junzhin commented Feb 16, 2023

Thank you for your help, I will try to resolve this problem first and see if there are any other issues!

@brdav
Copy link
Owner

brdav commented Feb 16, 2023

FYI, I just added the option of installing the CUDA extension beforehand (see the README update), in which case no JIT compilation is needed. This could potentially help with your original error.

@JCongLang
Copy link

Thank you for your help, I will try to resolve this problem first and see if there are any other issues!

I also encountered the same problem, did you solve it?

@junzhin
Copy link
Author

junzhin commented Jun 14, 2023

Thank you for your help, I will try to resolve this problem first and see if there are any other issues!

I also encountered the same problem, did you solve it?

In my case, cuda version is the key

@JCongLang
Copy link

Thank you for your help, I will try to resolve this problem first and see if there are any other issues!

I also encountered the same problem, did you solve it?

In my case, cuda version is the key

Thank you for your reply,I solved this problem caused by gcc version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants