Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isonet support NVIDIA A100 #26

Open
cron-weasley opened this issue Apr 16, 2022 · 5 comments
Open

Isonet support NVIDIA A100 #26

cron-weasley opened this issue Apr 16, 2022 · 5 comments

Comments

@cron-weasley
Copy link

Dear Author,
First thanks a lot for this powerful software !
I have a question: Is IsoNet support CUDA 11.2 and NVIDIA A100 with tensorflow-gpu_2.7?
I install isonet with conda python3.9 and tensorflow-gpu_2.7+cuda11.2+NVIDIA A100 get this error:

(py39) [root@Isonet]$ isonet.py refine subtomo.star --gpuID 0,1,2,3 --iterations 30 --noise_start_iter 10,15,20,25 --noise_level 0.05,0.1,0.15,0.2
04-16 22:14:09, INFO
######Isonet starts refining######

04-16 22:14:27, INFO Note: detected 128 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
04-16 22:14:27, INFO Note: NumExpr detected 128 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
04-16 22:14:27, INFO NumExpr defaulting to 8 threads.
04-16 22:14:30, WARNING The results folder already exists before the 1st iteration
The old results folder will be renamed (to results~)
04-16 22:14:50, INFO Done preperation for the first iteration!
04-16 22:14:50, INFO Start Iteration1!
/data1/apps/miniconda3/envs/py39/lib/python3.9/site-packages/keras/optimizer_v2/adam.py:105: UserWarning: The lr argument is deprecated, use learning_rate instead.
super(Adam, self).init(name, **kwargs)
/data1/apps/miniconda3/envs/py39/lib/python3.9/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
layer_config = serialize_layer_fn(layer)
04-16 22:14:54, INFO Noise Level:0.0
2022-04-16 22:15:06.826516: F tensorflow/stream_executor/cuda/cuda_driver.cc:153] Failed setting context: CUDA_ERROR_NOT_INITIALIZED: initialization error

Thanks a lot!

@procyontao
Copy link
Collaborator

Hi,

We only tested tensorflow2.5. and python3.6 on A100 GPUs. IsoNet works fine with A100.

Could you run the command with parameter: "--log_level debug" and see what is the error message?

@cron-weasley
Copy link
Author

Hi,

We only tested tensorflow2.5. and python3.6 on A100 GPUs. IsoNet works fine with A100.

Could you run the command with parameter: "--log_level debug" and see what is the error message?

Dear procyontao,
Thanks a lot!
And could you tell me which cuda version and which NVIDIA A100 driver did you use?

@cron-weasley
Copy link
Author

I sloved the problem.
Thanks procyontao!
I use python 3.8 with miniconda
and install:
pip install imageio==2.10.5 numpy==1.19.2
then install
pip install tensorflow-gpu==2.5.0
pip install -r requirements.txt
And Isonet run successfully.

@proteincommandr
Copy link

proteincommandr commented May 18, 2022

Hi all,
I also required TensorFlow v2.6.0 to run IsoNet properly on A100 GPUs.
I figured it would be nice to provide some GPU benchmarks for the tutorial dataset

I used TF2.6.0 Cuda11.6 IsoNet0.1 on the three tutorial tomograms

System Time/step speedup
2x 2070S 950ms 1
4x2080TI 700ms ~1.4
4x1080TI 900ms ~1
1xRTX8000 1000ms ~1
2xRTX8000 700ms ~1.3
4xA100 270ms ~3.5

Cheers

@procyontao
Copy link
Collaborator

Hi proteincommandr,

Thank you for providing the GPU benchmarks. We do not even afford that many types of GPU for a speed test.

One question, did you consider the differences in batch_size (i.e. number of subtomograms processed in one step)?
the default relation between number of GPUs and default batch_size is listed below:
nGPUs batch_size
1 4
2 4
3 6
4 8
5 10
6 12
7 14
8 16

If you do not specify the batch size in your command, your list should be:
System Time/step speedup batch_size speedup_persubtomo
2x 2070S 950ms 1 4 0.5
4x2080TI 700ms ~1.4 8 1.4
4x1080TI 900ms ~1 8 1
1xRTX8000 1000ms ~1 4 0.5
2xRTX8000 700ms ~1.3 4 0.65
4xA100 270ms ~3.5 8 3.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants