You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now we have a GPU cluster which contains 4X 4-2080Ti GPU nodes. The CPU is E5-2650(2x12 cores) and Physical Memory is 256G. We have used NVIDIA driver 410.93, CUDA-9.2, NCCL 2.4.2 for THUNDER. We are trying to run THUNDER on our GPU cluster. We have so many questions.
How should we specify processes and threads? It said that on a cluster, we should specify one node with one process, and the thread number set as the number of CPU cores. However, when we only want to use 2 nodes to run THUNDER, (and other two nodes were used by others), we found that mpirun -np 2 cannnot work. Would more process speed up the job running? Or only more threads speed up it? And also, please give us suggestions on threads setting if we want to run THUNDER on 2 nodes.
We want to run a benchmark to test if THUNDER installation have no problem. Now I used relion benchmark data (EMPAIR 10028 Ribosome, 51G, ~10K Particles). How long should such a dataset process by THUNDER on one of our nodes? I have also found there is a THUNDER-benchmark data on GitHub however I cannot download the data set. Should I use that dataset to run THUNDER benchmark?
Our cluster have SSD scrach on each machine, not shared. On Relion and cryosparc v2, we can set scrach dir on local scrach directory. However, I didn't find the place to set local scratch or open it. Could I use local scrach? If I could use, how could I use it?
I found THUNDER will copy my benchmark data to physical memory. However, when there are milions of particles, it will run out of physical memory and cause job failed (Happened on our old workstation with only 128G to run EMPAIR 10028 Particles). If I don't want to write particles into phsical memory, how could I do?
Thanks!
The text was updated successfully, but these errors were encountered:
Sorry, currently, there is no local scratch support in THUNDER.
Currently, THUNDER will read all particles in physical memory. We are working the memory buffer system which will load particles into physical memory when needed. We hope to release this feature soon. A present solution to his "out of memory" issue is to use SWAP. By configuring SWAP to a larger partition, it will get this issue solved.
Now we have a GPU cluster which contains 4X 4-2080Ti GPU nodes. The CPU is E5-2650(2x12 cores) and Physical Memory is 256G. We have used NVIDIA driver 410.93, CUDA-9.2, NCCL 2.4.2 for THUNDER. We are trying to run THUNDER on our GPU cluster. We have so many questions.
Thanks!
The text was updated successfully, but these errors were encountered: