Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
How to use Torvalds server
A new server has been set up called torvalds with the following specifications:
32 CPUs, 256 GB RAM and 8 GPUs Nvidia GeForce GTX 1080 Ti
Ubuntu 18.04.1 LTS (Bionic Beaver)
If you want to use it you have to send a mail to Adrian to request an account.
- Disks and quotas
Your /home directory is stored in a SSD.
Your /data directory is stored in a HDD. You should have a link /home/username/data -> /data/username
Tu check your quotas, current usage and limits, run:
$ /usr/sbin/xfs_quota -x -c 'report -h' 2>/dev/null
- Use of GPUs
Users don’t have direct access to the GPUs. Instead, you have a user-gpu user that does have access to the GPU though the queue system (Slurm). Some tutorials:
You must use the wrapper gpu to run any slurm command as user-gpu. Example:
$ gpu sbatch slurm.job
$ gpu squeue
$ gpu scancel 123
Nvidia driver version 390.77 is installed.
Cuda toolkit 8.0 (default) and 9.1 are installed on /usr/local.
- Scipion in torvalds
There is a general Scipion installation in /usr/local/scipion.
It has been compiled with CUDA=True and OpenCV (some changes required).
Packages installed are Gautomatch 0.53, Gctf 1.06, bsoft1.9.0, chimera 1.10.1, ctffind4 4.1.10, eman 2.12, frealign 9.07, gEMpicker 1.1, motioncor2 1.1.0, relion 2.1, resmap 1.1.5s2, spider 21.13.
Gautomatch, Gctf, motioncor2, relion and xmipp work with cuda 8. Gempicker does not work cause it needs cuda 7.5 or below which is not installed. Remember to adapt your $HOME/.config/scipion/scipion.conf variables to point to the right cuda and binaries.
- Installing your own Scipion
If you want to install your own Scipion you need some changes:
Scipion must use queues. Otherwise, jobs will not work (errors like “no CUDA available” will appear). The main Scipion in Torvalds (/usr/local/scipion/) has already been configured to use the gpu wrapper. Copy /usr/local/scipion/config/hosts.conf to your own installation.
Sqlite files must have 664 permissions (so user and user-gpu can both access those files).
You need to modify the following in pyworkflow/mapper/sqlite_db.py (check with a diff on /usr/local/scipion if you are not sure):
+import os from pyworkflow.utils import envVarOn @@ -50,6 +51,10 @@ class SqliteDb(): self.connection = self.OPEN_CONNECTIONS[dbName] else: self.connection = sqlite.Connection(dbName, timeout, check_same_thread=False) + try: + os.chmod(dbName, 0664) + except Exception as exc: + print ("cannot set permission", dbName, exc) self.connection.row_factory = sqlite.Row self.OPEN_CONNECTIONS[dbName] = self.connection
Change mpi paths on your scipion/config/scipion.conf to the right ones
#MPI_INCLUDE = /usr/lib64/mpi/gcc/openmpi/include #MPI_BINDIR = /usr/bin MPI_LIBDIR = /usr/lib/x86_64-linux-gnu/openmpi/lib MPI_INCLUDE = /usr/lib/x86_64-linux-gnu/openmpi/include MPI_BINDIR = /usr/bin
To compile scipion you need to change gcc and g++ version (by default 6 on ubuntu 18) to 5 on your scipion/config/scipion.conf
CC = gcc-5 CXX = g++-5
- Using GPUs from Scipion
As said before, the use of GPUs is restricted to the queue system, so to execute a protocol that requires GPU option 'Use queue' must be selected.
For Motioncor2, Gctf and Gautomatch Scipion gives the possibility to select the GPU devices to be used. You should select them as follows:
To select 1 GPU: 0
To select 2 GPUs: 0 1
To select 3 GPUs: 0 1 2
and so forth.
For Relion protocols the parameters used are on the Additional tab and can be either be left empty (in which case Relion will be calculate the number of GPUs to used based on the MPI / Threads requested) or specify them using the same syntax as explained above.
It is not possible to select the specific GPU devices that you want to use, but it is the queue system who assign your job to an available(s) GPU(s), so if you specify something like 1 4 6 it will fail.
Besides, when the Queue dialog appears, you should select the number of GPUs (which should be consistent with whatever was chosen in the protocol). Take into account that this is the number of GPUs that the queue system will reserve for your job so do not overtake.
If you encounter any problem or need some library to be installed you can contact Laura or Adrian.