CryoSPARC is setup to run on the CURC clusters using VM's built in CUmulus. This setup is rather complex and requires special permissions to implement.
- Make instance from
cryosparc-base
snapshot (:ref:`VM`) - Edit
fstab
on VM (:ref:`Mount PL`) - Delete and reinstall master (:ref:`Cryomaster`)
- Install worker (:ref:`Cryoworker`)
- Set aliases (:ref:`Cryo aliases`)
We will spin up a small VM to run the 'master' instance of Cryosparc on CURC's CUmulus cloud service. Currently, only the BioKEM IT admin has access to this allocaion. We will follow these instructions.
- Go to OpenStack
- Instances > Launch Instance
- Details > Add name
- Source > Ubuntu 20.04 LTS
- Set volume to 16GB
- Flavor > m5.large
- Networks > projectnet2023-private
- Security Groups > hpc-ssh, default, ssh-restricted, icmp, rfc-1918
- Key Pair > add BioKEM global user's RSA key**
- Associate Floating IP
+
- Pool > scinet-internal
- Allocate IP
- Associate
In order to submit jobs to Alpine's SLURM environment, we need to install the rigth version of SLURM, import Alpine's slurm config, and set up a user that has permission to submit jobs. We will be using a variation of this.
Log on to the VM
ssh -o KexAlgorithms=ecdh-sha2-nistp521 ubuntu@<IP>
sudo apt-get update sudo apt install -y libmysqlclient-dev libjwt-dev munge gcc make
Check SLURM version (on RC):
ml slurm/alpine sbatch --version
On VM (make sure to clone correct slurm):
cd /opt sudo git clone -b slurm-22.05 https://github.com/SchedMD/slurm.git cd slurm sudo ./configure --with-jwt --disable-dependency-tracking sudo make && sudo make install sudo mkdir -p /etc/slurm cd /etc/slurm
sudo scp <user>@login.rc.colorado.edu:/curc/slurm/alpine/etc/slurm.conf . sudo nano slurm.conf
ControlMachine=alpine-slurmctl1.rc.int.colorado.edu BackupController=alpine-slurmctl2.rc.int.colorado.edu
Edit
/etc/default/useradd
->SHELL=/bin/sh
toSHELL=bin/bash
Make slurm user and group
sudo groupadd -g 515 slurm sudo useradd -u 515 -g 515 slurm
Make biokem user and group:
sudo groupadd -g 2004664 biokempgrp sudo useradd -u 2004664 -g 2004664 biokem sudo mkdir /home/biokem sudo chown -R biokem /home/biokem sudo su biokem cd cp ../ubuntu/.profile . cp ../ubuntu/.bashrc . source .profile mkdir .ssh cd .ssh touch authorized_keys
In future, let's add the specific user group (also will need to edit fstab)
Copy over curc.pub key
Update
/projects/biokem/software/biokem/users/src/lab_specific/cryosparc_vms.src
Now we need to mount the lab's PetaLibrary to the VM, according to CURC's instructions.
Set up directories
exit sudo apt-get install sshfs sudo mkdir -p /pl/active/<lab's PL> sudo mkdir -p /pl/active/BioKEM/software/cryosparc/<lab> sudo chmod -R o+w /pl
Make key pair on VM
ssh-keygen -t ed25519
Add key to biokem on RC
Mount directories through fstab
#User lab PL biokem@dtn.rc.int.colorado.edu:/pl/active/<lab> /pl/active/<lab> fuse.sshfs defaults,_netdev,allow_other,default_permissions,identityfile=/home/ubuntu/.ssh/cryo,uid=biokem,gid=biokempgrp,reconnect 0 0
If you want to mount manually:
sudo sshfs -o allow_other,IdentityFile=/home/ubuntu/.ssh/cryo biokem@dtn.rc.int.colorado.edu:/pl/active/<lab> /pl/active/<lab>
Install the 'master' Cryosparc on the VM use their instructions. But we need to make a few important changes for this to work.
Bring in presets
sudo su biokem cd git clone https://github.com/CU-BioKEM/cryosparc_setup.git cd cryosparc_setup nano license.src -> export LICENSE_ID=" " mkdir ~/cryosparc cd ~/cryosparc
Follow instructions
source ../cryosparc_setup/license.src curl -L https://get.cryosparc.com/download/master-latest/$LICENSE_ID -o cryosparc_master.tar.gz tar -xf *gz cd ../cryosparc_setup
Edit
run_installer.sh
and runEdit
fix_cluster.sh
to correct IP and runStart cryosparc
source ~/.bashrc cryosparcm restart
Connect cluster
cd alpine nano cluster_info.json -> edit to correct worker bin path nano cluster_script.sh -> edit job names to cs-<lab>... cryosparcm cluster connect
Edit
run_first_user.sh
and runThe last thing to do is setup auto restarting of the instance in the event of a reboot
crontab -e append this to end: @reboot rm /tmp/cryo* @reboot sleep 60 && /home/biokem/cryosparc/cryosparc_master/bin/cryosparcm restart
Now that we've installed the 'master' instance, we can install the worker on Alpine.
Log onto RC
ssh login10 cd /pl/active/BioKEM/software/cryosparc
Make a new directory for each lab
sudo -u biokem mkdir <labname> cd <labname>
git clone https://github.com/CU-BioKEM/cryosparc_setup.git cd cryosparc_setup
Edit license.src to add correct CryoSPARC license
nano license.src
cd .. source cryosparc_setup/license.src curl -L https://get.cryosparc.com/download/worker-latest/$LICENSE_ID -o cryosparc_worker.tar.gz tar -xf *gz
ssh login10 ml slurm/alpine ainteractive ml cuda/11.4 cd cryosparc_setup
Edit
run_worker_install.sh
./run_worker_install.sh
Open new terminal
cryosparc
Login and try to test it out. Make sure you make all projects in PL
To keep everything as simple for the end user as possible, I have made lab
specific aliases in /projects/biokem/software/biokem/users/src/lab_specific
.
These will give users from each labs access to their specific Cryosparc builds.
Edit cryosparc_vms.src to add easy access to VM
alias <lab>-cryosparc-vm="ssh -o KexAlgorithms=ecdh-sha2-nistp521 ubuntu@<IP>"
(only gives access to BioKEM IT)Update
/projects/biokem/software/biokem/users/src/lab_specific/labs.src
with new lab groupMake lab specific functions:
touch <lab>lab.src
#cryosparc alias cryosparc='firefox http://<IP>:<base_port>'
Make admin functions (may enable later, but not now)
for USER in $(users) do if [ "$USER" == "<admin>" ]; then alias cryosparcm='ssh -o KexAlgorithms=ecdh-sha2-nistp521 <user>@<ip> "/home/<user>/cryosparc/cryosparc_master/bin/cryosparcm ${1}"' fi done`