Our notes on using the CRI cluster, gardner.
Fill out the form here. Note that you will likely need to apply for a BSDAD account
See these slides and these. The CRI also has a Wiki with some basic information about the gardner cluster.
Run
groups
to list the groups you belong to.
To find who are the members of a group, you can use the getent
command. For example, to list all members of Xin's and Matthew's labs,
run
getent group cri-xhe_lab
getent group cri-stephens_lab
If you are on campus, to connect to gardner from the command-line shell (e.g., Terminal), run:
ssh userid@gardner.cri.uchicago.edu
where userid
would be replaced by the name of your CRI account
(typically the same as your BSD account, although not always).
The Gardner cluster is not on the public internet. The login node
will only accept ssh connections from within the university network.
To connect to gardner via ssh
, you should first connect to the
university VPN using the university provided VPN client
cisco anyconnect.
Linux users that have difficulty with anyconnect can try
openconnect-sso.
Another option for those with an account on midway2 is to use
midway2 as an ssh proxy. To use midway2 as a proxy, add something
like the following to the file ~/ssh/.config
Host gardner_proxy
HostName gardner.cri.uchicago.edu
User your_cri_user_name
LogLevel error
ProxyJump your_midway2_user_name@midway2.rcc.uchicago.edu:22
You can then connect to gardner
ssh gardner_proxy
If you are connecting frequently, you may want to set up SSH keys for your account, which will allow you to connect without having to type a password every time.
Most of your files will be stored in one of these four locations:
-
/home
: Where your home directory is located. Not a lot of space. -
/gpfs/data
: Where many labs store their files (e.g.,/gpfs/data/xhe-lab
,/gpfs/data/stephens-lab
). There is a lot of space here. You can usedf
to check how much space is left in an individual directory, e.g.,df -h /gpfs/data/xhe-lab df -h /gpfs/data/stephens-lab
-
/group
: Where some other labs store their files (e.g.,/group/bergelson-lab
). -
/scratch
: Although each user has a separate directory only accessible to them, this is a shared file system, and there is also a great deal of freely available space here.
TO DO: add info about backups.
The gardner cluster uses Lmod to organize the many different versions of software that have been installed. It may seem a little complicated at first, but isn't too bad once you start using it.
For example, suppose you want to use R. First, find out which versions of R are available, run
module spider R
You will likely see that several versions of R are available. Suppose you want to use R version 3.3.3. Next, run
module spider R/3.3.3
It should tell you that you will need to first load two modules, gcc 6.2.0 and mpich 3.2, in order to use R 3.3.3. Go ahead and load these modules:
module load gcc/6.2.0 mpich/3.2 R/3.3.3
Now you are ready to use R.
Because you can only have one version of a module loaded at a time,
unloading the R
module is as simple as
module unload R
To unload all modules:
module purge
To see all currently loaded modules:
module list
Gardner uses TORQUE. "qsub" is the main command that's used for interacting with the job scheduler. It can be used to request interactive sessions, or submit jobs.
In this example, we request a 5-hour interactive session on a compute node with 32 GB memory and 12 CPUs:
qsub -I -l walltime=05:00:00 -l nodes=1:ppn=12 -l mem=32gb
Keep in mind that if you load modules in a login node, these modules will not automatically be loaded in a compute node.
We have provided a small example script in which the qsub options are specified inside the script. In this example, we request a compute node with 10 CPUs and 3 GB of memory, and at most 10 minutes of runtime.
To run this job, you will first need to install the
SuppDists package in R. You will also need to modify this
line in demo.sh
to point to the location of the git repository that
you have cloned or downloaded to gardner:
cd $HOME/git/gardner
Once you have done these two things, to submit this job to the scheduler simply run the following from the root directory of this repository:
qsub demo.sh
You can use qstat
to check the status of the job in the queue. While
the job is running, it should write the output to a text file named
demo.oxxx
, where "xxx" is the job id. Once the job has completed (it
may take a few minutes), the end of the text file should look something
like this:
Total time for all 15 tests_________________________ (sec): 39.0573333333333
Overall mean (sum of I, II and III trimmed means/3)_ (sec): 1.17529314479865
--- End of test ---
followed by a "Job WrapUp" section.
For more information on submitting jobs, see the "Job submission" section of the Torque user guide.