diff --git a/docs/source/_static/images/gates1.jpg b/docs/source/_static/images/gates1.jpg new file mode 100644 index 0000000..89f3318 Binary files /dev/null and b/docs/source/_static/images/gates1.jpg differ diff --git a/docs/source/_static/images/gates2.jpg b/docs/source/_static/images/gates2.jpg new file mode 100644 index 0000000..8b5f028 Binary files /dev/null and b/docs/source/_static/images/gates2.jpg differ diff --git a/docs/source/advanced/index.md b/docs/source/advanced/index.md index 70968b4..d5c8c7d 100644 --- a/docs/source/advanced/index.md +++ b/docs/source/advanced/index.md @@ -5,6 +5,6 @@ This section covers advanced usage of the Lane Cluster, including containerizati ```{toctree} :maxdepth: 1 +apptainer nextflow -singularity -spack \ No newline at end of file +spack diff --git a/docs/source/advanced/nextflow.md b/docs/source/advanced/nextflow.md new file mode 100644 index 0000000..5cf6985 --- /dev/null +++ b/docs/source/advanced/nextflow.md @@ -0,0 +1 @@ +# Nextflow diff --git a/docs/source/basics/index.md b/docs/source/basics/index.md new file mode 100644 index 0000000..504c1e9 --- /dev/null +++ b/docs/source/basics/index.md @@ -0,0 +1,8 @@ +# The Basics + +This section covers essential skills for using the Lane Cluster, such as working with environment modules, accessing software tools, and preparing scripts for job submission. + +```{toctree} +:maxdepth: 1 + +modules diff --git a/docs/source/basics/modules.md b/docs/source/basics/modules.md new file mode 100644 index 0000000..b0ebfde --- /dev/null +++ b/docs/source/basics/modules.md @@ -0,0 +1,112 @@ +# Using Modulefiles on the Lane Cluster + +The **Lane Cluster** uses *environment modules* to manage software installations. +Modules let you easily load, unload, and switch between different versions of tools without altering your shell configuration. + +Environment modules are implemented in **Tcl**, but as a user, you’ll primarily *use* them — not write them. + +--- + +## 🔍 Searching for Available Software + +To see what software is available through modulefiles: + +```bash +module avail +``` + +This lists all the modules installed on the system. +You can narrow your search using keywords, for example: + +```bash +module avail python +module avail gcc +``` + +If the output is long, pipe it through `less`: + +```bash +module avail | less +``` + +--- + +## 📦 Loading and Unloading Modules + +To load a module (for example, Python 3.11): + +```bash +module load python/3.11 +``` + +To confirm it was loaded: + +```bash +module list +``` + +You can unload it when done: + +```bash +module unload python/3.11 +``` + +Or clear all active modules: + +```bash +module purge +``` + +--- + +## 🧰 Checking Module Information + +To learn what a module does and what environment variables it modifies: + +```bash +module show python/3.11 +``` + +This displays paths such as `PATH`, `LD_LIBRARY_PATH`, and `PYTHONPATH` that the module sets. + +--- + +## 🧩 Using Modules in Job Scripts + +When writing a SLURM batch or shell script for the Lane Cluster, you can include module commands to set up your environment automatically. + +Example SLURM script: + +```bash +#!/bin/bash +#SBATCH --job-name=example +#SBATCH --output=output.log +#SBATCH --time=01:00:00 +#SBATCH --ntasks=1 + +# Load necessary modules +module load python/3.11 +module load gcc/11.2.0 + +# Run your program +python my_script.py +``` + +This ensures that when your job runs, the environment matches what you expect. + +--- + +## 🧠 Tips and Best Practices + +- Always load modules inside your batch or shell scripts to ensure reproducibility. +- Use `module list` interactively to check which modules are active before launching long jobs. +- If a command fails, run `module purge` and reload only what you need. +- Prefer explicit versions (e.g., `python/3.11`) instead of default aliases. + +--- + +## 🔗 More Resources + +- [Environment Modules Project Documentation](http://modules.sourceforge.net/) +- [Tcl Environment Modules GitHub](https://github.com/cea-hpc/modules) +- [Lane Cluster Overview](https://www.cbd.cmu.edu/research/computational-biology-cluster/) diff --git a/docs/source/basics/slurm.md b/docs/source/basics/slurm.md new file mode 100644 index 0000000..d4304b4 --- /dev/null +++ b/docs/source/basics/slurm.md @@ -0,0 +1,87 @@ +# Using SLURM on the Lane Cluster + +SLURM (Simple Linux Utility for Resource Management) is the workload manager used on the Lane Cluster. It schedules and runs computational jobs on the cluster’s compute nodes. This guide introduces the essential SLURM commands and explains how to submit, monitor, and manage jobs effectively. + +## Submitting a Batch Job + +Batch jobs are submitted with the `sbatch` command. Below is an example job script: + +```bash +#!/bin/bash +#SBATCH --job-name=test_job +#SBATCH --output=output.txt +#SBATCH --time=01:00:00 +#SBATCH --partition=cpu +#SBATCH --cpus-per-task=4 +#SBATCH --mem=8G + +module load python +python my_script.py +``` + +Submit the job: + +```bash +sbatch my_job.sh +``` + +## Checking Job Status + +```bash +squeue -u $USER +``` + +## Canceling Jobs + +```bash +scancel +``` + +## Running an Interactive Job + +```bash +salloc --partition=cpu --time=01:00:00 --cpus-per-task=2 --mem=4G +``` + +## Viewing Job Output + +```bash +tail -f output.txt +``` + +## Specifying Resources + +| Purpose | Directive | Example | +|---------|-----------|---------| +| Job name | `--job-name` | `--job-name=align` | +| Output file | `--output` | `--output=run.out` | +| Time limit | `--time` | `--time=4:00:00` | +| Partition | `--partition` | `--partition=cpu` | +| CPUs | `--cpus-per-task` | `--cpus-per-task=8` | +| Memory | `--mem` | `--mem=32G` | +| GPUs | `--gres` | `--gres=gpu:1` | +| Array jobs | `--array` | `--array=0-99` | + +## Job Arrays + +```bash +#!/bin/bash +#SBATCH --job-name=array_example +#SBATCH --output=array_%A_%a.out +#SBATCH --array=0-9 + +python script.py $SLURM_ARRAY_TASK_ID +``` + +## Monitoring Resource Usage + +```bash +sacct -j --format=JobID,Elapsed,ReqMem,MaxRSS,State +``` + +## Best Practices + +- Never run computations on the login node. +- Request only the resources you need. +- Use job arrays for parameter sweeps. +- Check your output files regularly. diff --git a/docs/source/conf.py b/docs/source/conf.py index 7cf6b22..d15d360 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -40,6 +40,7 @@ # -- HTML ------------------------------------------------------------- html_theme = "furo" +html_theme = 'sphinx_rtd_theme' html_title = "Ray and Stephanie Lane cluster documentation" # -- Autodoc / Napoleon ---------------------------------------------- diff --git a/docs/source/getting-started.md b/docs/source/getting-started.md index e7f9fea..0fbcb2e 100644 --- a/docs/source/getting-started.md +++ b/docs/source/getting-started.md @@ -1,16 +1,49 @@ # Getting Started -Write your first section here. You can use **Markdown**, callouts, tabs, and diagrams. +![Gates-Hillman Center](images/gates1.jpg) + +## What is the Lane cluster? + +The Lane cluster provides a powerful high-performance computing (HPC) environment for researchers working with computational biology, genomics, machine learning, and data-intensive scientific workflows. This chapter introduces the essential concepts you need before running your first job, including how to access the system, navigate the filesystem, load software with environment modules, and submit workloads with SLURM. + +## Who can access the cluster? + +Access to the Lane Cluster is limited to members of the Ray and Stephanie Lane Center for Computational Biology or collaborating labs. The cluster is intended to support computational biology, genomics, machine learning, and data-intensive scientific research. + +Users must be sponsored by a CMU faculty member, typically a principal investigator (PI) whose research relies on computational resources. Students, postdocs, research staff, and collaborators may receive access as part of a faculty-led project. + +If you are unsure whether you are eligible, start by speaking with your advisor or PI to determine whether your project aligns with the cluster’s intended use and to initiate the sponsorship process. + +### How to Request an Account :::{note} -This is a MyST *admonition*. +If you require access to the Lane cluster, consult your advisor to initiate the sponsorship process. ::: -::::{tabs} -:::{tab} Linux -Use this on Linux… -::: -:::{tab} macOS -Use this on macOS… -::: -:::: \ No newline at end of file +Access to the Lane Cluster requires sponsorship from a CMU faculty member whose research aligns with the mission of the Lane Center for Computational Biology. If you need access for a course, rotation, research project, or collaboration, follow the steps below to initiate the request. + +#### Confirm Eligibility + +Before submitting a request, verify that +* You are a CMU/Pitt student, postdoc, staff member, or collaborator working under an approved PI. +* Your project includes computational work appropriate for the cluster. +* Your advisor or PI is willing to sponsor your access. + +If you are unsure, speak with your advisor to confirm your eligibility. + +#### Contact Your Advisor or PI + +Your sponsor must authorize your access. Most users begin by: +* Discussing the computational nee ds of their project +* Confirming that Lane is the appropriate resource +* Asking the PI to approve account creation + +Advisors do not need to provide funding or project information depending on internal policies. + +### Submit an Account Request + +Once approved by your sponsor, the sponsor or an admin will submit an account request using the Lane Center’s designated process. If you are a sponsor, then please complete this [form](https://computing.cs.cmu.edu/accounts-access/forms/scs-account). + +### Wait for Account Provisioning + +System administrators will review your request and create your account. Provisioning time may vary depending on the number of pending requests. diff --git a/docs/source/index.md b/docs/source/index.md index 615ee37..f3ae928 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -1,5 +1,7 @@ # Welcome to the *Lane cluster documentation* +![Gates-Hillman Center](images/gates2.jpg) + This guide provides an introduction to navigating and effectively using the **[Lane Cluster](https://www.cbd.cmu.edu/research/computational-biology-cluster/)** — a high-performance computing (**HPC**) resource maintained by the **[Ray and Stephanie Lane Computational Biology Department](https://www.cbd.cmu.edu/)** at **[Carnegie Mellon University](https://www.cmu.edu/)**. It is intended for **students, researchers, and collaborators** who wish to leverage the cluster for computational biology research and data-intensive analysis. @@ -20,3 +22,5 @@ By the end of this guide, users will be familiar with the essential tools and pr :hidden: getting-started +basics/index +advanced/index