GitHub - rwchakra/claude-code-hpc: Claude Code skills for running research code on HPC clusters

Claude Code skills and configuration for running research code on HPC clusters. For the complete text, see Workflow.md. If you know your way around claude code already, the README should suffice.

The README.md has been written with the help (obviously), of Claude. But the main text is all me.

Overview

This workflow automates the end-to-end process of taking a research repository and getting it running on HPC infrastructure. Designed for clusters with containerized environments (Apptainer/Singularity) and GPU scheduling (Slurm).

Installation

Please install Claude Code first.

Configuration

CLAUDE.md - Main configuration file defining:

Active cluster settings (GPU type, CPU architecture, container runtime)
Storage paths and resource allocations
Slurm templates and environment setup
Cluster-specific gotchas and workarounds

Edit this and other files to match your HPC environment and paths.

Skills

`/navigate-repo`

Builds a structured map of a cloned repository by analyzing README, directory structure, and key files. Identifies entry points, dependencies, configs, and data requirements.

Output: REPO_MAP.md

`/resolve-deps`

Creates a working environment with all dependencies installed and verified. Handles cluster-specific constraints (ROCm vs CUDA, x86 vs ARM, containerization requirements).

Prerequisites: Repo map exists Output: Environment spec, activation commands, PATCHES.md

`/sample-run`

Executes a short verification run to confirm the full pipeline works. Reduces scope (steps, batch size) to target ~5-10 min wall time.

Prerequisites: Dependencies resolved, assets verified Output: Sample run summary with loss trends, resource usage, readiness verdict

`/monitor-slurm`

Monitors running Slurm jobs and inspects their logs for errors. If no jobs are running, summarizes recent job history.

Options: --tail N, --full, --verbose

Workflow

# 1. Clone the repo
git clone <repo-url> && cd <repo-name>

# 2. Navigate and map the repo
/navigate-repo

# 3. Resolve dependencies
/resolve-deps

# 4. Run a sample job
/sample-run

# 5. Monitor job progress
/monitor-slurm

Cluster Support

Currently configured for:

LUMI: AMD MI250X GPUs (ROCm), x86_64, Singularity
Olivia: NVIDIA GH200 GPUs (CUDA), ARM Grace CPUs, Apptainer

The workflow adapts automatically based on the active cluster in CLAUDE.md.

Notes

All modifications to repo code are logged in PATCHES.md
Container-first approach - direct conda/pip installs on filesystem are avoided
Skills are designed to work with poorly-maintained or undocumented repos
Verification is thorough: imports, GPU visibility, repo module loading

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Installation

Configuration

Skills

`/navigate-repo`

`/resolve-deps`

`/sample-run`

`/monitor-slurm`

Workflow

Cluster Support

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
skills		skills
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
Workflow.md		Workflow.md

Folders and files

Latest commit

History

Repository files navigation

Overview

Installation

Configuration

Skills

/navigate-repo

/resolve-deps

/sample-run

/monitor-slurm

Workflow

Cluster Support

Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

`/navigate-repo`

`/resolve-deps`

`/sample-run`

`/monitor-slurm`

Packages