# Introduction to Machine Learning and High Performance Computing

---

## 1. What is HPC?

High Performance Computing (HPC) refers to the use of clusters or supercomputers composed of **many interconnected processors** that operate in parallel to perform calculations at very high speed.

An HPC cluster consists of several computers that work together to run programs very fast. Each individual computer is called a **node**.

By distributing tasks across multiple nodes and using high-speed networks for communication, HPC systems can handle large-scale simulations, data-intensive computations, and machine learning workloads that are infeasible on standard desktop or laptop computers.

HPC is like having hundreds of computers (nodes) work together at once so big problems can be solved much faster than on a single laptop.

## 2. HPC@UTD

The **HPC team** at UT Dallas manages two main HPC clusters:

- **Ganymede2**  
  A computation condo HPC system. Ganymede2 assets are primarily owned by private researchers, the system has what are called “preempt” queues, which allow job submission from all Ganymede2 users.

- **Juno**  
  An HPC cluster available for demanding research computing workloads. Juno is available to faculty, students and staff working on research projects.

For full documentation, see: [HPC@UTD](https://hpc.utdallas.edu/)


## 3. Getting Started with Ganymede2

### Step 1: Request an Account
Apply here: [Account Request](https://hpc.utdallas.edu/services/)  
Accounts are created by research groups. First the faculty PI applies and then the members of their research group apply.

### Step 2: Set Up VPN
Your computer must be connected to the the UTD campus network via
- Wired Ethernet connection on campus, or
- WiFi connection to CometNet, or
- A VPN connection to campus **UT Dallas VPN**.  
[VPN Instructions](https://atlas.utdallas.edu/TDClient/30/Portal/KB/ArticleDet?ID=152)

### Step 3: Choose a Terminal Client

To use Ganymede2 (or Juno), you will connect through the **command-line interface (CLI)**.  
When you log in, you are given a **command prompt (shell)** where you type commands and see the output.  
Accessing the shell requires a **terminal application**, and the choice depends on your operating system and preference.

---

**Linux**  
- All Linux distributions have a built-in terminal.  
- To open it, search for "Terminal" in your applications menu.  
- The exact name may vary depending on your desktop environment, but it should always be available.  

---

**macOS**  
- macOS comes with a built-in terminal application called **Terminal**.  
- It may not appear in the Dock by default. To open it:  
  - Use **LaunchPad** and search for "Terminal"  
  - Or use **Spotlight Search** (Command + Space) → type "terminal" → hit Enter  
- Alternative terminal applications (like [Ghostty](https://ghostty.org/docs)) provide extra features.

---

**Windows**  
- Windows does not include a native UNIX-like terminal, but it has **Command Prompt** and **PowerShell**.  
- These can connect to HPC clusters but are very limited in UNIX functionality.  
- Better options include:  
  - **MobaXterm**: provides a native-lik UNIX shell (mimic a real UNIX shell inside Windows)  with a package manager  
  - **PuTTY**: lightweight tool for connecting to remote systems (no native shell)  
  - **Windows Subsystem for Linux (WSL)**: creates a Linux environment with a full command-line interface  

**Note:** While you may use Command Prompt or PowerShell, they are only sufficient for basic connectivity.  

---

### Step 4: Connecting to the HPC System

Before connecting, make sure:  
1. You are on the **UTD campus network** or **UT Dallas VPN**.  
2. You have a **terminal client** installed (macOS/Linux: Terminal; Windows: MobaXterm, PuTTY, or WSL).

#### Option 1: Basic SSH Login

The standard way to log in to a Linux system remotely is with Secure Shell (SSH). The format is:

Run the ssh command inside a terminal, to connect to HPC, use the command

```bash
ssh <USERNAME>@<HOSTNAME>
```

- Replace <USERNAME> with your UT Dallas NetID
- Replace <HOSTNAME> with the HPC system name (e.g., ganymede2.utdallas.edu)

The terminal will prompt you for a password (NetID password).

#### Option 2: Generate an SSH Key (Optional, Recommended)

- Allows password-free login.

In a terminal, run:

```bash
ssh-keygen -t ed25519
```
- Press **Enter** to save to default location (`~/.ssh/id_ed25519`).  
- Optionally, set a passphrase (you’ll enter it twice).  

This creates two files:  
- Private key: `~/.ssh/id_ed25519` (keep secret!)  
- Public key: `~/.ssh/id_ed25519.pub` (shareable)

Then, copy Your Public Key to the HPC System

```bash
ssh-copy-id -i ~/.ssh/id_ed25519.pub user@host2
```

- Enter your NetID password once.  
- You should see confirmation: `Number of key(s) added: 1`

Log in with Your SSH Key

```bash
ssh <USERNAME>@<HOSTNAME>
```
- If you set a passphrase, enter it now. Otherwise, you’ll log in directly without your NetID password.  

Detailed instructions can be found here: [UTD CIRC Connecting Guide](https://docs.circ.utdallas.edu/user-guide/intro-to-hpc/connecting.html). Please note that the HPC team is planning to retire this guide page, and a replacement is not yet available. For updates, please contact HPC@UTD.


## 4. Once Logged In

You should see a welcome message, current disk quotas and G2 command promp. You are now ready to use Ganymede2!

Common commands include:

Directories:

- `pwd` → print/show current working directory
- `cd` → change current directory
- `mkdir` → make a new directory

Editor:

- `nano` → a full screen interactive editor to edit text files

The module command:

- `module avail` → Lists all available software modules on the system.
- `module list`  → Displays the modules that are currently loaded in your session.
- `module load <module name>` → Loads a specific module (e.g., python/3.8, R/4.4.1).  
- `module unload <module name>` → Unloads a specific module
- `module spider <module_name>` — Searches for a specific module and shows
detailed information, including available versions and dependencies.  
- `module keyword <keyword>` — Finds all modules whose names or descriptions contain the specified keyword.

Check the available nodes:

- `sinfo` → shows status of partitions/queues  
- `scontrol show node c-05-11` → view detailed node information

Job related:

- `sbatch <script_name>` → submits a job script to the scheduler
- `squeue -u <netID>` → lists running and waiting jobs  
- `scontrol show job <job_id>` → shows detailed information about a specific job  
- `scancel <job_id>` → cancel a submitted job

In [None]:
# @title
import pandas as pd

# Create a DataFrame for HPC node states
data = {
    "STATE": ["idle", "alloc", "mix", "down", "drain", "comp"],
    "Meaning": [
        "The node is free and ready to run a job.",
        "The node is fully allocated — all resources are in use.",
        "The node is partially allocated — some resources are still available.",
        "The node is offline (e.g., under maintenance or error).",
        "The node is being drained of jobs and not accepting new ones.",
        "The node is completing its current jobs before becoming idle."
    ],
    "Indicator": ["✅ Available", "🚫 Busy", "⚙️ Partially Used", "🛠️ Unavailable", "🧹 Draining", "🕒 Finishing Up"]
}

df = pd.DataFrame(data)

# Display nicely formatted table
df.style.set_caption("HPC Node States (from sinfo)") \
    .set_table_styles(
        [{"selector": "caption", "props": [("caption-side", "top"), ("font-size", "16px"), ("font-weight", "bold")]}]
    ) \
    .hide(axis="index")


STATE,Meaning,Indicator
idle,The node is free and ready to run a job.,✅ Available
alloc,The node is fully allocated — all resources are in use.,🚫 Busy
mix,The node is partially allocated — some resources are still available.,⚙️ Partially Used
down,"The node is offline (e.g., under maintenance or error).",🛠️ Unavailable
drain,The node is being drained of jobs and not accepting new ones.,🧹 Draining
comp,The node is completing its current jobs before becoming idle.,🕒 Finishing Up


## 5. Slurm Batch Scripts

It's preferable to submit your job **non-interactively** to the compute nodes.  

SLURM is a job scheduler that manages the programs running on the compute nodes

Using a **batch script**, your job is queued until resources are available and then runs automatically without further input.

---

### Slurm Specifications in a Batch Script

At the beginning of your batch script, you specify Slurm settings by prefixing each line with `#SBATCH`. Common settings include:

- **Partition**: the resource group to request
  ```bash
  #SBATCH --partition=cpu-preempt
  ```

- **Number of nodes** required:
  ```bash
  #SBATCH --nodes=2
  ```
- **Total number of tasks** (or tasks per node):
  ```bash
  #SBATCH --ntasks=32
  #SBATCH --ntasks-per-node=16 # 16 CPUs per node
  ```
- **Maximum runtime** in the format Days-Hours:Minutes:Seconds:
  ```bash
  #SBATCH --time=1-12:00:00  # 1 day, 12 hours
  ```
- Email notifications (optional):
  ```bash
  #SBATCH --mail-type=ALL
  #SBATCH --mail-user=your.email@utdallas.edu
  ```

For a full list of available settings, see the [sbatch documentation](https://slurm.schedmd.com/sbatch.html).


### Example: Job with Slurm

  - This example demonstrates a Slurm batch script to run a Python/R script `example.py` (`example.R`) using 16 cores on one node of the `turing` partition, with a maximum runtime of 2 hours.

```bash
#!/bin/bash
#SBATCH -J example          # Job name
#SBATCH -o example.out      # Output file
#SBATCH -e example.err      # Error file
#SBATCH -p turing            # Partition/queue
#SBATCH -N 1                 # Number of nodes
#SBATCH -n 16                # Total number of tasks (CPUs)
#SBATCH -t 02:00:00          # Maximum runtime (2 HOURS)

# Run the Python script using default Python
python example.py
# Run the R script using default R
Rscript example.R
```
**Note:** Using `python example.py` assumes your desired Python is the default in your environment. If you need a specific conda environment, activate it before running the script.


## 6. Python Environment

Python programs often require a specific version of python to run

- The combination of a Python interpreter, its modules, and libraries is referred to as a Python **environment**.
- Since it's not feasible for the HPC team to install and maintain every possible Python version and package combination,
users are encouraged to create and manage their own Python environments.
- This can be done using Miniconda (recommended by HPC@UTD), a lightweight Python package and environment manager that provides Conda’s functionality without installing unnecessary default packages.

### Before Starting

To check if Conda is available on HPC resources, run:

```bash
module keyword conda
```
If available through the module system, load it with:
```bash
module load miniconda/4.12.0
```

### Configuring and Managing Conda Environments

Conda environments contain Python versions, packages, and dependencies isolated from other environments. They let you customize Python for specific workflows.

#### Initializing Conda

To enable the `conda` command in your shell:

```bash
conda init bash
```
Then either log out and back in, or run:
```bash
source ~/.bashrc
```
Your prompt should show `(base)` indicating the base Conda environment is active.  
To undo initialization:
```bash
conda init --reverse
```
#### Creating Environments

Create a new environment named `myenv`:
```bash
conda create --name myenv
```
Activate it:
```bash
conda activate myenv
```
By default, new environments have no packages installed. Install Python and packages like this:
```bash
conda install python=3.9
conda install numpy scipy matplotlib
conda install -c pytorch pytorch
```
#### Verifying Your Environment

Ensure the correct Python and pip executables are being used:
```bash
which python
which pip
python --version
```
You should see paths pointing to `$HOME/.conda/envs/myenv`.


## 7. Running a Job on HPC

To run a Python/R job on a HPC system, you typically need **two files**:

1. **Python script (`.py`) or R script (`.R`)**  
   - This file contains the Python or R code you want to run.
   - Example: `example.py`, `example.R`

2. **Slurm batch script (`.sh`)**  
   - This file tells Slurm how to run your Python script on the cluster.
   - Example: `example_python.sh`, `example_R.sh`

### Example Workflow

1. **Create your Python or R script**
   ```bash
   nano example.py
   nano example.R

In [None]:
# example.py
print("Hello from Python on HPC!")
x = [i**2 for i in range(5)]
print("Squares:", x)

Hello from Python on HPC!
Squares: [0, 1, 4, 9, 16]


In [None]:
# Installs the rpy2 package, which acts as a bridge between Python and R.
!pip install rpy2
# Loads the R extension
%load_ext rpy2.ipython



In [None]:
%%R
# example.R
cat("Hello from R on HPC!\n")
x <- 1:5
cat("Squares:", x^2, "\n")

Hello from R on HPC!
Squares: 1 4 9 16 25 


2. **Create your Slurm batch script**

    ```bash
    nano example_python.sh
    ```

    ```bash
    #!/bin/bash
    #SBATCH -J example_python
    #SBATCH -o /home/txw200000/ParallelComputingPython/example_python.out
    #SBATCH -e /home/txw200000/ParallelComputingPython/example_python.err
    #SBATCH -p turing
    #SBATCH -N 1
    #SBATCH -n 16
    #SBATCH -t 02:00:00

    # -----------------------------
    # Paths
    PYTHON=/home/txw200000/.conda/envs/ml_env/bin/python
    SCRIPT=/home/txw200000/ParallelComputingPython/example.py
    # -----------------------------

    echo "===== Environment Info ====="
    echo "SLURM Job ID: $SLURM_JOB_ID"
    echo "SLURM Nodes: $SLURM_NODELIST"
    echo "CPUs allocated: $SLURM_CPUS_ON_NODE"
    echo "----------------------------"

    echo "Running Python code"
    $PYTHON $SCRIPT
    ```

    ```bash
    nano example_R.sh
    ```

    ```bash
    #!/bin/bash         
    #SBATCH -J example_R
    #SBATCH -o /home/txw200000/ParallelComputingPython/example_R.out
    #SBATCH -e /home/txw200000/ParallelComputingPython/example_R.err
    #SBATCH -p turing
    #SBATCH -N 1
    #SBATCH -n 16
    #SBATCH -t 02:00:00

    # -----------------------------
    # Paths
    R_SCRIPT=/home/txw200000/ParallelComputingPython/example.R
    # -----------------------------

    echo "===== Environment Info ====="
    echo "SLURM Job ID: $SLURM_JOB_ID"
    echo "SLURM Nodes: $SLURM_NODELIST"
    echo "CPUs allocated: $SLURM_CPUS_ON_NODE"
    echo "----------------------------"

    echo "Running R code"

    module swap gnu12 gnu9/9.4.0
    module load R/4.4.1

    Rscript $R_SCRIPT
    ```


3. **Submitting the Job**

    Use the following command to submit your batch script to Slurm:

    ```bash
    sbatch ~/ParallelComputingPython/example_python.sh
    sbatch ~/ParallelComputingPython/example_R.sh
    ```

    - Slurm will queue your job and run it when resources are available.  
    - Output and errors will be saved in `example_python.out` (`example_R.out`) and `example_python.err` (`example_R.err`) respectively.

    ```text
    Python/R script (.py/.R)
        │
        ▼
    Slurm batch script (.sh)
        │
        ▼
    sbatch command
        │
        ▼
    Job queued in Slurm
        │
        ▼
    Job runs on compute nodes
        │
        ▼
    Output saved in .out / .err files
    ```

4. **Check HPC output**

    ```bash
    ===== Environment Info =====
    SLURM Job ID: 2573603
    SLURM Nodes: g-01-07
    CPUs allocated: 16
    ----------------------------
    Running Python code
    Hello from Python on HPC!
    Squares: [0, 1, 4, 9, 16]
    ```

    ```bash
    ===== Environment Info =====
    SLURM Job ID: 2573601
    SLURM Nodes: g-01-07
    CPUs allocated: 16
    ----------------------------
    Running R code
    Hello from R on HPC!
    Squares: 1 4 9 16 25
    ```

## 8. Parallel Computing


1.  One computer (node), multiple cores: Multiple processors work simultaneously to speed up tasks.
2.  Multiple computers (nodes): Networked computers collaborate to solve large problems faster.

### One computer (node), multiple cores

#### Python

In [None]:
import numpy as np
import time
import os
from multiprocessing import Pool, cpu_count

# Simulate one replicate: mean of sum of two normals.
def simulate(_):
    x = np.random.normal(0, 1, 100)
    y = np.random.normal(5, 1, 100)
    sums = x + y
    return np.mean(sums)

# Sequential run 100000 simulations
def run_sequential(n_iter=100000):
    start = time.time()
    results = [simulate(i) for i in range(n_iter)]
    mean_val = np.mean(results)
    end = time.time()
    elapsed = end - start
    print(f"Sequential mean: {mean_val:.4f}, time: {elapsed:.4f} seconds")
    return elapsed

# Parallel run 100000 simulations
def run_parallel(n_iter=100000):
    start = time.time()
    n_cpus = int(os.environ.get("SLURM_CPUS_ON_NODE", os.cpu_count()))
    with Pool(n_cpus) as pool:
        results = pool.map(simulate, range(n_iter))
    print(f"Number of CUPs: {n_cpus}")
    mean_val = np.mean(results)
    end = time.time()
    elapsed = end - start
    print(f"Parallel mean: {mean_val:.4f}, time: {elapsed:.4f} seconds")
    return elapsed

if __name__ == "__main__":
    n_iter = 100000
    print("=== Simulation of Normal Sums ===")
    time_seq = run_sequential(n_iter)
    time_par = run_parallel(n_iter)

    # Speedup calculation
    print(f"Speedup: {time_seq/time_par:.2f}x")


=== Simulation of Normal Sums ===
Sequential mean: 5.0001, time: 1.7886 seconds
Number of CUPs: 2
Parallel mean: 5.0005, time: 1.9126 seconds
Speedup: 0.94x


```bash
#!/bin/bash
#SBATCH -J sim_normal
#SBATCH -o /home/txw200000/ParallelComputingPython/sim_normal.out
#SBATCH -e /home/txw200000/ParallelComputingPython/sim_normal.err
#SBATCH -p turing
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -t 02:00:00

# -----------------------------
# Paths
PYTHON=/home/txw200000/.conda/envs/ml_env/bin/python
SCRIPT=/home/txw200000/ParallelComputingPython/sim_normal.py
# -----------------------------

echo "===== Environment Info ====="
echo "SLURM Job ID: $SLURM_JOB_ID"
echo "SLURM Nodes: $SLURM_NODELIST"
echo "CPUs allocated: $SLURM_CPUS_ON_NODE"
echo "Using Python: $($PYTHON -c 'import sys; print(sys.executable)')"
echo "----------------------------"

echo "Running Simulation: Sequential vs Parallel"
$PYTHON $SCRIPT
```

##### HPC Output

```bash
===== Environment Info =====
SLURM Job ID: 2540987
SLURM Nodes: g-01-07
CPUs allocated: 16
Using Python: /home/txw200000/.conda/envs/ml_env/bin/python
----------------------------
Running Simulation: Sequential vs Parallel
=== Simulation of Normal Sums ===
Sequential mean: 4.9997, time: 1.4248 seconds
Parallel mean:   5.0000, time: 0.5081 seconds
Speedup:    2.80x
```

#### R

In [None]:
%%R
# Load the 'parallel' package for multicore computation
library(parallel)

# Simulate one replicate: mean of sum of two normals.
simulate_one <- function(i) {
  x <- rnorm(100, 0, 1)
  y <- rnorm(100, 5, 1)
  mean(x + y)
}

# Sequential run 100000 simulations
run_sequential <- function(n_iter = 100000) {
  start <- Sys.time()
  results <- sapply(1:n_iter, simulate_one)
  mean_val <- mean(results)
  end <- Sys.time()
  elapsed <- as.numeric(difftime(end, start, units = "secs"))
  cat(sprintf("Sequential mean: %.4f, time: %.4f seconds\n", mean_val, elapsed))
  return(elapsed)
}

# Parallel run 100000 simulations
run_parallel <- function(n_iter = 100000) {
  # Use SLURM allocated CPUs
  n_cores <- as.numeric(Sys.getenv("SLURM_CPUS_ON_NODE", 2))
  print(sprintf("Using %d CPUs", n_cores))
  # For Linux nodes, FORK type is faster
  cl <- makeCluster(n_cores, type = "FORK")
  # Export function to workers
  clusterExport(cl, "simulate_one")
  start <- Sys.time()
  results <- parSapply(cl, 1:n_iter, simulate_one)
  mean_val <- mean(results)
  end <- Sys.time()
  stopCluster(cl)
  elapsed <- as.numeric(difftime(end, start, units = "secs"))
  cat(sprintf("Parallel mean:   %.4f, time: %.4f seconds\n", mean_val, elapsed))
  return(elapsed)
}

# Main
n_iter <- 100000
cat("=== Simulation of Normal Sums ===\n")
time_seq <- run_sequential(n_iter)
time_par <- run_parallel(n_iter)

# Speedup
cat(sprintf("Speedup: %.2fx\n", time_seq / time_par))

=== Simulation of Normal Sums ===
Sequential mean: 5.0003, time: 2.5477 seconds
[1] "Using 2 CPUs"
Parallel mean:   4.9999, time: 2.2929 seconds
Speedup: 1.11x


```bash
#!/bin/bash
#SBATCH -J sim_normal_R
#SBATCH -o /home/txw200000/ParallelComputingPython/sim_normal_R.out
#SBATCH -e /home/txw200000/ParallelComputingPython/sim_normal_R.err
#SBATCH -p turing
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -t 02:00:00

# -----------------------------
# Paths
R_SCRIPT=/home/txw200000/ParallelComputingPython/sim_normal.R
# -----------------------------

echo "===== Environment Info ====="
echo "SLURM Job ID: $SLURM_JOB_ID"
echo "SLURM Nodes: $SLURM_NODELIST"
echo "CPUs allocated: $SLURM_CPUS_ON_NODE"
echo "----------------------------"

echo "Running Simulation: Sequential vs Parallel"

module swap gnu12 gnu9/9.4.0
module load R/4.4.1

Rscript $R_SCRIPT


##### HPC Output
```bash
===== Environment Info =====
SLURM Job ID: 2540988
SLURM Nodes: g-01-07
CPUs allocated: 16
R version: [1] "R version 4.4.1 (2024-06-14)"
----------------------------
Running Simulation: Sequential vs Parallel
=== Simulation of Normal Sums ===
Sequential mean: 5.0000, time: 1.2385 seconds
Parallel mean:   5.0002, time: 0.3258 seconds
Speedup: 3.80x


### Multiple computers (nodes)

In [None]:
!pip install mpi4py

Collecting mpi4py
  Downloading mpi4py-4.1.0-cp312-cp312-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (16 kB)
Downloading mpi4py-4.1.0-cp312-cp312-manylinux1_x86_64.manylinux_2_5_x86_64.whl (1.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: mpi4py
Successfully installed mpi4py-4.1.0


In [None]:
from mpi4py import MPI
import numpy as np
import time

# --- Fixed parameters ---
N_ITER = 100000
SAMPLE_SIZE = 100

# Simulate one replicate
def simulate_one():
    x = np.random.normal(0, 1, SAMPLE_SIZE)
    y = np.random.normal(5, 1, SAMPLE_SIZE)
    return np.mean(x + y)

# Simulate a chunk of replicates
def simulate_chunk(n_iter):
    return [simulate_one() for _ in range(n_iter)]

if __name__ == "__main__":
    # Initialize the MPI communicator
    comm = MPI.COMM_WORLD
    # Rank (ID) of this process, from 0 to size-1
    rank = comm.Get_rank()
    # Total number of MPI processes in this communicator
    size = comm.Get_size()

    if rank == 0:
        print(f"Total MPI tasks: {size}")
        print(f"Total simulations: {N_ITER}")
        print(f"Sample size per replicate: {SAMPLE_SIZE}")

        # Sequential run (optional)
        start_seq = time.time()
        seq_results = simulate_chunk(N_ITER)
        mean_seq = np.mean(seq_results)
        time_seq = time.time() - start_seq
        print(f"[Sequential] Mean: {mean_seq:.4f}, time: {time_seq:.4f} s")

    # Parallel run
    n_iter_per_task = N_ITER // size
    # Synchronize all MPI processes
    comm.Barrier()
    start_par = time.time()

    local_results = simulate_chunk(n_iter_per_task)
    all_results = comm.gather(local_results, root=0)

    end_par = time.time()

    if rank == 0:
        all_results = np.concatenate(all_results)
        mean_par = np.mean(all_results)
        time_par = end_par - start_par
        print(f"[Parallel] Mean: {mean_par:.4f}, time: {time_par:.4f} s")
        print(f"Total replicates collected: {len(all_results)}")
        print(f"Speedup: {time_seq / time_par:.2f}x")

```bash
#!/bin/bash
#SBATCH -J sim_normal_mpi
#SBATCH -o /home/txw200000/ParallelComputingPython/sim_normal_mpi.out
#SBATCH -e /home/txw200000/ParallelComputingPython/sim_normal_mpi.err
#SBATCH -p turing
#SBATCH -N 2
#SBATCH -n 20
#SBATCH --ntasks-per-node=10
#SBATCH -t 02:00:00

# -----------------------------
# Paths
PYTHON=/home/txw200000/.conda/envs/ml_env/bin/python
SCRIPT=/home/txw200000/ParallelComputingPython/sim_normal_mpi.py
# -----------------------------

echo "===== Environment Info ====="
echo "SLURM Job ID: $SLURM_JOB_ID"
echo "SLURM Nodes: $SLURM_NODELIST"
echo "MPI tasks:   $SLURM_NTASKS"
echo "Using Python: $($PYTHON -c 'import sys; print(sys.executable)')"
echo "----------------------------"

echo "Running MPI bootstrap simulation"
prun $PYTHON $SCRIPT



##### HPC Output
```bash
===== Environment Info =====
SLURM Job ID: 2581721
SLURM Nodes: c-05-12,g-01-07
MPI tasks:   20
Using Python: /home/txw200000/.conda/envs/ml_env/bin/python
----------------------------
Running MPI bootstrap simulation
[prun] Master compute host = c-05-12
[prun] Resource manager = slurm
[prun] Launch cmd = mpirun /home/txw200000/.conda/envs/ml_env/bin/python /home/txw200000/ParallelComputingPython/sim_normal_mpi.py (family=openmpi4)
Total MPI tasks: 20
Total simulations: 100000
Sample size per replicate: 100
[Sequential] Mean: 4.9999, time: 1.6111 s
[Parallel] Mean: 4.9995, time: 0.2374 s
Total replicates collected: 100000
Speedup: 6.79x


## 9. Getting Help

- Visit the [HPC@UTD services page](https://hpc.utdallas.edu/services/)

- Email circ-assist@utdallas.edu

- Stop by HPC@UTD office on weekdays between 9:30 am and 3:30 pm in the Administration Building, AD 3.207