Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-Performance Computing (HPC) #16182

Closed
hongbo-miao opened this issue Apr 27, 2024 · 1 comment · Fixed by #16260 or #16319
Closed

High-Performance Computing (HPC) #16182

hongbo-miao opened this issue Apr 27, 2024 · 1 comment · Fixed by #16260 or #16319
Assignees

Comments

@hongbo-miao
Copy link
Owner

hongbo-miao commented Apr 27, 2024

High-Performance Computing (HPC)

AWS ParallelCluster

HPC job managing and scheduling tools

Comparison

@hongbo-miao hongbo-miao self-assigned this Apr 27, 2024
@hongbo-miao hongbo-miao changed the title HPC job managing and scheduling HPC Apr 27, 2024
@hongbo-miao hongbo-miao linked a pull request Apr 28, 2024 that will close this issue
@hongbo-miao hongbo-miao changed the title HPC High-Performance Computing (HPC) Apr 28, 2024
@hongbo-miao
Copy link
Owner Author

hongbo-miao commented Apr 29, 2024

Added by

Added diagram as part of the full architecture.

image

➜ pcluster create-cluster --cluster-name=hm-hpc-cluster --cluster-configuration=config/hm-hpc-cluster-config.yaml
{
  "cluster": {
    "clusterName": "horizon-hpc-cluster",
    "cloudformationStackStatus": "CREATE_IN_PROGRESS",
    "cloudformationStackArn": "arn:aws:cloudformation:us-west-2:272394222652:stack/hm-hpc-cluster/d8cc5ef0-05b0-11ef-bf6e-0ac24af65aed",
    "region": "us-west-2",
    "version": "3.9.1",
    "clusterStatus": "CREATE_IN_PROGRESS",
    "scheduler": {
      "type": "slurm"
    }
  }
}

➜ pcluster ssh --cluster-name=hm-hpc-cluster

ubuntu@ip-172-31-32-251:~$ sbatch --nodes=3 --partition=spot-queue --constraint="[c7gn-16xlarge*1&c7gn-metal*2]" --wrap="srun jobs/hello.sh"
Submitted batch job 1

ubuntu@ip-172-31-32-251:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 1 spot-queu     wrap   ubuntu CF       0:06      3 spot-queue-dy-c7gn16xlarge-1,spot-queue-dy-c7gnmetal-[1-2]

ubuntu@ip-172-31-32-251:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)

ubuntu@ip-172-31-32-251:~$ ls
jobs  slurm-1.out

ubuntu@ip-172-31-32-251:~$ cat slurm-1.out
Hello from spot-queue-dy-c7gn16xlarge-1
Hello from spot-queue-dy-c7gnmetal-1
Hello from spot-queue-dy-c7gnmetal-2

Repository owner locked and limited conversation to collaborators May 29, 2024
@hongbo-miao hongbo-miao converted this issue into discussion #16981 May 29, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →