Skip to content

Commit

Permalink
update core and node counts for alpine phase 3 (#255)
Browse files Browse the repository at this point in the history
  • Loading branch information
b-reyes committed Jul 31, 2023
1 parent 75430c0 commit 53882a5
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 25 deletions.
4 changes: 2 additions & 2 deletions docs/clusters/alpine/alpine-hardware.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

| Count & Type | Scheduler Partition | Processor | Sockets | Cores (total) | Threads/Core | RAM/Core (GB) | L3 Cache (MB) | GPU type | GPU count | Local Disk Capacity & Type | Fabric | OS |
| --------------------- | ------------------- | ---------------- | ------- | ------------- | ------------ | ------------- | ------------- | ----------- | --------- | -------------------------- | -------------------------------------------- | -------- |
| 192 Milan General CPU | amilan | x86_64 AMD Milan | 1 or 2 | 64 | 1 | 3.8 | 32 | N/A | 0 | 416G SSD | HDR-100 InfiniBand (200Gb inter-node fabric) | RHEL 8.4 |
| 256 Milan General CPU | amilan | x86_64 AMD Milan | 1 or 2 | 64 | 1 | 3.8 | 32 | N/A | 0 | 416G SSD | HDR-100 InfiniBand (200Gb inter-node fabric) | RHEL 8.4 |
| 12 Milan High-Memory | amem | x86_64 AMD Milan | 2 | 48 | 1 | 21.5 | tbd | N/A | 0 | 416G SSD | HDR-100 InfiniBand (200Gb inter-node fabric) | RHEL 8.4 |
| 8 Milan AMD GPU | ami100 | x86_64 AMD Milan | 2 | 64 | 1 | 3.8 | 32 | AMD MI100 | 3 | 416G SSD | 2x25 Gb Ethernet +RoCE | RHEL 8.4 |
| 8 Milan NVIDIA GPU | aa100 | x86_64 AMD Milan | 2 | 64 | 1 | 3.8 | 32 | NVIDIA A100 | 3 | 416G SSD | 2x25 Gb Ethernet +RoCE | RHEL 8.4 |
Expand Down Expand Up @@ -63,7 +63,7 @@ Partitions available on Alpine:

| Partition | Description | # of nodes | cores/node | RAM/core (GB) | Billing_weight/core | Default/Max Walltime | Resource Limits |
| --------- | ---------------------------- | ---------- | ---------- | ------------- | ------------------- | ------------------------ | ----------------------|
| amilan | AMD Milan (default) | 283 | 64 | 3.75 | 1 | 24H, 24H | see qos table |
| amilan | AMD Milan (default) | 347 | 32 or 48 or 64 | 3.75 | 1 | 24H, 24H | see qos table |
| ami100 | GPU-enabled (3x AMD MI100) | 8 | 64 | 3.75 | 6.1<sup>3</sup> | 24H, 24H | 15 GPUs across all jobs |
| aa100 | GPU-enabled (3x NVIDIA A100)<sup>4</sup> | 12 | 64 | 3.75 | 6.1<sup>3</sup> | 24H, 24H | 22 GPUs across all jobs |
| amem<sup>1</sup> | High-memory | 14 | 48 | 20.83<sup>2</sup> | 4.0 | 4H, 7D | 96 cores across all jobs |
Expand Down
22 changes: 6 additions & 16 deletions docs/clusters/alpine/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,15 @@ Alpine
===============

On May 18, 2022, CU Research Computing released phase 1 of Alpine, the third generation CURC High Performance
Computing Cluster. Phase 2 was released in September 2022. The warranty of the Summit cluster's storage ended on
September 30, 2022. The full sun-setting of the Summit system is March 1, 2023.

.. image:: ./images/alpine_summit_timeline.png
:width: 400
:align: center

:alt: Summit-Alpine timeline

Computing Cluster. Phase 2 was released in September 2022. Phase 3 was released in July 2023.

**Overview:**

Alpine is the University of Colorado Boulder Research Computing’s third-generation high performance computing (HPC) cluster. Alpine is a heterogeneous compute cluster currently composed of hardware provided from University of Colorado Boulder, Colorado State University, and Anschutz Medical Campus. Alpine currently offers 317 compute nodes and a total of 18,080 cores.

Alpine can be securely accessed anywhere, anytime using OpenOnDemand or ssh connectivity to the CURC system.

Total Core Count: **18,080**

All nodes are available to all users. For full details about node access, please refer to the Alpine Node Access and FairShare policy.
Alpine is the University of Colorado Boulder Research Computing’s third-generation high performance computing (HPC) cluster.
Alpine is a heterogeneous compute cluster currently composed of hardware provided from University of Colorado Boulder, Colorado
State University, and Anschutz Medical Campus. Alpine currently offers 382 compute nodes and a total of 22,180 cores. Alpine can
be securely accessed anywhere, anytime using Open OnDemand or ssh connectivity to the CURC system. All nodes are available to all
users. For full details about node access, please refer to the Alpine Node Access and FairShare policy.

.. toctree::
:maxdepth: 1
Expand Down
13 changes: 6 additions & 7 deletions docs/clusters/alpine/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@

Alpine is the University of Colorado Boulder Research Computing's third-generation high performance computing (HPC)
cluster. Alpine is a heterogeneous compute cluster currently composed of hardware provided from University of Colorado
Boulder, Colorado State University, and Anschutz Medical Campus. Alpine currently offers 317 compute nodes and a total of 18,080 cores.

Alpine can be securely accessed anywhere, anytime using OpenOnDemand or ssh connectivity to the CURC system.
Boulder, Colorado State University, and Anschutz Medical Campus. Alpine currently offers 382 compute nodes and a total
of 22,180 cores. Alpine can be securely accessed anywhere, anytime using Open OnDemand or ssh connectivity to the CURC system.

### Alpine Quick-Start

Expand All @@ -29,23 +28,23 @@ request](https://curc.readthedocs.io/en/latest/clusters/alpine/software.html?hig
### Cluster Summary
#### Nodes
The Alpine cluster is made up of different types of nodes outlined below:
- **CPU nodes**: 188 AMD Milan compute nodes (184 nodes with 64 cores/node, 4 nodes with 48 cores/node) with 256 GB RAM
- **CPU nodes**: 347 AMD Milan compute nodes (270 nodes with 64 cores/node, 28 nodes with 48 cores/node, 49 nodes with 32 cores/node) with 256 GB RAM
- **GPU nodes**:
- 8 GPU-enabled (3x AMD MI100) atop AMD Milan CPU
- 8 GPU-enabled (3x NVIDIA A100) atop AMD Milan CPU
- 12 GPU-enabled (3x NVIDIA A100) atop AMD Milan CPU
- **High-memory nodes**: 12 AMD Milan nodes with 1TB of memory

Alpine also includes nodes contributed by partner institutions. Contributors with nodes in either deployment or production are:
- **Colorado State University**: 77 AMD Milan compute nodes (28 nodes with 48 cores/node, 49 nodes with 32 cores/node)
- **CU Anschutz Medical Campus**: 16 AMD Milan compute nodes (64 cores/node), 2 AMD Milan nodes with 1TB of RAM, and 4 GPU-enabled (3x NVIDIA A100 atop AMD Milan)
- **CU Anschutz Medical Campus**: 14 AMD Milan compute nodes (64 cores/node), 2 AMD Milan nodes with 1TB of RAM, and 4 GPU-enabled (3x NVIDIA A100 atop AMD Milan)

All nodes are available to all users. For full details about node access, please read the [Alpine node access and FairShare policy](condo-fairshare-and-resource-access.md).

> For a full list of nodes on Alpine use the command: `scontrol show nodes`. Get single node details with the `scontrol show nodes <node name>` command.
#### Interconnect
The Alpine cluster has different types of interconnects/fabrics which connect different types of hardware, outlined below:
- **CPU nodes**: HDR-100 InfiniBand (200Gb inter-node fabric); available on most CPU nodes as of September 2022 and on most remaining CPU nodes pending hardware arrivals
- **CPU nodes**: HDR-100 InfiniBand (200Gb inter-node fabric); available on most CPU nodes as of July 2023 and on most remaining CPU nodes pending hardware arrivals
- **GPU nodes**: 2x25 Gb Ethernet +RoCE
- **High-memory nodes**: 2x25 Gb Ethernet +RoCE
- **Scratch storage**: 25Gb Ethernet +RoCE
Expand Down

0 comments on commit 53882a5

Please sign in to comment.