Skip to content

Commit

Permalink
Merge pull request #286 from ResearchComputing/allocations_year2
Browse files Browse the repository at this point in the history
Alpine allocations docs updates
  • Loading branch information
Trevhall52 committed Dec 22, 2023
2 parents 13fe436 + be40409 commit bc16c7d
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 4 deletions.
16 changes: 12 additions & 4 deletions docs/clusters/alpine/allocations.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,8 @@ combat this is to apply for an allocation.

In addition to the Trailhead auto-allocation (`ucb-general`) that all users are awarded automatically, CURC offers two
additional tiers to accommodate larger computing needs on Alpine. The **Ascent Allocation** tier provides users
with 250,000 SUs over a 12 month period. The **Peak Allocation** tier is
aimed at projects that will consume between 250,000 and 5,000,000 SUs in a
with 350,000 SUs over a 12 month period. The **Peak Allocation** tier is
aimed at projects that will consume between 350,000 and 6,000,000 SUs in a
12 month period. Users may apply for these tiers as described below.

CURC's tiered allocations are structured in a way such that your jobs are
Expand Down Expand Up @@ -85,7 +85,7 @@ use.
#### Get a Peak Allocation

Step 1: Download and complete the [Peak Allocation Request Supplementary
Information](https://o365coloradoedu.sharepoint.com/:x:/s/RC-Team/EW-ZgyLKV8VNhDrYVwH_UvoBGLFgZcZVU2-W2_xjx1EoAg?e=EsOj4M)
Information](https://o365coloradoedu.sharepoint.com/:x:/s/RC-Team/EajdPBAejjpDru7kvEEA29QBI8CoO8lj7-kUjotBIIusEg?e=geLBBP)
document. You need to be logged into Office365 with your CU Boulder
account.

Expand All @@ -95,10 +95,18 @@ logged into Office365 with your CU Boulder account.
The last question will ask you to upload your completed Peak Allocation
Request Supplementary Information document from step 1.

Step 3: Look out for email messages from the CURC ticketing system (<rc-help@colordo.edu>). User Support will contact you when the proposal
Step 3: Look out for email messages from the CURC ticketing system (<rc-help@colorado.edu>). User Support will contact you when the proposal
is received, during the initial
review stages, and when the allocation is ready to use.

#### Renewing Your Allocation

Step 1: Keep an eye on your email inbox for a notification that your allocation is about to expire. Notifications will be sent one month prior to expiration to give you plenty of time to renew. Allocations will automatically expire one year after they are provisioned.

Step 2: Fill out either the [Peak Allocation Renewal](https://forms.office.com/r/wimT1SCsWz) form, or the [Ascent Allocation Renewal](https://forms.office.com/r/1ymj7gxQF3) form. You need to be logged into Office365 with your CU Boulder account.

Step 3: Look out for email messages from the CURC ticketing system (<rc-help@colorado.edu>). User Support will contact you when the renewal request is received and when the renewed allocation is ready to use.

Alpine is jointly funded by the University of Colorado Boulder, the University of Colorado Anschutz, Colorado State University, and the
National Science Foundation (award 2201538).

Expand Down
Binary file modified docs/clusters/alpine/images/alpine-allocation-tiers-chart.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 51 additions & 0 deletions docs/compute/monitoring-resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ You have sucessfully loaded slurmtools, a collection of functions
'seff' (CPU and RAM efficiency for a specified job)
'seff-array' (CPU, RAM, and time efficiency for a specified array job)
'jobstats' (job statistics for all jobs run by a specified user over N days)
'levelfs' (current fair share priority for a specified user)
Expand Down Expand Up @@ -253,6 +255,55 @@ This output tells us that:

This information is also sent to users who include the `--mail` directive in jobs.

___How can I check the efficiency of array jobs?___

Use the `seff-array` command with the help flag for a usage hint:
```
$ seff-array -h
```
```
usage: seff-array.py [-h] [-c CLUSTER] [--version] jobid
positional arguments:
jobid
options:
-h, --help show this help message and exit
-c CLUSTER, --cluster CLUSTER
--version show program's version number and exit
```
In order to check the efficiency of all jobs in job array 8636572, run the command:
```
$ seff-array 8636572
```
This will display the status of all jobs in the array:
```
--------------------------------------------------------
Job Status
COMPLETED: 249
FAILED: 4
PENDING: 1
RUNNING: 22
TIMEOUT: 4
--------------------------------------------------------
```
Additionally, `seff-array` will display a histogram of the efficiency statistics all of the jobs in the array, separated into 10% increments. For example:
```
CPU Efficiency (%)
---------------------
+0.00e+00 - +1.00e+01 [ 3] ▌
+1.00e+01 - +2.00e+01 [244] ████████████████████████████████████████
+2.00e+01 - +3.00e+01 [ 8] █▎
+3.00e+01 - +4.00e+01 [ 2] ▍
+4.00e+01 - +5.00e+01 [ 0]
+5.00e+01 - +6.00e+01 [ 0]
+6.00e+01 - +7.00e+01 [ 0]
+7.00e+01 - +8.00e+01 [ 0]
+8.00e+01 - +9.00e+01 [ 0]
+9.00e+01 - +1.00e+02 [ 0]
```
The above indicates that all of the jobs displayed less than 40% CPU efficiency, with the majority (244/256) demonstrating between 10% and 20% efficiency. This information will also be displayed for memory and time efficiency.

### XDMoD

XDMoD is a web portal for viewing metrics at the system-, partition- and user-levels.
Expand Down

0 comments on commit bc16c7d

Please sign in to comment.