Skip to content

Commit

Permalink
Merge pull request #9 from flindersuni/develop
Browse files Browse the repository at this point in the history
Etiquette & Access Permissions
  • Loading branch information
The-Scott-Flinders committed Sep 8, 2020
2 parents a5035d6 + 0e94a9b commit bbde94e
Show file tree
Hide file tree
Showing 6 changed files with 229 additions and 22 deletions.
25 changes: 23 additions & 2 deletions docs/source/FAQ/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ When all goes well, your prompt should read something similar to

Notice the (/home/ande0548/isoseq3)? Thats a marker to tell you which Python/Conda Environment you have active at this point.

BX-Python
BX Python
----------
The given bx-python version in the wiki doesn't install correctly, and if it *does* work, then it will fail on run. To get a working version, run the following.

Expand Down Expand Up @@ -108,4 +108,25 @@ Right at the start of your script, add the following lines:
* source /home/FAN/.bashrc
* conda activate /path/to/conda/environment

This will load conda, initialises (all of your) conda environment(s), force a shell refresh and load that new configuration, then finally load up your environment. Your job can now run without strange conda-based initialisation errors.
This will load conda, initialises (all of your) conda environment(s), force a shell refresh and load that new configuration, then finally load up your environment. Your job can now run without strange conda-based initialisation errors.


BX-Python
=========
The given bx-python is a problematic module that appears in many of the BioScience packages in Conda, below will get you a working, Python 3 version.
These steps are the same as the installation for IsoSeq3, but given how often this particular python package gives the support team issues, it gets its own section!

* conda install -c conda-forge -c bioconda bx-python


What can I do on the Head Node?
================================
The head nodes are for small, 'Less than 10 minutes' (as a rough guide), small jobs.
Things like:

* Compiling software
* Copying / Decompressing Files
* Preparing Scripts

As a good rule, if it takes more than 10-15 minutes or > 2GB RAM, it should be run a SLURM Job, not on the head nodes.
Anything that uses too many resources on the head nodes will be *Terminated* **WITHOUT WARNING**.
2 changes: 1 addition & 1 deletion docs/source/FAQ/knownissues.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ EBPUB Version
Web / ReadTheDocs / Online Version
====================================

* Some builds seem to have magic-ed away the images and they no longer display correctly.
None at the moment.
10 changes: 9 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,14 @@ The new Flinders High Performance Computing (HPC) solution is called Deep Though
FAQ/knownissues.rst


.. toctree::
:maxdepth: 2
:caption: Software & Support Policies

policies/fairuse.rst
policies/accessandpermissions.rst


Acknowledgements
----------------

Expand All @@ -45,4 +53,4 @@ We recognise the respect the trademarks of all third-party providers referenced
License
^^^^^^^

This documentation is released under the `Creative-Commons: Attribution-ShareAlike 4.0 International <http://creativecommons.org/licenses/by-sa/4.0/>`_ license.
This documentation is released under the `Creative-Commons: Attribution-ShareAlike 4.0 International <http://creativecommons.org/licenses/by-sa/4.0/>`_ license.
87 changes: 87 additions & 0 deletions docs/source/policies/accessandpermissions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
HPC Etiquette
==================
The HPC is a shared resource, and to help make sure everybody can
continue to use the HPC together, the following provides some expected
behavior.

Head / Login / Management Nodes
--------------------------------

1) The Head / Login / Management Nodes are to be used for light, single-threaded tasks only.

2) If it takes more than 5-10 minutes or > 2-3GB of RAM, do NOT run it on the head node.

3) Some acceptable tasks include:

* Compiling software for your own use
* Transferring / Decompressing Files
* Light Pre/Post Processing
* SLURM Job Management


General Cluster Rules
------------------------

1) Use only the resources you need, remembering this is shared resource.

2) Clean up your disk usage in /scratch and /home regularly.

3) Do not attempt to bypass security controls.

4) Do not attempt to bypass job Scheduling.

5) Do not access the compute nodes directly.

6) Utlise /local on the compute nodes for your data sets if possible.


Permissions & Access Levels
----------------------------
The HPC has the following capabilities. If a capability is NOT listed,
then you are not permitted to perform the action. Security is approached
via a list of allowed actions, not a list of denied actions.

General User
+++++++++++++++

1) Full read & write permission to your own /scratch/FAN and /home/FAN locations

2) The ability to compile and run your own software, stored in your /home directory

3) The ability to run any module present in the module system

4) Manipulate your own jobs via SLURM


Group Admins
+++++++++++++
Trusted partners may be appointed 'Group Administrators' at the *sole discretion of the HPC Team* allowing them to:

1) Perform all actions of a general user

2) Manipulate SLURM actions of users under their remit


Permissions that are Never Granted
+++++++++++++++++++++++++++++++++++++
The following is a non-exhaustive list of permissions that are never, and will never, be granted to end-users. The HPC is a complicated system
and, while the Support Team is asked for these permissions quite often, the potential inadvertent damage to systems means these permissions cannot be provided.

1) Root or Sudo access

2) Global 'Module' Software Installation

3) Elevated Access to the HPC System

4) Access to Managerial Systems


If You Break These Rules
----------------------------
If you break these rules the HPC Team may take any or all of these actions:

* Cancellation of Tasks
* Removal problematic files and/or programs
* Warning of expected behavior
* Termination of identified problematic processes
* Revocation of HPC Access
124 changes: 106 additions & 18 deletions docs/source/policies/fairuse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,24 @@ Fair Usage Guidelines

The Deepthought HPC provides moderate use at no charge to Flinders University Colleges.
This is enforced by the Fairshare System and is under constant tweaking and monitoring to ensure
the best possible outcomes.
the best possible outcomes. The current split of resources between colleges is:

The following can be taken as general guidelines:
* 45 % for CSE
* 45 % for CMPH
* 10 % for General

* The current split between colleges is:
For example, if the HPC had 1000 'shares' that represent its resources, the following would be demonstrative of how they are allocated:

* 45 % CSE
* 45 % CMPH
* 10 % General
* 450 'shares' for CSE
* 450 'shares' for CMPH
* 100 'shares' for General


.. GettingAccess: ../Access/GettingAccess.md
.. _Getting Access: ../Access/GettingAccess.html
.. _Module System: ../ModuleSystem/LMod.html

Storage Usage Guidelines
============================
As explained in 'Getting Access', /scratch and /home have different targets. Some guidelines to follow :
As explained in `Getting Access`_ , /scratch and /home have different targets. Some guidelines to follow :

* Assume that anything on the HPC is *volatile* storage, and take appropriate backups
* Cleanup /scratch when you are done
Expand All @@ -30,24 +32,94 @@ Software Support Guidelines
====================================

HPC Users are encouraged to compile and install their own software when they are comfortable to do so.
This can be done freely on the head nodes.
This can be done freely on the head node(s).

The HPC Support team cannot maintain and provide active support for every piece of software that users of the HPC may need.
The following guidelines are an excerpt from our HPC Software Support Policy to summarise the key points.
For more information on how the software can be loaded/unloaded on the HPC, head on over to the `Module System`_.


Supported Software Categories
-------------------------------
The following categories are how the HPC Team asses and manage the differing types of Software that are present on the HPC.
For more information, each will have their own section.

* Core 'Most Used' Programs
* Licensed Software
* Libraries
* Toolchains
* Transient Packages
* Interpreters / Scripting Interfaces


Core 'Most Used' Programs
++++++++++++++++++++++++++++++++++++
Holds the most used packages on the HPC. The HPC Team monitors the loading and unloading of modules, so we can manage the lifecycle of software on the HPC.
As an example, some of the most used program on the HPC are:

* R
* Python 3.8
* RGDAL
* CUDA 10.1 Toolkit

While not an exhaustive list of the common software, it does allow the team to focus our efforts and provide more in-depth support for these programs.
This means they are usually first to be updated and have a wider range of tooling attached to them by default.

Licensed Software
+++++++++++++++++++++++++++++++++++++++++++
Licensed Software covers the massive packages like ANSYS (which, all installed is about 300 *gigabytes*) which are licensed either to the HPC specifically or
obtained for usage via the University Site licenses. This covers things like:

* ANSYS Suite (Structures, Fluids, Electronics, PrepPost & Photonics)


Compiler Toolchains
---------------------
Generally, we support the FOSS (Free Open Source Software) Toolchain, comprising of:
Libraries
++++++++++++++++++++++++
Generally, these libraries are required as dependencies for other software, however there are some core libraries that are used more than others.
As an example, the Geometry Engine - Open Source (GEOS) Library is commonly used by other languages (R, Python) to allow for Geo-Spatial calculations.
Some of the common libraries include:

* GNU C, C++ & Fortran Compilers
* GEOS
* ZLib
* ImageMagik
* OpenSSL

Most of these are useful in a direct manner only for software compilation or runtime usage.

Unless compiling your own software, you can safely ignore this section - the `Module System`_ takes care of this for you.

Toolchains
+++++++++++++++++++++++++
Generally we support the FOSS (Free Open Source Software) Toolchain, comprising of:

* GNU C, C++ & Fortran Compilers (gcc, g++ and gfortran)
* GNU Bin Utils
* Open MPI Library
* Open BLAS, LAPACK & ScaLAPACK
* FFTW Library

Transient Packages
+++++++++++++++++++++
Listed here for completeness, these packages are install-time dependencies or other packages that do not fit within the above schema.

Software Versioning & Support
-------------------------------

Not all software is suitable for the HPC, and the HPC team cannot manage every single piece of sftware that exists!
Below is a list of the core software & libraries that are under active support.
Scripting Languages
+++++++++++++++++++++
Interpreters like Python & R (Perl, Ruby, Scala, Lua etc.) are complex things and have their own entire ecosystems of packages, versioning and tools.
These Scripting Interfaces (The technical term is 'Interpreters') are all managed as their own standalone aspect to the HPC.

Using Python as an example you have:

* The interpreter 'Python'
* The package manager 'Pip'
* The Meta-Manager 'Conda'/'Mini-Conda'
* The Virtual Environments Manager 'venv'

Each interacting in slightly different ways and causing other issues. To ensure that the HPC team can support a core set of modules the interpreters are only updated when:

* Security patches are needed
* A new *Major* Version is available
* A commonly requested feature requires an upgrade

Versioning Support
+++++++++++++++++++++++
Expand All @@ -60,3 +132,19 @@ Most major packages will be supported in a Latest - 1 fashion. Below show an exa

As not all software follows such clean release patterns, the HPC Team will hold final say on updating a piece of software in the global module lists.



Upgrade Cycles
=====================================
The HPC Team does their best to adhere to the following cycle for upgrades for software and associated systems.

======================== ============= =============== ==================================
Software Category Upgrade Cycle Outage Required Versioning Type
======================== ============= =============== ==================================
Core Supported Programs Quarterly No N - 1
Core Licensed Programs Bi-Yearly No N - 1
OS & Managerial Tools Yearly Yes Latest
Software Images Bi-Yearly Partial Latest
Scripting Interfaces Quarterly No Major, Security & Feature Minor
Scripting Modules Quarterly No Latest
======================== ============= =============== ==================================
3 changes: 3 additions & 0 deletions docs/source/system/deepthoughspecifications.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ The SLURM Scheduler as the notion of 'Job Queue' or 'Partitions'. These manage
|hpc_general | 13 | General Usage Pool | UNLIMITED |
|hpc_melfeu | 2 | Molecular Biology Lab private Nodes. | UNLIMITED |

## Storage Layout

Scratch: ~80TB of scratch disk, mounted on all nodes

## Node Breakdown

Expand Down

0 comments on commit bbde94e

Please sign in to comment.