Skip to content

Commit

Permalink
Merge pull request #53 from flindersuni/feature/gaussian16
Browse files Browse the repository at this point in the history
Feature/gaussian16
  • Loading branch information
The-Scott-Flinders committed Oct 10, 2022
2 parents 1143a16 + 2293e7c commit 7294269
Show file tree
Hide file tree
Showing 11 changed files with 73 additions and 52 deletions.
10 changes: 5 additions & 5 deletions docs/source/FAQ/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ FAQ

Below are some of the common steps that the team has been asked to resolve more than once, so we put them here to (hopefully) answer your questions before you have to wait in the Ticket Queue!

Host Not Found
===============
When Connecting, Host Not Found?
================================

When attempting to connect to the HPC, you receive a message that says 'Could not find deepthought.flinders.edu.au'.

Expand Down Expand Up @@ -48,14 +48,14 @@ Running the Job
There are several ways to correctly start OpenMPI/MPI based programs. SLURM does an excellent job of integrating with OpenMPI/MPI, so usually it will 'Just Work'. Its highly dependant upon how the program is structured and written. Here are some options that can help you boot things when they do not go to plan.

* mpirun - bootstraps a program under MPI. Best tested under a manual allocation via salloc.
* srun - Acts nearly the same as 'sbatch' but runs immediacy via SLURM, instead of submitting the job for later execution.
* srun - Acts nearly the same as 'sbatch' but runs immediately via SLURM, instead of submitting the job for later execution.

OOM Killer
-----------
Remember, that each 'task' is its own little bucket - which means that SLURM tracks it individually! If a single task goes over its resource allocation, SLURM will kill it, and usually that causes a cascade failure with the rest of your program, as you suddenly have a process missing.
Remember, that each 'task' is its own little bucket - which means that SLURM tracks it individually! If a single task goes over its resource allocation, SLURM will kill it, and usually that causes a cascade failure of your program, as you suddenly have a process missing.


Issues Installed ISoSeq3
Installing ISoSeq3
=====================

IsoSeq3, from Pacific Bio Sciences has install instructions that won't get you all the way on DeepThought. There are some missing packages and some commands that must be altered to get you up and running.
Expand Down
20 changes: 6 additions & 14 deletions docs/source/SLURM/SLURMIntro.md
Original file line number Diff line number Diff line change
Expand Up @@ -405,18 +405,10 @@ An excellent guide to [submitting jobs](https://support.ceci-hpc.be/doc/_content
# data on /local, you will need to manually cleanup that
# directory as a part of your job script.

# Example using the HPC Set $TMPDIR Variable
cd /local/
mkdir $SLURM_JOB_ID/ ; cd $SLURM_JOBID
# Example using the SLURM $BGFS Variable (the Parallel Filesystem)
cd $BGFS
cp /scratch/user/<FAN>/dataset ./

# A Manual 'Shared' Data-Set Directory
# DATADIR=/local/$SLURM_USER/dataset/
# mkdir -p $DATADIR
# cd $DATADIR
# cp -r /scratch/users/$USER/dataset/ ./


##################################################################
# Enter the command-line arguments that you job needs to run.

Expand All @@ -425,12 +417,12 @@ An excellent guide to [submitting jobs](https://support.ceci-hpc.be/doc/_content
# Once you job has finished its processing, copy back your results
# and ONLY the results to /scratch, then clean-up the temporary
# working directory
# This command assumes that the destination exists

cp -r /$TMPDIR/<OUTPUT_FOLDER> /scratch/user/<FAN>/<JOB_RESULT_FOLDER>
cp -r /$BGFS/<OUTPUT_FOLDER> /scratch/user/<FAN>/<JOB_RESULT_FOLDER>

# Using the example above with a shared dataset directory, your final step
# in the script should remove the directory folder
# rm -rf $DATADIR
# No need to cleanup $BGFS, SLURM handles the cleanup for you.
# Just dont forget to copy out your results, or you will lose them!

##################################################################

2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ HPC Visual Dashboard
+++++++++++++++++++++++++++
DeepThought has a time-series based visual statistics dashboard that be viewed via a web-browser while on campus.

THe 'Default' and 'Public' URL's are using an *Alpha Release* feature set, and may not display correctly. In that case, please use the
The 'Default' and 'Public' URL's are using an *Alpha Release* feature set, and may not display correctly. In that case, please use the
'Full Version' link, and any display strangeness will be resolved.

The following URLS all link to the dashboard.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/software/ansys.rst
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ Below are some example command-line examples to get you started.

3. Single-Node Execution, GPU Enabled

``ansysedt -ng -batchsolve -Distributed --achinelist list="$SLURM_NODELIST:$SLURM_NTASKS:$SLURM_CPUS_PER_TASK -monitor -batchoptions "EnbleGPU=1" /path/to/project.aedt``
``ansysedt -ng -batchsolve -Distributed --machinelist list="$SLURM_NODELIST:$SLURM_NTASKS:$SLURM_CPUS_PER_TASK -monitor -batchoptions "EnbleGPU=1" /path/to/project.aedt``


1. Multi-Node
Expand Down
29 changes: 29 additions & 0 deletions docs/source/software/gaussian16.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
-------------------------
Gaussian
-------------------------
=====================
Gaussian Status
=====================

Gaussian 16 is installed and available for use on the HPC.

.. _Gaussian16 Home: https://gaussian.com/gaussian16/

====================
Gaussian Overview
====================

From `Gaussian16 Home`_:

Gaussian 16 is the latest in the Gaussian series of programs. It provides state-of-the-art capabilities for electronic structure modelling.
Gaussian 16 is licensed for a wide variety of computer systems. All versions of Gaussian 16 contain every scientific/modelling feature,
and none imposes any artificial limitations on calculations other than your computing resources and patience.


++++++++++++++++++++++++++++++++++++++++++++++++++
Gaussian Program Quick List
++++++++++++++++++++++++++++++++++++++++++++++++++

The main binary for Gaussian is ``gau16``.


4 changes: 2 additions & 2 deletions docs/source/software/jupyter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,13 +76,13 @@ Conda Environment & Tensorflow Install
It possible to use the GPU nodes and access the GPU's via the JupyterHub Interface. In this example, Tensorflow is used as the GPU-Enabled package of choice. To access the GPUs
from your Conda environment and use Tensorflow, perform the following steps:

1. Follow the 'Conda Environement Preparation' steps up to step 4.
1. Follow the 'Conda Environment Preparation' steps up to step 4.
2. Run the following commands to install Tensorflow. **DO NOT** use ``conda install tensorflow``, as it has issues with GPU detection. You *must* use pip.
a. ``python3 -m pip install tensorflow``

There are a few known issues that can occur here, however the main on is solved below:

1. An error about HTML has not attribute 'parser' or similar. This is usually resolved by upgrade pip.
1. An error about HTML has not attribute 'parser' or similar. This is usually resolved by upgrading pip.
a. ``python3 -m pip install --upgrade pip``

Try and re-install Tensorflow as above. If it still fails, upgrade the rest of the installation tooling for python:
Expand Down
18 changes: 9 additions & 9 deletions docs/source/software/lammps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ LAMMPS was installed from the Development Branch on 7th Jan, 2022.
There are two versions of LAMMPS installed on DeepThought, each with their own modules:

1. A CPU only version, with the program called called lmp
2. a GPU only version, with the program called lmp_gpu
2. A GPU only version, with the program called lmp_gpu

*You cannot run the GPU enabled version without access to a GPU, as it will cause errors.*

Expand All @@ -37,14 +37,14 @@ LAMMPS Installed Packages

The following is an extract from the ``lmp -h`` option, showing the enabled packages and capabilities of the LAMMPS installation.

ASPHERE ATC AWPMD BOCS BODY BROWNIAN CG-DNA CG-SDK CLASS2 COLLOID COLVARS
COMPRESS CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO DPD-REACT
DPD-SMOOTH DRUDE EFF EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX EXTRA-MOLECULE
EXTRA-PAIR FEP GPU GRANULAR H5MD INTERLAYER KIM KSPACE LATBOLTZ LATTE MACHDYN
MANIFOLD MANYBODY MC MDI MEAM MESONT MESSAGE MGPT MISC ML-HDNNP ML-IAP ML-PACE
ML-QUIP ML-RANN ML-SNAP MOFFF MOLECULE MOLFILE MPIIO MSCG NETCDF OPENMP OPT
ORIENT PERI PHONON PLUGIN PLUMED POEMS PTM PYTHON QEQ QMMM QTB REACTION REAXFF
REPLICA RIGID SCAFACOS SHOCK SMTBQ SPH SPIN SRD TALLY UEF VORONOI VTK YAFF
ASPHERE ATC AWPMD BOCS BODY BROWNIAN CG-DNA CG-SDK CLASS2 COLLOID COLVARS
COMPRESS CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO DPD-REACT
DPD-SMOOTH DRUDE EFF EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX EXTRA-MOLECULE
EXTRA-PAIR FEP GPU GRANULAR H5MD INTERLAYER KIM KSPACE LATBOLTZ LATTE MACHDYN
MANIFOLD MANYBODY MC MDI MEAM MESONT MESSAGE MGPT MISC ML-HDNNP ML-IAP ML-PACE
ML-QUIP ML-RANN ML-SNAP MOFFF MOLECULE MOLFILE MPIIO MSCG NETCDF OPENMP OPT
ORIENT PERI PHONON PLUGIN PLUMED POEMS PTM PYTHON QEQ QMMM QTB REACTION REAXFF
REPLICA RIGID SCAFACOS SHOCK SMTBQ SPH SPIN SRD TALLY UEF VORONOI VTK YAFF

======================================
LAMMPS Quickstart Command Line Guide
Expand Down
14 changes: 8 additions & 6 deletions docs/source/software/softwaresuitesoverview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,16 @@ List of Enterprise Software Suites
.. _VASP: vasp.html
.. _Delft 3D: delft3d.html
.. _Open Data Cube: opendatacube.html
.. _Guassian16: gaussian16.html

1. `ANSYS`_
2. `Delft 3D`_
3. `GROMACS`_
4. `Jupyter Hub`_
5. `LAMMPS`_
6. `MATLAB`_
7. `Singularity Containers`_
8. `VASP`_
9. `Open Data Cube`_
4. `Guassian16`_
5. `Jupyter Hub`_
6. `LAMMPS`_
7. `MATLAB`_
8. `Singularity Containers`_
9. `VASP`_
10. `Open Data Cube`_

2 changes: 1 addition & 1 deletion docs/source/software/vasp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ VASP
=======================================
VASP Status
=======================================
VASP: The Vienna Ab initio Simulation Package version 6.2.0 is operational with OpenACC GPU support on the HPC for the Standard, Gamma Ponit and Non-Collinear versions of the program.
VASP: The Vienna Ab initio Simulation Package version 6.2.0 is operational with OpenACC GPU support on the HPC for the Standard, Gamma Ponit and Non-Collinear versions.

.. _VASP: https://www.vasp.at/

Expand Down
16 changes: 7 additions & 9 deletions docs/source/storage/storageusage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,23 +70,21 @@ creates directories for you to use on this filesystem. See the environment varia
Once you job completes, is cancelled, or errors out, SLURM removes then entire directory of your job. That means, *if you do not move your data from the /cluster
filesystem, you will lose all of it*. This is by design, and the HPC Team cannot recover any data lost this way.

Each college is also limited to a **hard limit** on storage that mirrors their HPC SLURM allocation. This is currently
Each college is also limited to a **soft limit** on storage that mirrors their HPC SLURM allocation. This is currently

1. 45% CSE, ~18TB
2. 45% CMPH, ~18TB
3. 10% Other, ~5TB

When this quota is exceeded, no more files can be written, so be mindful of your and others usage. The HPC Team is actively monitoring and
improving the quota system and the above may change without warning.

When this quota is exceeded, files can still be written, but the HPC Team is notified of the user and their associated usage.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^
What to store in /cluster?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* Your working data sets
* Temporary job files
* Results, before you copy them back to /scratch
* Your working data sets
* Temporary job files
* Results, before you copy them back to /scratch

=======
/Home
Expand All @@ -99,7 +97,7 @@ What to store in /home
Here is a rough guide as to what should live in your /home/$FAN directory. In general, you want small, little things is here.

* SLURM Scripts
* Results from Jobs.
* 'Small' Results from Jobs
* 'Small' Data-Sets (<5GB)


Expand All @@ -113,4 +111,4 @@ Local is the per-node, high speed flash storage that is specific to each node. W
What to Store in /local
^^^^^^^^^^^^^^^^^^^^^^^^^

Only *transient files* should live on /local. Anything that your job is currently working on should be on /local. Once your job has finished with these files, they should be copied (or moved) to /scratch. The directory you were working in on /local should then cleaned, removing all files from your job.
Only *transient files* should live on /local. Anything that your job is currently working on can be on /local. Once your job has finished with these files, they should be copied (or moved) to /scratch. The directory you were working in on /local should then cleaned, removing all files from your job.
8 changes: 4 additions & 4 deletions docs/source/system/deepthoughspecifications.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ There are 17 General Purpose nodes, each with:
- RAM:
- 256GB DDR4 @ 2666Mhz
- Local Storage
- ~3.2TB of NVMe SSD's
- 1TB of NVMe SSD's

### GPU Nodes

Expand All @@ -48,7 +48,7 @@ There are 3 dedicated GPU nodes. They comprise of two 'Standard' and One 'Light'
- GPU:
- 2 x TESLA V100 w/ 32GB VRAM
- Local Storage
- 3.2TB of NVMe
- 1TB of NVMe

#### Light GPU Node
- CPU:
Expand All @@ -58,7 +58,7 @@ There are 3 dedicated GPU nodes. They comprise of two 'Standard' and One 'Light'
- GPU:
- 1 x TESLA V100 w/ 32GB VRAM
- Local Storage
- 1.5TB of NVMe
- 1TB of NVMe

### High Capacity Node

Expand All @@ -69,7 +69,7 @@ There is are 3 High-Capacity nodes with:
- RAM:
- 2TB (1.8TB) DDR4 @ 3200Mhz
- Local Storage
- 2.6TB of NVMe
- 1TB of NVMe

### Private Nodes

Expand Down

0 comments on commit 7294269

Please sign in to comment.