Skip to content

High performance computation

Ben Winchester edited this page Aug 19, 2022 · 1 revision

This section provides a guide on how to use CLOVER on Imperial College London's high-performance computers (HPCs). If you do not intend to run CLOVER on the HPCs, it is recommended that you skip this page as, whilst interesting, its content will not be applicable to you.

If you wish to gain access to Imperial College's HPCs, it is recommended that you get in touch with Imperial College London's RCS Service Desk to discuss your needs.

High performance computation

CLOVER is a powerful tool in of itself, capable of running many simulations and optimisations with ease. However, there are some use cases, such as simulating the energy demands for multiple locations, where CLOVER falls down and the computation time becomes too great to reasonably run on your own computer. In this instance, it is necessary to utilise the parallel processing power of a cluster-based system to carry out simulations and optimisations.

CLOVER was developed by researchers at Imperial College London. As such, the scripts developed for parallel high-performance computation were designed to run on Imperial College London's HPC. These scripts can be launched from the HPC's command-line interface (CLI) with only a single configuration file needed in addition to what is described elsewhere in the wiki pages.

Setting up your input files

Every time CLOVER would be run normally, e.g., a simulation or an optimisation call from the command line, can sent to a separate "node" (computer) on Imperial's HPC. These runs can then be carried out in parallel across the network to reduce your performance time. We define these runs within a single YAML file:

---
################################################################################
# hpc_runs.yaml - A list of runs to be carried out on Imperial's HPC.          #
#                                                                              #
# Author: Ben Winchester                                                       #
# Copyright: Ben Winchester, 2022                                              #
# Date created: 19/08/2022                                                     #
# License: Open source                                                         #
################################################################################

- location: Bahraich
  pv_system_size: 20
  storage_size: 5
  type: simulation
- location: Bahraich
  pv_system_size: 20
  scenario: default
  storage_size: 15
  total_load: total_load.csv
  type: simulation
- location: Bahraich
  total_load: false
  type: optimisation
- location: Bahraich
  total_load: total_load.csv
  type: optimisation

This file contains a list of four HPC runs to be carried out. There are multiple ways that runs can be specified, and all of these are contained in the above snippet. We'll take a look first at what we can specify, and then at the examples which demonstrate this.

Variable Explanation
location The name of the location for which a simulation or optimisation should be carried out.
type Whether an optimisation (optimisation) or simulation (simulation) should be carried out.
total_load This can be used to specify a total-load file to use. The entry should match a valid .csv file name within your location's inputs/load directory. You can specify false explicitly if you want CLOVER to generate a load profile based on your input information, or you can leave this key out to avoid confusion.
scenario If you are running a simulation, this is the name of the scenario CLOVER should use for the simulation.
pv_system_size This matches the command-line flag needed for simulations to specify the PV system size.
storage_size This matches the command-line flag needed for simulations to specify the storage size.

Now that we've covered each of these, we'll take a look at the runs that are specified in our file:

  • The first run is a simulation for Bahraich:
    location: Bahraich
    pv_system_size: 20
    storage_size: 5
    type: simulation
    Here, CLOVER will run a simulation for Bahraich, with a PV system size of 20 kWp of power and 5 kWh of storage. This is the same as running
    clover-energy -l Barhaich -pv 20 -b 5
    on your local computer;
  • The second run is a simulation for Barhaich but where we are specifying more information:
    location: Bahraich
    pv_system_size: 20
    scenario: default
    storage_size: 15
    total_load: total_load.csv
    type: simulation
    Here, CLOVER will run a simulation for Barhaich with 20 kWp of power, 15 kWh of storage, but with a scenario of default explicitly defined and where it will use the file total_load.csv in lieu of load-profile generation;
  • The third run is an optimisation for Bahraich:
    location: Bahraich
    total_load: false
    type: optimisation
    Optimisations require less input information than simulations as more of the required input information is contained within input files as opposed to on the command line. Here, the optional total_load: false expression has been used to state that no overall total-load file should be used. This may have been done here to avoid confusion with other optimisations: explicitly using a total-load file for most of the optimisations, but leaving it out for one, may lead to confusion as to why it hasn't been specified here.
  • The final run is an optimisation for Barhaich:
    location: Bahraich
    total_load: total_load.csv
    type: optimisation
    This is the same as the previous optimisation run, but where a total-load file is being used.

You should create a file, similar to the one above, for use within your CLOVER setup. This should sit above your locations folder:

├── hpc_runs.yaml
└── locations
    └── Bahraich
        ├── inputs
        │   ├── generation
        │   │   ├── diesel_inputs.yaml
        │   │   ├── generation_inputs.yaml
        │   │   ├── grid_times.csv
        │   │   └── solar_generation_inputs.yaml
        │   ├── impact
        │   │   ├── finance_inputs.yaml
        │   │   └── ghg_inputs.yaml
... etc ...

Correctly configuring your HPC

In order to correctly configure your HPC to run CLOVER 5, it is necessary to install a series of Python packages on which CLOVER depends. A helper script is provided for this with a git clone of CLOVER. If you have installed the clover-energy package, you will need to manually set up your environment, and it is recommended that you download a copy of the helper script (./bin/hpc-setup.sh) from the master branch of this repository.

Simply execute the helper script from the command-line interface:

$ ./bin/hpc-setup.sh
(clover) 17:39:04 bjw120 login-a /rds/general/user/_/home/CLOVER-energy ./bin/
clover.sh        hpc-setup.sh     new-location.sh  test-clover.sh   
(clover) 17:39:04 bjw120 login-a /rds/general/user/_/home/CLOVER-energy ./bin/hpc-setup.sh 
clover                *  /rds/general/user/_/home/anaconda3/envs/clover
Anaconda environment clover already exists, skipping.
Installing necessary packages .................................    
Collecting black>=22.3.0
  Downloading black-22.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
     |████████████████████████████████| 1.5 MB 27.6 MB/s 
Collecting mypy>=0.960
  Downloading mypy-0.971-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (17.6 MB)
     |████████████████████████████████| 17.6 MB 10.1 MB/s 
Requirement already satisfied: numpy>=1.20.1 in /rds/general/user/bjw120/home/anaconda3/lib/python3.9/site-packages (from -r requirements.txt (line 3)) (1.21.2)
...

Troubleshooting

If you encounter an error when running this script, contact the CLOVER development team and the RDS team.

A common error is a lack of disk space:

ERROR: Could not install packages due to an OSError: [Errno 122] Disk quota exceeded: '/rds/general/user/user/home/anaconda3/lib/python3.9/site-packages/pylint/testutils/functional/test_file.py'

If you encounter this error, then you have too many anaconda environments configured to correctly set up a new environment for running CLOVER. Either contact the RDS team or attempt to delete existing old anaconda environments to free up disk space.

Launching

In order to integrate CLOVER with Imperial's HPC, a new launch script is provided. This is run from the command-line in a similar way to how you would normally run CLOVER:

python -u -m src.clover.scripts.hpc_clover --runs hpc_runs.yaml --walltime 1


        (((((*    /(((
        ((((((( ((((((((
   (((((((((((( ((((((((((((
   ((((((((((((*(((((((((((((       _____ _      ______      ________ _____
     *((((((((( ((((((((((((       / ____| |    / __ \ \    / /  ____|  __ \
   (((((((. /((((((((((/          | |    | |   | |  | \ \  / /| |__  | |__) |
 ((((((((((((((((((((((((((,      | |    | |   | |  | |\ \/ / |  __| |  _  /
 (((((((((((*  (((((((((((((      | |____| |___| |__| | \  /  | |____| | \ \
   ,(((((((. (  (((((((((((/       \_____|______\____/   \/   |______|_|  \_\
   .((((((   (   ((((((((
             /     (((((
             ,
              ,
               (
                 (
                   (

                ___                     _      _   _  _ ___  ___
               |_ _|_ __  _ __  ___ _ _(_)__ _| | | || | _ \/ __|
                | || '  \| '_ \/ -_) '_| / _` | | | __ |  _/ (__
               |___|_|_|_| .__/\___|_| |_\__,_|_| |_||_|_|  \___|
                         |_|

       Continuous Lifetime Optimisation of Variable Electricity Resources
                         Copyright Phil Sandwell, 2018
                                 Version 5.0.4                                  

   This version of CLOVER has been adapted for Imperial College London's HPC
  See the user guide for more information on how to use this version of CLOVER

                         For more information, contact
                   Phil Sandwell (philip.sandwell@gmail.com),
                    Hamish Beath (hamishbeath@outlook.com),
               or Ben Winchester (benedict.winchester@gmail.com)


Checking HPC runs .............................................    [   DONE   ]
Processing HPC job submission script ..........................    [   DONE   ]
Sending jobs to the HPC:
6061663[].pbs
Sending jobs to the HPC .......................................    [   DONE   ]

Here, CLOVER has processed our runs and sent them to the HPC. It has given us the job number 6061663 to keep track of them.

The arguments we used are

  • --runs <hpc_runs_file.yaml> to point CLOVER to our newly created hpc_runs.yaml file;
  • and --walltime <walltime_in_hours> to tell CLOVER how long each job should run for in hours. The maximum permitted value is 72.

We can use the qstat command to keep track of our jobs now that we have set them running:

$ qstat -t 6061663[]
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
6061672[].pbs     hpc.sh           bjw120                   0 Q v1_short8a      
6061672[1].pbs    hpc.sh           bjw120                   0 Q v1_short8a      
6061672[2].pbs    hpc.sh           bjw120                   0 Q v1_short8a      
6061672[3].pbs    hpc.sh           bjw120                   0 Q v1_short8a      
6061672[4].pbs    hpc.sh           bjw120                   0 Q v1_short8a   

Because we specified a one-hour walltime, our jobs have been put in the short-jobs queue, denoted by v1_short8a. For each of these files, we will have a log which is being generated:

$ ls -l logs/
total 16451
-rw-r--r--. 1 <user> hpc-<group>  1786 Aug 19 17:52 hpc_clover.log
-rw-------. 1 <user> hpc-<group>   812 Aug 19 17:55 hpc_run_1.log
-rw-------. 1 <user> hpc-<group>   883 Aug 19 17:55 hpc_run_2.log
-rw-------. 1 <user> hpc-<group>   778 Aug 19 17:55 hpc_run_3.log
-rw-------. 1 <user> hpc-<group>   848 Aug 19 17:55 hpc_run_4.log
-rw-------. 1 <user> hpc-<group> 21007 Aug 19 17:55 Bahraich_clover_1.log
-rw-------. 1 <user> hpc-<group> 20547 Aug 19 17:55 Bahraich_clover_2.log
-rw-------. 1 <user> hpc-<group> 33081 Aug 19 17:56 Bahraich_clover_3.log
-rw-------. 1 <user> hpc-<group> 33927 Aug 19 17:56 Bahraich_clover_4.log
-rw-------. 1 <user> hpc-<group>  1199 Aug 19 17:55 Bahraich_solar_generation.log

At first, there seem to be a lot of logs here! There are, though, the right number of logs that we would expect 😄

  • hpc_clover.log keeps track of our HPC launch script. If something had failed here, such as a job not being correctly formatted, we would have seen an error in this file. As it happens though, our file looks fine:
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC-CLOVER script called.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Arguments: --runs, hpc_runs.yaml, --walltime, 1
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Command-line arguments successfully parsed.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC input successfully parsed.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Command-line arguments successfully parsed. Run file: hpc_runs.yaml
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Checking all run files are valid.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Parsing scenario input file.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: No desalination scenarios files provided, skipping.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: No hot-water scenario file provided, skipping.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Parsing scenario input file.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: No desalination scenarios files provided, skipping.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: No hot-water scenario file provided, skipping.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: All HPC runs valid.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Parsing base HPC job submission script.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC job submission file successfully parsed.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC job submission script updated with 4 runs, 1 walltime.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Writing temporary HPC submission script.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC job submission script successfully submitted.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC job submission script permissions successfully updated.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: Submitting CLOVER jobs to the HPC.
    19/08/2022 05:52:48 PM: hpc_clover: INFO: HPC runs submitted. Exiting.
    
  • The hpc_run_<number>.log files are from the HPC end. Once our runs have been submitted, they sit in the HPC's queue until they are picked up. These logs are generated when the runs are picked up, and they contain information about whether CLOVER was successfully launched and whether the HPC successfully handled the runs;
  • The Bahraich_solar_generation.log file denotes the progress of CLOVER in downloading solar profiles from the renewables.ninja web interface;
  • And, finally, the Bahraich_clover_<run_number>.log files are the standard log files that are generated when CLOVER runs.

Troubleshooting

If you encounter any errors, your first port of call should be these log files. Otherwise, contact the CLOVER development team.