Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added restart points and automatic restart support #512

Merged
merged 10 commits into from
Jan 23, 2023
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
-

### added
-
- **config** added setting cfg$keep_restarts which controls whether restart files should be kept after a run finished
- **config** changed default for `s_use_gdx` from 2 to 0
- **scripts** added restart points after each time step from which the model can now be restarted if the simulation aborts at some point
- **scripts** added SLURM dayMax submission type for standby QOS


### removed
-
Expand Down
7 changes: 6 additions & 1 deletion config/default.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ cfg$gms$c_timesteps <- "coup2100"
cfg$gms$c_past <- "till_2010"

# use of gdx files
cfg$gms$s_use_gdx <- 2 # def = 2
cfg$gms$s_use_gdx <- 0 # def = 0
#* 0: gdx will not be loaded
#* 1: gdx is loaded in the first time step
#* 2: gdx is loaded in all time steps
Expand Down Expand Up @@ -1804,6 +1804,7 @@ cfg$sequential <- NA
# * priority (immediate start, but slots limited to 5 in parallel)
# * priority_maxMem (same as priority but with 16 CPUs and max Memory)
# * standby (1 week max, preemption possible)
# * standby_dayMax (24h max, preemption possible)
# * standby_maxMem (same as standby but with 16 CPUs and max Memory)
# * NULL (educated guess of best option based
# * available resources)
Expand Down Expand Up @@ -1865,6 +1866,10 @@ cfg$model_name <- "MAgPIE"
# configuration
cfg$info <- list()

# Should the restart files of each iteration be kept in the output folder (TRUE)
# or deleted after the run finished (FALSE)
cfg$keep_restarts <- FALSE
tscheypidi marked this conversation as resolved.
Show resolved Hide resolved

# Should the model run in developer mode? This will loosen some restrictions,
# such as temporary toleration of coding etiquette violations
# Please make sure to set it to FALSE for production runs!
Expand Down
10 changes: 8 additions & 2 deletions core/calculations.gms
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,15 @@ file dummy; dummy.pw=2000; put dummy;
* clear ct set
ct(t) = no;
pt(t) = no;

***************************TIMESTEP LOOP START**********************************
loop (t,
$label TimeLoop
$if not set TIMESTEP $set TIMESTEP 0

loop (t$(m_year(t) > %TIMESTEP%),

* set ct to current time period
ct(t) = yes;
ct(t) = yes;
pt(t) = yes$(ord(t) = 1);
pt(t-1) = yes$(ord(t) > 1);

Expand Down Expand Up @@ -90,6 +94,8 @@ $batinclude "./modules/include.gms" postsolve
ct(t) = no;
pt(t) = no$(ord(t) = 1);
pt(t-1) = no$(ord(t) > 1);

put_utility 'save' / 'restart_' t.tl:0;;
********************************************************************************
);
****************************TIMESTEP LOOP END***********************************
Expand Down
24 changes: 13 additions & 11 deletions main.gms
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
*** | MAgPIE License Exception, version 1.0 (see LICENSE file).
*** | Contact: magpie@pik-potsdam.de

$if set RESTARTPOINT $goto %RESTARTPOINT%

$title magpie

*' @title MAgPIE - Modelling Framework
Expand Down Expand Up @@ -147,24 +149,24 @@ $title magpie
*##################### R SECTION START (VERSION INFO) ##########################
*
* Used data set: rev4.79_h12_magpie.tgz
* md5sum: 4f3f5fd72716fe371d646c69c30e6fd3
* Repository: /p/projects/rd3mod/inputdata/output
* md5sum: NA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but why is the md5sum NA when the data comes from our public repo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk

* Repository: https://rse.pik-potsdam.de/data/magpie/public
*
* Used data set: rev4.79_h12_fd712c0b_cellularmagpie_c200_MRI-ESM2-0-ssp370_lpjml-8e6c5eb1.tgz
* md5sum: a18af444eb4a3d24956f66e31e8634d8
* Repository: /p/projects/rd3mod/inputdata/output
* md5sum: NA
* Repository: https://rse.pik-potsdam.de/data/magpie/public
*
* Used data set: rev4.79_h12_validation.tgz
* md5sum: 0a617c2999127a50146ac106cd6ee4bf
* Repository: /p/projects/rd3mod/inputdata/output
* md5sum: NA
* Repository: https://rse.pik-potsdam.de/data/magpie/public
*
* Used data set: additional_data_rev4.36.tgz
* md5sum: e24c46872f77dc15ad8603bdac1e6065
* Repository: /p/projects/rd3mod/mirror/rse.pik-potsdam.de/data/magpie/public
* md5sum: NA
* Repository: https://rse.pik-potsdam.de/data/magpie/public
*
* Used data set: calibration_H12_09Jan23.tgz
* md5sum: 0fd18901ec047862918bf598ac126411
* Repository: /p/projects/rd3mod/mirror/rse.pik-potsdam.de/data/magpie/public
* md5sum: NA
* Repository: https://rse.pik-potsdam.de/data/magpie/public
*
* Low resolution: c200
* High resolution: 0.5
Expand Down Expand Up @@ -193,7 +195,7 @@ $title magpie
* * Call: withCallingHandlers(expr, message = messageHandler, warning = warningHandler, error = errorHandler)
*
*
* Last modification (input data): Tue Jan 17 11:25:32 2023
* Last modification (input data): Wed Jan 18 12:28:24 2023
*
*###################### R SECTION END (VERSION INFO) ###########################

Expand Down
17 changes: 16 additions & 1 deletion scripts/run_submit/submit.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,19 @@ library(gms)

#options(error=function()traceback(2))

.getRestartFiles <- function() {
restartFiles <- dir(pattern="restart_.*\\.g00")
names(restartFiles) <- gsub("restart_y([0-9]*)\\.g00", "\\1", restartFiles)
return(restartFiles)
}

.getRestartCode <- function() {
restartFiles <- .getRestartFiles()
if (length(restartFiles) == 0) return("")
restartFiles <- tail(restartFiles, 1)
return(paste0(" --RESTARTPOINT=TimeLoop --TIMESTEP=", names(restartFiles), " r=", restartFiles))
}

cfg <- gms::loadConfig("config.yml")

maindir <- cfg$magpie_folder
Expand All @@ -20,7 +33,9 @@ maindir <- cfg$magpie_folder
timeGAMSStart <- Sys.time()

cat("\nStarting MAgPIE...\n")
system(paste("gams full.gms -errmsg=1 -lf=full.log -lo=",cfg$logoption,sep=""))
system(paste0("gams full.gms -errmsg=1 -lf=full.log -lo=",cfg$logoption, .getRestartCode()))

if (isFALSE(cfg$keep_restarts)) unlink(.getRestartFiles())

# Capture runtimes
timeGAMSEnd <- Sys.time()
Expand Down
11 changes: 11 additions & 0 deletions scripts/run_submit/submit_standby_dayMax.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

#SBATCH --qos=standby
#SBATCH --job-name=mag-run
#SBATCH --output=slurm.log
#SBATCH --mail-type=END
#SBATCH --cpus-per-task=3
#SBATCH --partition=priority
#SBATCH --time=24:00:00

Rscript submit.R