The draft for read.settings compatibility with benchmarking #1095

bcow · 2016-10-17T15:47:27Z

Major changes:

Added benchmarking stage to workflow.R
Reworking the process of reading settings:
- read.settings reads the pecan.xml file
- add additional settings such a benchmarking settings
- update settings and write pecan.CHECKED.xml
- This new configuration works with normal pecan.xml files - You can test by doing runs with my pecan
New improved benchmarking.workflow.R that can be used when not using workflow.R

Example XML files

Can be used to test benchmarking (both with workflow.R and benchmark.workflow.R
(I'll check them off when they are working)

1 variable, 1 metric, 1 site, 1 model

settings.file: modules/benchmark/inst/scripts/bm.1var.1metric.1site.1model.xml
New Run
Existing Run

2 variables, 2 metric, 1 site, 1 model

settings.file: modules/benchmark/inst/scripts/bm.2var.2metric.1site.1model.xml
New Run
Existing Run

2 variables, 2 metric, 1 site, 1 model

settings.file: modules/benchmark/inst/scripts/bm.2var.2metric.2site.1model.xml
settings will now be a multisite object
New Run
Existing Run

…n for reference_run$settings

…nal `read.settings` function.

… didn't call it BRR because I thought reference runs weren't always specifically for benchmarking.

…ipt. Simply because it was too hard to read, I can always recombine them.

…ifferent settings types (ie XML or reference run id's). Also, trying to figure out where I'm adding in supplemental settings (such as benchmarking settings)

…rep settings XML for insertion into the reference_runs table

…, `update.settings`, `check.settings` into individual Rscript files.

…ject. It does not do any additional editing.

…ing the pecan.CHECKED.xml file

…ngs.RR` to the benchmarking module

# Conflicts: # modules/benchmark/R/create.BRR.R # modules/benchmark/R/load.data.R

…ettings for BRR

…ings.bench # Conflicts: # settings/R/read.settings.R # utils/R/utils.R

mdietze

I think all the requested changes are quick fixes. Would love to see a quick turn-around and have this pulled today.

mdietze · 2016-10-25T14:01:57Z

.gitignore

@@ -69,5 +69,6 @@ tests/BC*
 compile_on_geo.sh
 documentation/tutorials/*/*.html
 pecan.Rproj
+shiny/BenchmarkReport/*


shiny/BenchmarkReport exists in the repo already, so a commit that adds it to the ignore seems like a bad idea. Could you remove that. Better yet, if you've made changes to BenchmarkReport could you commit those changes too?

mdietze · 2016-10-25T14:03:16Z

modules/benchmark/R/create.BRR.R

@@ -9,48 +9,49 @@
 ##' 
 ##' @author Betsy Cowdery 

-create.BRR <- function(ensemble.id, workflow, con) {
+create.BRR <- function(ens_wf, con){


changed argument to something hard to understand, but didn't update the documentation (param still listed as ensemble.id)

✔️ Updated the documentation

mdietze · 2016-10-25T14:04:35Z

modules/benchmark/R/create.BRR.R

-    # If the ensemble run was done on localhost, turn into a BRR
+  cnd1 <- ens_wf$hostname == fqdn() 
+  cnd2 <- ens_wf$hostname == 'test-pecan.bu.edu' & fqdn() == 'pecan2.bu.edu'
+  cnd3 <- ens_wf$hostname == 'pecan2.bu.edu' & fqdn() == 'test-pecan.bu.edu'


would be nice to figure out how to avoid having hacks like this hardcoded (Fix not required in this PR)

You commented on this in the last PR but I still don't have any ideas on what to do with it.
It's just a special case that involves our pecan servers. Could simplify it to:

ens_wf$hostname == fqdn() | length(grep("pecan.bu.edu", c(fqdn(), "test-pecan.bu.edu")))==2

Would it be possible to add some text as a comment on the reason why we can not run a benchmark reference run on any of those other machines, what conditions do we need to be added to the list of exceptions.

Here I'm just trying to find a way to say that test-pecan, and pecan2 are actually the "same" machine in that the files are accessible from either.
But this is skirting around the issue that if one wants to turn an existing ensemble into a benchmark reference run and that ensemble was not run on localhost, getting all the information to populate the settings section of the reference run table id going to be difficult / may not be possible. This is because we may not be able to read the pecan.xml file on the remote host and the ensemble settings aren't saved anywhere else. This is something that @mdietze and I talked about earlier this week and maybe Mike can explain it better than I can.

What if you look at the settings$outdir and see if that exists on the local machine, if so you can (I think) assume that all the files are locally.

mdietze · 2016-10-25T14:07:46Z

modules/benchmark/R/create.BRR.R

-# Case in which we would need a remote connection to get the pecan.xml file - not functional
+    settings_xml <- toString(listToXml(clean, "pecan"))
+
+    ref_run <- db.query(paste0(" SELECT * from reference_runs where settings = '", settings_xml,"'"),con)


This is OK for now, but in the long run it's a pretty weak test -- an settings that is LOGICALLY the same but not character-for-character identical (e.g. whitespace, order of tags) wouldn't come up.

Good point. I will put this on the todo list.

mdietze · 2016-10-25T14:12:54Z

modules/benchmark/R/create.BRR.R

+      ref_run <- db.query(paste0("INSERT INTO reference_runs",
+                                 "(model_id, settings, user_id, created_at, updated_at)",
+                                 "VALUES(",ens_wf$model_id,", '",settings_xml,"' , ",user_id,
+                                 ", NOW() , NOW()) RETURNING *;"),con)


Per #1083 we're supposed to remove created_at and updated_at NOW(),NOW() from SQL. The database inserts these automatically and @dlebauer noted it is a bug since most machines insert NOW() in local time, not UTC. @tonygardella make sure this goes into the style guide.

✔️ Removed all created_at, updated_at, NOW(),NOW() in benchmarking

mdietze · 2016-10-25T14:25:40Z

modules/benchmark/R/read.settings.RR.R

+  settings <- tbl(bety,"reference_runs") %>% 
+    filter(id == settings$benchmark$reference_run_id) %>% 
+      dplyr::select(settings) %>% collect() %>% unlist() %>%
+      xmlToList(.,"pecan") %>% append(settings,.) %>% Settings()


Impressive that you can do the query and insert all in one line. Probably would have taken me a page of code.

mdietze · 2016-10-25T14:27:04Z

settings/R/check.all.settings.R

+  library(XML)
+  library(lubridate)
+  library(PEcAn.DB)
+  library(PEcAn.utils)


mdietze · 2016-10-25T14:27:58Z

settings/R/fix.deprecated.settings.R

+  library(XML)
+  library(lubridate)
+  library(PEcAn.DB)
+  library(PEcAn.utils)


mdietze · 2016-10-25T14:31:26Z

settings/R/write.settings.R

+write.settings <- function(settings, outputfile = "pecan.CHECKED.xml"){
+  library(XML)
+  library(PEcAn.DB)
+  library(PEcAn.utils)


Can you think of a more appropriate name for this function? write.settings sounds like a function that reads in a settings object and writes it to file. This does a lot more than that! Unfortunately clean.settings and update.settings are already in use.

packages

settings2pecan.CHECKED?
Or should we not call the other function update.settings?

OK for today, but renaming the update.settings function so you can name this function update is definitely an option.

mdietze · 2016-10-25T14:33:10Z

web/workflow.R

+
+# Write pecan.CHECKED.xml
+settings <- write.settings(settings, outputfile = "pecan.CHECKED.xml")
+


I don't see the call to calc.benchmarks at the end of workflow.R

✔️ Added it!

…tings without it saving an XML file.

…xml settings file may ony contain ensemble ids (and not BRR ids)

… new settings configuration

… are used elsewhere in the benchmarking workflow. This shouldn't affect anyone else's work. Adding a special case for annual data. This may need more work.

…table entry if there is no new run.

…led because scores wasn't in the schema.

…in outdir

…mark.workflow

…ings.bench

….settings.bench

bcow added 10 commits August 18, 2016 14:50

Switching from require to library

a5cab2c

Changing worflow$folder to workflow$params as what will be enetered i…

244d6da

…n for reference_run$settings

Merge branch 'master' of github.com:PecanProject/pecan into bench

6370f09

Merge branch 'master' of github.com:PecanProject/pecan into bench

e1a589f

Function for loading settings from an XML file. Copied from the origi…

72001f0

…nal `read.settings` function.

Function to read settings in from the reference runs table in bety. I…

2f40d52

… didn't call it BRR because I thought reference runs weren't always specifically for benchmarking.

Moving all "internal" functions out of the actual read.settings scr…

161399c

…ipt. Simply because it was too hard to read, I can always recombine them.

First pass at a read.settings that can call different functions for d…

187777b

…ifferent settings types (ie XML or reference run id's). Also, trying to figure out where I'm adding in supplemental settings (such as benchmarking settings)

Adding additional "cleaning" steps after clean.settings is run to p…

60b78c6

…rep settings XML for insertion into the reference_runs table

update gitignore

0215f30

tonygardella assigned araiho Oct 18, 2016

bcow added 16 commits October 19, 2016 01:56

Separating out the functions addSecrets, fix.depreciated.settings…

bdd422e

…, `update.settings`, `check.settings` into individual Rscript files.

read.settings only reads an XML document and produces a settings ob…

bbaf0c4

…ject. It does not do any additional editing.

write.settings completes the process of checking settings and creat…

1307a0e

…ing the pecan.CHECKED.xml file

Adding in new steps for writing pecan.CHECKED.xml, moving `read.setti…

dd53978

…ngs.RR` to the benchmarking module

Merge branch 'master' of github.com:PecanProject/pecan into bench

88c16b2

# Conflicts: # modules/benchmark/R/create.BRR.R # modules/benchmark/R/load.data.R

Merge branch 'bench' of github.com:bcow/pecan into read.settings.bench

4652ddc

Removing unnecessary loading of libraries

2e7318b

documentation

d677d02

More documentation

e93f104

More Documentation

b5d921c

Fixing naming mistake

873fc19

Still fixing documentation!

9bb023f

Switching from pecan.xml to pecan.CHECKED.xml for the source of s…

77a4a76

…ettings for BRR

Explicity state package for run.meta.analysis.pft

c9f1543

Fixing external references

9b94f1a

Merge branch 'master' of github.com:PecanProject/pecan into read.sett…

8c0d940

…ings.bench # Conflicts: # settings/R/read.settings.R # utils/R/utils.R

mdietze requested changes Oct 25, 2016

View reviewed changes

bcow added 2 commits October 26, 2016 13:58

Adding create.benchmark function for making benchmarking records in BETY

3810b0c

clean.settings now has a write argument so that one can run clean.set…

5e17aa9

…tings without it saving an XML file.

bcow and others added 20 commits October 26, 2016 14:03

create.BRR now called from within read.settings.RR assuming that the …

a71a56b

…xml settings file may ony contain ensemble ids (and not BRR ids)

Small updates in variable names and function arguments in response to…

f9bd4c3

… new settings configuration

Merge branch 'master' into read.settings.bench

6bffddd

Fixing variable names so they don't conflict with variable names that…

631399a

… are used elsewhere in the benchmarking workflow. This shouldn't affect anyone else's work. Adding a special case for annual data. This may need more work.

Using variables names that are consistent with calc.metrics.

b8416ae

Populating rundir, modeloutdir, outdir in settings from the workflow …

0a01921

…table entry if there is no new run.

load.data now uses site timezone

01740b5

Small fixes

7efa6b9

Re-enabling insertion into the scores table in BETY. Previously disab…

212a027

…led because scores wasn't in the schema.

Slightly reworking the results table from calc.metrics.

e489c17

Commenting out insert scores again - but keeping changes I made

905d0e4

Fixing typos, adding scores back in

62b9dfe

Fixing confusion between ensemble_id and benchmarks_ensemble_id

ad4bfc7

Fixing plot metrics & calc.benchmark so that outputs/plots are saved …

16d80b2

…in outdir

Adding very basic documentation and example XML files. Updating bench…

aede4e2

…mark.workflow

Adding benchmarking into workflow.R

b4fb1d9

Fixing date problem in metric.timeseries.plot

fe378af

Merge branch 'master' of github.com:PecanProject/pecan into read.sett…

996b21f

…ings.bench

Merge branch 'read.settings.bench' of github.com:bcow/pecan into read…

9875102

….settings.bench

Removing created.at, updated.at, NOW(). Updating documentation.

f962b57

bcow force-pushed the read.settings.bench branch from 12b7b48 to f962b57 Compare October 27, 2016 13:31

Updates to .Rd files

36e77d9

mdietze approved these changes Oct 27, 2016

View reviewed changes

mdietze merged commit 3431df5 into PecanProject:master Oct 27, 2016

mdietze mentioned this pull request Nov 1, 2016

Implement #1156 #1157

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The draft for read.settings compatibility with benchmarking #1095

The draft for read.settings compatibility with benchmarking #1095

bcow commented Oct 17, 2016 •

edited

mdietze left a comment

mdietze Oct 25, 2016

mdietze Oct 25, 2016

bcow Oct 27, 2016

mdietze Oct 25, 2016

bcow Oct 27, 2016

robkooper Oct 27, 2016

bcow Oct 27, 2016

robkooper Oct 27, 2016 •

edited

mdietze Oct 25, 2016

bcow Oct 27, 2016

mdietze Oct 25, 2016

bcow Oct 27, 2016

mdietze Oct 25, 2016

mdietze Oct 25, 2016

mdietze Oct 25, 2016

mdietze Oct 25, 2016

bcow Oct 27, 2016

mdietze Oct 27, 2016

mdietze Oct 25, 2016

bcow Oct 27, 2016


		# Write pecan.CHECKED.xml
		settings <- write.settings(settings, outputfile = "pecan.CHECKED.xml")

The draft for read.settings compatibility with benchmarking #1095

The draft for read.settings compatibility with benchmarking #1095

Conversation

bcow commented Oct 17, 2016 • edited

Major changes:

Example XML files

1 variable, 1 metric, 1 site, 1 model

2 variables, 2 metric, 1 site, 1 model

2 variables, 2 metric, 1 site, 1 model

mdietze left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robkooper Oct 27, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bcow commented Oct 17, 2016 •

edited

robkooper Oct 27, 2016 •

edited