-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The draft for read.settings compatibility with benchmarking #1095
Conversation
…n for reference_run$settings
…nal `read.settings` function.
… didn't call it BRR because I thought reference runs weren't always specifically for benchmarking.
…ipt. Simply because it was too hard to read, I can always recombine them.
…ifferent settings types (ie XML or reference run id's). Also, trying to figure out where I'm adding in supplemental settings (such as benchmarking settings)
…rep settings XML for insertion into the reference_runs table
…, `update.settings`, `check.settings` into individual Rscript files.
…ject. It does not do any additional editing.
…ing the pecan.CHECKED.xml file
…ngs.RR` to the benchmarking module
# Conflicts: # modules/benchmark/R/create.BRR.R # modules/benchmark/R/load.data.R
…ings.bench # Conflicts: # settings/R/read.settings.R # utils/R/utils.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all the requested changes are quick fixes. Would love to see a quick turn-around and have this pulled today.
@@ -69,5 +69,6 @@ tests/BC* | |||
compile_on_geo.sh | |||
documentation/tutorials/*/*.html | |||
pecan.Rproj | |||
shiny/BenchmarkReport/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shiny/BenchmarkReport exists in the repo already, so a commit that adds it to the ignore seems like a bad idea. Could you remove that. Better yet, if you've made changes to BenchmarkReport could you commit those changes too?
@@ -9,48 +9,49 @@ | |||
##' | |||
##' @author Betsy Cowdery | |||
|
|||
create.BRR <- function(ensemble.id, workflow, con) { | |||
create.BRR <- function(ens_wf, con){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed argument to something hard to understand, but didn't update the documentation (param still listed as ensemble.id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️ Updated the documentation
# If the ensemble run was done on localhost, turn into a BRR | ||
cnd1 <- ens_wf$hostname == fqdn() | ||
cnd2 <- ens_wf$hostname == 'test-pecan.bu.edu' & fqdn() == 'pecan2.bu.edu' | ||
cnd3 <- ens_wf$hostname == 'pecan2.bu.edu' & fqdn() == 'test-pecan.bu.edu' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to figure out how to avoid having hacks like this hardcoded (Fix not required in this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You commented on this in the last PR but I still don't have any ideas on what to do with it.
It's just a special case that involves our pecan servers. Could simplify it to:
ens_wf$hostname == fqdn() | length(grep("pecan.bu.edu", c(fqdn(), "test-pecan.bu.edu")))==2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add some text as a comment on the reason why we can not run a benchmark reference run on any of those other machines, what conditions do we need to be added to the list of exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm just trying to find a way to say that test-pecan, and pecan2 are actually the "same" machine in that the files are accessible from either.
But this is skirting around the issue that if one wants to turn an existing ensemble into a benchmark reference run and that ensemble was not run on localhost, getting all the information to populate the settings section of the reference run table id going to be difficult / may not be possible. This is because we may not be able to read the pecan.xml file on the remote host and the ensemble settings aren't saved anywhere else. This is something that @mdietze and I talked about earlier this week and maybe Mike can explain it better than I can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you look at the settings$outdir and see if that exists on the local machine, if so you can (I think) assume that all the files are locally.
# Case in which we would need a remote connection to get the pecan.xml file - not functional | ||
settings_xml <- toString(listToXml(clean, "pecan")) | ||
|
||
ref_run <- db.query(paste0(" SELECT * from reference_runs where settings = '", settings_xml,"'"),con) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK for now, but in the long run it's a pretty weak test -- an settings that is LOGICALLY the same but not character-for-character identical (e.g. whitespace, order of tags) wouldn't come up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I will put this on the todo list.
ref_run <- db.query(paste0("INSERT INTO reference_runs", | ||
"(model_id, settings, user_id, created_at, updated_at)", | ||
"VALUES(",ens_wf$model_id,", '",settings_xml,"' , ",user_id, | ||
", NOW() , NOW()) RETURNING *;"),con) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per #1083 we're supposed to remove created_at and updated_at NOW(),NOW() from SQL. The database inserts these automatically and @dlebauer noted it is a bug since most machines insert NOW() in local time, not UTC. @tonygardella make sure this goes into the style guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️ Removed all created_at, updated_at, NOW(),NOW() in benchmarking
settings <- tbl(bety,"reference_runs") %>% | ||
filter(id == settings$benchmark$reference_run_id) %>% | ||
dplyr::select(settings) %>% collect() %>% unlist() %>% | ||
xmlToList(.,"pecan") %>% append(settings,.) %>% Settings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive that you can do the query and insert all in one line. Probably would have taken me a page of code.
library(XML) | ||
library(lubridate) | ||
library(PEcAn.DB) | ||
library(PEcAn.utils) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packages
library(XML) | ||
library(lubridate) | ||
library(PEcAn.DB) | ||
library(PEcAn.utils) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packages
write.settings <- function(settings, outputfile = "pecan.CHECKED.xml"){ | ||
library(XML) | ||
library(PEcAn.DB) | ||
library(PEcAn.utils) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you think of a more appropriate name for this function? write.settings sounds like a function that reads in a settings object and writes it to file. This does a lot more than that! Unfortunately clean.settings and update.settings are already in use.
- packages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
settings2pecan.CHECKED?
Or should we not call the other function update.settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for today, but renaming the update.settings function so you can name this function update is definitely an option.
|
||
# Write pecan.CHECKED.xml | ||
settings <- write.settings(settings, outputfile = "pecan.CHECKED.xml") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the call to calc.benchmarks at the end of workflow.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️ Added it!
…tings without it saving an XML file.
…xml settings file may ony contain ensemble ids (and not BRR ids)
… new settings configuration
… are used elsewhere in the benchmarking workflow. This shouldn't affect anyone else's work. Adding a special case for annual data. This may need more work.
…table entry if there is no new run.
…led because scores wasn't in the schema.
12b7b48
to
f962b57
Compare
Major changes:
workflow.R
read.settings
reads thepecan.xml
filepecan.CHECKED.xml
pecan.xml
files - You can test by doing runs with my pecanbenchmarking.workflow.R
that can be used when not usingworkflow.R
Example XML files
Can be used to test benchmarking (both with
workflow.R
andbenchmark.workflow.R
(I'll check them off when they are working)
1 variable, 1 metric, 1 site, 1 model
modules/benchmark/inst/scripts/bm.1var.1metric.1site.1model.xml
2 variables, 2 metric, 1 site, 1 model
modules/benchmark/inst/scripts/bm.2var.2metric.1site.1model.xml
2 variables, 2 metric, 1 site, 1 model
modules/benchmark/inst/scripts/bm.2var.2metric.2site.1model.xml