Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation/date-matching for R timeseries objects #9

Open
rburghol opened this issue Nov 29, 2018 · 0 comments · Fixed by #65
Open

Interpolation/date-matching for R timeseries objects #9

rburghol opened this issue Nov 29, 2018 · 0 comments · Fixed by #65

Comments

@rburghol
Copy link
Contributor

rburghol commented Nov 29, 2018

Overview

Current implementation of timeseries objects in opemi-om R version is ultra-simplistic, only considering exact match of timestamp, and failing catastrophically if no match is found (the data retrieval line doesn't fail, just the method of setting the state variable). This needs to be more robust to be sure (handle missing/mismatched data and fail gracefully with error). Also, there needs to be means to do interpolation. The PHP version of vahydro timeseries object had the following modes:

  • 0: linear,
  • 1: previous value,
  • 2: next value,
  • 3: period mean,
  • 4: period min,
  • 5: period max,
  • 6: period sum

Using SQL to create a perfectly matched and interpolated timeseries before run execution

See https://github.com/HARPgroup/openmi-om/blob/master/examples/timestamp_queries.R

  • before model run, use SQL to create a timeseries that matches the timestamp of the timer exactly.
  • Should make all data in UTC format in order to avoid daylight savings time weirdness

xts date range retrieval functions

The xts class (extending zoo) allows for retrieval of date ranges:

# now create an instance of the timeSeriesInput class we've just made
k <- openmi.om.timeSeriesInput();
# Create dat by reading tmp_file
tmp_file = "http://deq2.bse.vt.edu/files/icprb/potomac_111518_precip_in.tsv"
dat <- read.table(tmp_file, sep="\t", header=TRUE)
# Convert dat into xts
k$tsvalues <- xts(dat, order.by = as.POSIXct(dat$Date, format="%m/%d/%Y"))
# use xts timespan notation, i.e.:
# tr = "1999-01-09/1999-01-10 12:00:00"
tr = paste(as.POSIXct('1999-01-9'),(as.POSIXct('1999-01-9') + hours(36)), sep="/")
k$tsvalues[tr]

Produces:

           Date        Northern Shenandoah
1999-01-09 "1/9/1999"  "0.48"   "0.52"    
1999-01-10 "1/10/1999" "0.07"   "0.01"  

To which could be applied the functions:

> mean(as.numeric(k$tsvalues[tr]$Northern))
[1] 0.275
> min(as.numeric(k$tsvalues[tr]$Northern))
[1] 0.07
> max(as.numeric(k$tsvalues[tr]$Northern))
[1] 0.48

Benchmarking/Performance

Previous versions of the model employed timeseries inputs that were pre-processed to do all interpolation prior to execution, allowing easy, exact-matching at run-time. (Example: CBP runoff timeseries input https://github.com/HARPgroup/vahydro/issues/7 )

  • Benchmark exact matching to period summing
  • examine methods of pre-run period summing , benchmark time taken
@rburghol rburghol linked a pull request Dec 16, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant