plan interface for the suite of expected models #56

aappling-usgs · 2015-07-20T16:21:39Z

(This issue will be modified as I continue to think about it)

Desired models:

Day-by-day MLE with observation error to estimate P+R+K
Day-by-day Bayes with observation error to estimate P+R+K
Day-by-day MLE with process error to estimate P+R+K
Day-by-day Bayes with process error to estimate P+R+K
Nighttime regression by OLS to estimate K
Bayesian hierarchical approach to estimate K vs Q function?
Day-by-day MLE with observation error to estimate P+R given K
Day-by-day Bayes with observation error to estimate P+R given K
Day-by-day MLE with process error to estimate P+R given K
Day-by-day Bayes with process error to estimate P+R given K
Hierarchical Bayes with observation error to estimate P+R+K with which hierarchy? So many options. Could do any combination of the following. See issue implement hierarchical bayesian options #57 for more.
- Constrain overall average (mean and/or tau) P, R, and/or K.
- Constrain day-to-day variation in P, R, and/or K.
- Constrain K to be near the daily K values estimated by nighttime regression

Options shared across MLE, Bayesian models

observation vs process error: calc_DO_fun = c('calc_DO_mod', 'calc_DO_mod_by_diff')
date delineation: c(start_hour, end_hour)
if taking K as given, then ts of K values should be supplied as an arg to metab_xxx

MLE models: metab_mle

constant parameters: inits are c(GPP=3, ER=-5, K600=5)
if taking K as given, then we should use a variant on onestation_negloglik that doesn't expect K600.daily among the params

Bayesian models: metab_bayes

if non-hierarchical (independent days), constant parameters: DO.err.tau.shape=0.001, DO.err.tau.rate=0.001, GPP.daily.mu = 10, GPP.daily.tau = 1/(10^2), ER.daily.mu = -10, ER.daily.tau = 1/(10^2), K600.daily.mu = 10, K600.daily.tau = 1/(10^2)
if hierarchical, use mm_model_by_ply to produce new input data with non-overlapping (partially copied) plys
if taking K as given, then we should use variants on prepjags_bayes_simple and runjags_bayes_simple that don't expect K600.daily among the params to estimate

Questions

Is it OK to have overlapping days when modeling consecutive days? Does it matter whether the model is distinct for each day vs hierarchical using the distribution of daily estimates?
What to do about hierarchical models for which we have missing days? Can we ignore that there are gaps?

The text was updated successfully, but these errors were encountered:

lawinslow · 2015-07-20T16:28:00Z

Is it OK to have overlapping days when modeling consecutive days?

I think it's ok, but you'll have to keep in mind that overlap could add to autocorrelation in the estimate timeseries. The estimates can not be considered independent. (they will probably be autocorrelated anyway, but overlap adds a new dimension). So you'd have to keep it in mind in subsequent analysis (for example, what is average GPP when you have overlap? What does the math do to the estimate? I think it would increase the weight from the overlapping part of the signal)

aappling-usgs · 2015-07-20T19:42:58Z

from group discussion at lunch, we've decided that from overlapping days we can expect:

increased autocorrelation of daily metabolism estimates
excessive confidence in daily estimates (narrower intervals than justified)
possibly some bias toward values of K and ER that predict nighttime curves better at the expense of predicting daytime curves well (because these overlapping days effectively each span 2 nights and 1 day).

robohall · 2015-07-20T21:03:31Z

The use of night day night extends back to Odum. The idea was to get night on both sides of the day to nail the ER term. It would be true for gas exchange too. But this was before extensive continuous data. To do one day would require staring at complete darkness (will vary with time of year), just before dawn to get GPP and then go as far as possible into the next night to get ER.

I think we should base how we solve for ER based on data and not a hunch one way or the other. The thing to do would be to be to generate a month long time series with varying and known ER and then try both approaches and see which gives back the best ER. The key is how to generate the fake data? We do not want ER varying randomly. Maybe allow it to wander. Or put a shock in (say a flood) lowering ER and then it recovers.

One thing about ER is that it particularly tricky to measure. Unlike GPP which is a relative change in O2, ER is absolute difference. So from an estimation perspective it is probably best to use both nights to get more data. Yes it adds autocorrelation, but if ER is not biologically auto correlated, then we have big problems. The one way that we could make a mistake with using both nights is in high GPP streams where the ER on any one night is a function of GPP the day before, so that daily variation in ER responses to the daily variation in GPP. Furthering the problem is that ER will change through the night as the stream temp changes or as the yummy carbon from the day's photosynthesis gets eaten up. And ER during the day might be 2-10 time higher than the night, but there is not much we can do about that.

Hmm, that might be a way to vary ER with fake data, vary GPP and make ER a fraction of GPP above some base, as did Hall and Beaulieu.

Bob

On Jul 20, 2015, at 1:43 PM, Alison Appling <notifications@github.com mailto:notifications@github.com> wrote:

from group discussion at lunch, we've decided that from overlapping days we can expect:

increased autocorrelation of daily metabolism estimates
excessive confidence in daily estimates (narrower intervals than justified)
possibly some bias toward values of K and ER that predict nighttime curves better at the expense of predicting daytime curves well (because these overlapping days effectively each span 2 nights and 1 day).

Reply to this email directly or view it on GitHubhttps://github.com//issues/56#issuecomment-123003728.

aappling-usgs · 2015-07-20T21:34:07Z

That's interesting that the whole point of the extended day is to compensate for weak ER information, i.e., to intentionally bias the estimates toward those that fit the nighttime data best. For multi-day data, I wonder if hierarchical models that constrain day-to-day variation in ER (or even just the mean ER) would basically do the same job.

Agreed, it'd be cool to explore simulated ER and see which modeling approach extracts it best. Non-trivial, though, so it'll be a while before I might be able to take that on. I'll add a new issue to remind us that the opportunity is there for the taking when we have time.

aappling-usgs · 2015-07-28T21:56:13Z

this is coming together, at least for current needs.

aappling-usgs mentioned this issue Jul 20, 2015

implement hierarchical bayesian options #57

Closed

aappling-usgs mentioned this issue Jul 20, 2015

simulate ER to compare the effectiveness of different daily time windows #58

Closed

aappling-usgs added the blocker label Jul 21, 2015

aappling-usgs closed this as completed Jul 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plan interface for the suite of expected models #56

plan interface for the suite of expected models #56

aappling-usgs commented Jul 20, 2015

lawinslow commented Jul 20, 2015

aappling-usgs commented Jul 20, 2015

robohall commented Jul 20, 2015

aappling-usgs commented Jul 20, 2015

aappling-usgs commented Jul 28, 2015

plan interface for the suite of expected models #56

plan interface for the suite of expected models #56

Comments

aappling-usgs commented Jul 20, 2015

lawinslow commented Jul 20, 2015

aappling-usgs commented Jul 20, 2015

robohall commented Jul 20, 2015

aappling-usgs commented Jul 20, 2015

aappling-usgs commented Jul 28, 2015