Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plan interface for the suite of expected models #56

Closed
aappling-usgs opened this issue Jul 20, 2015 · 5 comments
Closed

plan interface for the suite of expected models #56

aappling-usgs opened this issue Jul 20, 2015 · 5 comments
Labels

Comments

@aappling-usgs
Copy link
Contributor

(This issue will be modified as I continue to think about it)

Desired models:

  • Day-by-day MLE with observation error to estimate P+R+K
  • Day-by-day Bayes with observation error to estimate P+R+K
  • Day-by-day MLE with process error to estimate P+R+K
  • Day-by-day Bayes with process error to estimate P+R+K
  • Nighttime regression by OLS to estimate K
  • Bayesian hierarchical approach to estimate K vs Q function?
  • Day-by-day MLE with observation error to estimate P+R given K
  • Day-by-day Bayes with observation error to estimate P+R given K
  • Day-by-day MLE with process error to estimate P+R given K
  • Day-by-day Bayes with process error to estimate P+R given K
  • Hierarchical Bayes with observation error to estimate P+R+K with which hierarchy? So many options. Could do any combination of the following. See issue implement hierarchical bayesian options #57 for more.
    • Constrain overall average (mean and/or tau) P, R, and/or K.
    • Constrain day-to-day variation in P, R, and/or K.
    • Constrain K to be near the daily K values estimated by nighttime regression

Options shared across MLE, Bayesian models

  • observation vs process error: calc_DO_fun = c('calc_DO_mod', 'calc_DO_mod_by_diff')
  • date delineation: c(start_hour, end_hour)
  • if taking K as given, then ts of K values should be supplied as an arg to metab_xxx

MLE models: metab_mle

  • constant parameters: inits are c(GPP=3, ER=-5, K600=5)
  • if taking K as given, then we should use a variant on onestation_negloglik that doesn't expect K600.daily among the params

Bayesian models: metab_bayes

  • if non-hierarchical (independent days), constant parameters: DO.err.tau.shape=0.001, DO.err.tau.rate=0.001, GPP.daily.mu = 10, GPP.daily.tau = 1/(10^2), ER.daily.mu = -10, ER.daily.tau = 1/(10^2), K600.daily.mu = 10, K600.daily.tau = 1/(10^2)
  • if hierarchical, use mm_model_by_ply to produce new input data with non-overlapping (partially copied) plys
  • if taking K as given, then we should use variants on prepjags_bayes_simple and runjags_bayes_simple that don't expect K600.daily among the params to estimate

Questions

  • Is it OK to have overlapping days when modeling consecutive days? Does it matter whether the model is distinct for each day vs hierarchical using the distribution of daily estimates?
  • What to do about hierarchical models for which we have missing days? Can we ignore that there are gaps?
@lawinslow
Copy link

Is it OK to have overlapping days when modeling consecutive days?

I think it's ok, but you'll have to keep in mind that overlap could add to autocorrelation in the estimate timeseries. The estimates can not be considered independent. (they will probably be autocorrelated anyway, but overlap adds a new dimension). So you'd have to keep it in mind in subsequent analysis (for example, what is average GPP when you have overlap? What does the math do to the estimate? I think it would increase the weight from the overlapping part of the signal)

@aappling-usgs
Copy link
Contributor Author

from group discussion at lunch, we've decided that from overlapping days we can expect:

  • increased autocorrelation of daily metabolism estimates
  • excessive confidence in daily estimates (narrower intervals than justified)
  • possibly some bias toward values of K and ER that predict nighttime curves better at the expense of predicting daytime curves well (because these overlapping days effectively each span 2 nights and 1 day).

@robohall
Copy link

The use of night day night extends back to Odum. The idea was to get night on both sides of the day to nail the ER term. It would be true for gas exchange too. But this was before extensive continuous data. To do one day would require staring at complete darkness (will vary with time of year), just before dawn to get GPP and then go as far as possible into the next night to get ER.

I think we should base how we solve for ER based on data and not a hunch one way or the other. The thing to do would be to be to generate a month long time series with varying and known ER and then try both approaches and see which gives back the best ER. The key is how to generate the fake data? We do not want ER varying randomly. Maybe allow it to wander. Or put a shock in (say a flood) lowering ER and then it recovers.

One thing about ER is that it particularly tricky to measure. Unlike GPP which is a relative change in O2, ER is absolute difference. So from an estimation perspective it is probably best to use both nights to get more data. Yes it adds autocorrelation, but if ER is not biologically auto correlated, then we have big problems. The one way that we could make a mistake with using both nights is in high GPP streams where the ER on any one night is a function of GPP the day before, so that daily variation in ER responses to the daily variation in GPP. Furthering the problem is that ER will change through the night as the stream temp changes or as the yummy carbon from the day's photosynthesis gets eaten up. And ER during the day might be 2-10 time higher than the night, but there is not much we can do about that.

Hmm, that might be a way to vary ER with fake data, vary GPP and make ER a fraction of GPP above some base, as did Hall and Beaulieu.

Bob

On Jul 20, 2015, at 1:43 PM, Alison Appling <notifications@github.commailto:notifications@github.com> wrote:

from group discussion at lunch, we've decided that from overlapping days we can expect:

  • increased autocorrelation of daily metabolism estimates
  • excessive confidence in daily estimates (narrower intervals than justified)
  • possibly some bias toward values of K and ER that predict nighttime curves better at the expense of predicting daytime curves well (because these overlapping days effectively each span 2 nights and 1 day).

Reply to this email directly or view it on GitHubhttps://github.com//issues/56#issuecomment-123003728.

@aappling-usgs
Copy link
Contributor Author

That's interesting that the whole point of the extended day is to compensate for weak ER information, i.e., to intentionally bias the estimates toward those that fit the nighttime data best. For multi-day data, I wonder if hierarchical models that constrain day-to-day variation in ER (or even just the mean ER) would basically do the same job.

Agreed, it'd be cool to explore simulated ER and see which modeling approach extracts it best. Non-trivial, though, so it'll be a while before I might be able to take that on. I'll add a new issue to remind us that the opportunity is there for the taking when we have time.

@aappling-usgs
Copy link
Contributor Author

this is coming together, at least for current needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants